Papers
Topics
Authors
Recent
Search
2000 character limit reached

Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models

Published 21 Feb 2024 in cs.CL and cs.AI | (2402.14007v2)

Abstract: Text watermarking technology aims to tag and identify content produced by LLMs to prevent misuse. In this study, we introduce the concept of cross-lingual consistency in text watermarking, which assesses the ability of text watermarks to maintain their effectiveness after being translated into other languages. Preliminary empirical results from two LLMs and three watermarking methods reveal that current text watermarking technologies lack consistency when texts are translated into various languages. Based on this observation, we propose a Cross-lingual Watermark Removal Attack (CWRA) to bypass watermarking by first obtaining a response from an LLM in a pivot language, which is then translated into the target language. CWRA can effectively remove watermarks, decreasing the AUCs to a random-guessing level without performance loss. Furthermore, we analyze two key factors that contribute to the cross-lingual consistency in text watermarking and propose X-SIR as a defense method against CWRA. Code: https://github.com/zwhe99/X-SIR.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. Baichuan. 2023. A large-scale 7b pretraining language model developed by baichuan-inc.
  2. Sachin Chanchani and Ruihong Huang. 2023. Composition-contrastive learning for sentence embeddings. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15836–15848, Toronto, Canada. Association for Computational Linguistics.
  3. Canyu Chen and Kai Shu. 2023a. Can llm-generated misinformation be detected? arXiv preprint arXiv:2309.13788.
  4. Canyu Chen and Kai Shu. 2023b. Combating misinformation in the age of llms: Opportunities and challenges. arXiv preprint arXiv:2311.05656.
  5. X-mark: Towards lossless watermarking through lexical redundancy.
  6. Enhancing chat language models by scaling high-quality instructional conversations. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 3029–3051, Singapore. Association for Computational Linguistics.
  7. Multi-news: A large-scale multi-document summarization dataset and abstractive hierarchical model. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1074–1084, Florence, Italy. Association for Computational Linguistics.
  8. ELI5: Long form question answering. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3558–3567, Florence, Italy. Association for Computational Linguistics.
  9. Semstamp: A semantic watermark with paraphrastic robustness for text generation.
  10. Unbiased watermark for large language models.
  11. A watermark for large language models. In Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 17061–17084. PMLR.
  12. On the reliability of watermarks for large language models.
  13. An unforgeable publicly verifiable watermark for large language models.
  14. A semantic invariant robust watermark for large language models.
  15. A survey of text watermarking in the era of large language models. arXiv preprint arXiv:2312.07913.
  16. OpenAI. 2023. Gpt-4 technical report.
  17. Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992, Hong Kong, China. Association for Computational Linguistics.
  18. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  19. Waterbench: Towards holistic evaluation of watermarks for large language models.
  20. Attention is all you need. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
  21. Watermarking text generated by black-box language models.
  22. Provable robust watermarking for ai-generated text.
Citations (12)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.