Papers
Topics
Authors
Recent
Search
2000 character limit reached

k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text

Published 17 Feb 2024 in cs.CL, cs.CY, cs.LG, and cs.CR | (2402.11399v2)

Abstract: Recent watermarked generation algorithms inject detectable signatures during language generation to facilitate post-hoc detection. While token-level watermarks are vulnerable to paraphrase attacks, SemStamp (Hou et al., 2023) applies watermark on the semantic representation of sentences and demonstrates promising robustness. SemStamp employs locality-sensitive hashing (LSH) to partition the semantic space with arbitrary hyperplanes, which results in a suboptimal tradeoff between robustness and speed. We propose k-SemStamp, a simple yet effective enhancement of SemStamp, utilizing k-means clustering as an alternative of LSH to partition the embedding space with awareness of inherent semantic structure. Experimental results indicate that k-SemStamp saliently improves its robustness and sampling efficiency while preserving the generation quality, advancing a more effective tool for machine-generated text detection.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. Undetectable watermarks for language models. ArXiv, abs/2306.09194.
  2. Watermarking conditional text generation for ai detection: Unveiling challenges and a semantic-aware watermark remedy. ArXiv, abs/2307.13808.
  3. Measuring and improving semantic diversity of dialogue generation. In Findings of the Association for Computational Linguistics: EMNLP 2022.
  4. Semstamp: A semantic watermark with paraphrastic robustness for text generation. arXiv preprint arXiv:2310.03991.
  5. Piotr Indyk and Rajeev Motwani. 1998. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, STOC ’98, page 604–613, New York, NY, USA. Association for Computing Machinery.
  6. A watermark for large language models. arXiv preprint arXiv:2301.10226.
  7. On the reliability of watermarks for large language models.
  8. Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense. arXiv preprint arXiv:2303.13408.
  9. Booksum: A collection of datasets for long-form narrative summarization. arXiv preprint arXiv:2105.08209.
  10. Robust distortion-free watermarks for language models. ArXiv, abs/2307.15593.
  11. A semantic invariant robust watermark for large language models. arXiv preprint arXiv:2310.06356.
  12. Seth Lloyd. 1982. Least squares quantization in pcm. IEEE Transactions on Information Theory, 28(2):129–137.
  13. Model cards for model reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT*’19, page 220–229, New York, NY, USA. Association for Computing Machinery.
  14. OpenAI. 2022. ChatGPT.
  15. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research (JMLR).
  16. Can ai-generated text be reliably detected?
  17. Towards codable text watermarking for large language models. ArXiv, abs/2307.15992.
  18. Paraphrastic representations at scale. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 379–388, Abu Dhabi, UAE. Association for Computational Linguistics.
  19. Robust multi-bit natural language watermarking through invariant features. In Annual Meeting of the Association for Computational Linguistics.
  20. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In International Conference on Machine Learning (ICML).
  21. OPT: Open Pre-trained Transformer Language Models. arXiv preprint arXiv:2205.01068.
  22. Bertscore: Evaluating text generation with bert. In International Conference on Learning Representations (ICLR).
  23. Generating informative and diverse conversational responses via adversarial information maximization. In NeurIPS.
  24. Provable robust watermarking for ai-generated text. arXiv preprint arXiv:2306.17439.
Citations (8)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 25 likes about this paper.