MarkLLM: An Open-Source Toolkit for LLM Watermarking
Abstract: LLM watermarking, which embeds imperceptible yet algorithmically detectable signals in model outputs to identify LLM-generated text, has become crucial in mitigating the potential misuse of LLMs. However, the abundance of LLM watermarking algorithms, their intricate mechanisms, and the complex evaluation procedures and perspectives pose challenges for researchers and the community to easily experiment with, understand, and assess the latest advancements. To address these issues, we introduce MarkLLM, an open-source toolkit for LLM watermarking. MarkLLM offers a unified and extensible framework for implementing LLM watermarking algorithms, while providing user-friendly interfaces to ensure ease of access. Furthermore, it enhances understanding by supporting automatic visualization of the underlying mechanisms of these algorithms. For evaluation, MarkLLM offers a comprehensive suite of 12 tools spanning three perspectives, along with two types of automated evaluation pipelines. Through MarkLLM, we aim to support researchers while improving the comprehension and involvement of the general public in LLM watermarking technology, fostering consensus and driving further advancements in research and application. Our code is available at https://github.com/THU-BPM/MarkLLM.
- S. Aaronson and H. Kirchner. 2022. Watermarking gpt outputs. https://www.scottaaronson.com/talks/watermark.ppt.
- Findings of the 2016 conference on machine translation. In Proceedings of the First Conference on Machine Translation, pages 131–198, Berlin, Germany. Association for Computational Linguistics.
- Evaluating large language models trained on code.
- Undetectable watermarks for language models. arXiv preprint arXiv:2306.09194.
- No language left behind: Scaling human-centered machine translation. arXiv preprint arXiv:2207.04672.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
- Publicly detectable watermarking for language models. Cryptology ePrint Archive, Paper 2023/1661. https://eprint.iacr.org/2023/1661.
- Three bricks to consolidate watermarks for large language models. arXiv preprint arXiv:2308.00113.
- On the learnability of watermarks for language models.
- Can watermarks survive translation? on the cross-lingual consistency of text watermark for large language models.
- Unbiased watermark for large language models. arXiv preprint arXiv:2310.10669.
- A watermark for large language models. In International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 of Proceedings of Machine Learning Research, pages 17061–17084. PMLR.
- On the reliability of watermarks for large language models. arXiv preprint arXiv:2306.04634.
- Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense. arXiv preprint arXiv:2303.13408.
- Robust distortion-free watermarks for language models. arXiv preprint arXiv:2307.15593.
- Who wrote this code? watermarking for code generation. arXiv preprint arXiv:2305.15060.
- Starcoder: may the source be with you!
- An unforgeable publicly verifiable watermark for large language models.
- A semantic invariant robust watermark for large language models. arXiv preprint arXiv:2310.06356.
- A survey of text watermarking in the era of large language models.
- An entropy-based text watermarking detection method.
- Dissimilar: Towards fake news detection using information hiding. In Signal Processing and Machine Learning. In The 16th International Conference on Availability, Reliability and Security (Vienna, Austria)(ARES 2021). Association for Computing Machinery, New York, NY, USA, Article, volume 66.
- George A Miller. 1995. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39–41.
- OpenAI. 2022. Chatgpt: Optimizing language models for dialogue. https://openai.com/blog/chatgpt.
- OpenAI. 2023. Gpt-4 technical report. ArXiv, abs/2303.08774.
- Mark my words: Analyzing and evaluating language model watermarks.
- Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
- A robust semantics-based watermark for large language model against paraphrasing. arXiv preprint arXiv:2311.08721.
- In-context impersonation reveals large language models’ strengths and biases.
- Necessary and sufficient watermark for large language models. arXiv preprint arXiv:2310.00833.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
- Waterbench: Towards holistic evaluation of watermarks for large language models. arXiv preprint arXiv:2311.07138.
- Howkgpt: Investigating the detection of chatgpt-generated university student homework through context-aware perplexity analysis. arXiv preprint arXiv:2305.18226.
- Towards codable text watermarking for large language models. arXiv preprint arXiv:2307.15992.
- Dipmark: A stealthy, efficient and resilient watermark for large language models. arXiv preprint arXiv:2310.07710.
- Learning to watermark llm-generated text via reinforcement learning.
- Advancing beyond identification: Multi-bit watermark for language models. arXiv preprint arXiv:2308.00221.
- Opt: Open pre-trained transformer language models.
- Provable robust watermarking for ai-generated text. arXiv preprint arXiv:2306.17439.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.