Performance-lossless Black-box Model Watermarking

Published 11 Dec 2023 in cs.CR | (2312.06488v2)

Abstract: With the development of deep learning, high-value and high-cost models have become valuable assets, and related intellectual property protection technologies have become a hot topic. However, existing model watermarking work in black-box scenarios mainly originates from training-based backdoor methods, which probably degrade primary task performance. To address this, we propose a branch backdoor-based model watermarking protocol to protect model intellectual property, where a construction based on a message authentication scheme is adopted as the branch indicator after a comparative analysis with secure cryptographic technologies primitives. We prove the lossless performance of the protocol by reduction. In addition, we analyze the potential threats to the protocol and provide a secure and feasible watermarking instance for LLMs.