KinGuard: Hierarchical Kinship-Aware Fingerprinting to Defend Against Large Language Model Stealing
Abstract: Protecting the intellectual property of LLMs requires robust ownership verification. Conventional backdoor fingerprinting, however, is flawed by a stealth-robustness paradox: to be robust, these methods force models to memorize fixed responses to high-perplexity triggers, but this targeted overfitting creates detectable statistical artifacts. We resolve this paradox with KinGuard, a framework that embeds a private knowledge corpus built on structured kinship narratives. Instead of memorizing superficial triggers, the model internalizes this knowledge via incremental pre-training, and ownership is verified by probing its conceptual understanding. Extensive experiments demonstrate KinGuard's superior effectiveness, stealth, and resilience against a battery of attacks including fine-tuning, input perturbation, and model merging. Our work establishes knowledge-based embedding as a practical and secure paradigm for model fingerprinting.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.