Lifting the Veil on Composition, Risks, and Mitigations of the Large Language Model Supply Chain
Abstract: LLMs have sparked significant impact with regard to both intelligence and productivity. Numerous enterprises have integrated LLMs into their applications to solve their own domain-specific tasks. However, integrating LLMs into specific scenarios is a systematic process that involves substantial components, which are collectively referred to as the LLM supply chain. A comprehensive understanding of LLM supply chain composition, as well as the relationships among its components, is crucial for enabling effective mitigation measures for different related risks. While existing literature has explored various risks associated with LLMs, there remains a notable gap in systematically characterizing the LLM supply chain from the dual perspectives of contributors and consumers. In this work, we develop a structured taxonomy encompassing risk types, risky actions, and corresponding mitigations across different stakeholders and components of the supply chain. We believe that a thorough review of the LLM supply chain composition, along with its inherent risks and mitigation measures, would be valuable for industry practitioners to avoid potential damages and losses, and enlightening for academic researchers to rethink existing approaches and explore new avenues of research.
- Privacy-Preserving Data Deduplication for Enhancing Federated Learning of Language Models. arXiv preprint arXiv:2407.08152 (2024).
- A hybrid hashing security algorithm for data storage on cloud computing. International Journal of Computer Science and Information Security 14, 4 (2016).
- How to backdoor federated learning. In International conference on artificial intelligence and statistics. 2938–2948.
- Fred Bals. 2024. What is the Xz Utils Backdoor : Everything you need to know about the supply chain attack. Retrieved Aug 16, 2024 from https://www.synopsys.com/blogs/software-security/xz-utils-backdoor-supply-chain-attack.html
- Best-of-Venom: Attacking RLHF by Injecting Poisoned Preference Data. arXiv preprint arXiv:2404.05530 (2024).
- BlackDuck. 2002. BlackDuck. Retrieved May 30, 2024 from https://www.blackduck.com/software-composition-analysis-tools/black-duck-sca.html
- bloomberg. 2024. Introducing BloombergGPT, Bloomberg’s 50-billion parameter large language model, purpose-built from scratch for finance. Retrieved May 30, 2024 from https://www.bloomberg.com/company/press/bloomberggpt-50-billion-parameter-llm-tuned-finance/
- Badprompt: Backdoor attacks on continuous prompts. Advances in Neural Information Processing Systems 35 (2022), 37068–37080.
- Alan Cao and Brendan Dolan-Gavitt. 2022. What the Fork? Finding and Analyzing Malware in GitHub Forks. In Proceedings of the Network and Distributed System Security Symposium, Vol. 22.
- Stealing part of a production language model. arXiv preprint arXiv:2403.06634 (2024).
- Extracting training data from large language models. In 30th USENIX Security Symposium. 2633–2650.
- CheckPoint. 2024. PyPI Inundated by Malicious Typosquatting Campaign. Retrieved May 30, 2024 from https://blog.checkpoint.com/securing-the-cloud/pypi-inundated-by-malicious-typosquatting-campaign/
- Minigpt-v2: large language model as a unified interface for vision-language multi-task learning. arXiv preprint arXiv:2310.09478 (2023).
- StruQ: Defending against prompt injection with structured queries. arXiv preprint arXiv:2402.06363 (2024).
- The-x: Privacy-preserving transformer inference with homomorphic encryption. arXiv preprint arXiv:2206.00216 (2022).
- Badnl: Backdoor attacks against nlp models with semantic-preserving improvements. In Proceedings of the 37th Annual Computer Security Applications Conference. 554–569.
- Stefano Chierici. 2022. Analysis on Docker Hub malicious images: Attacks through public container images. Retrieved Aug 16, 2024 from https://sysdig.com/blog/analysis-of-supply-chain-attacks-through-public-docker-images/
- Google Cloud. 2024. Vertex AI. Retrieved May 30, 2024 from https://cloud.google.com/vertex-ai/docs/start/introduction-unified-platform
- Risk taxonomy, mitigation, and assessment benchmarks of large language model systems. arXiv preprint arXiv:2401.05778 (2024).
- On the robustness of large multimodal models against image adversarial attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 24625–24634.
- Masterkey: Automated jailbreaking of large language model chatbots. In Proc. ISOC NDSS.
- A security risk taxonomy for large language models. arXiv preprint arXiv:2311.11415 (2023).
- Towards measuring supply chain attacks on package managers for interpreted languages. In Proceedings of the 28th Annual Network and Distributed System Security Symposium.
- Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3. 265–284.
- The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science 9, 3–4 (2014), 211–407.
- A comprehensive survey of attack techniques, implementation, and mitigation strategies in large language models. In International Conference on Ubiquitous Security. 76–95.
- GitHub eugeneyan. 2024. Open LLMs. Retrieved May 30, 2024 from https://github.com/eugeneyan/open-llms
- Hugging Face. 2024. Hugging Face. Retrieved May 30, 2024 from https://huggingface.co
- Large language models for software engineering: Survey and open problems. arXiv preprint arXiv:2310.03533 (2023).
- Python Software Foundation. 2024. PyPI. Retrieved May 30, 2024 from https://pypi.org
- Laura French. 2024. Are your GitHub Action artifacts leaking tokens? Retrieved Aug 16, 2024 from https://www.scmagazine.com/news/are-your-github-action-artifacts-leaking-tokens
- GitHub. 2024a. GitHub. Retrieved May 30, 2024 from https://github.com
- GitHub. 2024b. GPTCache. Retrieved April 20, 2024 from https://github.com/zilliztech/GPTCache
- GitHub. 2024c. Promptify. Retrieved April 20, 2024 from https://github.com/promptslab/Promptify
- GitHub. 2024d. Zapier. Retrieved April 20, 2024 from https://zapier.com/blog/gpt-assistant/
- Anomalicious: Automated detection of anomalous and potentially malicious commits on github. In Proceedings of the 43rd International Conference on Software Engineering: Software Engineering in Practice.
- Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injection. In Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security. 79–90.
- Policy shaping: Integrating human feedback with reinforcement learning. Advances in neural information processing systems (2013).
- Recovering private text in federated learning of language models. Advances in neural information processing systems (2022), 8130–8143.
- DONAPI: Malicious NPM Packages Detector using Behavior Sequence Knowledge Mapping. In Proceedings of the 33rd USENIX Security Symposium.
- Efficient symmetric image encryption by using a novel 2D chaotic system. IET Image Processing 14, 6 (2020), 1157–1163.
- Visually asymmetric image encryption algorithm based on SHA-3 and compressive sensing by embedding encrypted image. Alexandria Engineering Journal 61, 10 (2022), 7637–7647.
- SpiderScan: Practical Detection of Malicious NPM Packages Based on Graph-Based Behavior Modeling and Matching. In Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering.
- Training-free lexical backdoor attacks on language models. In Proceedings of the ACM Web Conference 2023. 2198–2208.
- Thomas Hunter II. 2018. Compromised npm Package: event-stream. Retrieved May 30, 2024 from https://medium.com/intrinsic-blog/compromised-npm-package-event-stream-d47d08605502
- LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI’s ChatGPT Plugins. arXiv preprint arXiv:2309.10254 (2023).
- Measuring Forgetting of Memorized Training Examples. arXiv:2207.00099 [cs.LG] https://arxiv.org/abs/2207.00099
- An empirical study of pre-trained model reuse in the hugging face deep learning model registry. In 2023 IEEE/ACM 45th International Conference on Software Engineering. 2463–2475.
- Exploiting programmatic behavior of llms: Dual-use through standard security attacks. In 2024 IEEE Security and Privacy Workshops. 132–143.
- A watermark for large language models. In International Conference on Machine Learning. 17061–17084.
- Eduard Kovacs. 2024a. ChatGPT Plugin Vulnerabilities Exposed Data, Accounts. Retrieved July 24, 2024 from https://www.securityweek.com/chatgpt-plugin-vulnerabilities-exposed-data-accounts/
- Eduard Kovacs. 2024b. Simple Attack Allowed Extraction of ChatGPT Training Data. Retrieved July 24, 2024 from https://www.securityweek.com/simple-attack-allowed-extraction-of-chatgpt-training-data/
- Robust distortion-free watermarks for language models. arXiv preprint arXiv:2307.15593 (2023).
- The ethics of interaction: Mitigating security threats in llms. arXiv preprint arXiv:2401.12273 (2024).
- labelbox. 2024. LabelBox. Retrieved April 20, 2024 from https://labelbox.com/
- Sok: Taxonomy of attacks on open-source software supply chains. In 2023 IEEE Symposium on Security and Privacy. 1509–1526.
- Who wrote this code? watermarking for code generation. arXiv preprint arXiv:2305.15060 (2023).
- Mpcformer: fast, performant and private transformer inference with mpc. arXiv preprint arXiv:2211.01452 (2022).
- Learning to detect malicious clients for robust federated learning. arXiv preprint arXiv:2002.00211 (2020).
- Hidden backdoors in human-centric language models. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 3123–3140.
- Memristor-coupled asymmetric neural networks: Bionic modeling, chaotic dynamics analysis and encryption application. Chaos, Solitons & Fractals 166 (2023), 112905.
- Visual instruction tuning. Advances in neural information processing systems (2024).
- Google LLC. 2024. Kaggle. Retrieved May 30, 2024 from https://www.kaggle.com
- Explaining vulnerabilities to adversarial machine learning through visual analytics. IEEE transactions on visualization and computer graphics (2019).
- Membership Inference Attacks against Language Models via Neighbourhood Comparison. In The 61st Annual Meeting Of The Association For Computational Linguistics.
- Microsoft. 2022. sbom-tool. Retrieved May 30, 2024 from https://github.com/microsoft/sbom-tool
- Microsoft. 2024. Visual Studio Marketplace. Retrieved May 30, 2024 from https://marketplace.visualstudio.com/
- MITRE. 2021. Spamming ML System with Chaff Data. Retrieved May 30, 2024 from https://atlas.mitre.org/techniques/AML.T0046
- Adam Murray. 2020. The $100 Million Court Case for Open Source License Compliance. Retrieved May 30, 2024 from https://www.mend.io/blog/the-100-million-case-for-open-source-license-compliance/#track-your-licenses-%E2%80%93-or-risk-litigation
- Seth Neel and Peter Chang. 2023. Privacy issues in large language models: A survey. arXiv preprint arXiv:2312.06717 (2023).
- The Hacker News. 2018. Password-Guessing Was Used to Hack Gentoo Linux Github Account. Retrieved May 30, 2024 from https://thehackernews.com/2018/07/github-hacking-gentoo-linux.html
- The Hacker News. 2022. 10 Credential Stealing Python Libraries Found on PyPI Repository. Retrieved May 30, 2024 from https://thehackernews.com/2022/08/10-credential-stealing-python-libraries.html
- Newsroom. 2024. Third-Party ChatGPT Plugins Could Lead to Account Takeovers. Retrieved July 24, 2024 from https://thehackernews.com/2024/03/third-party-chatgpt-plugins-could-lead.html
- nvidia GitHub. 2024. TensorRT. Retrieved April 20, 2024 from https://nvidia.github.io/TensorRT-Model-Optimizer/guides/1_quantization.html
- onnx. 2024. Onnx. Retrieved April 20, 2024 from https://onnx.ai/
- OpenAI. 2024. Introducing the GPT Store. Retrieved July 24, 2024 from https://openai.com/index/introducing-the-gpt-store/
- Training language models to follow instructions with human feedback. Advances in neural information processing systems 35 (2022), 27730–27744.
- Controlling the extraction of memorized data from large language models via prompt-tuning. arXiv preprint arXiv:2305.11759 (2023).
- Privacy risks of general-purpose language models. In 2020 IEEE Symposium on Security and Privacy. 1314–1331.
- Hidden trigger backdoor attack on {{\{{NLP}}\}} models via linguistic style manipulation. In Proceedings of the 31st USENIX Security Symposium. 3611–3628.
- Mapping LLM Security Landscapes: A Comprehensive Stakeholder Risk Assessment Proposal. arXiv preprint arXiv:2403.13309 (2024).
- Canary Extraction in Natural Language Understanding Models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 552–560.
- Debian Project. 2024. Debian Packages. Retrieved May 30, 2024 from https://www.debian.org/distrib/packages
- PyPI. 2023. 2FA Requirement for PyPI begins 2024-01-01. Retrieved May 30, 2024 from https://blog.pypi.org/posts/2023-12-13-2fa-enforcement/
- PyTorch. 2022. Compromised PyTorch-nightly dependency chain between December 25th and December 30th. Retrieved May 30, 2024 from https://pytorch.org/blog/compromised-nightly-dependency/
- Language models are unsupervised multitask learners. OpenAI blog (2019), 9.
- Smoothllm: Defending large language models against jailbreaking attacks. arXiv preprint arXiv:2310.03684 (2023).
- SourceFinder: Finding malware Source-Code from publicly available repositories in GitHub. In Proceedings of the 23rd International Symposium on Research in Attacks, Intrusions and Defenses. 149–163.
- ODSC Open Data Science. 2024. 2023 Was the Year of Large Language Models: Then and Now. Retrieved May 30, 2024 from https://odsc.medium.com/2023-was-the-year-of-large-language-models-then-and-now-924d34f3b6a9
- scmagazine. 2024. SAP AI Core flaws show risks of training Alin shared environments. Retrieved July 24, 2024 from https://www.scmagazine.com/news/sap-ai-core-flaws-show-risks-of-training-ai-in-shared-environments?nbd=%7B%7Blead.HumId%7D%7D&nbd_source=mrkto&utm_source=sc-dailyscan&utm_medium=email&mkt_tok=MTg4LVVOWi02NjAAAAGUaWl_tY1A-NhfVVQomaBYG1UhBIqdj_oZrzANo6xuZnI1EVvKBl4TnmU-Y1RRZU4vYvsphD9f7NNGM3wxsvxl41hXY4niWZFW6hSx-CHM-2Av
- Amazon Web Services. 2024. Deep Learning AMI. Retrieved May 30, 2024 from https://aws.amazon.com/machine-learning/amis/
- Nightshade: Prompt-Specific Poisoning Attacks on Text-to-Image Generative Models. In 2024 IEEE Symposium on Security and Privacy. 212–212.
- Jailbreak in pieces: Compositional adversarial attacks on multi-modal language models. In Proceedings of the 12th International Conference on Learning Representations.
- ” do anything now”: Characterizing and evaluating in-the-wild jailbreak prompts on large language models. arXiv preprint arXiv:2308.03825 (2023).
- On the exploitability of instruction tuning. Advances in Neural Information Processing Systems (2023).
- Sponge examples: Energy-latency attacks on neural networks. In 2021 IEEE European symposium on security and privacy. 212–231.
- Mondrian: Prompt abstraction attack against large language models for cheaper API pricing. arXiv preprint arXiv:2308.03558 (2023).
- Sonatype. 2008. Automate your dependency management. Retrieved May 30, 2024 from https://www.sonatype.com/sonatype-developer
- Beyond Memorization: Violating Privacy via Inference with Large Language Models. In Proceedings of the 12th International Conference on Learning Representations.
- Ludwig-Ferdinand Stumpp. 2024. Achieving Code Execution in MathGPT via Prompt Injection. Retrieved July 24, 2024 from https://atlas.mitre.org/studies/AML.CS0016/
- FDI: Attack Neural Code Generation Systems through User Feedback Channel. arXiv preprint arXiv:2408.04194 (2024).
- Codemark: Imperceptible watermarking for code datasets against neural code completion models. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1561–1572.
- Xuchen Suo. 2024. Signed-Prompt: A new approach to prevent prompt injection attacks against LLM-integrated applications. arXiv preprint arXiv:2401.07612 (2024).
- Did you train on my dataset? towards public dataset protection with cleanlabel backdoor watermarking. ACM SIGKDD Explorations Newsletter 25, 1 (2023), 43–53.
- LLM Red Team. 2024. Free API by LLM Red Team. Retrieved July 24, 2024 from https://github.com/LLM-Red-Team/free-api/
- Docker security: A threat model, attack taxonomy and real-time attack scenario of dos. In 10th International Conference on Cloud Computing, Data Science & Engineering. 150–155.
- Data-free model extraction. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 4771–4780.
- VirusTotal. 2023. VirusTotal. Retrieved May 30, 2024 from https://www.virustotal.com/
- Poisoning language models during instruction tuning. In International Conference on Machine Learning. 35413–35425.
- You see what i want you to see: poisoning vulnerabilities in neural code search. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering.
- Towards codable text watermarking for large language models. arXiv preprint arXiv:2307.15992 (2023).
- Large Language Models for Education: A Survey and Outlook. arXiv:2403.18105
- Poisoned LangChain: Jailbreak LLMs by LangChain. arXiv preprint arXiv:2406.18122 (2024).
- Jailbroken: How does llm safety training fail? Advances in Neural Information Processing Systems (2024).
- Simon Willison. 2024. Prompt injection and jailbreaking are not the same thing. Retrieved July 24, 2024 from https://simonwillison.net/2024/Mar/5/prompt-injection-jailbreaking/
- A new era in llm security: Exploring security concerns in real-world llm-based systems. arXiv preprint arXiv:2402.18649 (2024).
- Efficient adversarial training in llms with continuous attacks. arXiv preprint arXiv:2405.15589 (2024).
- Feature squeezing: Detecting adversarial examples in deep neural networks. arXiv preprint arXiv:1704.01155 (2017).
- What do code models memorize? an empirical study on large language models of code. arXiv preprint arXiv:2308.09932 (2023).
- Poisonprompt: Backdoor attack on prompt-based large language models. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing. 7745–7749.
- A survey on large language model (llm) security and privacy: The good, the bad, and the ugly. High-Confidence Computing (2024), 100211.
- Bag of tricks for training data extraction from language models. In International Conference on Machine Learning. 40306–40320.
- Synthetic text generation with differential privacy: A simple and practical recipe. arXiv preprint arXiv:2210.14348 (2022).
- Malicious Package Detection in NPM and PyPI using a Single Model of Malicious Behavior Sequence. arXiv preprint arXiv:2309.02637 (2023).
- Fldetector: Defending federated learning against model poisoning attacks via detecting malicious clients. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2545–2555.
- ETHICIST: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. 12674–12687.
- NNSplitter: an active defense solution for DNN model via automated weight obfuscation. In International Conference on Machine Learning. 42614–42624.
- Universal and transferable adversarial attacks on aligned language models. arXiv preprint arXiv:2307.15043 (2023).
- Poisonedrag: Knowledge poisoning attacks to retrieval-augmented generation of large language models. arXiv preprint arXiv:2402.07867 (2024).
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.