Privacy Preserving Prompt Engineering: A Survey
Abstract: Pre-trained LLMs (PLMs) have demonstrated significant proficiency in solving a wide range of general NLP tasks. Researchers have observed a direct correlation between the performance of these models and their sizes. As a result, the sizes of these models have notably expanded in recent years, persuading researchers to adopt the term LLMs to characterize the larger-sized PLMs. The size expansion comes with a distinct capability called in-context learning (ICL), which represents a special form of prompting and allows the models to be utilized through the presentation of demonstration examples without modifications to the model parameters. Although interesting, privacy concerns have become a major obstacle in its widespread usage. Multiple studies have examined the privacy risks linked to ICL and prompting in general, and have devised techniques to alleviate these risks. Thus, there is a necessity to organize these mitigation techniques for the benefit of the community. This survey provides a systematic overview of the privacy protection methods employed during ICL and prompting in general. We review, analyze, and compare different methods under this paradigm. Furthermore, we provide a summary of the resources accessible for the development of these frameworks. Finally, we discuss the limitations of these frameworks and offer a detailed examination of the promising areas that necessitate further exploration.
- T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
- P. Liang, R. Bommasani, T. Lee, D. Tsipras, D. Soylu, M. Yasunaga, Y. Zhang, D. Narayanan, Y. Wu, A. Kumar et al., “Holistic evaluation of language models,” arXiv preprint arXiv:2211.09110, 2022.
- Y. Yu, Y. Zhuang, J. Zhang, Y. Meng, A. Ratner, R. Krishna, J. Shen, and C. Zhang, “Large language model as attributed training data generator: A tale of diversity and bias,” arXiv preprint arXiv:2306.15895, 2023.
- S. Zhang, S. Roller, N. Goyal, M. Artetxe, M. Chen, S. Chen, C. Dewan, M. Diab, X. Li, X. V. Lin et al., “Opt: Open pre-trained transformer language models,” arXiv preprint arXiv:2205.01068, 2022.
- J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, D. Almeida, J. Altenschmidt, S. Altman, S. Anadkat et al., “Gpt-4 technical report,” arXiv preprint arXiv:2303.08774, 2023.
- F. Duarte, “Number of chatgpt users (dec 2023),” accessed on: December 21, 2023. [Online]. Available: https://explodingtopics.com/blog/chatgpt-users
- H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar et al., “Llama: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971, 2023.
- S. Zhang, S. Roller, N. Goyal, M. Artetxe, M. Chen, S. Chen, C. Dewan, M. Diab, X. Li, X. V. Lin et al., “Opt: Open pre-trained transformer language models, 2022,” URL https://arxiv. org/abs/2205.01068, vol. 3, pp. 19–0, 2023.
- S. Iyer, X. V. Lin, R. Pasunuru, T. Mihaylov, D. Simig, P. Yu, K. Shuster, T. Wang, Q. Liu, P. S. Koura et al., “Opt-iml: Scaling language model instruction meta learning through the lens of generalization,” arXiv preprint arXiv:2212.12017, 2022.
- B. Workshop, T. L. Scao, A. Fan, C. Akiki, E. Pavlick, S. Ilić, D. Hesslow, R. Castagné, A. S. Luccioni, F. Yvon et al., “Bloom: A 176b-parameter open-access multilingual language model,” arXiv preprint arXiv:2211.05100, 2022.
- N. Muennighoff, T. Wang, L. Sutawika, A. Roberts, S. Biderman, T. L. Scao, M. S. Bari, S. Shen, Z.-X. Yong, H. Schoelkopf et al., “Crosslingual generalization through multitask finetuning,” arXiv preprint arXiv:2211.01786, 2022.
- M. Conover, M. Hayes, A. Mathur, X. Meng, J. Xie, J. Wan, S. Shah, A. Ghodsi, P. Wendell, M. Zaharia et al., “Free dolly: Introducing the world’s first truly open instruction-tuned llm,” 2023, accessed on: December 21, 2023. [Online]. Available: https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm
- M. Shanahan, “Talking about large language models,” Communications of the ACM, vol. 67, no. 2, pp. 68–79, 2024.
- J. Wei, X. Wang, D. Schuurmans, M. Bosma, F. Xia, E. Chi, Q. V. Le, D. Zhou et al., “Chain-of-thought prompting elicits reasoning in large language models,” Advances in neural information processing systems, vol. 35, pp. 24 824–24 837, 2022.
- R. Taylor, M. Kardas, G. Cucurull, T. Scialom, A. Hartshorn, E. Saravia, A. Poulton, V. Kerkez, and R. Stojnic, “Galactica: A large language model for science,” arXiv preprint arXiv:2211.09085, 2022.
- Z. Hu, Y. Lan, L. Wang, W. Xu, E.-P. Lim, R. K.-W. Lee, L. Bing, and S. Poria, “Llm-adapters: An adapter family for parameter-efficient fine-tuning of large language models,” arXiv preprint arXiv:2304.01933, 2023.
- E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, “Lora: Low-rank adaptation of large language models,” arXiv preprint arXiv:2106.09685, 2021.
- X. L. Li and P. Liang, “Prefix-tuning: Optimizing continuous prompts for generation,” arXiv preprint arXiv:2101.00190, 2021.
- H. Zheng, L. Shen, A. Tang, Y. Luo, H. Hu, B. Du, and D. Tao, “Learn from model beyond fine-tuning: A survey,” arXiv preprint arXiv:2310.08184, 2023.
- W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, J. Zhang, Z. Dong et al., “A survey of large language models,” arXiv preprint arXiv:2303.18223, 2023.
- OpenAI, “March 20 chatgpt outage: Here’s what happened: an update on our findings, the actions we’ve taken, and technical details of the bug,” March 24, 2023, accessed on: Februrary 27, 2024. [Online]. Available: https://openai.com/blog/march-20-chatgpt-outage
- R. Zhang, S. Hidano, and F. Koushanfar, “Text revealer: Private text reconstruction via model inversion attacks against transformers,” arXiv preprint arXiv:2209.10505, 2022.
- J. X. Morris, W. Zhao, J. T. Chiu, V. Shmatikov, and A. M. Rush, “Language model inversion,” arXiv preprint arXiv:2311.13647, 2023.
- B. Wang, W. Chen, H. Pei, C. Xie, M. Kang, C. Zhang, C. Xu, Z. Xiong, R. Dutta, R. Schaeffer et al., “Decodingtrust: A comprehensive assessment of trustworthiness in gpt models,” arXiv preprint arXiv:2306.11698, 2023.
- H. Duan, A. Dziedzic, M. Yaghini, N. Papernot, and F. Boenisch, “On the privacy risk of in-context learning,” in The 61st annual meeting of the association for computational linguistics, 2023.
- X. Tang, R. Shin, H. A. Inan, A. Manoel, F. Mireshghallah, Z. Lin, S. Gopi, J. Kulkarni, and R. Sim, “Privacy-preserving in-context learning with differentially private few-shot generation,” arXiv preprint arXiv:2309.11765, 2023.
- J. Hong, J. T. Wang, C. Zhang, Z. Li, B. Li, and Z. Wang, “Dp-opt: Make large language model your privacy-preserving prompt engineer,” arXiv preprint arXiv:2312.03724, 2023.
- H. Duan, A. Dziedzic, N. Papernot, and F. Boenisch, “Flocks of stochastic parrots: Differentially private prompt learning for large language models,” arXiv preprint arXiv:2305.15594, 2023.
- L. Wutschitz, B. Köpf, A. Paverd, S. Rajmohan, A. Salem, S. Tople, S. Zanella-Béguelin, M. Xia, and V. Rühle, “Rethinking privacy in machine learning pipelines from an information flow control perspective,” arXiv preprint arXiv:2311.15792, 2023.
- L. Hu, I. Habernal, L. Shen, and D. Wang, “Differentially private natural language models: Recent advances and future directions,” arXiv preprint arXiv:2301.09112, 2023.
- Y. Yao, J. Duan, K. Xu, Y. Cai, E. Sun, and Y. Zhang, “A survey on large language model (llm) security and privacy: The good, the bad, and the ugly,” arXiv preprint arXiv:2312.02003, 2023.
- S. Neel and P. Chang, “Privacy issues in large language models: A survey,” arXiv preprint arXiv:2312.06717, 2023.
- L. Sun, Y. Huang, H. Wang, S. Wu, Q. Zhang, C. Gao, Y. Huang, W. Lyu, Y. Zhang, X. Li et al., “Trustllm: Trustworthiness in large language models,” arXiv preprint arXiv:2401.05561, 2024.
- B. C. Das, M. H. Amini, and Y. Wu, “Security and privacy challenges of large language models: A survey,” arXiv preprint arXiv:2402.00888, 2024.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
- X. Liu, F. Zhang, Z. Hou, L. Mian, Z. Wang, J. Zhang, and J. Tang, “Self-supervised learning: Generative or contrastive,” IEEE transactions on knowledge and data engineering, vol. 35, no. 1, pp. 857–876, 2021.
- C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” The journal of machine learning research, vol. 21, no. 1, pp. 5485–5551, 2020.
- P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, “Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing,” ACM computing surveys, vol. 55, no. 9, pp. 1–35, 2023.
- X. Qiu, T. Sun, Y. Xu, Y. Shao, N. Dai, and X. Huang, “Pre-trained models for natural language processing: A survey,” Science china technological sciences, vol. 63, no. 10, pp. 1872–1897, 2020.
- X. Han, Z. Zhang, N. Ding, Y. Gu, X. Liu, Y. Huo, J. Qiu, Y. Yao, A. Zhang, L. Zhang, W. Han, M. Huang, Q. Jin, Y. Lan, Y. Liu, Z. Liu, Z. Lu, X. Qiu, R. Song, J. Tang, J.-R. Wen, J. Yuan, W. X. Zhao, and J. Zhu, “Pre-trained models: Past, present and future,” AI Open, vol. 2, pp. 225–250, 2021.
- S. Doddapaneni, G. Ramesh, M. M. Khapra, A. Kunchukuttan, and P. Kumar, “A primer on pretrained multilingual language models,” arXiv preprint arXiv:2107.00676, 2021.
- T. Shin, Y. Razeghi, R. L. Logan IV, E. Wallace, and S. Singh, “Autoprompt: Eliciting knowledge from language models with automatically generated prompts,” arXiv preprint arXiv:2010.15980, 2020.
- Q. Dong, L. Li, D. Dai, C. Zheng, Z. Wu, B. Chang, X. Sun, J. Xu, and Z. Sui, “A survey for in-context learning,” arXiv preprint arXiv:2301.00234, 2022.
- D. Zhou, N. Schärli, L. Hou, J. Wei, N. Scales, X. Wang, D. Schuurmans, C. Cui, O. Bousquet, Q. Le et al., “Least-to-most prompting enables complex reasoning in large language models,” arXiv preprint arXiv:2205.10625, 2022.
- S. Borgeaud, A. Mensch, J. Hoffmann, T. Cai, E. Rutherford, K. Millican, G. B. Van Den Driessche, J.-B. Lespiau, B. Damoc, A. Clark et al., “Improving language models by retrieving from trillions of tokens,” in International conference on machine learning. PMLR, 2022, pp. 2206–2240.
- M. Yasunaga, A. Bosselut, H. Ren, X. Zhang, C. D. Manning, P. S. Liang, and J. Leskovec, “Deep bidirectional language-knowledge graph pretraining,” Advances in neural information processing systems, vol. 35, pp. 37 309–37 323, 2022.
- B. Wang, W. Ping, P. Xu, L. McAfee, Z. Liu, M. Shoeybi, Y. Dong, O. Kuchaiev, B. Li, C. Xiao et al., “Shall we pretrain autoregressive language models with retrieval? a comprehensive study,” arXiv preprint arXiv:2304.06762, 2023.
- C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Theory of cryptography: Third theory of cryptography conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3. Springer, 2006, pp. 265–284.
- S. P. Kasiviswanathan, H. K. Lee, K. Nissim, S. Raskhodnikova, and A. Smith, “What can we learn privately?” SIAM Journal on Computing, vol. 40, no. 3, pp. 793–826, 2011.
- T. Wang, X. Zhang, J. Feng, and X. Yang, “A comprehensive survey on local differential privacy toward data statistics and analysis,” Sensors, no. 24, 2020.
- M. Alvim, K. Chatzikokolakis, C. Palamidessi, and A. Pazii, “Local differential privacy on metric spaces: optimizing the trade-off with utility,” in 2018 IEEE 31st Computer Security Foundations Symposium (CSF). IEEE, 2018, pp. 262–267.
- C. Dwork, A. Roth et al., “The algorithmic foundations of differential privacy,” Foundations and trends® in theoretical computer science, vol. 9, no. 3–4, pp. 211–407, 2014.
- N. Li, W. H. Qardaji, and D. Su, “On sampling, anonymization, and differential privacy or, k-anonymization meets differential privacy,” in ASIACCS, 2012.
- C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Theory of Cryptography Conference. Springer, 2006, pp. 265–284.
- F. McSherry and K. Talwar, “Mechanism design via differential privacy,” in 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS’07). IEEE, 2007, pp. 94–103.
- K. Nissim, S. Raskhodnikova, and A. Smith, “Smooth sensitivity and sampling in private data analysis,” Proceedings of the thirty-ninth annual ACM symposium on Theory of computing - STOC ’07, no. x, p. 75, 2007.
- N. Papernot, S. Song, I. Mironov, A. Raghunathan, K. Talwar, and Ú. Erlingsson, “Scalable private learning with pate,” arXiv preprint arXiv:1802.08908, 2018.
- K. Chaudhuri, C. Monteleoni, and A. D. Sarwate, “Differentially Private Empirical Risk Minimization,” Journal of Machine Learning Research, vol. 12, pp. 1069–1109, 2011.
- M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, “Deep learning with differential privacy,” in Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, 2016, pp. 308–318.
- J. Zhang, Z. Zhang, X. Xiao, Y. Yang, and M. Winslett, “Functional mechanism: regression analysis under differential privacy,” Proceedings of the 38th International Conference on Very Large Data Bases, vol. 5, no. 11, pp. 1364–1375, 2012.
- Z. Kan, L. Qiao, H. Yu, L. Peng, Y. Gao, and D. Li, “Protecting user privacy in remote conversational systems: A privacy-preserving framework based on text sanitization,” arXiv preprint arXiv:2306.08223, 2023.
- Y. Chen, T. Li, H. Liu, and Y. Yu, “Hide and seek (has): A lightweight framework for prompt privacy protection,” arXiv preprint arXiv:2309.03057, 2023.
- K. Zhang, J. Wang, E. Hua, B. Qi, N. Ding, and B. Zhou, “Cogenesis: A framework collaborating large and small language models for secure context-aware instruction following,” arXiv preprint arXiv:2403.03129, 2024.
- Y. Yao, F. Wang, S. Ravi, and M. Chen, “Privacy-preserving language model inference with instance obfuscation,” ArXiv, vol. abs/2402.08227, 2024. [Online]. Available: https://api.semanticscholar.org/CorpusID:267636718
- M. Zhang, T. He, T. Wang, F. Mireshghallah, B. Chen, H. Wang, and Y. Tsvetkov, “Latticegen: A cooperative framework which hides generated text in a lattice for privacy-aware generation on cloud,” arXiv preprint arXiv:2309.17157, 2023.
- G. Lin, W. Hua, and Y. Zhang, “Promptcrypt: Prompt encryption for secure communication with large language models,” arXiv preprint arXiv:2402.05868, 2024.
- X. Hou, J. Liu, J. Li, Y. Li, W.-j. Lu, C. Hong, and K. Ren, “Ciphergpt: Secure two-party gpt inference,” Cryptology ePrint Archive, 2023.
- M. Hao, H. Li, H. Chen, P. Xing, G. Xu, and T. Zhang, “Iron: Private inference on transformers,” in Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35. Curran Associates, Inc., 2022, pp. 15 718–15 731.
- T. Chen, H. Bao, S. Huang, L. Dong, B. Jiao, D. Jiang, H. Zhou, J. Li, and F. Wei, “The-x: Privacy-preserving transformer inference with homomorphic encryption,” in Findings, 2022.
- L. Lyu, Y. Li, X. He, and T. Xiao, “Towards differentially private text representations,” in Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2020, pp. 1813–1816.
- R. Plant, D. Gkatzia, and V. Giuffrida, “Cape: Context-aware private embeddings for private language learning,” arXiv preprint arXiv:2108.12318, 2021.
- S. Chen, F. Mo, Y. Wang, C. Chen, J.-Y. Nie, C. Wang, and J. Cui, “A customized text sanitization mechanism with differential privacy,” in Findings of the Association for Computational Linguistics: ACL 2023, 2023, pp. 5747–5758.
- M. Tong, K. Chen, Y. Qi, J. Zhang, W. Zhang, and N. Yu, “Privinfer: Privacy-preserving inference for black-box large language model,” arXiv preprint arXiv:2310.12214, 2023.
- O. Feyisetan, B. Balle, T. Drake, and T. Diethe, “Privacy-and utility-preserving textual analysis via calibrated multivariate perturbations,” in Proceedings of the 13th international conference on web search and data mining, 2020, pp. 178–186.
- Z. Xu, A. Aggarwal, O. Feyisetan, and N. Teissier, “A differentially private text perturbation method using a regularized mahalanobis metric,” arXiv preprint arXiv:2010.11947, 2020.
- R. S. Carvalho, T. Vasiloudis, O. Feyisetan, and K. Wang, “Tem: High utility metric differential privacy on text,” in Proceedings of the 2023 SIAM International Conference on Data Mining (SDM). SIAM, 2023, pp. 883–890.
- X. Yue, M. Du, T. Wang, Y. Li, H. Sun, and S. S. Chow, “Differential privacy for text analytics via natural text sanitization,” arXiv preprint arXiv:2106.01221, 2021.
- X. Zhou, Y. Lu, R. Ma, T. Gui, Y. Wang, Y. Ding, Y. Zhang, Q. Zhang, and X.-J. Huang, “Textobfuscator: Making pre-trained language model a privacy protector via obfuscating word representations,” in Findings of the Association for Computational Linguistics: ACL 2023, 2023, pp. 5459–5473.
- M. Du, X. Yue, S. S. Chow, and H. Sun, “Sanitizing sentence embeddings (and labels) for local differential privacy,” in Proceedings of the ACM Web Conference 2023, 2023, pp. 2349–2359.
- M. Du, X. Yue, S. S. Chow, T. Wang, C. Huang, and H. Sun, “Dp-forward: Fine-tuning and inference on language models with differential privacy in forward pass,” in Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, 2023, pp. 2665–2679.
- S. Utpala, S. Hooker, and P. Y. Chen, “Locally differentially private document generation using zero shot prompting,” arXiv preprint arXiv:2310.16111, 2023.
- A. N. Carey, K. Bhaila, K. Edemacu, and X. Wu, “Dp-tabicl: In-context learning with differentially private tabular data,” arXiv preprint arXiv:2403.05681, 2024.
- Z. Tian, Y. Zhao, Z. Huang, Y.-X. Wang, N. L. Zhang, and H. He, “Seqpate: Differentially private text generation via knowledge distillation,” Advances in Neural Information Processing Systems, vol. 35, pp. 11 117–11 130, 2022.
- Y. Li, Y.-L. Tsai, C.-M. Yu, P.-Y. Chen, and X. Ren, “Exploring the benefits of visual prompting in differential privacy,” in Proceedings of the IEEE/CVF International Conference on computer vision (ICCV), October 2023, pp. 5158–5167.
- X. Yue, H. A. Inan, X. Li, G. Kumar, J. McAnallen, H. Sun, D. Levitan, and R. Sim, “Synthetic text generation with differential privacy: A simple and practical recipe,” in Annual meeting of the association for computational linguistics, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:253116660
- J. Flemings and M. Annavaram, “Differentially private knowledge distillation via synthetic text generation,” arXiv preprint arXiv:2403.00932, 2024.
- A. Kurakin, N. Ponomareva, U. Syed, L. MacDermed, and A. Terzis, “Harnessing large-language models to generate private synthetic text,” arXiv preprint arXiv:2306.01684, 2023.
- A. G. Carranza, R. Farahani, N. Ponomareva, A. Kurakin, M. Jagielski, and M. Nasr, “Privacy-preserving recommender systems with synthetic query generation using differentially private large language models,” arXiv preprint arXiv:2305.05973, 2023.
- C. Xie, Z. Lin, A. Backurs, S. Gopi, D. Yu, H. A. Inan, H. Nori, H. Jiang, H. Zhang, Y. T. Lee et al., “Differentially private synthetic data via foundation model apis 2: Text,” arXiv preprint arXiv:2403.01749, 2024.
- C. Meehan, K. Mrini, and K. Chaudhuri, “Sentence-level privacy for document embeddings,” arXiv preprint arXiv:2205.04605, 2022.
- T. Wu, A. Panda, J. T. Wang, and P. Mittal, “Privacy-preserving in-context learning for large language models,” in The Twelfth International Conference on Learning Representations, 2023.
- Y. Huang, S. Gupta, Z. Zhong, K. Li, and D. Chen, “Privacy implications of retrieval-based language models,” arXiv preprint arXiv:2305.14888, 2023.
- S. Arora, P. Lewis, A. Fan, J. Kahn, and C. Ré, “Knowledge retrieval over public and private data,” 2023.
- F.-E. Yang, C.-Y. Wang, and Y.-C. F. Wang, “Efficient model personalization in federated learning via client-specific prompt generation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 19 159–19 168.
- S. Su, M. Yang, B. Li, and X. Xue, “Federated adaptive prompt tuning for multi-domain collaborative learning,” 2024.
- T. Guo, S. Guo, J. Wang, X. Tang, and W. Xu, “Promptfl: Let federated participants cooperatively learn prompts instead of models-federated learning in age of foundation model,” IEEE Transactions on Mobile Computing, 2023.
- S. Ishihara, “Training data extraction from pre-trained language models: A survey,” arXiv preprint arXiv:2305.16157, 2023.
- X. Zhou, J. Lu, T. Gui, R. Ma, Z. Fei, Y. Wang, Y. Ding, Y. Cheung, Q. Zhang, and X. Huang, “Textfusion: Privacy-preserving pre-trained model inference via token fusion,” in Proceedings of the 2022 conference on empirical methods in natural language processing, Y. Goldberg, Z. Kozareva, and Y. Zhang, Eds. Abu Dhabi, United Arab Emirates: Association for computational linguistics, Dec 2022, pp. 8360–8371. [Online]. Available: https://aclanthology.org/2022.emnlp-main.572
- X. Wu, F. Li, A. Kumar, K. Chaudhuri, S. Jha, and J. Naughton, “Bolt-on differential privacy for scalable stochastic gradient descent-based analytics,” in Proceedings of the 2017 ACM International Conference on Management of Data, 2017, pp. 1307–1322.
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
- B. Weggenmann and F. Kerschbaum, “Differential privacy for directional data,” in Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, 2021, pp. 1205–1222.
- S. Hegselmann, A. Buendia, H. Lang, M. Agrawal, X. Jiang, and D. Sontag, “Tabllm: Few-shot classification of tabular data with large language models,” in International Conference on Artificial Intelligence and Statistics. PMLR, 2023, pp. 5549–5581.
- S. L. Warner, “Randomized response: A survey technique for eliminating evasive answer bias,” Journal of the American Statistical Association, vol. 60, no. 309, pp. 63–69, 1965.
- A. Sordoni, X. Yuan, M.-A. Côté, M. Pereira, A. Trischler, Z. Xiao, A. Hosseini, F. Niedtner, and N. L. Roux, “Deep language networks: Joint prompt training of stacked llms using variational inference,” arXiv preprint arXiv:2306.12509, 2023.
- B. Lester, R. Al-Rfou, and N. Constant, “The power of scale for parameter-efficient prompt tuning,” arXiv preprint arXiv:2104.08691, 2021.
- N. Shazeer and M. Stern, “Adafactor: Adaptive learning rates with sublinear memory cost,” in International Conference on Machine Learning. PMLR, 2018, pp. 4596–4604.
- C. Dwork and J. Lei, “Differential privacy and robust statistics,” in Proceedings of the forty-first annual ACM symposium on Theory of computing, 2009, pp. 371–380.
- J. Gillenwater, M. Joseph, A. Munoz, and M. R. Diaz, “A joint exponential mechanism for differentially private top-k𝑘kitalic_k,” in International Conference on Machine Learning. PMLR, 2022, pp. 7570–7582.
- U. Khandelwal, O. Levy, D. Jurafsky, L. Zettlemoyer, and M. Lewis, “Generalization through memorization: Nearest neighbor language models,” arXiv preprint arXiv:1911.00172, 2019.
- G. Izacard, P. Lewis, M. Lomeli, L. Hosseini, F. Petroni, T. Schick, J. Dwivedi-Yu, A. Joulin, S. Riedel, and E. Grave, “Few-shot learning with retrieval augmented language models,” arXiv preprint arXiv:2208.03299, 2022.
- S. Min, W. Shi, M. Lewis, X. Chen, W.-t. Yih, H. Hajishirzi, and L. Zettlemoyer, “Nonparametric masked language modeling,” arXiv preprint arXiv:2212.01349, 2022.
- D. Biesner, R. Ramamurthy, R. Stenzel, M. Lübbering, L. Hillebrand, A. Ladi, M. Pielka, R. Loitz, C. Bauckhage, and R. Sifa, “Anonymization of german financial documents using neural network-based language models with contextual word representations,” International journal of data science and analytics, pp. 1–11, 2022.
- Y. Nakamura, S. Hanaoka, Y. Nomura, N. Hayashi, O. Abe, S. Yada, S. Wakamiya, and E. Aramaki, “Kart: Privacy leakage framework of language models pre-trained with clinical records,” arXiv preprint arXiv:2101.00036, 2020.
- N. Lukas, A. Salem, R. Sim, S. Tople, L. Wutschitz, and S. Zanella-Béguelin, “Analyzing leakage of personally identifiable information in language models,” arXiv preprint arXiv:2302.00539, 2023.
- W. Xiong, X. L. Li, S. Iyer, J. Du, P. Lewis, W. Y. Wang, Y. Mehdad, W.-t. Yih, S. Riedel, D. Kiela et al., “Answering complex open-domain questions with multi-hop dense retrieval,” arXiv preprint arXiv:2009.12756, 2020.
- M. Jia, L. Tang, B.-C. Chen, C. Cardie, S. Belongie, B. Hariharan, and S.-N. Lim, “Visual prompt tuning,” in European Conference on Computer Vision. Springer, 2022, pp. 709–727.
- K. Zhou, J. Yang, C. C. Loy, and Z. Liu, “Learning to prompt for vision-language models,” International Journal of Computer Vision, vol. 130, no. 9, pp. 2337–2348, 2022.
- B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” in Artificial intelligence and statistics. PMLR, 2017, pp. 1273–1282.
- A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning transferable visual models from natural language supervision,” in Proceedings of the 38th international conference on machine learning, ser. Proceedings of machine learning Research, M. Meila and T. Zhang, Eds., vol. 139. PMLR, 18–24 Jul 2021, pp. 8748–8763. [Online]. Available: https://proceedings.mlr.press/v139/radford21a.html
- H.-Y. Chen and W.-L. Chao, “On bridging generic and personalized federated learning for image classification,” arXiv preprint arXiv:2107.00778, 2021.
- A. Shamsian, A. Navon, E. Fetaya, and G. Chechik, “Personalized federated learning using hypernetworks,” in International Conference on Machine Learning. PMLR, 2021, pp. 9489–9502.
- X. Zhang, J. Zhao, and Y. LeCun, “Character-level convolutional networks for text classification,” Advances in neural information processing systems, vol. 28, 2015.
- A. Panda, T. Wu, J. T. Wang, and P. Mittal, “Differentially private in-context learning,” arXiv preprint arXiv:2305.01639, 2023.
- D. Ofer, “Dbpedia classes,” accessed on: March 30, 2024. [Online]. Available: https://www.kaggle.com/datasets/danofer/dbpedia-classes?select=DBP_wiki_data.csv
- X. Li and D. Roth, “Learning question classifiers,” in COLING 2002: The 19th international Ccnference on computational linguistics, 2002. [Online]. Available: https://www.aclweb.org/anthology/C02-1150
- R. Socher, A. Perelygin, J. Wu, J. Chuang, C. D. Manning, A. Y. Ng, and C. Potts, “Recursive deep models for semantic compositionality over a sentiment treebank,” in Proceedings of the 2013 conference on empirical methods in natural language processing, 2013, pp. 1631–1642.
- J. Wiebe, T. Wilson, and C. Cardie, “Annotating expressions of opinions and emotions in language,” Language resources and evaluation, vol. 39, pp. 165–210, 2005.
- A. Howard, Devrishi, P. Culliton, and Y. Guo, “Natural language processing with disaster tweets,” 2019. [Online]. Available: https://kaggle.com/competitions/nlp-getting-started
- Huggingface, “Datasets: super_glue cb,” accessed on: March 30, 2024. [Online]. Available: https://huggingface.co/datasets/super_glue/viewer/cb
- A. Wang, A. Singh, J. Michael, F. Hill, O. Levy, and S. R. Bowman, “Glue: A multi-task benchmark and analysis platform for natural language understanding,” arXiv preprint arXiv:1804.07461, 2018.
- B. Bose, “Bbc news classification,” 2019. [Online]. Available: https://kaggle.com/competitions/learn-ai-bbc
- J. Liu, S. Cyphers, P. Pasupat, I. McGraw, and J. Glass, “A conversational movie search system based on conditional random fields,” in Thirteenth Annual Conference of the International Speech Communication Association, 2012.
- D. Kershaw and R. Koeling, “Elsevier oa cc-by corpus,” arXiv preprint arXiv:2008.00774, 2020.
- Cornell University, “arxiv dataset,” Access on: February 20, 2024. [Online]. Available: https://www.kaggle.com/datasets/Cornell-University/arxiv
- W. Christopher, S. Stephanie, M. Julie, and M. Kazuaki, “Ace 2005 multilingual training corpus,” 2006. [Online]. Available: https://doi.org/10.35111/mwxc-vh88
- B. Klimt and Y. Yang, “The enron corpus: A new dataset for email classification research,” in European conference on machine learning. Springer, 2004, pp. 217–226.
- S. Merity, C. Xiong, J. Bradbury, and R. Socher, “Pointer sentinel mixture models,” arXiv preprint arXiv:1609.07843, 2016.
- A. Fan, D. Grangier, and M. Auli, “Controllable abstractive summarization,” arXiv preprint arXiv:1711.05217, 2017.
- R. Tito, K. Nguyen, M. Tobaben, R. Kerkouche, M. A. Souibgui, K. Jung, L. Kang, E. Valveny, A. Honkela, M. Fritz, and D. Karatzas, “Privacy-aware document visual question answering,” arXiv preprint arXiv:2312.10108, 2023.
- B. Gliwa, I. Mochol, M. Biesek, and A. Wawer, “Samsum corpus: A human-annotated dialogue dataset for abstractive summarization,” arXiv preprint arXiv:1911.12237, 2019.
- Skillsmuggler, “Amazon-ratings (beauty products),” accessed on: March 30, 2024. [Online]. Available: https://www.kaggle.com/datasets/skillsmuggler/amazon-ratings
- B. Becker and R. Kohavi, “Adult,” UCI Machine Learning Repository, 1996, DOI: https://doi.org/10.24432/C5XW20.
- P. Rathi, “Banking dataset - marketing targets,” accessed on: March 30, 2024. [Online]. Available: https://www.kaggle.com/datasets/prakharrathi25/banking-dataset-marketing-targets
- M. M. Marchetti, “Predicting blood donations,” accessed on: March 30, 2024. [Online]. Available: https://www.kaggle.com/code/mmmarchetti/predicting-blood-donations
- O. R. Opeyemi, “California_housing_dataset,” accessed on: March 30, 2024. [Online]. Available: https://www.kaggle.com/code/olanrewajurasheed/california-housing-dataset
- A. Kadra, M. Lindauer, F. Hutter, and J. Grabocka, “Well-tuned simple nets excel on tabular datasets,” Advances in neural information processing systems, vol. 34, pp. 23 928–23 941, 2021.
- UCI, “Pima indians diabetes database,” accessed on: March 30, 2024. [Online]. Available: https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database
- Fedesoriano, “Heart failure prediction dataset,” accessed on: March 30, 2024. [Online]. Available: https://www.kaggle.com/datasets/fedesoriano/heart-failure-prediction
- J. N. van Rijn and J. K. Vis, “Endgame analysis of dou shou qi,” ICGA journal, vol. 37, no. 2, pp. 120–124, 2014.
- G. Griffin, A. Holub, and P. Perona, “Caltech-256 object category dataset,” 2007.
- K. Saenko, B. Kulis, M. Fritz, and T. Darrell, “Adapting visual category models to new domains,” in Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part IV 11. Springer, 2010, pp. 213–226.
- X. Peng, Q. Bai, X. Xia, Z. Huang, K. Saenko, and B. Wang, “Moment matching for multi-source domain adaptation,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 1406–1415.
- Z. Chen, M. Zhu, C. Yang, and Y. Yuan, “Personalized retrogress-resilient framework for real-world medical federated learning,” in Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part III 24. Springer, 2021, pp. 347–356.
- A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” 2009.
- J. Yang, R. Shi, D. Wei, Z. Liu, L. Zhao, B. Ke, H. Pfister, and B. Ni, “Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification,” Scientific Data, vol. 10, no. 1, p. 41, 2023.
- P. Tschandl, C. Rosendahl, and H. Kittler, “The ham10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions,” Scientific data, vol. 5, no. 1, pp. 1–9, 2018.
- N. C. Codella, D. Gutman, M. E. Celebi, B. Helba, M. A. Marchetti, S. W. Dusza, A. Kalloo, K. Liopyris, N. Mishra, H. Kittler et al., “Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imaging (isbi), hosted by the international skin imaging collaboration (isic),” in 2018 IEEE 15th international symposium on biomedical imaging (ISBI 2018). IEEE, 2018, pp. 168–172.
- Anonos, “Anonos prompt protector,” accessed on: March 27, 2024. [Online]. Available: https://www.anonos.com/solutions/prompt-protector-ai-privacy
- Prompt, “The singular platform for genai security,” accessed on: March 27, 2024. [Online]. Available: https://www.prompt.security/
- Whylabs, “Ensure safe and responsible usage of large language models,” accessed on: March 27, 2024. [Online]. Available: https://whylabs.ai/llm-security
- CalypsoAI, “Harness ai safely,” accessed on: March 27, 2024. [Online]. Available: https://calypsoai.com/
- Lakera, “Protect your ai against safety and security threats, instantly,” accessed on: March 27, 2024. [Online]. Available: https://www.lakera.ai/
- LLM Guard, “Llm guard - the security toolkit for llm interactions,” accessed on: March 27, 2024. [Online]. Available: https://llm-guard.com/
- Guardrails, “Guardrails ai,” accessed on: March 27, 2024. [Online]. Available: https://github.com/guardrails-ai/guardrails
- B. D. Rouhani, M. S. Riazi, and F. Koushanfar, “Deepsecure: Scalable provably-secure deep learning,” in Proceedings of the 55th annual design automation conference, 2018, pp. 1–6.
- B. Pecher, I. Srba, and M. Bielikova, “Fine-tuning, prompting, in-context learning and instruction-tuning: How many labelled samples do we need?” arXiv preprint arXiv:2402.12819, 2024.
- P. Sun, “Fine-tuning vs prompting, can language models understand human values?” arXiv preprint arXiv:2403.09720, 2024.
- M. Mosbach, T. Pimentel, S. Ravfogel, D. Klakow, and Y. Elazar, “Few-shot fine-tuning vs. in-context learning: A fair comparison and evaluation,” arXiv preprint arXiv:2305.16938, 2023.
- J. Zhang, J. Huang, S. Jin, and S. Lu, “Vision-language models for vision tasks: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024.
- I. S. Rubinstein and W. Hartzog, “Anonymization and risk,” Wash. L. Rev., vol. 91, p. 703, 2016.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.