Papers
Topics
Authors
Recent
Search
2000 character limit reached

Distributed In-Context Learning under Non-IID Among Clients

Published 31 Jul 2024 in cs.CL and cs.AI | (2408.00144v1)

Abstract: Advancements in LLMs have shown their effectiveness in multiple complicated natural language reasoning tasks. A key challenge remains in adapting these models efficiently to new or unfamiliar tasks. In-context learning (ICL) provides a promising solution for few-shot adaptation by retrieving a set of data points relevant to a query, called in-context examples (ICE), from a training dataset and providing them during the inference as context. Most existing studies utilize a centralized training dataset, yet many real-world datasets may be distributed among multiple clients, and remote data retrieval can be associated with costs. Especially when the client data are non-identical independent distributions (non-IID), retrieving from clients a proper set of ICEs needed for a test query presents critical challenges. In this paper, we first show that in this challenging setting, test queries will have different preferences among clients because of non-IIDness, and equal contribution often leads to suboptimal performance. We then introduce a novel approach to tackle the distributed non-IID ICL problem when a data usage budget is present. The principle is that each client's proper contribution (budget) should be designed according to the preference of each query for that client. Our approach uses a data-driven manner to allocate a budget for each client, tailored to each test query. Through extensive empirical studies on diverse datasets, our framework demonstrates superior performance relative to competing baselines.

Authors (3)
Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
  2. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
  3. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
  4. Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805, 2023.
  5. Glue: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461, 2018.
  6. Superglue: A stickier benchmark for general-purpose language understanding systems. Advances in neural information processing systems, 32, 2019.
  7. A survey on in-context learning. arXiv preprint arXiv:2301.00234, 2022.
  8. The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends® in Information Retrieval, 3(4):333–389, 2009.
  9. Batch-icl: Effective, efficient, and order-agnostic in-context learning. arXiv preprint arXiv:2401.06469, 2024.
  10. Data-driven learning for data rights, data pricing, and privacy computing. Engineering, 25:66–76, 2023.
  11. Data pricing in machine learning pipelines. Knowledge and Information Systems, 64(6):1417–1455, 2022.
  12. Abrief survey of data pricing for machine learning. In CS & IT Conference Proceedings, volume 10. CS & IT Conference Proceedings, 2020.
  13. Social learning: Towards collaborative learning with large language models. arXiv preprint arXiv:2312.11441, 2023.
  14. What makes good in-context examples for gpt-3333? arXiv preprint arXiv:2101.06804, 2021.
  15. Compositional exemplars for in-context learning. In International Conference on Machine Learning, pages 39818–39833. PMLR, 2023.
  16. Diverse demonstrations improve in-context compositional generalization. arXiv preprint arXiv:2212.06800, 2022.
  17. Federated learning on non-iid data silos: An experimental study. In 2022 IEEE 38th international conference on data engineering (ICDE), pages 965–978. IEEE, 2022.
  18. Open problems in medical federated learning. International Journal of Web Information Systems, 18(2/3):77–99, 2022.
  19. Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
  20. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 conference on empirical methods in natural language processing, pages 1631–1642, 2013.
  21. Hidden factors and hidden topics: understanding rating dimensions with review text. In Proceedings of the 7th ACM conference on Recommender systems, pages 165–172, 2013.
  22. Character-level convolutional networks for text classification. Advances in neural information processing systems, 28, 2015.
  23. Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. arXiv preprint cs/0506075, 2005.
  24. Bo Pang and Lillian Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), pages 271–278, Barcelona, Spain, July 2004.
  25. Gemma: Open models based on gemini research and technology. arXiv preprint arXiv:2403.08295, 2024.
  26. Openicl: An open-source framework for in-context learning. arXiv preprint arXiv:2303.02913, 2023.
  27. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. Transactions of the Association for Computational Linguistics, 2:67–78, 2014.
  28. A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1112–1122. Association for Computational Linguistics, 2018.
  29. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
  30. GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow, March 2021. If you use this software, please cite it using these metadata.
  31. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  32. Learning to retrieve prompts for in-context learning. In Marine Carpuat, Marie-Catherine de Marneffe, and Ivan Vladimir Meza Ruiz, editors, Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2655–2671, Seattle, United States, July 2022. Association for Computational Linguistics.
  33. Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906, 2020.
  34. Unified demonstration retriever for in-context learning. arXiv preprint arXiv:2305.04320, 2023.
  35. Dynamic prompt learning via policy gradient for semi-structured mathematical reasoning. arXiv preprint arXiv:2209.14610, 2022.
  36. Learning to retrieve in-context examples for large language models. arXiv preprint arXiv:2307.07164, 2023.
  37. A survey on federated learning systems: Vision, hype and reality for data privacy and protection. IEEE Transactions on Knowledge and Data Engineering, 35(4):3347–3366, 2021.
  38. A survey on federated learning. Knowledge-Based Systems, 216:106775, 2021.
  39. Priyanka Mary Mammen. Federated learning: Opportunities and challenges. arXiv preprint arXiv:2101.05428, 2021.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.