Social Skill Training with Large Language Models
Abstract: People rely on social skills like conflict resolution to communicate effectively and to thrive in both work and personal life. However, practice environments for social skills are typically out of reach for most people. How can we make social skill training more available, accessible, and inviting? Drawing upon interdisciplinary research from communication and psychology, this perspective paper identifies social skill barriers to enter specialized fields. Then we present a solution that leverages LLMs for social skill training via a generic framework. Our AI Partner, AI Mentor framework merges experiential learning with realistic practice and tailored feedback. This work ultimately calls for cross-disciplinary innovation to address the broader implications for workforce development and social equality.
- The skills to pay the bills: Returns to on-the-job soft skills training. Technical report, National Bureau of Economic Research.
- Design principles for visual communication. Commun. ACM, 54(4):60–69.
- Using large language models to simulate multiple humans and replicate human subject studies. In International Conference on Machine Learning, pages 337–371. PMLR.
- Personalized text generation with fine-grained linguistic control. ArXiv preprint, abs/2402.04914.
- Arzu Sosyal Altugan. 2015. The relationship between cultural identity and learning. Procedia-Social and Behavioral Sciences, 186:1159–1162.
- Facilitative interpersonal skills performance analysis rating method. Unpublished coding manual, Department of Psychology, Ohio University, Athens, OH.
- Star-gate: Teaching language models to ask clarifying questions. ArXiv preprint, abs/2403.19154.
- How stress management improves quality of life after treatment for breast cancer. Journal of consulting and clinical psychology, 74(6):1143.
- Leveraging ai for democratic discourse: Chat interventions can improve online political conversations at scale. Proceedings of the National Academy of Sciences, 120(41):e2311627120.
- Aristotle. 1984. Rhetoric. Modern Library, New York. Translated from the Greek.
- Constitutional ai: Harmlessness from ai feedback. ArXiv preprint, abs/2212.08073.
- Hard and soft skills in vocational training: Experimental evidence from colombia. The World Bank Economic Review, 37(3):409–436.
- The critical role of conflict resolution in teams: A close look at the links between conflict type, conflict management strategies, and team outcomes. Journal of applied psychology, 93(1):170.
- A systematic review of research on personalized learning: Personalized by whom, to what, how, and for what purpose (s)? Educational Psychology Review, 33(4):1675–1715.
- Dinesh Bhugra and Matthew A Becker. 2005. Migration, cultural bereavement and cultural identity. World psychiatry, 4(1):18.
- Power to the people? opportunities and challenges for participatory ai. In Proceedings of the 2nd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization, pages 1–8.
- Learning by teaching: A new agent paradigm for educational software. Applied Artificial Intelligence, 19(3-4):363–392.
- Pedagogical agents for learning by teaching: Teachable agents. Educational Technology, pages 56–61.
- Effects of school-wide positive behavioral interventions and supports on child behavior problems. Pediatrics, 130(5):e1136–e1145.
- Ross Brennan and Lynn Vos. 2013. Effects of participation in a simulation game on marketing students’ numeracy and financial skills. Journal of Marketing Education, 35(3):259–270.
- Marked personas: Using natural language prompts to measure stereotypes in language models. ArXiv preprint, abs/2305.18189.
- Compost: Characterizing and evaluating caricature in llm simulations. ArXiv preprint, abs/2310.11501.
- Simulation-based learning in higher education: A meta-analysis. Review of Educational Research, 90(4):499–541.
- Cleo H Cherryholmes. 1966. Some current research on effectiveness of educational simulations: Implications for alternative strategies. American Behavioral Scientist, 10(2):4–7.
- Making entrepreneurs: Returns to training youth in hard versus soft business skills. Technical report, National Bureau of Economic Research.
- A computational framework for behavioral assessment of llm therapists. ArXiv preprint, abs/2401.00820.
- Herbert H Clark. 1996. Using language. Cambridge university press.
- Herbert H Clark and Edward F Schaefer. 1989. Contributing to discourse. Cognitive science, 13(2):259–294.
- Sheldon Cohen. 2004. Social relationships and health. American psychologist, 59(8):676.
- A theory-based measure of conflict management strategies in the workplace. Journal of Organizational Behavior: The International Journal of Industrial, Occupational and Organizational Psychology and Behavior, 22(6):645–668.
- Susan A Dean. 2017. Soft skills needed for the 21st century workforce. Walden University.
- Using large language models in psychology. Nature Reviews Psychology, 2(11):688–701.
- The handbook of conflict resolution: Theory and practice. John Wiley & Sons.
- Joseph A DeVito. 2019. The interpersonal communication book. Instructor, 1(18):521–532.
- Dean S Dorn. 1989. Simulation games: One more tool on the pedagogical shelf. Teaching Sociology, pages 1–18.
- Improving factuality and reasoning in language models through multiagent debate. ArXiv preprint, abs/2305.14325.
- Alpacafarm: A simulation framework for methods that learn from human feedback. Preprint, arXiv:2305.14387.
- Refocusing on relevance: Personalization in nlg. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing, volume 2021, page 5190. NIH Public Access.
- Pamela L Eddy and Joy L Gaston-Gayles. 2013. New faculty on the block: Issues of stress and support. In Faculty stress, pages 89–106. Routledge.
- FDA. 2009. Adverse event reporting to irbs improving human subject protection. Guidance Clinical Investigators, Sponsors, and IRBs.
- Lucie Flek. 2020. Returning the N to NLP: Towards contextually personalized classification models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7828–7838, Online. Association for Computational Linguistics.
- Strategic reasoning with language models. ArXiv preprint, abs/2305.19165.
- Anthony Giddens and Simon Griffiths. 2006. Sociology. Polity.
- Efficacy of simulation-based trauma team training of non-technical skills. a systematic review. Acta Anaesthesiologica Scandinavica, 58(7):775–787.
- News summarization and evaluation in the era of gpt-3. Preprint, arXiv:2209.12356.
- Collaborative dialogue patterns in naturalistic one-to-one tutoring. Applied cognitive psychology, 9(6):495–522.
- Unifying human and statistical evaluation for natural language generation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1689–1701, Minneapolis, Minnesota. Association for Computational Linguistics.
- James J Heckman and Tim Kautz. 2012. Hard evidence on soft skills. Labour economics, 19(4):451–464.
- Association of youth suicides and county-level mental health professional shortage areas in the us. JAMA pediatrics, 177(1):71–80.
- The jefferson scale of physician empathy: development and preliminary psychometric data. Educational and psychological measurement, 61(2):349–365.
- Zero-shot goal-directed dialogue via rl on imagined conversations. ArXiv preprint, abs/2311.05584.
- John J Horton. 2023. Large language models as simulated economic agents: What can we learn from homo silicus? Technical report, National Bureau of Economic Research.
- Helping the helper: Supporting peer counselors via ai-empowered practice and feedback. ArXiv preprint, abs/2305.08982.
- War and peace (waragent): Large language model-based multi-agent simulation of world wars. ArXiv preprint, abs/2311.17227.
- Empowering personalized learning with an interactive e-book learning system for elementary school students. Educational technology research and development, 60:703–722.
- Equity & social and emotional learning: A cultural analysis. CASEL Assessment Work Group Brief series.
- Leveraging large language models for learning complex legal concepts through storytelling. ArXiv preprint, abs/2402.17019.
- Active retrieval augmented generation. ArXiv preprint, abs/2305.06983.
- The perils of using Mechanical Turk to evaluate open-ended text generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 1265–1285, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Ctrl: A conditional transformer language model for controllable generation. ArXiv preprint, abs/1909.05858.
- Demonstrate-search-predict: Composing retrieval and language models for knowledge-intensive nlp. ArXiv preprint, abs/2212.14024.
- Simulation in education and training. Applied system simulation: methodologies and applications, pages 437–456.
- Diane B Kraft. 2014. Creac in the real world. Clev. St. L. Rev., 63:567.
- Kelly YL Ku and Irene T Ho. 2010. Metacognitive strategies that enhance critical thinking. Metacognition and learning, 5:251–267.
- What, when, and how to ground: Designing user persona-aware conversational agents for engaging dialogue. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pages 707–719.
- Retrieval-augmented generation for knowledge-intensive NLP tasks. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.
- Automatic prompt rewriting for personalized text generation. ArXiv preprint, abs/2310.00152.
- Truthfulqa: Measuring how models mimic human falsehoods. ArXiv preprint, abs/2109.07958.
- Emerging systems for managing workplace conflict: Lessons from American corporations for managers and dispute resolution professionals, volume 18. Jossey-Bass San Francisco.
- Lost in the middle: How language models use long contexts. Transactions of the Association for Computational Linguistics, 12:157–173.
- Gpteach: Interactive ta training with gpt-based students. In Proceedings of the tenth acm conference on learning@ scale, pages 226–236.
- David Matz and Noam Ebner. 2010. Using role-play in online negotiation teaching. In Venturing beyond the classroom, pages 293–317. DRI Press.
- Brendan McCormack and Tanya V McCance. 2006. Development of a framework for person-centred nursing. Journal of advanced nursing, 56(5):472–479.
- The development of a questionnaire to assess the attitude of active listening. Journal of Occupational Health, 42(3):111–118.
- Richard C Mohs and Nigel H Greig. 2017. Drug discovery and development: Role of basic biological research. Alzheimer’s & Dementia: Translational Research & Clinical Interventions, 3(4):651–657.
- Britta K Morris-Rothschild and Marla R Brassard. 2006. Teachers’ conflict management styles: The role of attachment styles and classroom management efficacy. Journal of school psychology, 44(2):105–121.
- Motivational interviewing treatment integrity coding manual 4.1 (miti 4.1). Unpublished manual.
- Michael J. Muller and Sarah Kuhn. 1993. Participatory design. Commun. ACM, 36(6):24–28.
- Pearl: Personalizing large language model writing assistants with generation-calibrated retrievers. ArXiv preprint, abs/2311.09180.
- Can you hear me now? teaching listening skills. Psychiatric rehabilitation journal, 40(4):415.
- Galor Oded. 2011. Inequality, human capital formation, and the process of development. In Handbook of the Economics of Education, volume 4, pages 441–493. Elsevier.
- Philip Oreopoulos and Kjell G Salvanes. 2011. Priceless: The nonpecuniary benefits of schooling. Journal of Economic perspectives, 25(1):159–184.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
- Sarah M Ovink and Brian D Veazey. 2011. More than “getting us through:” a case study in cultural capital enrichment of underrepresented minority undergraduates. Research in higher education, 52:370–394.
- Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pages 1–22.
- Social simulacra: Creating populated prototypes for social computing systems. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology, pages 1–18.
- Constitutionmaker: Interactively critiquing large language models by converting feedback into principles. ArXiv preprint, abs/2310.15428.
- Closing the ai accountability gap: Defining an end-to-end framework for internal algorithmic auditing. In Proceedings of the 2020 conference on fairness, accountability, and transparency, pages 33–44.
- Rosemary P Ramsey and Ravipreet S Sohi. 1997. Listening to your customers: The impact of perceived salesperson listening behavior on relationship outcomes. Journal of the Academy of marketing Science, 25:127–137.
- The effectiveness of games for educational purposes: A review of recent research. Simulation & gaming, 23(3):261–276.
- Bases: Large-scale web search user simulation with large language model based agents. ArXiv preprint, abs/2402.17505.
- Carl Ransom Rogers and Richard Evans Farson. 1957. Active listening. Industrial Relations Center, the University of Chicago.
- Toolformer: Language models can teach themselves to use tools. Advances in Neural Information Processing Systems, 36.
- Donald A Schon and Vincent DeSanctis. 1986. The reflective practitioner: How professionals think in action.
- Douglas Schuler and Aki Namioka. 1993. Participatory design: Principles and practices. CRC Press.
- Broadening the study of the self: Integrating the study of personal identity and cultural identity. Social and personality psychology compass, 2(2):635–651.
- Social relationships, social support, and patterns of cognitive aging in healthy, high-functioning older adults: Macarthur studies of successful aging. Health psychology, 20(4):243.
- Rehearsal: Simulating conflict to teach conflict resolution. ArXiv preprint, abs/2309.12309.
- Grounding or guesswork? large language models are presumptive grounders. ArXiv preprint, abs/2311.09144.
- On second thought, let’s not think step by step! bias and toxicity in zero-shot reasoning. ArXiv preprint, abs/2212.08061.
- Human–ai collaboration enables more empathic conversations in text-based peer-to-peer mental health support. Nature Machine Intelligence, 5(1):46–57.
- Investigating agency of llms in human-ai collaboration tasks. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1968–1987.
- Reflexion: an autonomous agent with dynamic memory and self-reflection. ArXiv preprint, abs/2303.11366.
- Ben Shneiderman. 1983. Direct manipulation: A step beyond programming languages. Computer, 16(08):57–69.
- Language models that seek for knowledge: Modular search & generation for dialogue and prompt completion. ArXiv preprint, abs/2203.13224.
- Blenderbot 3: a deployed conversational agent that continually learns to responsibly engage. ArXiv preprint, abs/2208.03188.
- Joseph William Singer. 1988. Persuasion. Mich. L. Rev., 87:2442.
- Adam Smith. 2006. Cognitive empathy and emotional empathy in human behavior and evolution. The Psychological Record, 56(1):3–21.
- Stephen D Sorden. 2012. The cognitive theory of multimedia learning. Handbook of educational theories, 1(2012):1–22.
- Appropriateness of psychiatric and substance abuse hospitalization: implications for payment and utilization management. Medical Care, pages AS77–AS90.
- Intelligent agents for interactive simulation environments. AI magazine, 16(1):15–15.
- Democratizing large language models via personalized parameter-efficient fine-tuning. ArXiv preprint, abs/2402.04401.
- Donald M Taylor and Esther Usborne. 2010. When i know who “we” are, i can be “me”: The primary role of cultural identity clarity for psychological well-being. Transcultural psychiatry, 47(1):93–111.
- A note on the evaluation of generative models. In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings.
- Lamda: Language models for dialog applications. ArXiv preprint, abs/2201.08239.
- Llama 2: Open foundation and fine-tuned chat models. ArXiv preprint, abs/2307.09288.
- Cees PM van der Vleuten and David B Swanson. 1990. Assessment of clinical skills with standardized patients: state of the art. Teaching and Learning in Medicine: An International Journal, 2(2):58–76.
- Evert Van der Vliert. 2013. Complex interpersonal conflict behaviour: Theoretical frontiers. Psychology Press.
- Voyager: An open-ended embodied agent with large language models. ArXiv preprint, abs/2305.16291.
- A survey on large language model based autonomous agents. Frontiers of Computer Science, 18(6):1–26.
- Backtracing: Retrieving the cause of the query. In Findings of the Association for Computational Linguistics: EACL 2024, pages 722–735, St. Julian’s, Malta. Association for Computational Linguistics.
- Jason Weston and Sainbayar Sukhbaatar. 2023. System 2 attention (is something you might need too). ArXiv preprint, abs/2311.11829.
- The national teacher shortage, urban education and the cognitive sociology of labor. The Urban Review, 53:43–75.
- The jobs that artificial intelligence will create. MIT Sloan Management Review Summer.
- Laura Wray-Lake and Amy K Syvertsen. 2011. The developmental roots of social responsibility in childhood and adolescence. New directions for child and adolescent development, 2011(134):11–25.
- Promptchainer: Chaining large language model prompts through visual programming. In CHI Conference on Human Factors in Computing Systems Extended Abstracts, pages 1–10.
- Personalized response generation via generative split memory network. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1956–1970, Online. Association for Computational Linguistics.
- Beyond goldfish memory: Long-term open-domain conversation. ArXiv preprint, abs/2107.07567.
- Exploring large language models for communication games: An empirical study on werewolf. ArXiv preprint, abs/2309.04658.
- Re3: Generating longer stories with recursive reprompting and revision. ArXiv preprint, abs/2210.06774.
- Tree of thoughts: Deliberate problem solving with large language models. Advances in Neural Information Processing Systems, 36.
- React: Synergizing reasoning and acting in language models. ArXiv preprint, abs/2210.03629.
- Juping Yu and Maggie Kirk. 2009. Evaluation of empathy measurement tools in nursing: systematic review. Journal of advanced nursing, 65(9):1790–1806.
- Exploring collaboration mechanisms for llm agents: A social psychology view. ArXiv preprint, abs/2310.02124.
- Expel: Llm agents are experiential learners. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 19632–19642.
- Judging llm-as-a-judge with mt-bench and chatbot arena. Advances in Neural Information Processing Systems, 36.
- Is this the real life? is this just fantasy? the misleading success of simulating social interactions with llms. ArXiv preprint, abs/2403.05020.
- Sotopia: Interactive evaluation for social intelligence in language agents. ArXiv preprint, abs/2310.11667.
- Large language model for participatory urban planning. ArXiv preprint, abs/2402.17161.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.