Cross-cultural Inspiration Detection and Analysis in Real and LLM-generated Social Media Data
Abstract: Inspiration is linked to various positive outcomes, such as increased creativity, productivity, and happiness. Although inspiration has great potential, there has been limited effort toward identifying content that is inspiring, as opposed to just engaging or positive. Additionally, most research has concentrated on Western data, with little attention paid to other cultures. This work is the first to study cross-cultural inspiration through machine learning methods. We aim to identify and analyze real and AI-generated cross-cultural inspiring posts. To this end, we compile and make publicly available the InspAIred dataset, which consists of 2,000 real inspiring posts, 2,000 real non-inspiring posts, and 2,000 generated inspiring posts evenly distributed across India and the UK. The real posts are sourced from Reddit, while the generated posts are created using the GPT-4 model. Using this dataset, we conduct extensive computational linguistic analyses to (1) compare inspiring content across cultures, (2) compare AI-generated inspiring posts to real inspiring posts, and (3) determine if detection models can accurately distinguish between inspiring content across cultures and data sources.
- Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
- Cindy K Chung and James W Pennebaker. 2008. Revealing dimensions of thinking in open-ended self-descriptions: An automated meaning extraction method for natural language. Journal of research in personality, 42(1):96–132.
- Unsupervised cross-lingual representation learning at scale. CoRR, abs/1911.02116.
- Prompting a large language model to generate diverse motivational messages: A comparison with human-written messages. In Proceedings of the 11th International Conference on Human-Agent Interaction, pages 378–380.
- Text detoxification using large pre-trained neural models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7979–7996, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Youtube for good: A content analysis and examination of elicitors of self-transcendent media. Journal of Communication, 67(6):897–919.
- Self-transcendent emotions and social media: Exploring the content and consumers of inspirational facebook posts. New Media & Society, 22(3):507–527.
- Andrew J Elliot and Todd M Thrash. 2002. Approach-avoidance motivation in personality: approach and avoidance temperaments and goals. Journal of personality and social psychology, 82(5):804.
- Joseph L Fleiss and Jacob Cohen. 1973. The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and psychological measurement, 33(3):613–619.
- Rudolph Flesch. 1948. A new readability yardstick. Journal of applied psychology, 32(3):221.
- Normsage: Multi-lingual multi-cultural norm discovery from conversations on-the-fly. arXiv preprint arXiv:2210.08604.
- Jing Huang and Diyi Yang. 2023. Culturally aware natural language inference. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 7591–7609.
- Detecting inspiring content on social media. 2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII), pages 1–8.
- Cross-cultural polarity and emotion detection using sentiment analysis and deep learning on covid-19 related tweets. Ieee Access, 8:181074–181090.
- Human heuristics for ai-generated language are flawed. Proceedings of the National Academy of Sciences, 120(11):e2208839120.
- Examining long-term trends in politics and culture through language of political leaders and cultural institutions. Proceedings of the National Academy of Sciences, 116(9):3476–3481.
- Working with ai to persuade: Examining a large language model’s ability to generate pro-vaccination messages. Proceedings of the ACM on Human-Computer Interaction, 7(CSCW1):1–29.
- Jason Kessler. 2017. Scattertext: a browser-based tool for visualizing how corpora differ. In Proceedings of ACL 2017, System Demonstrations, pages 85–90, Vancouver, Canada. Association for Computational Linguistics.
- J.H. Leavitt and P.J. Leavitt. 1997. Poetry and Prophecy: The Anthropology of Inspiration. Studies in Literature and Science. University of Michigan Press.
- Roberta: A robustly optimized bert pretraining approach. ArXiv, abs/1907.11692.
- Inspired to get there: The effects of trait and goal inspiration on goal progress.
- Natural language generation for advertising: A survey. Preprint, arXiv:2306.12719.
- The scientific study of inspiration in the creative process: challenges and opportunities. Frontiers in Human Neuroscience, 8.
- Opinion mining and sentiment analysis. Foundations and Trends® in information retrieval, 2(1–2):1–135.
- Potato: The portable text annotation tool. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations.
- Linguistic inquiry and word count (LIWC2007).
- The development and psychometric properties of LIWC2015.
- When small words foretell academic success: The case of college admissions essays. PloS one, 9(12):e115844.
- Aida Ramezani and Yang Xu. 2023. Knowledge of cultural moral norms in large language models. arXiv preprint arXiv:2306.01857.
- Juan Ramos et al. 2003. Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning, volume 242, pages 29–48. Citeseer.
- Profiling the audience for self-transcendent media: A national survey. Mass Communication and Society, 21(3):296–319.
- Diana Rieger and Christoph Klimmt. 2019. The daily dose of digital inspiration: A multi-method exploration of meaningful communication in social media. New Media & Society, 21(1):97–118.
- Student. 1908. The probable error of a mean. Biometrika, pages 1–25.
- The science of detecting llm-generated texts. arXiv preprint arXiv:2303.07205.
- Todd M. Thrash and Andrew J Elliot. 2003. Inspiration as a psychological construct. Journal of personality and social psychology, 84 4:871–89.
- Todd M. Thrash and Andrew J Elliot. 2004. Inspiration: core characteristics, component processes, antecedents, and function. Journal of personality and social psychology, 87 6:957–73.
- Mediating between the muse and the masses: inspiration and the actualization of creative ideas. Journal of personality and social psychology, 98 3:469–87.
- Learning from the worst: Dynamically generated datasets to improve online hate detection. In ACL.
- CCNet: Extracting high quality monolingual datasets from web crawl data. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 4003–4012, Marseille, France. European Language Resources Association.
- A survey on llm-generated text detection: Necessity, methods, and future directions. ArXiv, abs/2310.14724.
- Fake news detection with generated comments for news articles. In 2020 IEEE 24th International Conference on Intelligent Engineering Systems (INES), pages 85–90. IEEE.
- Wordcraft: story writing with large language models. In 27th International Conference on Intelligent User Interfaces, pages 841–852.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.