Measuring the happiness of large-scale written expression: Songs, Blogs, and Presidents

Published 6 Mar 2017 in cs.SI and physics.soc-ph | (1703.09774v1)

Abstract: The importance of quantifying the nature and intensity of emotional states at the level of populations is evident: we would like to know how, when, and why individuals feel as they do if we wish, for example, to better construct public policy, build more successful organizations, and, from a scientific perspective, more fully understand economic and social phenomena. Here, by incorporating direct human assessment of words, we quantify happiness levels on a continuous scale for a diverse set of large-scale texts: song titles and lyrics, weblogs, and State of the Union addresses. Our method is transparent, improvable, capable of rapidly processing Web-scale texts, and moves beyond approaches based on coarse categorization. Among a number of observations, we find that the happiness of song lyrics trends downward from the 1960's to the mid 1990's while remaining stable within genres, and that the happiness of blogs has steadily increased from 2005 to 2009, exhibiting a striking rise and fall with blogger age and distance from the equator.

Abstract PDF Upgrade to Chat

Authors (2)

Citations (239)

View on Semantic Scholar

Summary

The paper presents a quantitative method using the ANEW dataset to assign weighted happiness scores to words in large texts.
It reveals key trends such as declining positivity in song lyrics from the 1960s to the 1990s and rising happiness in blogs during significant cultural events.
The analysis offers practical insights for public policy and sociolinguistic research, paving the way for more nuanced future emotional assessments.

Analysis and Measurement of Happiness in Large-Scale Texts: An Evaluation of Songs, Blogs, and Presidential Addresses

The paper authored by Dodds and Danforth presents a quantitative approach to analyzing the emotional tone of large-scale written texts, specifically focusing on song lyrics, blogs, and State of the Union addresses. Utilizing the Affective Norms for English Words (ANEW) dataset, the authors introduce a method capable of quantifying the happiness level of these texts, harnessing human-assessed emotional content of individual words.

Methodology and Core Findings

The methodological framework relies on a weighted average of word valences derived from the ANEW study, where each word’s emotional valence is rated on a scale from 1 to 9. The utilization of large text corpora allows for robust statistical analysis, especially beneficial in identifying trends over time and across different demographic segments. This approach is particularly advantageous as it moves beyond traditional categorical analyses by offering a continuous measure of emotional valence.

A key observation made in the paper is the decreasing trend in the emotional tone of song lyrics from the 1960s to the mid-1990s, a finding substantiated through time series analyses. The authors attribute this trend partly to the rise of new music genres that embraced a broader emotional spectrum, thus impacting aggregate measures of happiness in lyrics. This richness of content is further explored through 'Valence Shift Word Graphs,' allowing for a detailed breakdown of word-level contributions to changes in average valence.

In the field of blogs, a parallel increase in happiness between 2005 and 2009 is documented, with notable peaks corresponding to significant cultural and political events such as the US presidential election, and holiday seasons. The demographic analysis highlights intriguing patterns, such as a happier expression in blog posts as age increases, peaking in mid-life before declining.

Practical and Theoretical Implications

The implications of this research are manifold. Pragmatically, the development of a scalable and non-invasive method to assess collective happiness offers potential applications in public policy, marketing, and sociological research. Theoretically, the research invites further exploration into the socio-linguistic dynamics manifesting at the population level and presents an opportunity to refine theories of emotional expression and reception in public texts.

The authors appeal for future studies to expand on this research by increasing the repertoire of words evaluated for emotional content and exploring alternative metrics beyond the single dimension of good-bad emotion polarity. There is a potential to integrate parallel methodologies from artificial intelligence and data mining to enhance the granularity and scope of emotional assessments.

Moreover, while the current method emphasizes word-level emotion capturing, it opens avenues for future work where phrasal nuances and context-specific interpretations may be incorporated, thus enhancing the resolution of emotional valency detection in complex texts.

Conclusion

This research marks a step towards a comprehensive understanding of how large-scale written texts can reflect societal moods and emotional dynamics. By integrating human emotional assessments with analytical rigor, this study provides a foundation for subsequent explorations into the intersection of language, emotion, and public sentiment. There remains a vast field for continued research to refine these methodologies and expand their applicability across diverse linguistic and cultural contexts.

Markdown Report Issue