- The paper demonstrates that BuzzProphet integrates LLM-generated rationales with classical regression to enhance hashtag popularity predictions.
- The methodology fuses contextual reasoning with raw hashtag features, achieving up to a 2.8% RMSE reduction and a 30% boost in correlation.
- The interpretable and modular design offers practical insights for optimizing social media engagement and advancing future analytics research.
BuzzProphet: Enhancing Hashtag Popularity Prediction with LLM Reasoning
Introduction
The paper "Forecasting the Buzz: Enriching Hashtag Popularity Prediction with LLM Reasoning" proposes a novel framework named BuzzProphet to predict the popularity of hashtags on social media platforms by combining LLMs with traditional regressors. By leveraging the contextual reasoning abilities of LLMs and the numeric precision of classical models, BuzzProphet aims to improve the accuracy and interpretability of popularity forecasts.
Figure 1: Comparison of BuzzProphet with prior work.
Background and Motivation
Predicting the popularity of hashtags is crucial for optimizing engagement and resource allocation across social media platforms. Existing methods treat hashtag popularity as a classification task or rely on classical regressors, which fail to capture rich contextual signals. LLMs, while proficient in language understanding and contextual reasoning, struggle with direct numeric predictions. The need for an interpretable and contextually-aware solution has driven the development of BuzzProphet, which integrates explanatory reasoning from LLMs into a regression framework.
BuzzProphet Framework
BuzzProphet consists of several components designed to utilize the strengths of LLMs and regressors:
- Reasoning Elicitation: The framework instructs LLMs to generate human-readable rationales on key engagement factors such as topical virality, audience reach, and timing advantage. These rationales are used to enrich input features for the regression model.
- Encoding and Fusion: The enriched input, which combines raw hashtag features and LLM-generated rationales, is encoded using a pre-trained LLM. This dense representation facilitates the incorporation of contextual insights from LLMs into the regression process.
- Regression: Using models like CatBoost, the framework integrates these enriched features to produce predictions, aiming to achieve more accurate forecasts than utilizing regressors or LLMs in isolation.
Benchmark and Evaluation
The study introduces HashView, a benchmark dataset composed of 7,532 hashtags from social media platform Weibo, allowing for comprehensive evaluation. HashView supports systematic analysis by capturing a diverse range of topics and engagement patterns (Figure 2 and Figure 3).
Figure 2: Domain distribution of our HashView benchmark.
Figure 3: Temporal distribution of hashtag postings in HashView, bucketized by hour of day.
Experimental Results
BuzzProphet demonstrates significant improvements over baseline methods:
Conclusion
BuzzProphet presents a practical and interpretable approach to hashtag popularity prediction by integrating LLM-generated reasoning within a regression framework. The research underscores the potential of LLMs to serve as powerful contextual reasoners in social media analytics. Future work could extend the approach to other languages and platforms, further exploring its applicability and the integration of graph-based relational data for enhanced prediction accuracy. The introduction of HashView as a benchmark enriches future research directions, enabling cross-comparison and development of advanced prediction models.