Forecasting the Buzz: Enriching Hashtag Popularity Prediction with LLM Reasoning

Published 9 Oct 2025 in cs.SI | (2510.08481v1)

Abstract: Hashtag trends ignite campaigns, shift public opinion, and steer millions of dollars in advertising spend, yet forecasting which tag goes viral is elusive. Classical regressors digest surface features but ignore context, while LLMs excel at contextual reasoning but misestimate numbers. We present BuzzProphet, a reasoning-augmented hashtag popularity prediction framework that (1) instructs an LLM to articulate a hashtag's topical virality, audience reach, and timing advantage; (2) utilizes these popularity-oriented rationales to enrich the input features; and (3) regresses on these inputs. To facilitate evaluation, we release HashView, a 7,532-hashtag benchmark curated from social media. Across diverse regressor-LLM combinations, BuzzProphet reduces RMSE by up to 2.8% and boosts correlation by 30% over baselines, while producing human-readable rationales. Results demonstrate that using LLMs as context reasoners rather than numeric predictors injects domain insight into tabular models, yielding an interpretable and deployable solution for social media trend forecasting.

Abstract PDF Upgrade to Chat

Summary

The paper demonstrates that BuzzProphet integrates LLM-generated rationales with classical regression to enhance hashtag popularity predictions.
The methodology fuses contextual reasoning with raw hashtag features, achieving up to a 2.8% RMSE reduction and a 30% boost in correlation.
The interpretable and modular design offers practical insights for optimizing social media engagement and advancing future analytics research.

BuzzProphet: Enhancing Hashtag Popularity Prediction with LLM Reasoning

Introduction

The paper "Forecasting the Buzz: Enriching Hashtag Popularity Prediction with LLM Reasoning" proposes a novel framework named BuzzProphet to predict the popularity of hashtags on social media platforms by combining LLMs with traditional regressors. By leveraging the contextual reasoning abilities of LLMs and the numeric precision of classical models, BuzzProphet aims to improve the accuracy and interpretability of popularity forecasts.

Figure 1: Comparison of BuzzProphet with prior work.

Background and Motivation

Predicting the popularity of hashtags is crucial for optimizing engagement and resource allocation across social media platforms. Existing methods treat hashtag popularity as a classification task or rely on classical regressors, which fail to capture rich contextual signals. LLMs, while proficient in language understanding and contextual reasoning, struggle with direct numeric predictions. The need for an interpretable and contextually-aware solution has driven the development of BuzzProphet, which integrates explanatory reasoning from LLMs into a regression framework.

BuzzProphet Framework

BuzzProphet consists of several components designed to utilize the strengths of LLMs and regressors:

Reasoning Elicitation: The framework instructs LLMs to generate human-readable rationales on key engagement factors such as topical virality, audience reach, and timing advantage. These rationales are used to enrich input features for the regression model.
Encoding and Fusion: The enriched input, which combines raw hashtag features and LLM-generated rationales, is encoded using a pre-trained LLM. This dense representation facilitates the incorporation of contextual insights from LLMs into the regression process.
Regression: Using models like CatBoost, the framework integrates these enriched features to produce predictions, aiming to achieve more accurate forecasts than utilizing regressors or LLMs in isolation.

Benchmark and Evaluation

The study introduces HashView, a benchmark dataset composed of 7,532 hashtags from social media platform Weibo, allowing for comprehensive evaluation. HashView supports systematic analysis by capturing a diverse range of topics and engagement patterns (Figure 2 and Figure 3).

Figure 2: Domain distribution of our HashView benchmark.

Figure 3: Temporal distribution of hashtag postings in HashView, bucketized by hour of day.

Experimental Results

BuzzProphet demonstrates significant improvements over baseline methods:

Achieves up to 2.8% reduction in RMSE and 30% boost in correlation compared to classical regression and LLM models independently.
The modular design allows for easy updates, emphasizing interpretability through the generation of structured explanations by LLMs (Figure 4).
Figure 4: Illustration of how BuzzProphet generates more accurate predictions through interpretable reasoning. (Orange: LLM predictions for the three dimensions; blue: explanations about their potential influence on hashtag popularity.)

Conclusion

BuzzProphet presents a practical and interpretable approach to hashtag popularity prediction by integrating LLM-generated reasoning within a regression framework. The research underscores the potential of LLMs to serve as powerful contextual reasoners in social media analytics. Future work could extend the approach to other languages and platforms, further exploring its applicability and the integration of graph-based relational data for enhanced prediction accuracy. The introduction of HashView as a benchmark enriches future research directions, enabling cross-comparison and development of advanced prediction models.

Markdown Report Issue