Selecting Cut-off Thresholds for Topic Prevalences in Topic Modeling
Determine principled and generalizable cut-off probability thresholds for topic prevalences (expected topic proportions) produced by topic modeling algorithms, to guide when topic prevalence values should be considered appropriate for substantive discussion and interpretation in empirical analyses.
References
In topic modeling, absolute values are often the focus of discussion due to the open question of which cut-off values should be considered appropriate when discussing topic prevalences.
— Quantitative Tools for Time Series Analysis in Natural Language Processing: A Practitioners Guide
(2404.18499 - Schmal, 2024) in Subsection “Non-stationarity in topic modeling” (Section: Stationarity)