- The paper demonstrates that integrating quantitative market indicators with qualitative social sentiment data significantly enhances prediction accuracy.
- It employs a two-layer BiLSTM network and correlation-based feature selection to combine 36 market features with daily sentiment vectors extracted via FinBERT.
- Backtesting shows the framework achieves 96.8% accuracy and a 99.6% MCC, highlighting its strong potential for profitable trading strategies.
Summary of "A Multisource Fusion Framework for Cryptocurrency Price Movement Prediction"
Introduction and Objective
Cryptocurrency markets, characterized by high volatility and complex dynamics, present significant challenges for accurate price trend prediction. The paper proposes a multisource fusion framework that combines quantitative market indicators with qualitative sentiment signals from social media, particularly tweets from X (formerly Twitter), to improve prediction accuracy. Sentiment analysis leverages Financial Bidirectional Encoder Representations from Transformers (FinBERT) for financial text processing while sequential dependencies are modeled using a Bidirectional Long Short-Term Memory (BiLSTM) network. The approach is demonstrated on a large Bitcoin dataset, achieving an accuracy of 96.8%, underscoring the importance of sentiment data in financial predictions.
The Proposed Model
The model integrates heterogeneous data sources for cryptocurrency price prediction:
- Quantitative Market Information: This includes historical trade data and technical indicators from Yahoo Finance. A correlation-based feature selection method was applied to derive a set of 36 uncorrelated features from an initial set of 53. The feature vector is represented as a matrix Ht​ within a rolling time window of size T.
Figure 1: Overview of the proposed framework.
- Qualitative Sentiment Information: Tweets related to Bitcoin are preprocessed and sentiment-extracted using FinBERT. A daily sentiment vector st​ is calculated and represented as a matrix St​ over the same rolling window.
- Fusion Strategy: The quantitative and qualitative features are concatenated into a combined matrix X, which serves as input to a two-layer BiLSTM architecture for sequential pattern analysis. Dropout layers are employed to prevent overfitting.
Figure 2: The proposed model for final prediction of cryptocurrency price movements.
Experimental Evaluation
Experiments used a dataset spanning from April 2015 to December 2022, with distinct partitions for training, validation, and testing (70%, 15%, 15% respectively). The model setup includes specific parameters optimized via grid search, with T=21 providing optimal results.
Figure 3: Performance metrics for machine learning models.
Baseline comparisons include classical ML and deep learning (DL) models. StackedLSTM, CNNLSTM, and ordinary LSTM architectures were evaluated, showing varying efficiency in sequential data handling.
The proposed BiLSTM model consistently outperformed all baseline methods. Evaluation metrics included accuracy, precision, recall, F1-score, and MCC, with the BiLSTM model achieving an MCC of 99.6%.
Figure 4: Cumulative trading capital (TMoney) of different models.
The model was also tested using backtesting simulations for financial performance, maximizing returns over traditional strategies and baseline models.



Figure 5: Performance metrics results of models: (a) LSTM, (b) BiLSTM (Proposed model), (c) CNNLSTM, and (d) StackedLSTM.
Key Insights
Two major findings are emphasized: the inclusion of sentiment data significantly boosts predictive performance, and domain-specific sentiment models like FinBERT offer superior results over alternatives such as RoBERTa. Integrating real-time social sentiment data and technical indicators provides a comprehensive approach to cryptocurrency market analysis.
Future Research Directions
To increase the robustness and applicability of the framework, future research could explore weighted news impact measurement and multilingual sentiment analysis support. Additionally, analyzing broader social media platforms and integrating external economic factors could enhance prediction accuracy in dynamic cryptocurrency markets.
Conclusion
This multisource fusion framework provides substantial improvements in cryptocurrency price movement prediction by leveraging technical and sentiment information. Its demonstrated accuracy and financial profitability underscore its potential benefits for investment strategies and automated trading systems, stressing the value of heterogeneous data sources in financial forecasting.
Overall, the proposed approach offers practical insights for using social sentiment in conjunction with traditional financial indicators, paving the way for more informed and profitable trading decisions in volatile cryptocurrency markets.