Combined machine learning for stock selection strategy based on dynamic weighting methods

Published 26 Aug 2025 in q-fin.ST | (2508.18592v1)

Abstract: This paper proposes a novel stock selection strategy framework based on combined machine learning algorithms. Two types of weighting methods for three representative machine learning algorithms are developed to predict the returns of the stock selection strategy. One is static weighting based on model evaluation metrics, the other is dynamic weighting based on Information Coefficients (IC). Using CSI 300 index data, we empirically evaluate the strategy' s backtested performance and model predictive accuracy. The main results are as follows: (1) The strategy by combined machine learning algorithms significantly outperforms single-model approaches in backtested returns. (2) IC-based weighting (particularly IC_Mean) demonstrates greater competitiveness than evaluation-metric-based weighting in both backtested returns and predictive performance. (3) Factor screening substantially enhances the performance of combined machine learning strategies.

Abstract PDF Upgrade to Chat

Summary

The paper's main contribution is its novel dynamic weighting method using Information Coefficients (IC) to improve stock selection accuracy.
It integrates static and dynamic weighting approaches across Ridge Regression, MLP, and Random Forest models for enhanced predictive performance.
Empirical findings on the CSI 300 index show a 39.09% return and improved Sharpe ratios, indicating robust risk-adjusted returns.

Combined Machine Learning for Stock Selection Strategy Based on Dynamic Weighting Methods

Introduction

The research paper "Combined machine learning for stock selection strategy based on dynamic weighting methods" (2508.18592) introduces a sophisticated framework for stock selection grounded in the synthesis of machine learning algorithms. It addresses the growing complexity in quantitative finance by combining machine learning strategies to enhance predictive accuracy and return performance. The paper's primary innovation lies in the use of two novel weighting methods—static and dynamic—applied to three machine learning models: Ridge Regression, Multilayer Perceptron (MLP), and Random Forest. The dynamic method, based on Information Coefficients (IC), particularly showcases competitive enhancement over traditional evaluation metric-based weighting.

Methodology

Static Weighting

The first portion of the methodology involves static weighting based on conventional model evaluation metrics such as RMSE, MAPE, precision, recall, and F1-score. Here, the model with superior performance on these metrics commands a greater weight in the ensemble. Though useful for error reduction, these metrics lack the ability to predict directionality in stock price movements effectively.

Dynamic Weighting

The dynamic weighting approach constitutes the core innovation of this work. It leverages Information Coefficients (IC), a measure of correlation between predicted and actual returns, thereby integrating direction and magnitude in prediction assessments. The IC is computed using the Spearman correlation, which aptly handles outlier influences, making this method robust under fluctuating market conditions. Two schemes are devised:

IC Mean: This scheme calculates a moving average of IC values over a specified window, offering a real-time adjustment to model weights based on historical predictive performance.
IC Ratio: This scheme modifies the IC Mean with its standard deviation, ensuring weight stability in the face of predictive performance volatility.

Results

Empirical analyses conducted using the CSI 300 index data demonstrate significant advantages of the dynamic IC-based weighting over static metrics. The strong performance metrics include:

The strategy return under $IC_{Mean}$ weighting is notably 39.09%, a striking enhancement compared to single models or static metric combinations.
Sharpe ratios indicate improved risk-adjusted returns, substantially exceeding those derived from conventional single-model methods.

Factor screening, performed using Lasso regression, further optimizes predictive accuracy by eliminating redundant or ineffective factors, enhancing speed and adaptability in model execution.

Implications and Future Directions

The implications of this work extend beyond quantitative finance into other applied domains such as healthcare analytics and macroeconomic forecasting, highlighting the importance of adaptive model strategies in dynamically evolving environments. Future research can explore advanced deep learning architectures, such as Transformer models and LSTM networks, to refine temporal modeling capabilities. Additionally, interpretability-focused techniques such as SHAP or LIME can be integrated to better elucidate model decision processes and feature relevancies.

Conclusion

The combination of machine learning for stock selection with dynamic weighting methods presents a significant progression in quantitative finance. The paper contributes by addressing traditional model limitations through innovative use of Information Coefficient-based dynamic weighting, resulting in superior predictive accuracy and robust return profiles. This strategy not only exemplifies effective algorithmic integration but also provides a scalable framework applicable across diverse predictive environments.

Markdown Report Issue