StockGPT: A GenAI Model for Stock Prediction and Trading

Published 7 Apr 2024 in q-fin.CP, cs.AI, q-fin.PM, q-fin.PR, and q-fin.ST | (2404.05101v3)

Abstract: This paper introduces StockGPT, an autoregressive ``number'' model trained and tested on 70 million daily U.S.\ stock returns over nearly 100 years. Treating each return series as a sequence of tokens, StockGPT automatically learns the hidden patterns predictive of future returns via its attention mechanism. On a held-out test sample from 2001 to 2023, daily and monthly rebalanced long-short portfolios formed from StockGPT predictions yield strong performance. The StockGPT-based portfolios span momentum and long-/short-term reversals, eliminating the need for manually crafted price-based strategies, and yield highly significant alphas against leading stock market factors, suggesting a novel AI pricing effect. This highlights the immense promise of generative AI in surpassing human in making complex financial investment decisions.

Abstract PDF HTML Upgrade to Chat

References (26)

Citations (2)

View on Semantic Scholar

Summary

The paper introduces StockGPT, a transformer-based model that tokenizes continuous stock return data for prediction and trading.
It employs a decoder-only architecture with four attention blocks on 256-day sequences, trained on historic returns to achieve a 119% annual return and Sharpe ratio of 6.5.
Its integration with benchmark factors and potential scalability to high-frequency data highlight significant improvements over traditional trading models.

"StockGPT: A GenAI Model for Stock Prediction and Trading" (2404.05101)

Introduction

The paper presents StockGPT, an autoregressive generative AI model developed explicitly for stock return prediction and trading. StockGPT leverages the transformer architecture commonly used in NLP but adapts it for continuous numerical stock return data. By parsing historical stock returns into tokenized sequences, StockGPT captures temporal dependencies through its attention mechanisms, supplementing traditional price-based trading strategies with AI-driven insights.

Model Architecture

StockGPT is built on a decoder-only transformer architecture, similar in form to that used by ChatGPT. This choice stems from the natural compatibility of the transformer model with time-series data, where the prediction of future values depends on previously observed data.

Figure 1: StockGPT Architecture

The model adopts a light configuration of four attention blocks, each containing four self-attention heads, resulting in approximately one million parameters. Input sequences consist of 256 days' worth of returns for each stock, imitating the sequence length used in LLMs. The model is trained to predict the next day’s stock return based on historical sequences, crafting a form of end-to-end learning that abstracts complex trading signals from numeric data.

Implementation and Training

Stock Returns from 1926-2000 are utilized as the training dataset, providing a comprehensive historical ground for learning various market conditions. The returns are discretized to transform continuous numeric data into intervals or tokens suitable for the transformer architecture. This discretization retains essential market movements while allowing the transformer to function on data formatted similar to text.

The training regime includes minimizing cross-entropy between predicted and actual return bins across the training dataset, with StockGPT displaying substantial prediction accuracy over test periods spanning 2001-2023.

Results and Performance Metrics

The deployment of StockGPT on unseen data from 2001 to 2023 demonstrates a daily rebalanced long-short portfolio yielding an extraordinary annual return of 119% with a Sharpe ratio of 6.5. This significantly exceeds traditional language-model-driven strategies. The model achieved this high return rate even under stringent constraints, like avoiding micro-cap stocks to reduce transaction costs.

Figure 2: Daily Cumulative Returns

Figure 3: Monthly Cumulative Returns

Further analysis revealed that adding StockGPT’s predictions to benchmark factors such as momentum, value, and reversal significantly enhances overall portfolio performance, demonstrating its efficacy in encompassing and, in cases, surpassing conventional market factors.

Comparison with Traditional Models

Contrary to LLMs operating with sentiment analysis or contextual embeddings from financial text data, StockGPT uniquely captures numerical sequences' latent dynamics. Its one-off training specificity negates regular retraining needs, providing a cost-effective and initial performance robustness lasting over two decades.

Implications and Future Directions

StockGPT’s ability to autonomously learn from numeric stock returns invites further exploration into expanding its architecture. Increasing parameter sizes, such as broader embedding dimensions or more extensive attention layers, can mitigate complexity exclusion and foster deeper learning abilities. Moreover, applying StockGPT on high-frequency data like intraday or minute-level pricing could unravel its adaptability to short-term, high-volatility trading environments.

Conclusion

StockGPT stands as a testament to applying generative transformers beyond traditional NLP domains, promising reinforced economic strategies through a direct numerical comprehension of stock markets. Its novel adaptation offers a bridge to complex market efficiency debates, presenting potential adjustments for empirical finance and quantitative trading methodologies. Enhancements in model scalability and frequency will further affirm its position within advanced financial AI systems.