Chain-structured neural architecture search for financial time series forecasting

Published 15 Mar 2024 in q-fin.ST and cs.LG | (2403.14695v2)

Abstract: Neural architecture search (NAS) emerged as a way to automatically optimize neural networks for a specific task and dataset. Despite an abundance of research on NAS for images and natural language applications, similar studies for time series data are lacking. Among NAS search spaces, chain-structured are the simplest and most applicable to small datasets like time series. We compare three popular NAS strategies on chain-structured search spaces: Bayesian optimization (specifically Tree-structured Parzen Estimator), the hyperband method, and reinforcement learning in the context of financial time series forecasting. These strategies were employed to optimize simple well-understood neural architectures like the MLP, 1D CNN, and RNN, with more complex temporal fusion transformers (TFT) and their own optimizers included for comparison. We find Bayesian optimization and the hyperband method performing best among the strategies, and RNN and 1D CNN best among the architectures, but all methods were very close to each other with a high variance due to the difficulty of working with financial datasets. We discuss our approach to overcome the variance and provide implementation recommendations for future users and researchers.

Abstract PDF HTML Upgrade to Chat

Authors (5)

References (19)

Citations (1)

View on Semantic Scholar

Summary

The paper introduces a chain-structured NAS approach that automates the selection of optimal neural network architectures for financial forecasting.
It compares NAS strategies including Bayesian optimization, reinforcement learning, and hyperband across multivariate financial datasets from Japan, Germany, and the US.
Results indicate that LSTMs and CNNs optimized with hyperband and Bayesian methods marginally outperform other models despite inherent data variability challenges.

The paper "Chain-structured neural architecture search for financial time series forecasting" by Denis Levchenko and colleagues explores the application of neural architecture search (NAS) strategies to financial time series forecasting. The authors aim to improve model performance by automating the selection of the best neural network architecture using NAS techniques.

Introduction

The paper begins by discussing the advantages of deep neural networks (DNNs), notably their ability to perform automatic feature extraction without extensive manual engineering. Despite these benefits, choosing the optimal DNN architecture often relies on manual selection. To address this, Auto-ML and NAS techniques have emerged, automating the architecture selection process. However, most NAS research focuses on domains like computer vision and natural language processing, with limited exploration in time series data analysis. This work attempts to fill that gap by evaluating NAS strategies on financial time series datasets.

Data and Problem Formulation

The research utilizes real-world financial data provided by Predictive Layer SA, focusing on multivariate daily time series for Japanese, German, and US bonds. The task is a binary classification problem, predicting whether a target feature will increase or decrease in value in the future. This involves dealing with datasets characterized by high dimensionality and limited samples, which is challenging for deep learning methods. To manage this, the authors employ feature reduction techniques, including removing time-derived features and applying principal component analysis (PCA).

Architecture Types and Search Spaces

The authors focus on chain-structured search spaces, which involve simple architectural topologies comprising sequential layers. This approach offers simplicity and robustness, suitable for the small datasets they employ. They consider various architectures, including feedforward neural networks (FFNNs), convolutional neural networks (CNNs), and recurrent neural networks (RNNs), each adapted with hyperparameter search spaces. Additionally, the study examines the Temporal Fusion Transformer (TFT) for comparison, though it performed poorly due to the lack of data.

Challenges and Performance Estimation

Predicting financial markets is intrinsically difficult, resulting in metrics close to random performance. Factors like the stochastic nature of deep learning models contribute to high variance, depending on the training's random seed, creating challenges for NAS methods. To mitigate seed-related performance variations, the authors average multiple trial results.

Search Strategies

The paper evaluates three NAS strategies: Bayesian optimization, reinforcement learning, and the hyperband method. These strategies inform the optimal network configuration through iterative model training and evaluation. Each method navigates hyperparameter configurations differently, with Bayesian optimization leveraging probabilistic models, reinforcement learning using a reward-based system, and hyperband employing successive halving to efficiently allocate resources.

Results

The authors report the best results for the German dataset using an LSTM model optimized by the hyperband method, achieving an AUC score of 0.56 on average. For the Japanese dataset, a Bayesian-optimized CNN performs best. The US dataset posed the greatest challenge, with averages not exceeding random prediction.

Architectures and Search Strategies Compared

LSTMs and CNNs generally outperform FFNNs, with Bayesian optimization and hyperband showing a slight performance edge over reinforcement learning. Implementation difficulties and computational efficiency considerations make Bayesian optimization and hyperband more practical.

Discussion and Future Work

The study notes that hyperparameter optimization showed no significant convergence, indicating the complexity inherent in the task. The authors propose exploring further NAS methods like cell-based search spaces and one-shot NAS techniques. They emphasize examining publicly available datasets to ascertain the generalizability of their findings.

In conclusion, this paper contributes to understanding how NAS can be applied to financial time series forecasting, highlighting both the potential and challenges of employing automated architecture selection in such complex domains.

Markdown Report Issue