- The paper presents FR-LUX, a reinforcement learning framework that integrates transaction cost modeling and regime conditioning to optimize portfolio management.
- It introduces a microstructure-consistent execution model and a trade-space trust region to stabilize updates and reduce turnover, achieving superior average Sharpe ratios.
- Empirical results confirm consistent outperformance across diverse volatility-liquidity regimes, offering robust practical insights for ML-driven portfolio strategies.
FR-LUX: Friction-Aware, Regime-Conditioned Policy Optimization for Implementable Portfolio Management
Introduction
The paper presents FR-LUX, a novel reinforcement learning framework developed to optimize trading strategies in realistic portfolio management scenarios, particularly addressing challenges posed by transaction costs and regime shifts. The framework is designed to produce after-cost trading policies that are robust across varying volatility and liquidity regimes. FR-LUX integrates several components to achieve this: a microstructure-consistent execution model, a trade-space trust region, and explicit regime conditioning.
Framework Components
Microstructure-Consistent Execution Model
FR-LUX includes an execution model that incorporates both proportional and impact costs directly into the reward structure. This approach allows for more accurate modeling of real-world trading conditions, where transaction costs significantly affect portfolio performance.
Trade-Space Trust Region
Instead of focusing solely on changes in logits, FR-LUX introduces a trade-space trust region that constrains changes in inventory flow. This component helps stabilize updates and maintain low turnover, crucial for managing costs in live trading environments.
Regime Conditioning
Explicit regime conditioning enables the policy to adapt to different market states without fragmenting the data. This feature allows the model to specialize its strategies for different volatility and liquidity conditions, namely LL, LH, HL, and HH regimes.
FR-LUX was tested on a grid of regimes and cost levels and outperformed strong baselines like vanilla PPO and mean-variance models. It showed superior average Sharpe ratios across scenarios with narrow confidence intervals, indicating both high performance and statistical significance.
Figure 1: Top methods by Sharpe (95\% bootstrap CI). Bars show scenario-mean Sharpe with seeds averaged first; whiskers are percentile CIs. All statistics are computed on after-cost returns.
FR-LUX's performance was particularly notable in maintaining a flat cost-performance slope, highlighting its robust handling of transaction costs compared to competitors.
Figure 2: Cost robustness. Scenario-mean Sharpe versus cost (bps). Shaded bands are ±1 standard error across regimes (HAC). The slope for FR-LUX is the smallest among competitors, evidencing friction-aware learning.
Theoretical Guarantees
The paper provides several theoretical guarantees for FR-LUX. There is a conservative improvement bound for policy updates with a KL trust region, and results suggest a positive value advantage for regime-conditioned policies. Additionally, the model has been proven robust to cost misspecification, and the existence of an optimal stationary policy is assured under convex frictions.
Empirical Results
The empirical results demonstrate FR-LUX's ability to deliver positive Sharpe ratios across all volatility-liquidity regimes, with consistently superior performance in both LL/LH and HL/HH states.
Figure 3: Regime profile (mean Sharpe). The color scale is centered at zero, making positive vs. negative cells directly comparable. FR-LUX attains consistently positive Sharpe across all volatility--liquidity regimes.
Further, pairwise scenario-level improvements confirmed that FR-LUX's outperformance remains statistically significant even after rigorous multiple-testing corrections.
Practical Implications
The research highlights the practical importance of incorporating transaction costs within the learning framework. By internalizing execution costs and learning to adjust trading intensity in different liquidity environments, FR-LUX presents a viable pathway for deploying ML-driven strategies in institutional portfolios.
Conclusion
FR-LUX stands as a robust framework for implementing portfolio management strategies that remain effective in the face of transaction costs and regime shifts. Its design and performance highlight the necessity of integrating cost modelling and adaptive strategy learning to achieve implementable and statistically credible results. These attributes make FR-LUX a valuable contribution for advancing machine learning applications in financial markets.