FR-LUX: Friction-Aware, Regime-Conditioned Policy Optimization for Implementable Portfolio Management

Published 3 Oct 2025 in q-fin.TR and cs.LG | (2510.02986v1)

Abstract: Transaction costs and regime shifts are major reasons why paper portfolios fail in live trading. We introduce FR-LUX (Friction-aware, Regime-conditioned Learning under eXecution costs), a reinforcement learning framework that learns after-cost trading policies and remains robust across volatility-liquidity regimes. FR-LUX integrates three ingredients: (i) a microstructure-consistent execution model combining proportional and impact costs, directly embedded in the reward; (ii) a trade-space trust region that constrains changes in inventory flow rather than logits, yielding stable low-turnover updates; and (iii) explicit regime conditioning so the policy specializes to LL/LH/HL/HH states without fragmenting the data. On a 4 x 5 grid of regimes and cost levels with multiple random seeds, FR-LUX achieves the top average Sharpe ratio with narrow bootstrap confidence intervals, maintains a flatter cost-performance slope than strong baselines, and attains superior risk-return efficiency for a given turnover budget. Pairwise scenario-level improvements are strictly positive and remain statistically significant after multiple-testing corrections. We provide formal guarantees on optimality under convex frictions, monotonic improvement under a KL trust region, long-run turnover bounds and induced inaction bands due to proportional costs, positive value advantage for regime-conditioned policies, and robustness to cost misspecification. The methodology is implementable: costs are calibrated from standard liquidity proxies, scenario-level inference avoids pseudo-replication, and all figures and tables are reproducible from released artifacts.

Abstract PDF Upgrade to Chat

Summary

The paper presents FR-LUX, a reinforcement learning framework that integrates transaction cost modeling and regime conditioning to optimize portfolio management.
It introduces a microstructure-consistent execution model and a trade-space trust region to stabilize updates and reduce turnover, achieving superior average Sharpe ratios.
Empirical results confirm consistent outperformance across diverse volatility-liquidity regimes, offering robust practical insights for ML-driven portfolio strategies.

FR-LUX: Friction-Aware, Regime-Conditioned Policy Optimization for Implementable Portfolio Management

Introduction

The paper presents FR-LUX, a novel reinforcement learning framework developed to optimize trading strategies in realistic portfolio management scenarios, particularly addressing challenges posed by transaction costs and regime shifts. The framework is designed to produce after-cost trading policies that are robust across varying volatility and liquidity regimes. FR-LUX integrates several components to achieve this: a microstructure-consistent execution model, a trade-space trust region, and explicit regime conditioning.

Framework Components

Microstructure-Consistent Execution Model

FR-LUX includes an execution model that incorporates both proportional and impact costs directly into the reward structure. This approach allows for more accurate modeling of real-world trading conditions, where transaction costs significantly affect portfolio performance.

Trade-Space Trust Region

Instead of focusing solely on changes in logits, FR-LUX introduces a trade-space trust region that constrains changes in inventory flow. This component helps stabilize updates and maintain low turnover, crucial for managing costs in live trading environments.

Regime Conditioning

Explicit regime conditioning enables the policy to adapt to different market states without fragmenting the data. This feature allows the model to specialize its strategies for different volatility and liquidity conditions, namely LL, LH, HL, and HH regimes.

Performance and Evaluation

FR-LUX was tested on a grid of regimes and cost levels and outperformed strong baselines like vanilla PPO and mean-variance models. It showed superior average Sharpe ratios across scenarios with narrow confidence intervals, indicating both high performance and statistical significance.

Figure 1: Top methods by Sharpe (95\% bootstrap CI). Bars show scenario-mean Sharpe with seeds averaged first; whiskers are percentile CIs. All statistics are computed on after-cost returns.

FR-LUX's performance was particularly notable in maintaining a flat cost-performance slope, highlighting its robust handling of transaction costs compared to competitors.

Figure 2: Cost robustness. Scenario-mean Sharpe versus cost (bps). Shaded bands are $\pm 1$ standard error across regimes (HAC). The slope for FR-LUX is the smallest among competitors, evidencing friction-aware learning.

Theoretical Guarantees

The paper provides several theoretical guarantees for FR-LUX. There is a conservative improvement bound for policy updates with a KL trust region, and results suggest a positive value advantage for regime-conditioned policies. Additionally, the model has been proven robust to cost misspecification, and the existence of an optimal stationary policy is assured under convex frictions.

Empirical Results

The empirical results demonstrate FR-LUX's ability to deliver positive Sharpe ratios across all volatility-liquidity regimes, with consistently superior performance in both LL/LH and HL/HH states.

Figure 3: Regime profile (mean Sharpe). The color scale is centered at zero, making positive vs. negative cells directly comparable. FR-LUX attains consistently positive Sharpe across all volatility--liquidity regimes.

Further, pairwise scenario-level improvements confirmed that FR-LUX's outperformance remains statistically significant even after rigorous multiple-testing corrections.

Practical Implications

The research highlights the practical importance of incorporating transaction costs within the learning framework. By internalizing execution costs and learning to adjust trading intensity in different liquidity environments, FR-LUX presents a viable pathway for deploying ML-driven strategies in institutional portfolios.

Conclusion

FR-LUX stands as a robust framework for implementing portfolio management strategies that remain effective in the face of transaction costs and regime shifts. Its design and performance highlight the necessity of integrating cost modelling and adaptive strategy learning to achieve implementable and statistically credible results. These attributes make FR-LUX a valuable contribution for advancing machine learning applications in financial markets.

Markdown Report Issue