Papers
Topics
Authors
Recent
Search
2000 character limit reached

Rethinking Soft Actor-Critic in High-Dimensional Action Spaces: The Cost of Ignoring Distribution Shift

Published 22 Oct 2024 in cs.LG and cs.AI | (2410.16739v2)

Abstract: Soft Actor-Critic algorithm is widely recognized for its robust performance across a range of deep reinforcement learning tasks, where it leverages the tanh transformation to constrain actions within bounded limits. However, this transformation induces a distribution shift, distorting the original Gaussian action distribution and potentially leading the policy to select suboptimal actions, particularly in high-dimensional action spaces. In this paper, we conduct a comprehensive theoretical and empirical analysis of this distribution shift, deriving the precise probability density function (PDF) for actions following the tanh transformation to clarify the misalignment introduced between the transformed distribution's mode and the intended action output. We substantiate these theoretical insights through extensive experiments on high-dimensional tasks within the HumanoidBench benchmark. Our findings indicate that accounting for this distribution shift substantially enhances SAC's performance, resulting in notable improvements in cumulative rewards, sample efficiency, and reliability across tasks. These results underscore a critical consideration for SAC and similar algorithms: addressing transformation-induced distribution shifts is essential to optimizing policy effectiveness in high-dimensional deep reinforcement learning environments, thereby expanding the robustness and applicability of SAC in complex control tasks.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.