Optimal learning strategy for non-differentiable transformation choices in computation graph synthesis

Ascertain whether gradient descent is an optimal learning strategy for selecting sequences of non-differentiable mathematical transformations when synthesizing computation graphs for math word problem solvers.

Background

The authors discuss that solving math word problems involves applying a series of mathematical transformations, which can be framed as synthesizing a computation graph. They note that conventional learning via gradient descent relies on differentiability and iterative error reduction, which may be ill-suited when the action choices (transformations) are non-differentiable.

They highlight uncertainty about the optimality of gradient descent in this setting and suggest reinforcement learning as a potentially better-suited paradigm, given the exponential search space and non-differentiable decisions inherent in constructing computation graphs.

References

The choices of these mathematical transformations are not differential, and hence it is unclear if its the optimal strategy.

— Towards Tractable Mathematical Reasoning: Challenges, Strategies, and Opportunities for Solving Math Word Problems (2111.05364 - Faldu et al., 2021) in Section: Reinforcement Learning

Optimal learning strategy for non-differentiable transformation choices in computation graph synthesis

Background

References

Related Problems