Extend the framework beyond linear reward decompositions
Develop and analyze extensions of the framework to reward functions that do not decompose linearly into a sum of a Chain-of-Thought-only term R_cot and an output-only term R_out, including cases with interactions between CoT and final output.
References
We consider this case where the two rewards act separately and linearly on the CoT and Final Output, leaving other reward decompositions for future work.
— Aligned, Orthogonal or In-conflict: When can we safely optimize Chain-of-Thought?
(2603.30036 - Kaufmann et al., 31 Mar 2026) in Appendix: Mathematical Model of Aligned / In-Conflict / Orthogonal, Reward decomposition (R_cot and R_out)