Predicting per‑query actual API cost for reasoning language models
Develop methods to predict the actual API cost c_m(q) for a given reasoning language model m on a specific query q before issuing the request, using the model’s listed input and output token prices and the query content, so that the prediction accounts for model-specific thinking token consumption and enables cost-aware model selection.
References
We formalize actual cost prediction as an open problem and provide initial evidence that it is challenging due to high per-query cost variance (Section\textasciitilde\ref{sec:priceinverse:prediction}).
— The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More
(2603.23971 - Chen et al., 25 Mar 2026) in Introduction, Contributions list (Section 1)