Cost estimates for non-OpenAI models in GDPval speed/cost analysis
Determine the model completion costs for Claude Opus 4.1, Gemini 2.5 Pro, and Grok 4 on the GDPval gold subset tasks, using the same evaluation setup, so that these systems can be included in the paper’s speed and cost savings analyses alongside the reported OpenAI models.
References
We were not able to obtain cost estimates for Claude, Gemini, and Grok.
— GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks
(2510.04374 - Patwardhan et al., 5 Oct 2025) in Footnote in Subsection 3.2 “Speed and cost comparison,” Section 3 (Experiments and Results)