Roo Code tests each frontier model against a suite of hundreds of exercises across 5 programming languages with varying difficulty. These results can help you find the right price-to-intelligence ratio for your use case.
Want to see the results for a model we haven't tested yet? Ping us in Discord.
ModelContext WindowPricingCost (USD)Score
Claude 3.7 Sonnet200K
$3.00
/
$15.00
$40.3997%
Gemini 2.5 Pro Preview1M
$1.25
/
$10.00
$45.4992%
GPT 4.11M
$2.00
/
$8.00
$41.5291%
Claude 3.5 Sonnet200K
$3.00
/
$15.00
$34.0790%
GPT 4.1 Mini1M
$0.40
/
$1.60
$9.4281%
O3 Mini (High)200K
$1.10
/
$4.40
$24.5581%
DeepSeek V364K
$0.27
/
$1.10
$12.2073%
Cost Versus Intelligence