Gemini 3.5 Flash benchmark

Benchmark snapshot across popular evaluation categories. Higher scores are generally better unless noted; source status is tracked separately from SEO keyword demand.

Last updated:

Benchmark snapshot

BenchmarkGemini 3.5 FlashGemini 3.5 ProKimi K2.6GPT-5 (API)GPT-4o
MMLU-Pro87.390.284.689.686.1
GPQA71.476.268.576.368.2
HumanEval+92.194.588.294.590.7
AIME 20246672.161.872.163.4

Latency vs cost

Gemini 3.5 FlashGemini 3.5 ProKimi K2.6GPT-5 (API)GPT-4o

This visual is an implementation placeholder for the launch chart. V1 keeps the data table crawlable and the methodology visible.

Methodology

ModelMeter stores benchmark snapshots with source URLs and capture dates. Watchlist models are clearly labeled when official confirmation is still required. Future refresh jobs run through Cloudflare Cron and Queues.

Join the waitlist

Get updates when new model pages, benchmarks, and pricing checks ship.

We respect your privacy. No spam, ever.