The only generative image benchmark that shows the images
17 models, 192 prompts, 6 categories — every output published. Judge with your own eyes which model is best for your use case, your budget, your quality bar.

Text Rendering › Typography Style › Easyfal/google/nano-banana-2
Prompt: The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink
V1 Leaderboard
192 prompts, 6 categories, graded pass/fail by VLM judges.
| # | Model | Pass Rate | Pass / Fail | Avg Latency |
|---|---|---|---|---|
| 1 | fal/google/nano-banana-2 | 95.3% | 183/9 | 28.1s |
| 2 | openai/gpt-image-2 | 95.3% | 183/9 | 45.3s |
| 3 | fal/google/nano-banana-pro | 91.1% | 175/17 | 23.4s |
| 4 | bfl/flux-2-max | 90.6% | 174/18 | 26.7s |
| 5 | fal/bytedance/seedream-v4 | 84.4% | 162/30 | 14.1s |
| 6 | bfl/flux-2-pro | 82.8% | 159/33 | 11.8s |
| 7 | fal/ideogram/v4 | 82.3% | 158/34 | 16.6s |
| 8 | bfl/flux-2-klein-9b | 78.6% | 151/41 | 4.1s |
| 9 | local/z-image-6b | 75.5% | 145/47 | 130.7s |
| 10 | local/z-image-turbo-6b | 74.5% | 143/49 | 18.1s |
| 11 | bfl/flux-2-klein-4b | 72.4% | 139/53 | 3.8s |
| 12 | local/qwen-image-2512-20b | 69.3% | 133/59 | 80.2s |
| 13 | local/bonsai-image-ternary-4b | 68.2% | 131/61 | 4.1s |
| 14 | fal/ideogram/v3 | 68.2% | 131/61 | 12.9s |
| 15 | local/nucleus-image-17b-a2b | 64.1% | 123/69 | 39.1s |
| 16 | local/hidream-i1-full-17b | 56.8% | 109/83 | 91.3s |
| 17 | local/sana-1.5-1.6b | 51.0% | 98/94 | 11.1s |
What we evaluate
Each model is tested across 6 categories with 192 prompts spanning easy to extreme difficulty.
Popular benchmark guides
Start learning
Comprehensive guides on image generation evaluation — from metrics to methodology.
Browse guidesFrequently asked questions
See how every model performs
Compare models side-by-side with our interactive benchmark explorer.
Explore ImageBench V1