ImageBench

vs

192 evaluations across 6 categories

92%vs95%
Pass Rate
26.7svs28.1s
Avg Latency
Text Rendering100%87%Spatial Reasoning93%97%Human realism100%98%Truthfulness89%93%Professional Studio89%100%Graphical design75%92%Latency3%2%
The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink
Text RenderingTypography StyleEasybfl/flux-2-max

Prompt: The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Text Rendering 100%vs87%

Typography Style100%vs100%
Easy 100%vs100%
The word 'CHAPTER ONE' typed on aged paper with a vintage tybfl/flux-2-maxPASS
The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Evaluator (Qwen 3 VL)

The text “CHAPTER ONE” is spelled correctly. The font style mimics a vintage typewriter with blocky, uniform characters and slight imperfections in stroke consistency, which aligns with the description of “vintage typewriter font.” The paper appears aged, with visible texture, slight discoloration, and creases, matching “aged paper.” The ink impression is not perfectly even — some letters show slight variations in saturation and alignment, suggesting a mechanical typewriter’s ink transfer, which

The phrase 'TODAY'S SPECIAL' written in white chalk on a darbfl/flux-2-maxPASS
The phrase 'TODAY'S SPECIAL' written in white chalk on a dark green chalkboard with chalk dust

Evaluator (Qwen 3 VL)

The text “TODAY’S SPECIAL” is rendered in a style that convincingly mimics white chalk on a dark green chalkboard. The lettering shows the characteristic uneven, textured strokes and smudged edges of chalk writing, with visible chalk dust scattered at the bottom edge and faint smudges on the board surface. The spelling is correct, including the apostrophe in “TODAY’S”. VERDICT: PASS

The word 'OPEN' displayed on a red LED sign with glowing pixbfl/flux-2-maxPASS
The word 'OPEN' displayed on a red LED sign with glowing pixel-style letters

Evaluator (Qwen 3 VL)

The text "OPEN" is rendered in a glowing, pixel-style format consistent with an LED display. The letters are composed of illuminated blocks, matching the described "glowing pixel-style letters" on a red LED sign. The color is red, and the lighting effect, including the reflection on the surface below, reinforces the LED aesthetic. The spelling is correct. VERDICT: PASS

Writing accuracy100%vs83%
Easy 100%vs100%
A single large word 'HELLO' centered on a white backgroundbfl/flux-2-maxPASS
A single large word 'HELLO' centered on a white background

Evaluator (Qwen 3 VL)

The intended text was "HELLO", but the image displays "H E L L O" — the letters are correctly formed and readable. However, the fourth letter is an "L", and the fifth letter is an "O", which is correct. The word is spelled "HELLO", but the "L" in the fourth position is visually distorted — it appears as a rounded, almost circular shape, which may be misread as an "O" or a stylized "L". However, upon close inspection, it is clearly an "L" — the vertical stem and horizontal crossbar are present, e

The word 'ADVENTURE' in big bold letters centered on a plainbfl/flux-2-maxPASS
The word 'ADVENTURE' in big bold letters centered on a plain cream background

Evaluator (Qwen 3 VL)

The word "ADVENTURE" is spelled correctly and is fully readable. All letters are clearly visible and properly formed. There are no misspellings, missing words, or illegible parts. VERDICT: PASS

The word 'SERENITY' written large and centered on a solid libfl/flux-2-maxPASS
The word 'SERENITY' written large and centered on a solid light blue background

Evaluator (Qwen 3 VL)

The word displayed is "SERENITY", which is spelled correctly. The text is large, centered, and fully readable against the solid light blue background. There are no misspellings, missing words, or illegible parts. VERDICT: PASS

Medium 100%vs100%
The sentence 'The quick brown fox jumps high' in clean blackbfl/flux-2-maxPASS
The sentence 'The quick brown fox jumps high' in clean black text on a white background

Evaluator (Qwen 3 VL)

All words in the sentence "The quick brown fox jumps high" are spelled correctly and fully readable. There are no misspellings, missing words, or illegible parts. VERDICT: PASS

'Every sunset brings a new dawn ahead' written neatly on a pbfl/flux-2-maxPASS
'Every sunset brings a new dawn ahead' written neatly on a pale yellow background

Evaluator (Qwen 3 VL)

The text in the image reads: "Every sunset brings a new dawn ahead" This matches the intended text exactly: "'Every sunset brings a new dawn ahead'" All words are spelled correctly. All words are fully readable. There are no misspellings, missing words, or illegible parts. VERDICT: PASS

'Stars shine brightest in the dark night' displayed on a clebfl/flux-2-maxPASS
'Stars shine brightest in the dark night' displayed on a clean off-white background

Evaluator (Qwen 3 VL)

The text in the image reads: "Stars shine brightest in the dark night" This matches the intended text exactly: "'Stars shine brightest in the dark night'" All words are spelled correctly, fully present, and clearly readable. There are no misspellings, missing words, or illegible parts. VERDICT: PASS

Hard 100%vs33%
A poster with headline 'SUMMER SALE' in large bold text, bodbfl/flux-2-maxPASS
A poster with headline 'SUMMER SALE' in large bold text, body text 'Up to 50% off all items this weekend only' in medium size, and fine print 'Terms and conditions apply. See store for details.' at the bottom in small text

Evaluator (Qwen 3 VL)

All text in the image matches the intended text exactly: - Headline: "SUMMER SALE" — correct and fully readable. - Body text: "Up to 50% off all items this weekend only" — correct and fully readable. - Fine print: "Terms and conditions apply. See store for details." — correct and fully readable. No misspellings, missing words, or illegible parts are present. VERDICT: PASS

A flyer with large title 'GRAND OPENING', subtitle 'Join us bfl/flux-2-maxPASS
A flyer with large title 'GRAND OPENING', subtitle 'Join us for food, music, and fun' in medium text, and small disclaimer 'RSVP required. Limited seating available.' at the bottom

Evaluator (Qwen 3 VL)

All text in the image matches the intended text exactly: - Main title: “GRAND OPENING” — spelled correctly and fully readable. - Subtitle: “Join us for food, music, and fun” — spelled correctly and fully readable. - Disclaimer: “RSVP required. Limited seating available.” — spelled correctly and fully readable. No misspellings, missing words, or illegible parts are present. VERDICT: PASS

An event banner with headline 'ANNUAL GALA 2025' at the top bfl/flux-2-maxPASS
An event banner with headline 'ANNUAL GALA 2025' at the top in large serif font, description 'An evening of fine dining and live jazz' in medium text, and footnote 'Black tie optional. Doors open at 6 PM.' in small text

Evaluator (Qwen 3 VL)

All text in the image matches the intended text exactly: - Headline: "ANNUAL GALA 2025" — spelled correctly, in large serif font. - Description: "An evening of fine dining and live jazz" — spelled correctly, in medium text. - Footnote: "Black tie optional. Doors open at 6 PM." — spelled correctly, in small text. No misspellings, missing words, or illegible parts are present. The text is fully readable and accurate. VERDICT: PASS

Extreme 100%vs100%
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagbfl/flux-2-maxPASS
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagline 'One ship. One chance. No turning back.' with credits block reading 'Directed by James Harlow. Starring Elena Voss, Marcus Chen, and Sofia Reyes. Music by Thomas Grant. A Paramount Pict

Evaluator (Qwen 3 VL)

All text in the image matches the intended text exactly: - Title: “THE LAST VOYAGE” — correct. - Tagline: “One ship. One chance. No turning back.” — correct. - Credits block: - “Directed by James Harlow” — correct. - “Starring Elena Voss, Marcus Chen, and Sofia Reyes” — correct. - “Music by Thomas Grant. A Paramount Pictures Release.” — correct. - “Coming Summer 2025” — correct. All words are spelled correctly, fully present, and clearly readable. No misspellings, missing words, or ill

A magazine cover titled 'NATIONAL EXPLORER' with cover linesbfl/flux-2-maxPASS
A magazine cover titled 'NATIONAL EXPLORER' with cover lines 'The Hidden Temples of Cambodia — page 34', 'Interview: Dr. Maria Santos on Climate Action', 'Top 10 Trails You Must Hike This Year', and a barcode area with 'Issue 247 | March 2025 | $6.99

Evaluator (Qwen 3 VL)

All text is spelled correctly and fully readable. - "NATIONAL EXPLORER" — correct. - "The Hidden Temples of Cambodia — page 34" — correct. - "Interview: Dr. Maria Santos on Climate Action" — correct. - "Top 10 Trails You Must Hike This Year" — correct. - Barcode area: "Issue 247 | March 2025 | $6.99" — correct. No misspellings, missing words, or illegible parts. VERDICT: PASS

A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtbfl/flux-2-maxPASS
A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtitle 'Raw & Unfiltered', weight '12 oz (340g)', description 'Pure wildflower honey harvested from sustainable apiaries in the Pacific Northwest. No additives or preservatives.', and small te

Evaluator (Qwen 3 VL)

All text on the label matches the intended text exactly. - "GOLDEN HARVEST ORGANIC HONEY" — correct. - "Raw & Unfiltered" — correct. - "12 oz (340g)" — correct. - "Pure wildflower honey harvested from sustainable apiaries in the Pacific Northwest. No additives or preservatives." — correct. - "Best before: 06/2026. Store in cool, dry place. Product of USA." — correct. No misspellings, missing words, or illegible parts are present. VERDICT: PASS

Spatial Reasoning 93%vs97%

Attributes Binding100%vs100%
Easy 100%vs100%
A red ball and a blue cube sitting side by side on a white sbfl/flux-2-maxPASS
A red ball and a blue cube sitting side by side on a white surface

Evaluator (Qwen 3.5 122B)

Object 1: Red ball - Color: Red ✅ - Shape: Sphere (ball) ✅ - Size: Not specified in prompt, but consistent with cube (relative size is plausible) ✅ Object 2: Blue cube - Color: Blue ✅ - Shape: Cube ✅ - Size: Not specified in prompt, but consistent with ball (relative size is plausible) ✅ All attributes are correctly bound to the right object with no swaps. VERDICT: PASS

A green apple and an orange pumpkin on a wooden cutting boarbfl/flux-2-maxPASS
A green apple and an orange pumpkin on a wooden cutting board

Evaluator (Qwen 3.5 122B)

Object 1: Green apple - Color: Green ✅ - Shape: Round with a slight indentation at the top ✅ - Size: Small to medium ✅ Object 2: Orange pumpkin - Color: Orange ✅ - Shape: Round with vertical ridges ✅ - Size: Medium ✅ All attributes are correctly bound to the right object with no swaps. VERDICT: PASS

A yellow taxi and a black limousine parked next to each othebfl/flux-2-maxPASS
A yellow taxi and a black limousine parked next to each other on a street

Evaluator (Qwen 3.5 122B)

Object 1: Yellow taxi - Color: Yellow (correct) - Shape: Standard sedan shape with taxi roof sign (correct) - Size: Medium-sized, typical taxi dimensions (correct) Object 2: Black limousine - Color: Black (correct) - Shape: Extended-length vehicle with elongated body (correct) - Size: Larger than the taxi, consistent with limousine dimensions (correct) All attributes are correctly bound to the right object with no swaps. VERDICT: PASS

Medium 100%vs100%
A large red circle, a small blue triangle, and a medium greebfl/flux-2-maxPASS
A large red circle, a small blue triangle, and a medium green star arranged on a white canvas

Evaluator (Qwen 3.5 122B)

Object 1: Large red circle — Observed: Red color, circular shape, large size. ✅ Correct. Object 2: Small blue triangle — Observed: Blue color, triangular shape, small size. ✅ Correct. Object 3: Medium green star — Observed: Green color, star shape, medium size. ✅ Correct. All attributes (color, size, shape) are correctly bound to the right object with no swaps. VERDICT: PASS

A tall thin white candle, a short fat red candle, and a medibfl/flux-2-maxPASS
A tall thin white candle, a short fat red candle, and a medium spiral yellow candle on a shelf

Evaluator (Qwen 3.5 122B)

Object 1 (left): Tall, thin, white candle — matches the description. Object 2 (middle): Short, fat, red candle — matches the description. Object 3 (right): Medium, spiral, yellow candle — matches the description. All three objects have their specified color, size, and shape correctly assigned without any swaps. VERDICT: PASS

A big brown horse, a small white rabbit, and a medium-sized bfl/flux-2-maxPASS
A big brown horse, a small white rabbit, and a medium-sized black cat standing in a meadow

Evaluator (Qwen 3.5 122B)

Object: Big brown horse - Color: Brown (correct) - Size: Big (correct) - Shape: Horse (correct) Object: Small white rabbit - Color: White (correct) - Size: Small (correct) - Shape: Rabbit (correct) Object: Medium-sized black cat - Color: Black (correct) - Size: Medium-sized (correct) - Shape: Cat (correct) All attributes are correctly bound to the right object with no swaps. VERDICT: PASS

Hard 100%vs100%
A pink fire truck, a blue banana, and a green sun in a surrebfl/flux-2-maxPASS
A pink fire truck, a blue banana, and a green sun in a surreal landscape

Evaluator (Qwen 3.5 122B)

Object: Pink fire truck - Color: Pink (main body), white (trim), red (lights) - Shape: Standard fire truck shape with cab, ladder, hoses, etc. - Size: Realistic scale for a vehicle Object: Blue banana - Color: Blue (shades of blue with gradient) - Shape: Curved banana shape, with stem at top - Size: Giant, surreal scale — much larger than life Object: Green sun - Color: Green (soft, glowing green) - Shape: Circular, full sun shape - Size: Large, dominating the sky All three objects are correc

A purple carrot, a red tree with red leaves, and a white watbfl/flux-2-maxPASS
A purple carrot, a red tree with red leaves, and a white watermelon sliced open

Evaluator (Qwen 3.5 122B)

Object: Purple carrot Observed attributes: Color = purple, Shape = conical/elongated, Size = large (relative to other objects) → Matches prompt: "A purple carrot" — PASS Object: Red tree Observed attributes: Color = red (leaves and trunk), Shape = tree with branches and leaves, Size = medium (in pot) → Matches prompt: "A red tree with red leaves" — PASS Object: White watermelon sliced open Observed attributes: Color = white flesh (inside), Shape = round (sliced), Size = medium (slice

An orange elephant, a yellow penguin, and a silver strawberrbfl/flux-2-maxPASS
An orange elephant, a yellow penguin, and a silver strawberry arranged on a table

Evaluator (Qwen 3.5 122B)

Object: Orange elephant — Observed: Orange color, elephant shape, medium size. Object: Yellow penguin — Observed: Yellow color, penguin shape, medium size. Object: Silver strawberry — Observed: Silver color, strawberry shape, medium size. All objects match their specified color and shape. No attribute swaps are present. VERDICT: PASS

Compositionality100%vs100%
Easy 100%vs100%
A tabby cat wearing oversized red sunglassesbfl/flux-2-maxPASS
A tabby cat wearing oversized red sunglasses

Evaluator (Qwen 3.5 122B)

Element 1: Tabby cat — Present. The cat has distinct tabby markings (stripes and swirls) on its fur. Element 2: Oversized red sunglasses — Present. The cat is wearing large, bright red-framed sunglasses that cover most of its face. Element 3: Coherent combination — Present. The sunglasses are placed on the cat’s face in a way that is visually plausible and stylistically humorous, fitting the prompt’s intent. All described elements are present and coherently combined. VERDICT: PASS

A golden retriever wearing a tiny cowboy hatbfl/flux-2-maxPASS
A golden retriever wearing a tiny cowboy hat

Evaluator (Qwen 3.5 122B)

Element: Golden retriever — Present. Element: Tiny cowboy hat — Present. (The hat is proportionally small relative to the dog’s head, fitting the “tiny” descriptor.) Both elements are clearly and coherently combined in the image. The dog is wearing the cowboy hat, and the scene is consistent with the prompt’s unusual concept. VERDICT: PASS

A penguin holding a colorful umbrellabfl/flux-2-maxPASS
A penguin holding a colorful umbrella

Evaluator (Qwen 3.5 122B)

Element: Penguin — Present. Element: Colorful umbrella — Present (rainbow-colored segments: red, yellow, green, blue). Element: Penguin holding the umbrella — Present (penguin is gripping the umbrella handle with both flippers). Element: Coherent combination — Yes, the penguin and umbrella are logically combined in a whimsical, plausible scene (though biologically odd, the composition is coherent and visually consistent). VERDICT: PASS

Medium 100%vs100%
A medieval knight in full armor sitting at a desk typing on bfl/flux-2-maxPASS
A medieval knight in full armor sitting at a desk typing on a modern laptop

Evaluator (Qwen 3.5 122B)

Element: Medieval knight in full armor → Present (visible full suit of plate armor, including helmet, gauntlets, and chainmail undergarment). Element: Sitting at a desk → Present (knight is seated on a modern office chair at a wooden desk). Element: Typing on a modern laptop → Present (hands are positioned over the laptop keyboard, actively typing). Element: Unusual conceptual combination → Coherently combined (the knight’s armor is seamlessly integrated into the modern office setting, with no v

A Roman gladiator taking a selfie with a smartphone in the Cbfl/flux-2-maxPASS
A Roman gladiator taking a selfie with a smartphone in the Colosseum

Evaluator (Qwen 3.5 122B)

Element 1: Roman gladiator — Present. The figure is wearing traditional Roman gladiator armor, including a helmet with a red plume, metal cuirass, red tunic, and leg guards. Element 2: Taking a selfie — Present. The gladiator is holding a smartphone in one hand, raised to eye level, consistent with taking a selfie. Element 3: Smartphone — Present. A modern black smartphone is clearly visible in the gladiator’s hand. Element 4: Colosseum — Present. The iconic structure of the Roman Colosseum f

An astronaut in a spacesuit riding a bicycle through a parkbfl/flux-2-maxPASS
An astronaut in a spacesuit riding a bicycle through a park

Evaluator (Qwen 3.5 122B)

Element: Astronaut in a spacesuit — Present. Element: Riding a bicycle — Present. Element: Through a park — Present (trees, benches, pathway, pond, grass, people walking, dogs). All described elements are present and coherently combined in the image. The scene successfully merges the unusual concept of an astronaut riding a bicycle in a terrestrial park setting, with no visual inconsistencies. VERDICT: PASS

Hard 100%vs100%
An octopus wearing a top hat and monocle playing chess on thbfl/flux-2-maxPASS
An octopus wearing a top hat and monocle playing chess on the surface of the moon with Earth visible in the background

Evaluator (Qwen 3.5 122B)

Element 1: Octopus — Present. A large, detailed octopus is the central figure in the image. Element 2: Top hat — Present. The octopus is wearing a black top hat with a silver band. Element 3: Monocle — Present. A gold monocle is attached to the octopus’s right eye via a chain. Element 4: Playing chess — Present. The octopus is seated at a chessboard with pieces arranged for play; one tentacle is reaching to move a piece. Element 5: Surface of the moon — Present. The setting is a gray, rocky, cra

A dinosaur in a business suit giving a PowerPoint presentatibfl/flux-2-maxPASS
A dinosaur in a business suit giving a PowerPoint presentation in a modern office while drinking coffee

Evaluator (Qwen 3.5 122B)

Element 1: Dinosaur — Present. A Tyrannosaurus rex is the central figure. Element 2: Business suit — Present. The dinosaur is wearing a dark suit, white shirt, and tie. Element 3: PowerPoint presentation — Present. A large screen behind the dinosaur displays a bar chart titled “Quarterly Performance.” Element 4: Modern office — Present. The setting includes a conference table, laptops, glass walls, and city views. Element 5: Drinking coffee — Present. The dinosaur is holding a white coffee mug i

A robot dressed as a pirate sailing a paper boat on a lake obfl/flux-2-maxPASS
A robot dressed as a pirate sailing a paper boat on a lake of lava while juggling three kittens

Evaluator (Qwen 3.5 122B)

Element 1: Robot dressed as a pirate — Present. The central figure is a robot with a pirate hat, eye patch, and red cape, clearly styled as a pirate. Element 2: Sailing a paper boat — Present. The robot is standing inside a white, folded paper boat that is floating on the lava. Element 3: Lake of lava — Present. The boat is on a surface of flowing, glowing orange lava, with volcanic rock formations in the background. Element 4: Juggling three kittens — Present. Three kittens (one orange, one

Counting67%vs100%
Easy 67%vs100%
Exactly 3 red apples arranged on a wooden tablebfl/flux-2-maxPASS
Exactly 3 red apples arranged on a wooden table

Evaluator (Qwen 3.5 122B)

I count exactly 3 red apples in the image. They are arranged on a wooden table, as requested. The apples are red, glossy, and have water droplets on them. The wooden table surface is clearly visible beneath them. The count of red apples (3) exactly matches the request. VERDICT: PASS

Exactly 5 yellow rubber ducks sitting in a row on a bathroombfl/flux-2-maxFAIL
Exactly 5 yellow rubber ducks sitting in a row on a bathroom shelf

Evaluator (Qwen 3.5 122B)

I count exactly 4 yellow rubber ducks sitting in a row on a white bathroom shelf. The prompt requested exactly 5. Since the number of ducks (4) does not match the requested number (5), the count does not exactly match what was asked. VERDICT: FAIL

Exactly 4 white candles standing on a dark surfacebfl/flux-2-maxPASS
Exactly 4 white candles standing on a dark surface

Evaluator (Qwen 3.5 122B)

I see exactly 4 white candles standing on a dark surface. Each candle is white, cylindrical, and has a wick at the top. The surface beneath them is dark and reflective, consistent with the prompt. The count exactly matches what was requested: 4 white candles on a dark surface. VERDICT: PASS

Medium 67%vs100%
2 orange cats and 3 black dogs sitting together on a grassy bfl/flux-2-maxPASS
2 orange cats and 3 black dogs sitting together on a grassy lawn

Evaluator (Qwen 3.5 122B)

I see 2 orange cats and 3 black dogs sitting together on a grassy lawn. - Orange cats: 2 (one on the left, one on the right) - Black dogs: 3 (one on the far left, one in the center, one on the far right) The count exactly matches what was requested. VERDICT: PASS

4 red roses and 2 white daisies in a glass vase on a tablebfl/flux-2-maxPASS
4 red roses and 2 white daisies in a glass vase on a table

Evaluator (Qwen 3.5 122B)

Counting the objects in the image: - Red roses: There are 4 distinct red roses visible. - White daisies: There are 2 distinct white daisies visible. The image shows a glass vase on a wooden table, as requested. The count exactly matches the prompt: 4 red roses and 2 white daisies. VERDICT: PASS

1 blue butterfly and 3 yellow butterflies resting on a gardebfl/flux-2-maxFAIL
1 blue butterfly and 3 yellow butterflies resting on a garden fence

Evaluator (Qwen 3.5 122B)

I count: - 1 blue butterfly (centered on the middle fence post) - 4 yellow butterflies (one on the leftmost post, one on the middle post to the left of the blue butterfly, one on the middle post to the right of the blue butterfly, and one on the rightmost post) The prompt requested: "1 blue butterfly and 3 yellow butterflies". The count does not exactly match — there are 4 yellow butterflies, not 3. VERDICT: FAIL

Hard 67%vs100%
Exactly 9 goldfish swimming in a large round fishbowlbfl/flux-2-maxPASS
Exactly 9 goldfish swimming in a large round fishbowl

Evaluator (Qwen 3.5 122B)

I count exactly 9 goldfish in the image, all swimming inside a large, round glass fishbowl. The fish are clearly visible and distinct, and the bowl’s shape and water are consistent with the description. The prompt requested: “Exactly 9 goldfish swimming in a large round fishbowl.” The image matches this request perfectly in both quantity and context. VERDICT: PASS

A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 bfl/flux-2-maxPASS
A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 apples

Evaluator (Qwen 3.5 122B)

Counting the objects in the image: - Bananas: There is a bunch of bananas. Visually, there are clearly 3 bananas connected at the stem. - Oranges: There are 4 oranges visible — one is partially obscured, but all are countable. - Apples: There are 2 apples — one red and one green. The requested count was: 3 bananas, 4 oranges, and 2 apples. The image exactly matches this count. VERDICT: PASS

Exactly 11 colorful balloons floating against a clear blue sbfl/flux-2-maxFAIL
Exactly 11 colorful balloons floating against a clear blue sky

Evaluator (Qwen 3.5 122B)

Counting the balloons in the image: From left to right: 1. Red 2. Yellow 3. Blue (with a yellow stripe) 4. Orange 5. Orange (second orange) 6. Green 7. Pink 8. Gold 9. Silver 10. Teal (between green and pink) 11. Another orange (partially obscured behind the teal balloon) Actually, let’s recount carefully to avoid miscounting: Left to right: 1. Red 2. Yellow 3. Blue 4. Orange 5. Orange (second orange) 6. Green 7. Teal 8. Pink 9. Gold 10. Silver That’s only 10. But wait — there’s a second ora

Negation89%vs89%
Easy 100%vs100%
A tropical beach scene with no peoplebfl/flux-2-maxPASS
A tropical beach scene with no people

Evaluator (Qwen 3.5 122B)

The image depicts a tropical beach scene with white sand, turquoise water, palm trees, and distant islands under a blue sky with clouds. The foreground includes seashells and a piece of driftwood. A careful scan of the entire image reveals no human figures, either on the beach, in the water, or in the background vegetation. The prompt explicitly excluded “people,” and no people are visible in the image. VERDICT: PASS

A cat lying on a couch without a collarbfl/flux-2-maxPASS
A cat lying on a couch without a collar

Evaluator (Qwen 3.5 122B)

The image shows a tabby cat lying on a couch, resting on a cushion. A close inspection of the cat’s neck and shoulders reveals no collar of any kind — it is collar-free. The prompt explicitly excluded “a collar,” and this element is indeed absent. VERDICT: PASS

A classroom with desks and a chalkboard but no studentsbfl/flux-2-maxPASS
A classroom with desks and a chalkboard but no students

Evaluator (Qwen 3.5 122B)

The prompt explicitly excluded "students" from the classroom scene. I have carefully examined the image and can confirm that there are no students present in the room. The classroom contains desks, chairs, a chalkboard with writing, a clock, posters, and a teacher’s desk — but no people. VERDICT: PASS

Medium 100%vs100%
A sandy beach with palm trees, seashells, and a beach umbrelbfl/flux-2-maxPASS
A sandy beach with palm trees, seashells, and a beach umbrella but no water or ocean visible

Evaluator (Qwen 3.5 122B)

The prompt explicitly excluded "water or ocean" from the image. In the provided image, there is no visible water or ocean. The scene shows only sand stretching to the horizon, with palm trees, a beach umbrella, and seashells — all elements that are permitted. The horizon line is flat and shows no indication of water, waves, or shoreline transition to water. The sky is visible above, but no water is present. VERDICT: PASS

An office workspace with a desk, chair, bookshelf, and lamp bfl/flux-2-maxPASS
An office workspace with a desk, chair, bookshelf, and lamp but no computers or screens

Evaluator (Qwen 3.5 122B)

The image shows an office workspace with a desk, chair, bookshelf, and lamp — all elements explicitly included in the prompt. The prompt specifically excluded “computers or screens.” Upon careful inspection, there are no computers, monitors, tablets, or any other electronic display devices visible on the desk, on the bookshelf, or anywhere else in the scene. The desk contains only papers, a notebook, a mug, a pencil holder, and small potted plants. The lamp is a floor lamp, not a computer monito

A kitchen with pots, pans, a stove, and a cutting board but bfl/flux-2-maxPASS
A kitchen with pots, pans, a stove, and a cutting board but no food

Evaluator (Qwen 3.5 122B)

The image shows a kitchen with pots and pans (on the stove and hanging above the sink), a stove (with blue flames visible), and a cutting board (on the countertop). There is no food visible anywhere in the image. VERDICT: PASS

Hard 67%vs67%
A pepperoni pizza with tomato sauce, basil, and crust but abbfl/flux-2-maxPASS
A pepperoni pizza with tomato sauce, basil, and crust but absolutely no cheese

Evaluator (Qwen 3.5 122B)

The image shows a pizza with tomato sauce, pepperoni, and fresh basil leaves on top. The crust is visible and appears golden-brown with charred spots. There is no visible cheese on the pizza. The prompt explicitly excluded cheese, and the image does not contain any cheese. VERDICT: PASS

A car body with windows, doors, and headlights but no wheelsbfl/flux-2-maxPASS
A car body with windows, doors, and headlights but no wheels

Evaluator (Qwen 3.5 122B)

The image shows a silver car body with visible windows, doors, and headlights. However, the wheels are completely absent — the wheel wells are empty and show only the inner structure of the fenders. This matches the prompt’s requirement: “A car body with windows, doors, and headlights but no wheels.” VERDICT: PASS

A human face with eyes, nose, and mouth but no eyebrowsbfl/flux-2-maxFAIL
A human face with eyes, nose, and mouth but no eyebrows

Evaluator (Qwen 3.5 122B)

The image shows a close-up of a human face with clearly visible eyes, nose, and mouth. Above each eye, there are distinct, well-defined eyebrows. The prompt explicitly excluded "a human face with eyes, nose, and mouth but no eyebrows," meaning the eyebrows should be absent. Since eyebrows are clearly present in the image, the excluded element is not absent. VERDICT: FAIL

Relative Position100%vs92%
Easy 100%vs100%
A cat sitting on top of a cardboard boxbfl/flux-2-maxPASS
A cat sitting on top of a cardboard box

Evaluator (Qwen 3.5 122B)

The prompt specified: “A cat sitting on top of a cardboard box.” In the image: - The cat is clearly positioned on the top surface of the cardboard box. - The cat’s body is resting on the lid of the box, with its paws hanging over the edge — consistent with “sitting on top of.” - The box is on the floor, and the cat is above it — fulfilling “on top of.” - No other positional relationships (left/right/above/below) are specified in the prompt, so there is no requirement to verify those. The core

A red ball underneath a wooden chairbfl/flux-2-maxPASS
A red ball underneath a wooden chair

Evaluator (Qwen 3.5 122B)

The prompt specifies: “A red ball underneath a wooden chair.” In the image: - There is a wooden chair with four legs. - A red ball is positioned directly beneath the chair, between the front legs, and clearly under the seat. - The spatial relationship “underneath” is accurately depicted — the ball is located below the chair’s seat and legs, as expected. The prompt only specifies one spatial relationship: “underneath.” No other positional relationships (left, right, above, below) are mentioned

A coffee mug next to an open book on a deskbfl/flux-2-maxPASS
A coffee mug next to an open book on a desk

Evaluator (Qwen 3.5 122B)

The prompt specifies: “A coffee mug next to an open book on a desk.” In the image: - Both the coffee mug and the open book are placed on the wooden desk — satisfying “on a desk.” - The coffee mug is positioned to the left of the open book, and the book is to the right of the mug — satisfying “next to” (which implies adjacency, and here it is clearly left/right adjacency). - No objects are placed “under” or “above” each other in a way that contradicts the prompt — both items are on the same surf

Medium 100%vs100%
A bird perched on a branch, the branch extending from a treebfl/flux-2-maxPASS
A bird perched on a branch, the branch extending from a tree next to a park bench

Evaluator (Qwen 3.5 122B)

The bird is perched on a branch — this spatial relationship is correctly depicted. The branch extends from a tree — visually, the branch clearly originates from the tree trunk on the left side of the image. The tree is next to a park bench — the bench is visible in the background to the right of the tree, and the spatial relationship “next to” is accurate, even if not explicitly “left/right” or “above/below” in a strict sense. The prompt asks for positional relationships such as “on,” “under,

A cat sitting on a pillow, the pillow placed on a wooden chabfl/flux-2-maxPASS
A cat sitting on a pillow, the pillow placed on a wooden chair

Evaluator (Qwen 3.5 122B)

The prompt specifies: "A cat sitting on a pillow, the pillow placed on a wooden chair." In the image: - The cat is sitting directly on top of a pillow → satisfies "cat sitting on a pillow". - The pillow is resting on the seat of a wooden chair → satisfies "pillow placed on a wooden chair". All spatial relationships described in the prompt are visually accurate: - “on” is correctly used for both the cat-on-pillow and pillow-on-chair relationships. No other positional relationships (left/right/

A candle on top of a stack of books, the books next to a potbfl/flux-2-maxPASS
A candle on top of a stack of books, the books next to a potted plant

Evaluator (Qwen 3.5 122B)

The prompt specifies: "A candle on top of a stack of books, the books next to a potted plant." - The candle is indeed placed on top of the stack of books — this spatial relationship ("on top of") is correct. - The stack of books is positioned next to the potted plant — this spatial relationship ("next to") is also correct. All described positional relationships are accurately depicted in the image. VERDICT: PASS

Hard 100%vs67%
Five chess pieces on a board: a king in the center, a queen bfl/flux-2-maxPASS
Five chess pieces on a board: a king in the center, a queen to its left, a bishop to its right, a rook above the king, and a knight below the king

Evaluator (Qwen 3.5 122B)

The image shows five white chess pieces arranged on a wooden chessboard. From the viewer’s perspective: - The **king** is in the center. - To the **left** of the king is the **queen** (identifiable by its crown). - To the **right** of the king is the **bishop** (identifiable by its pointed top). - **Above** the king is the **rook** (identifiable by its castle-like structure). - **Below** the king is the **knight** (identifiable by its horse head). All five pieces are correctly positioned relat

A desk scene: laptop in the center, coffee mug to the left, bfl/flux-2-maxPASS
A desk scene: laptop in the center, coffee mug to the left, phone to the right, lamp above, and notebook below

Evaluator (Qwen 3.5 122B)

The prompt specified: "laptop in the center, coffee mug to the left, phone to the right, lamp above, and notebook below". - Laptop is centered on the desk — ✅ Correct. - Coffee mug is to the left of the laptop — ✅ Correct. - Phone is to the right of the laptop — ✅ Correct. - Lamp is above the laptop — ✅ Correct. - Notebook is below the laptop — ✅ Correct. All specified spatial relationships are accurately depicted in the image. VERDICT: PASS

Five animals in a field: a horse on the far left, a cow to tbfl/flux-2-maxPASS
Five animals in a field: a horse on the far left, a cow to the right of the horse, a sheep in the center, a goat to the right of the sheep, and a dog on the far right

Evaluator (Qwen 3.5 122B)

The prompt specifies the following spatial relationships: - Horse on the far left - Cow to the right of the horse - Sheep in the center - Goat to the right of the sheep - Dog on the far right In the image: - The horse is on the far left — ✅ correct. - The cow is to the right of the horse — ✅ correct. - The sheep is in the center — ✅ correct. - The goat is to the right of the sheep — ✅ correct. - The dog is on the far right — ✅ correct. All positional relationships described in the prompt are a

Extreme 100%vs100%
A landscape with a red barn in the foreground at large scalebfl/flux-2-maxPASS
A landscape with a red barn in the foreground at large scale, a windmill in the midground at medium scale, and snow-capped mountains in the background at small scale, all with consistent depth perspective

Evaluator (Qwen 3.5 122B)

The image correctly depicts the spatial relationships as specified in the prompt: - The red barn is in the foreground at a large scale, occupying the left side of the frame and appearing closest to the viewer. - The windmill is in the midground at a medium scale, positioned to the right of the barn and further back, consistent with depth perspective. - The snow-capped mountains are in the background at a small scale, appearing farthest away and behind both the barn and windmill, creating a clea

A beach scene with a sandcastle in the foreground, a surfer bfl/flux-2-maxPASS
A beach scene with a sandcastle in the foreground, a surfer on the waves in the midground, and a cruise ship on the horizon in the background, sizes decreasing with distance

Evaluator (Qwen 3.5 122B)

The image correctly depicts the spatial relationships as specified in the prompt: - The sandcastle is in the foreground, clearly positioned closest to the viewer, built on the beach sand. - The surfer is in the midground, riding a wave further out in the ocean, positioned behind the sandcastle. - The cruise ship is in the background, on the horizon, appearing smaller due to distance, as expected. All positional relationships (foreground, midground, background, decreasing size with distance) ar

A city park with a park bench and pigeons in the foreground,bfl/flux-2-maxPASS
A city park with a park bench and pigeons in the foreground, a fountain in the midground, and skyscrapers in the background, each layer progressively smaller

Evaluator (Qwen 3.5 122B)

The image correctly depicts the spatial relationships as described in the prompt: - **Foreground**: A park bench with pigeons on and around it — this is clearly visible and accurate. - **Midground**: A fountain is positioned behind the bench, centered in the frame — this matches the description. - **Background**: Skyscrapers are visible behind the fountain and trees, appearing progressively smaller due to perspective — this is also accurate. All positional relationships (foreground → midground

Scale & Proportions100%vs100%
Easy 100%vs100%
A person standing next to a yellow school bus, showing realibfl/flux-2-maxPASS
A person standing next to a yellow school bus, showing realistic relative scale

Evaluator (Qwen 3.5 122B)

The person is standing next to the school bus, and their relative scale is realistic. The bus is a standard-sized yellow school bus, and the person is proportionally sized to it — not dwarfed or unnaturally large. The bus’s windows, wheels, and body are correctly proportioned to real-world dimensions, and the person’s height and stance are consistent with a human standing beside a full-sized vehicle. The perspective and shadows also support accurate scale. VERDICT: PASS

A ladybug sitting on a large green leaf, shown at realistic bfl/flux-2-maxPASS
A ladybug sitting on a large green leaf, shown at realistic proportions

Evaluator (Qwen 3.5 122B)

The ladybug is shown sitting on a large green leaf, and the proportions are realistic. The leaf is significantly larger than the ladybug, which is consistent with real-world scale — ladybugs are small insects, and leaves they rest on are typically much larger. The image captures this size relationship accurately, with the ladybug appearing appropriately tiny in comparison to the leaf’s surface and veins. VERDICT: PASS

A child holding a basketball, both at correct relative sizesbfl/flux-2-maxPASS
A child holding a basketball, both at correct relative sizes

Evaluator (Qwen 3.5 122B)

The child in the image is holding a basketball that is proportionally appropriate for his size. A standard basketball for a child (typically size 5 or 6) is not oversized relative to his torso or arms. The ball fits naturally in his hands and is consistent with real-world proportions for a young child playing basketball. The scale of the basketball relative to the child’s body is realistic and correctly rendered. VERDICT: PASS

Medium 100%vs100%
A tiny person standing on the rim of a coffee cup, looking dbfl/flux-2-maxPASS
A tiny person standing on the rim of a coffee cup, looking down into it

Evaluator (Qwen 3.5 122B)

The image depicts a tiny person standing on the rim of a coffee cup, looking down into it. The person is clearly scaled down to fit on the rim, and their posture and position are consistent with the prompt. The coffee cup is proportionally large relative to the person, and the perspective reinforces the scale difference — the cup’s rim is wide enough to accommodate the person, and the coffee surface appears vast from the person’s viewpoint. The lighting, shadows, and reflections also support a r

A giant tabby cat walking between city skyscrapers like a kabfl/flux-2-maxPASS
A giant tabby cat walking between city skyscrapers like a kaiju

Evaluator (Qwen 3.5 122B)

The image depicts a giant tabby cat walking between city skyscrapers, with its size relative to the buildings and vehicles clearly exaggerated to match the “kaiju” concept. The cat’s body is proportionally massive — its head and torso are comparable in scale to the height of the skyscrapers, and its paw is large enough to crush cars and disrupt the street. The perspective and scale are consistent: the cat’s tail extends upward and beyond the building’s height, and its body occupies the entire wi

A miniature house sitting on the palm of someone's handbfl/flux-2-maxPASS
A miniature house sitting on the palm of someone's hand

Evaluator (Qwen 3.5 122B)

The image shows a miniature house resting on the palm of a human hand. The house is clearly scaled down to fit comfortably within the palm, with its proportions (roof, windows, door, porch) appearing proportionally correct relative to the hand. The hand’s size and anatomical features (fingers, palm lines, knuckles) are realistic and provide a clear sense of scale. The house is not just small — it is appropriately miniature, fitting the prompt’s description. VERDICT: PASS

Hard 100%vs100%
A group photo of an elephant, a horse, a dog, a cat, and a mbfl/flux-2-maxPASS
A group photo of an elephant, a horse, a dog, a cat, and a mouse, all standing in a line with correct real-world proportional sizes

Evaluator (Qwen 3.5 122B)

The image shows an elephant, a horse, a dog, a cat, and a mouse standing in a line. The elephant is the largest, followed by the horse, then the dog, then the cat, and finally the mouse — which is the smallest. This ordering reflects the correct real-world proportional sizes of these animals. - The elephant is significantly larger than the horse, which is accurate. - The horse is larger than the dog, which is correct. - The dog is larger than the cat, which is accurate. - The cat is larger than

A dinner table scene with a plate, fork, knife, wine glass, bfl/flux-2-maxPASS
A dinner table scene with a plate, fork, knife, wine glass, salt shaker, and peppercorn, all at realistic relative sizes

Evaluator (Qwen 3.5 122B)

The image depicts a dinner table scene with a plate, fork, knife, wine glass, salt shaker, and a single peppercorn — all arranged in a realistic, standard place setting. The relative sizes are consistent with real-world proportions: - The plate is the largest object, appropriately sized for a main course. - The fork and knife are standard dinner utensils, proportionally sized to fit beside the plate. - The wine glass is appropriately sized for a standard serving of red wine, with a stem and bas

A farm scene with a barn, tractor, farmer, cow, chicken, andbfl/flux-2-maxPASS
A farm scene with a barn, tractor, farmer, cow, chicken, and a fence post, all at correct proportional scale

Evaluator (Qwen 3.5 122B)

The image depicts a farm scene with a barn, tractor, farmer, cow, chicken, and fence post. The relative sizes are consistent with real-world proportions: - The barn is large and appropriately scaled for the background. - The tractor is proportionally sized relative to the barn and farmer. - The farmer is realistically sized compared to the tractor and barn. - The cow is correctly proportioned relative to the farmer and tractor. - The chicken is appropriately small compared to the cow and farmer

Human realism 100%vs98%

Faces & Expressions100%vs100%
Easy 100%vs100%
A close-up portrait of a young woman with a warm natural smibfl/flux-2-maxPASS
A close-up portrait of a young woman with a warm natural smile

Evaluator (Qwen 3 VL)

The face in the image is largely symmetrical, with balanced eyes, eyebrows, and lips. The features are well-proportioned and rendered with high fidelity, though there is a slight softness and uniformity to the skin texture that suggests AI generation. The smile is clearly visible and conveys a warm, natural emotion — the corners of the mouth lift, cheeks rise, and teeth are slightly visible, all consistent with a genuine or posed happy expression. There are no major distortions, though the eyes

A portrait of a middle-aged man with a calm, serious expressbfl/flux-2-maxPASS
A portrait of a middle-aged man with a calm, serious expression

Evaluator (Qwen 3 VL)

The face in the image is largely symmetrical with correctly rendered features: eyes, nose, and ears are balanced, and the facial structure appears natural. The man’s expression is calm and serious, with neutral lips, direct gaze, and minimal emotional cues — consistent with the prompt. There are no obvious distortions; skin texture, hair, and beard are realistic and well-defined. The lighting and composition support a professional portrait style. VERDICT: PASS

A headshot of a teenager with a relaxed, neutral facebfl/flux-2-maxPASS
A headshot of a teenager with a relaxed, neutral face

Evaluator (Qwen 3 VL)

The face in the image is largely symmetrical with correctly proportioned features: eyes, nose, and ears are well-aligned, and the facial structure appears natural. The teenager’s expression is neutral and relaxed, with no strong emotion conveyed — fitting the prompt. There are no obvious distortions; the skin texture, freckles, and lighting are consistent with a realistic portrait. The eyes are clear and focused, and the mouth is closed in a calm, neutral line. VERDICT: PASS

Medium 100%vs100%
A close-up of a person's face showing clear disgust, nose wrbfl/flux-2-maxPASS
A close-up of a person's face showing clear disgust, nose wrinkled and lip curled

Evaluator (Qwen 3 VL)

The face in the image is not perfectly symmetric — the left side (from the viewer’s perspective) appears slightly more compressed and the right side is more relaxed, especially around the eyes and brow. The nose is slightly asymmetrical in shape and angle, and the lips are not perfectly even, with the right side appearing slightly more pulled back. However, these asymmetries are natural and consistent with a human face expressing strong emotion. The intended emotion — disgust — is clearly reada

A portrait capturing genuine surprise — raised eyebrows, widbfl/flux-2-maxPASS
A portrait capturing genuine surprise — raised eyebrows, wide eyes, slightly open mouth

Evaluator (Qwen 3 VL)

The face is largely symmetric with correctly rendered features: the eyes, eyebrows, nose, and mouth are well-proportioned and aligned. The intended emotion — genuine surprise — is clearly readable through the wide, bulging eyes, raised eyebrows, and slightly open mouth. There are no significant distortions; the facial structure is anatomically plausible, and the expression is natural and convincing. The lighting and detail are consistent with a high-quality portrait. VERDICT: PASS

A face showing quiet melancholy — downcast eyes, slightly drbfl/flux-2-maxPASS
A face showing quiet melancholy — downcast eyes, slightly drawn mouth, distant gaze

Evaluator (Qwen 3 VL)

The face is largely symmetric with correctly rendered anatomical features — the eyes, nose, and mouth are proportionally accurate and aligned. There are no major distortions or unnatural deformities. The expression conveys quiet melancholy: the gaze is distant and downward-looking, the mouth is slightly drawn and closed with a subtle downturn, and the brow is gently furrowed, enhancing the somber mood. The lighting and texture add to the emotional realism. The prompt’s requested emotion is clear

Hard 100%vs100%
An elderly woman in her 80s with deep wrinkles and white haibfl/flux-2-maxPASS
An elderly woman in her 80s with deep wrinkles and white hair, smiling warmly with kind eyes

Evaluator (Qwen 3 VL)

The face is largely symmetrical with natural, well-defined features. The deep wrinkles around the eyes and mouth are consistent with an elderly woman in her 80s, and the white hair is realistic and textured. The warm smile is clearly readable, with crinkled eyes and a gentle curve of the lips conveying kindness and warmth. There are no obvious distortions — the proportions, eye alignment, and facial structure appear natural. The lighting and skin texture enhance the authenticity of the expressio

A 5-year-old child with round cheeks and large eyes, crying bfl/flux-2-maxPASS
A 5-year-old child with round cheeks and large eyes, crying with tears streaming down their face

Evaluator (Qwen 3 VL)

The face is largely symmetric with correctly proportioned features: the eyes are large and evenly spaced, the nose and mouth are centered, and the cheeks are round and plump as requested. Tears are clearly streaming down both cheeks, and the child’s expression — furrowed brow, slightly open mouth, and tear tracks — strongly conveys crying. There are no significant distortions; the facial structure is anatomically plausible, and the emotion is unmistakably readable as distress or sadness. VERDIC

An elderly man in his 70s with age spots and thinning gray hbfl/flux-2-maxPASS
An elderly man in his 70s with age spots and thinning gray hair, looking skeptically over reading glasses

Evaluator (Qwen 3 VL)

The face is largely symmetric and features are anatomically plausible. The elderly man’s skin shows age spots as requested, and his hair is thinning gray with a natural receding pattern. He wears reading glasses, and his expression — furrowed brow, slightly downturned mouth, and narrowed eyes — clearly conveys skepticism or disapproval. There are no major distortions; the facial structure, eye alignment, and proportions are natural. The lighting and texture are realistic, enhancing the authentic

Extreme 100%vs100%
A group of three friends at a table: one laughing with head bfl/flux-2-maxPASS
A group of three friends at a table: one laughing with head tilted back, one looking shocked with wide eyes, and one rolling their eyes with a smirk

Evaluator (Qwen 3 VL)

The face on the left (man laughing with head tilted back) is symmetric and features are correct — natural smile, visible teeth, and facial structure are well-defined. The central woman’s face shows clear shock: wide eyes, open mouth, and raised eyebrows — all symmetric and emotionally expressive. The man on the right has a smirk with eyes half-closed and head tilted — this conveys rolling eyes and smirking, though the “rolling eyes” is not literally depicted (eyes are not visibly rolling), the e

A family photo with four people: a father beaming with pridebfl/flux-2-maxPASS
A family photo with four people: a father beaming with pride, a mother smiling gently, a teenager looking bored, and a child grinning excitedly

Evaluator (Qwen 3 VL)

The faces in the image are largely symmetrical with correctly rendered features — no obvious distortions or anatomical errors. All four individuals have clear, readable expressions matching the prompt: - Father (top left): Beaming with pride — wide, genuine smile, crinkled eyes. - Mother (center): Smiling gently — soft, pleasant expression. - Teenager (top right): Looking bored — neutral to slightly downturned mouth, uninterested gaze. - Child (front): Grinning excitedly — wide, toothy grin, br

Three coworkers in a meeting: one looking frustrated with fubfl/flux-2-maxPASS
Three coworkers in a meeting: one looking frustrated with furrowed brows, one nodding with a pleased expression, and one looking confused with a tilted head

Evaluator (Qwen 3 VL)

The image shows three men at a meeting table, each with a distinct facial expression: - Left: The man has furrowed brows, a downturned mouth, and a tense posture — clearly conveying frustration. - Center: The man is smiling, looking pleased, and nodding slightly — consistent with satisfaction or agreement. - Right: The man has a tilted head, wide eyes, and an open mouth — conveying confusion or surprise. All faces are symmetric with correct anatomical features — no distortions, asymmetries, or

Full Body100%vs100%
Easy 100%vs100%
A woman standing upright in a relaxed pose, arms at her sidebfl/flux-2-maxPASS
A woman standing upright in a relaxed pose, arms at her sides, wearing jeans and a white t-shirt

Evaluator (Qwen 3 VL)

The image shows a full-body shot of a woman standing upright with her arms at her sides, wearing a white t-shirt and blue jeans — matching the prompt exactly. Her posture is relaxed and natural, with no visible distortions. Limb count is correct (two arms, two legs), and proportions are anatomically accurate — head-to-body ratio, limb lengths, and joint placements are normal. The pose is clearly recognizable as “standing upright with arms at sides.” There are no unnatural bends, missing limbs, o

A man standing straight facing the camera, hands in pockets,bfl/flux-2-maxPASS
A man standing straight facing the camera, hands in pockets, normal proportions

Evaluator (Qwen 3 VL)

The image shows a full-body shot of a man standing straight and facing the camera. His right hand is in his right pocket, and his left hand hangs naturally by his side. The proportions appear normal: head, torso, arms, and legs are in proportion to each other, with no obvious limb count anomalies (two arms, two legs, one head). The pose is anatomically correct — the spine is straight, shoulders are level, and the posture is natural for standing with one hand in a pocket. The described action — “

A person standing in a neutral pose in a park, full body visbfl/flux-2-maxPASS
A person standing in a neutral pose in a park, full body visible from head to feet

Evaluator (Qwen 3 VL)

The image shows a full-body shot of a person standing on a park pathway, with head-to-feet visibility as requested. The pose is neutral: the person is upright, arms relaxed at the sides, and feet parallel and shoulder-width apart. The limbs are correctly proportioned — two arms, two legs, with natural joint angles and hand placement. There are no anatomical distortions; the torso, head, and limbs align naturally with no extra or missing limbs. The setting (park, trees, path) matches the prompt.

Medium 100%vs100%
A woman mid-stride running through a park, hair and clothes bfl/flux-2-maxPASS
A woman mid-stride running through a park, hair and clothes showing motion

Evaluator (Qwen 3 VL)

The image shows a woman in mid-stride running through a park, with her hair and clothing dynamically flowing to suggest motion — consistent with the prompt. Anatomical assessment: - Limb count: All four limbs are present — two arms and two legs — no missing or extra limbs. - Proportions: The figure’s proportions appear anatomically plausible — head, torso, limbs, and feet are proportionally scaled relative to each other. - Pose: The pose is recognizable as a running stride — one leg forward, th

A man captured mid-jump over a puddle, body in the air with bfl/flux-2-maxPASS
A man captured mid-jump over a puddle, body in the air with bent knees

Evaluator (Qwen 3 VL)

The image shows a man in mid-air, seemingly jumping over a puddle. His body is crouched with bent knees, arms extended outward for balance — a pose consistent with the prompt. Limb count is correct: two arms, two legs, and the torso are all present. Proportions appear anatomically plausible — the torso, limbs, and head are proportionally scaled relative to each other. The pose is recognizable as a dynamic jump, and the reflection in the puddle below reinforces the action, showing the moment of i

A dancer in a dynamic spin, dress flowing outward with one lbfl/flux-2-maxPASS
A dancer in a dynamic spin, dress flowing outward with one leg extended

Evaluator (Qwen 3 VL)

The dancer in the image is captured mid-spin, with her dress flowing outward in a dynamic, circular motion. One leg is extended forward and slightly lifted, while the other supports her weight on the ball of the foot — a pose consistent with a ballet or contemporary dance turn. The limbs are clearly visible: two arms, two legs, two feet, and the torso — all anatomically present and proportionally accurate. The pose is recognizable as a spinning movement, with the dress’s motion suggesting rotati

Hard 100%vs100%
A person in a yoga tree pose — standing on one leg with the bfl/flux-2-maxPASS
A person in a yoga tree pose — standing on one leg with the other foot pressed to the inner thigh, arms raised overhead

Evaluator (Qwen 3 VL)

The image shows a person performing the yoga tree pose (Vrksasana). The pose is recognizable: one foot is placed on the inner thigh of the standing leg, arms are raised overhead with palms pressed together. Limb count is correct — two arms, two legs, two feet, and the torso. Proportions appear anatomically plausible, though the pose is inherently challenging and may involve slight compression or adjustment in the hip and knee joints — which is normal. There are no gross distortions or anatomical

A rock climber clinging to a cliff face with arms and legs ebfl/flux-2-maxPASS
A rock climber clinging to a cliff face with arms and legs extended to different holds

Evaluator (Qwen 3 VL)

The climber has two arms and two legs — limb count is correct. Proportions appear anatomically plausible: the torso, limbs, and head are proportioned in a way consistent with a human body, though the pose is dynamic and exaggerated for dramatic effect. The pose is recognizable as a rock climbing stance — arms extended to grip holds, legs positioned for leverage, body angled against the cliff face. The climber’s feet are on holds, and the body is tensed appropriately for the action. There are no

A cellist seated with correct posture, legs around the instrbfl/flux-2-maxPASS
A cellist seated with correct posture, legs around the instrument, bow arm extended

Evaluator (Qwen 3 VL)

The image shows a cellist seated with the cello positioned between her legs, supported by a stand. Her left hand is on the fingerboard, and her right hand holds the bow, which is extended across the strings — consistent with the described action. The posture appears correct: she is seated upright, with her back straight, and her legs are wrapped around the instrument as expected for cello playing. Limb count is normal (two arms, two legs), and proportions appear anatomically accurate — no obviou

Extreme 100%vs100%
Two tango dancers in a close embrace, the man dipping the wobfl/flux-2-maxPASS
Two tango dancers in a close embrace, the man dipping the woman backward with intertwined legs

Evaluator (Qwen 3 VL)

The image depicts two tango dancers in a dramatic pose. The man is holding the woman’s hand and supporting her back, while she is arched backward with her head tilted up. Her legs are intertwined with his — one leg is wrapped around his waist/back, and the other is extended backward, creating a classic tango dip. The pose is dynamic and recognizable as a tango figure. Anatomically, the limbs are mostly proportionate and correctly rendered. The woman’s body is arched backward in a way that is ph

Two soccer players mid-tackle, one sliding and the other jumbfl/flux-2-maxPASS
Two soccer players mid-tackle, one sliding and the other jumping over with the ball at their feet

Evaluator (Qwen 3 VL)

The image depicts two soccer players in a dynamic mid-tackle moment: one player is sliding on the ground, kicking up dirt, while the other is airborne, jumping over the sliding player with the ball at their feet. The action is recognizable and matches the prompt. Anatomical analysis: - Limb count: Both players have all limbs intact — two arms and two legs each. No extra or missing limbs. - Proportions: The figures are proportionally accurate — heads, torsos, limbs, and feet are in realistic sca

Two gymnasts performing a synchronized handstand, side by sibfl/flux-2-maxPASS
Two gymnasts performing a synchronized handstand, side by side with identical body alignment

Evaluator (Qwen 3 VL)

The image shows two gymnasts performing a synchronized handstand side by side, with nearly identical body alignment — their legs are straight and parallel, their torsos are aligned vertically, and their arms are positioned symmetrically. The pose is anatomically plausible for a handstand, with limbs appearing correctly proportioned relative to each other and to the body. Both figures have the expected number of limbs (two arms, two legs) and no obvious anatomical distortions such as extra limbs,

Hands100%vs92%
Easy 100%vs100%
A person standing with both hands visible at their sides, fibfl/flux-2-maxPASS
A person standing with both hands visible at their sides, fingers relaxed and naturally spread

Evaluator (Qwen 3 VL)

Both hands are visible at the person’s sides, fingers relaxed and naturally spread, as per the prompt. Counting the fingers on each hand: - Left hand: Thumb, index, middle, ring, pinky — 5 fingers. - Right hand: Thumb, index, middle, ring, pinky — 5 fingers. Anatomically, the hands are correctly proportioned. The fingers are relaxed, slightly curved downward, and the thumbs are naturally positioned. There are no extra or missing fingers, no fused digits, and no unnatural deformities. The postu

A woman holding her hands out in front of her palms-up, all bfl/flux-2-maxPASS
A woman holding her hands out in front of her palms-up, all ten fingers clearly visible

Evaluator (Qwen 3 VL)

Both hands are clearly visible, palms facing forward, with all ten fingers distinctly shown. Each hand displays five fingers: thumb, index, middle, ring, and pinky — no fingers are missing or duplicated. The positioning is anatomically correct for a “palms-up” gesture with hands extended forward. The fingers are proportionally aligned and the knuckles and creases appear natural. No anomalies such as extra fingers, fused digits, or unnatural deformities are present. VERDICT: PASS

A man with arms crossed, both hands resting on opposite uppebfl/flux-2-maxPASS
A man with arms crossed, both hands resting on opposite upper arms with fingers visible

Evaluator (Qwen 3 VL)

Both hands are visible in the image, resting on opposite upper arms as described in the prompt. - Left hand (resting on right upper arm): All five fingers are clearly visible — thumb, index, middle, ring, and little finger. - Right hand (resting on left upper arm): All five fingers are also clearly visible — thumb, index, middle, ring, and little finger. The grip/gesture is anatomically plausible. The hands are positioned with palms facing inward and fingers slightly curled, which is a natural

Medium 100%vs100%
A person holding a white ceramic coffee mug by the handle wibfl/flux-2-maxPASS
A person holding a white ceramic coffee mug by the handle with their right hand

Evaluator (Qwen 3 VL)

The visible hand in the image is holding the white ceramic coffee mug by its handle with the right hand, as prompted. Finger count: - Thumb: visible - Index finger: visible - Middle finger: visible - Ring finger: visible - Pinky finger: visible All five fingers are clearly visible and accounted for. Anatomical correctness: The grip is natural and anatomically plausible. The thumb wraps around the handle from the front, while the other fingers curl around from the back, forming a secure but re

A woman writing in a notebook with a pen gripped correctly ibfl/flux-2-maxPASS
A woman writing in a notebook with a pen gripped correctly in her right hand

Evaluator (Qwen 3 VL)

The visible right hand is gripping the pen in a standard writing posture. All five fingers are clearly visible: thumb, index, middle, ring, and pinky. The grip is anatomically correct — the pen is held between the index finger and thumb, with the middle finger providing support, and the other fingers curled naturally. The hand appears relaxed and functional for writing. The left hand is partially visible, resting on the notebook, and also shows five fingers — thumb, index, middle, ring, and pin

A person playing an acoustic guitar with left hand on the frbfl/flux-2-maxPASS
A person playing an acoustic guitar with left hand on the fretboard and right hand strumming

Evaluator (Qwen 3 VL)

The left hand is clearly visible on the fretboard. Counting the fingers: thumb, index, middle, ring, and pinky — exactly 5 fingers. The positioning is anatomically correct for fretting strings: the fingers are curved over the strings, pressing down at the frets, with the thumb supporting the back of the neck. The right hand is partially visible near the soundhole, but only the thumb and part of the palm are visible — not enough to count fingers or assess strumming posture. However, the prompt sp

Hard 100%vs67%
A person making a peace sign with their right hand, clearly bfl/flux-2-maxPASS
A person making a peace sign with their right hand, clearly showing two fingers up and three curled

Evaluator (Qwen 3 VL)

The image shows a single hand making a peace sign gesture, with the index and middle fingers extended upward and the remaining three fingers (ring, pinky, and thumb) curled inward. All five fingers are clearly visible and anatomically correct in their positioning. - Count of fingers: 5 per hand (only one hand is visible, but it has all five fingers). - Anatomical correctness: The gesture is standard and natural — the thumb is curled inward, the ring and pinky fingers are curled, and the index a

A hand giving a thumbs-up with four fingers curled and thumbbfl/flux-2-maxPASS
A hand giving a thumbs-up with four fingers curled and thumb extended upward

Evaluator (Qwen 3 VL)

The image shows a single hand making a thumbs-up gesture. The thumb is extended upward, and the other four fingers are curled into a fist. Counting the fingers: there are clearly five fingers visible — the thumb (extended), and the index, middle, ring, and little fingers (curled). The anatomy is consistent with a human hand: the knuckles, finger joints, and skin texture appear natural, and the gesture is anatomically correct for a thumbs-up. There are no anomalies — no extra fingers, missing fi

A person counting to three on their fingers, with index, midbfl/flux-2-maxPASS
A person counting to three on their fingers, with index, middle, and ring fingers extended

Evaluator (Qwen 3 VL)

The image shows one hand, not two. The prompt requested counting to three using the index, middle, and ring fingers — which is exactly what is depicted: three fingers extended (index, middle, ring), and the thumb and pinky are curled inward, forming a “three” gesture. - Fingers per hand: Only one hand is visible. It has 3 extended fingers (index, middle, ring) and 2 curled fingers (thumb and pinky). So, total fingers visible = 5 — but only one hand is shown. - Anatomical correctness: The gestur

Extreme 100%vs100%
Two people shaking hands in a professional setting, all fingbfl/flux-2-maxPASS
Two people shaking hands in a professional setting, all fingers of both hands anatomically correct

Evaluator (Qwen 3 VL)

Counting the fingers on each visible hand: - The hand on the left (wearing the dark suit) clearly shows five fingers: thumb, index, middle, ring, and pinky. All are anatomically present and properly aligned. - The hand on the right (wearing the gray suit) also clearly shows five fingers. The thumb is visible on the side, and the other four fingers are wrapped around the other hand in a standard handshake grip. The grip is anatomically correct: both hands are interlocked with fingers wrapped ar

Two people doing a high-five, palms meeting with all fingersbfl/flux-2-maxPASS
Two people doing a high-five, palms meeting with all fingers extended on both hands

Evaluator (Qwen 3 VL)

Each visible hand shows exactly five fingers: thumb, index, middle, ring, and pinky. The fingers are extended and meet palm-to-palm in a high-five gesture, which is anatomically correct for this action. There are no anomalies — no extra fingers, missing fingers, or unnatural deformities. The alignment and positioning of the hands are natural and consistent with a high-five. VERDICT: PASS

One person handing a small gift box to another person, both bfl/flux-2-maxPASS
One person handing a small gift box to another person, both sets of hands gripping the box correctly

Evaluator (Qwen 3 VL)

Both hands are clearly visible, gripping the gift box from opposite sides. Each hand shows five distinct fingers: thumb, index, middle, ring, and pinky. The fingers are positioned naturally around the box, with the thumbs on the sides and the other fingers wrapping around the front and back, forming a secure, anatomically correct grip. There are no anomalies — no extra fingers, missing fingers, or unnatural deformities. The gesture is consistent with the prompt: “One person handing a small gift

Multi-Subject100%vs100%
Easy 100%vs100%
Two women standing side by side: one with short blonde hair bfl/flux-2-maxPASS
Two women standing side by side: one with short blonde hair wearing a red jacket, the other with long black hair wearing a blue dress

Evaluator (Qwen 3 VL)

Person 1: - Hair: Short, blonde, styled in a neat pixie cut. - Clothing: Wearing a red jacket over a light-colored top. - Position: Standing on the left side of the image. Person 2: - Hair: Long, straight, black hair. - Clothing: Wearing a blue dress with short sleeves and a fitted waist. - Position: Standing on the right side of the image, next to Person 1. Both individuals are visually distinct and match the described features exactly. VERDICT: PASS

Two men at a cafe: one with a beard and glasses wearing a grbfl/flux-2-maxPASS
Two men at a cafe: one with a beard and glasses wearing a green sweater, the other clean-shaven in a black hoodie

Evaluator (Qwen 3 VL)

Person 1 (left): - Beard: Present, full and well-defined. - Glasses: Wearing black-framed glasses. - Sweater: Wearing a textured green sweater. - Gender: Male. - Visually distinct: Yes, clearly visible and matches description. Person 2 (right): - Beard: Clean-shaven (no facial hair visible). - Hoodie: Wearing a black hooded sweatshirt. - Gender: Male. - Visually distinct: Yes, clearly visible and matches description. All individuals are present, visually distinct, and match their described fea

Two friends posing: one tall with curly red hair in denim ovbfl/flux-2-maxPASS
Two friends posing: one tall with curly red hair in denim overalls, the other short with straight brown hair in a yellow sundress

Evaluator (Qwen 3 VL)

Person 1 (tall): - Hair: Curly, red/orange (matches "curly red hair") - Clothing: Denim overalls over a white t-shirt (matches "denim overalls") - Height: Appears taller than the other person (matches "tall") - Additional traits: Smiling, wearing white sneakers, standing with one hand on hip Person 2 (short): - Hair: Straight, brown (matches "straight brown hair") - Clothing: Yellow sundress (matches "yellow sundress") - Height: Clearly shorter than the other person (matches "short") - Addition

Medium 100%vs100%
Four people at a dinner table: a bald man in a suit, a womanbfl/flux-2-maxPASS
Four people at a dinner table: a bald man in a suit, a woman with silver hair in a red blouse, a young man with dreadlocks in a denim jacket, and a woman with a hijab in a green dress

Evaluator (Qwen 3 VL)

- Bald man in a suit: Present. He is on the far left, bald, wearing a dark suit with a white shirt. ✅ - Woman with silver hair in a red blouse: Present. She is second from left, has silver hair styled back, and is wearing a red blouse. ✅ - Young man with dreadlocks in a denim jacket: Present. He is third from left, has dreadlocks, and is wearing a denim jacket over a gray shirt. ✅ - Woman with a hijab in a green dress: Present. She is on the far right, wearing a light green hijab and a matching

Three musicians on stage: a tall woman with short pink hair bfl/flux-2-maxPASS
Three musicians on stage: a tall woman with short pink hair playing bass, a man with a beard and bandana on drums, and a slim man with glasses playing keyboard

Evaluator (Qwen 3 VL)

- Tall woman with short pink hair playing bass: Present. She has short, vibrant pink hair, is playing a bass guitar, and appears tall and slender. Matches description. - Man with beard and bandana on drums: Present. He is seated behind a drum kit, has a full beard, and is wearing a patterned bandana. Matches description. - Slim man with glasses playing keyboard: Present. He is on the right, playing a keyboard, wearing glasses, and appears slim. Matches description. All three individuals are vis

Three hikers on a trail: a man with a backpack and sunhat, abfl/flux-2-maxPASS
Three hikers on a trail: a man with a backpack and sunhat, a woman with braided hair and hiking poles, and a teenager with a baseball cap and water bottle

Evaluator (Qwen 3 VL)

Person 1 (Man): - Wearing a backpack (large, grayish, with straps visible) - Wearing a sunhat (light brown, wide-brimmed) - Wearing sunglasses - Appears to be middle-aged, smiling - Matches description: ✅ Person 2 (Woman): - Wearing a backpack (smaller, grayish, with straps visible) - Has long, dark braided hair (one thick braid down the side) - Holding two hiking poles - Wearing hiking attire (shorts and shirt) - Smiling, appears engaged in conversation - Matches description: ✅ Person 3 (Tee

Truthfulness 89%vs93%

Photorealism100%vs100%
Medium 100%vs100%
A still life on a wooden table featuring a clear glass of wabfl/flux-2-maxPASS
A still life on a wooden table featuring a clear glass of water, a polished silver spoon, a folded linen napkin, and a ripe red apple — each material rendered with accurate texture and light behavior

Evaluator (Qwen 3.6 27B)

The image depicts a still life on a wooden table with a glass of water, a silver spoon, a folded linen napkin, and a red apple — all materials rendered with high fidelity to their physical properties. - **Glass**: The glass is transparent with realistic refraction of the wooden table beneath, subtle surface condensation, and accurate reflections of the surroundings. The water inside shows natural surface tension and clarity. - **Metal (silver spoon)**: The spoon exhibits sharp, coherent reflect

A workspace desk with a brushed aluminum laptop, a leather-bbfl/flux-2-maxPASS
A workspace desk with a brushed aluminum laptop, a leather-bound notebook, a ceramic coffee mug, and a glass paperweight — all materials visually distinct and realistic

Evaluator (Qwen 3.6 27B)

The image contains the following materials as requested: - **Brushed aluminum laptop**: The laptop lid shows a clear brushed metal texture with subtle directional highlights and reflections consistent with real brushed aluminum. - **Leather-bound notebook**: The brown cover exhibits realistic leather texture, creases, stitching, and slight wear — all characteristic of genuine leather. - **Ceramic coffee mug**: The white mug has a smooth, matte finish with soft, diffused reflections, approp

A kitchen counter with a stainless steel pot, a wooden cuttibfl/flux-2-maxPASS
A kitchen counter with a stainless steel pot, a wooden cutting board, a glass bottle of olive oil, and a cotton dish towel — each surface reflecting light correctly

Evaluator (Qwen 3.6 27B)

The image depicts a kitchen counter with the requested items: a stainless steel pot, a wooden cutting board, a glass bottle of olive oil, and a white cotton dish towel. Let’s evaluate each material’s visual properties: 1. Stainless steel pot: The pot shows strong, clear reflections of the surrounding environment (including the wooden board and light sources), which is characteristic of polished metal. The reflections are coherent and physically plausible. ✅ 2. Wooden cutting board: The wood gr

Physics & Reflections100%vs100%
Easy 100%vs100%
A red sphere on a white surface with a single shadow cast tobfl/flux-2-maxPASS
A red sphere on a white surface with a single shadow cast to the right, consistent with a light source from the upper left

Evaluator (Qwen 3.6 27B)

The shadow is cast to the right of the sphere, which is consistent with a light source from the upper left — this matches the prompt. The shadow’s shape is soft and elliptical, appropriate for a spherical object under a directional light. The shadow’s direction and position are physically plausible. The highlight on the sphere is located on the upper-left side, consistent with the light source direction. The gradient of light across the sphere’s surface follows the expected curvature — brighter

A coffee mug on a sunny windowsill casting a sharp shadow tobfl/flux-2-maxPASS
A coffee mug on a sunny windowsill casting a sharp shadow to the left

Evaluator (Qwen 3.6 27B)

The shadow cast by the coffee mug is positioned to the left of the mug, which is consistent with a light source (the sun) coming from the right. The shadow’s shape accurately follows the mug’s form, including the handle, and its edges are sharp and well-defined, which is physically plausible for direct sunlight. There is no visible distortion in the shadow’s shape or orientation. The shadow’s color is a natural dark gray, appropriate for a bright, direct light source on a white surface. No refle

A person standing outdoors at midday with a short shadow dirbfl/flux-2-maxPASS
A person standing outdoors at midday with a short shadow directly below them

Evaluator (Qwen 3.6 27B)

The shadow is cast directly beneath the person, consistent with a high sun angle at midday. The shadow’s shape is a clear, inverted silhouette of the person, with proportions and orientation matching the body’s form — no distortion or unnatural stretching. The shadow’s color is a natural dark gray-blue, appropriate for a bright, clear sky with strong direct sunlight. The direction is correct: the shadow falls straight down, indicating the sun is directly overhead. The ground is flat and uniform,

Medium 100%vs100%
A woman standing in front of a full-length mirror, her reflebfl/flux-2-maxPASS
A woman standing in front of a full-length mirror, her reflection accurately showing the back of her outfit and reversed orientation

Evaluator (Qwen 3.6 27B)

The reflection in the mirror correctly shows the reversed orientation of the woman — her face is mirrored, her body is reversed left-to-right, and the back of her suit is visible as expected. The reflection also accurately shows the front of her outfit, including the white shirt and lapel details, which is physically correct for a mirror image. Shadows are consistent: the woman casts a shadow on the floor to her right (viewer’s left), and the mirror’s reflection shows a corresponding shadow on

A cat sitting on a polished marble floor with its reflectionbfl/flux-2-maxPASS
A cat sitting on a polished marble floor with its reflection visible below

Evaluator (Qwen 3.6 27B)

The cat is sitting on a polished marble floor, and its reflection is clearly visible below it. The reflection is vertically aligned with the cat, which is physically plausible for a flat, reflective surface. The reflection is slightly distorted — as expected — due to the angle of the floor tiles and the curvature of the cat’s body, but it remains coherent and recognizable. The color of the reflection matches the cat’s fur and eyes, with appropriate tonal fidelity. The shadow beneath the cat is s

A vase of flowers on a glass table with a clear reflection vbfl/flux-2-maxPASS
A vase of flowers on a glass table with a clear reflection visible on the table surface

Evaluator (Qwen 3.6 27B)

The reflection on the glass table is clear and well-defined, mirroring the vase and flowers with correct orientation and proportion. The reflection is slightly distorted by the curvature of the glass and the angle of the surface, which is physically accurate. The color fidelity is good — the reflection retains the tones of the vase and flowers, though slightly muted by the glass’s translucency. Shadows are minimal but present, consistent with soft, ambient lighting. The direction of the reflecti

Hard 100%vs100%
A row of trees reflected in a calm lake at sunset, with the bfl/flux-2-maxPASS
A row of trees reflected in a calm lake at sunset, with the reflection slightly darker and gently rippled

Evaluator (Qwen 3.6 27B)

The reflection in the image is physically plausible. The trees are mirrored vertically in the water, with slight distortion caused by gentle ripples — consistent with a calm lake surface. The reflection is slightly darker than the actual trees, which is realistic due to absorption and scattering in water. The color of the reflection matches the warm sunset tones above, with natural tonal gradation. The direction and orientation of the reflection are correct — inverted vertically, as expected in

A city skyline reflected in a rain puddle on a street, with bfl/flux-2-maxPASS
A city skyline reflected in a rain puddle on a street, with mild distortion and color shift

Evaluator (Qwen 3.6 27B)

The reflection in the puddle shows a city skyline with buildings inverted as expected in a water mirror, and the distortion is mild — consistent with a shallow puddle’s surface. The color shift is present, with purplish and pinkish hues from ambient city lighting reflecting off the water, which is physically plausible under nighttime or twilight conditions. The orientation of the reflection is correct: the tops of buildings appear at the bottom of the puddle, and the horizon line is roughly leve

A mountain range reflected in a still alpine lake, with the bfl/flux-2-maxPASS
A mountain range reflected in a still alpine lake, with the reflection inverted and subtly muted in color

Evaluator (Qwen 3.6 27B)

The reflection in the lake is inverted as expected — the peaks appear upside down relative to their actual positions, which is physically accurate for a calm water surface acting as a mirror. The orientation is correct: the reflection is vertically flipped, not rotated or distorted in an unnatural way. The color is subtly muted in the reflection, as requested — the reflected image is slightly less saturated and cooler in tone than the actual mountains, likely due to the water’s absorption and t

Extreme 100%vs100%
A clear glass sphere sitting on a mirror surface near a poolbfl/flux-2-maxPASS
A clear glass sphere sitting on a mirror surface near a pool of water, showing refraction through the sphere, reflection on the mirror, and water reflection simultaneously

Evaluator (Qwen 3.6 27B)

The image depicts a clear glass sphere resting on a mirror surface near a pool of water, with three distinct reflective/refractive elements: the sphere’s own reflection on the mirror, the distorted reflection of the surroundings inside the sphere (refraction), and the water’s surface reflection nearby. Physical plausibility check: 1. Direction and orientation: - The sun is visible as a bright starburst in the sphere’s reflection, and its position matches the reflection on the mirror surface

A chrome sculpture in a room with a glass floor and large wibfl/flux-2-maxPASS
A chrome sculpture in a room with a glass floor and large windows, showing multiple reflections and environmental mapping on every surface

Evaluator (Qwen 3.6 27B)

The reflections and shadows in the image are largely physically plausible, though with minor inconsistencies. - **Reflections**: The chrome sculpture reflects the cityscape and sky accurately, with correct perspective and curvature matching the shape of the sculpture. The glass floor reflects the sculpture and the environment below with appropriate distortion based on the angle and curvature of the floor panels. The reflections are coherent and consistent with the scene’s lighting. - **Shadows

A crystal wine glass on a polished marble counter next to a bfl/flux-2-maxPASS
A crystal wine glass on a polished marble counter next to a window overlooking a lake — showing refraction in the glass, reflection on the marble, and the lake's own reflective surface

Evaluator (Qwen 3.6 27B)

The image depicts a crystal wine glass on a polished marble counter, with a window behind it showing a lake and sky. The prompt requested refraction in the glass, reflection on the marble, and the lake’s reflective surface — all of which are present and physically plausible. - **Refraction in the glass**: The glass shows clear refraction — the background scenery (sky, clouds, trees) is visibly distorted and bent through the glass, especially around the cut facets. This is accurate for how light

World Knowledge75%vs83%
Easy 100%vs100%
The Eiffel Tower in Paris on a clear day, architecturally acbfl/flux-2-maxPASS
The Eiffel Tower in Paris on a clear day, architecturally accurate with its iron lattice structure and three levels

Evaluator (Qwen 3.6 27B)

The depicted subject is the Eiffel Tower in Paris, clearly recognizable and presented under a clear sky, matching the prompt’s description. The image accurately shows its iconic iron lattice structure, and the three levels (first, second, and top observation decks) are visible and correctly proportioned. The architectural details, including the arches at the base and the intricate trusswork, are factually accurate. There are no discernible distortions or inaccuracies in the structure’s form or p

The Taj Mahal in Agra seen from the reflecting pool, with acbfl/flux-2-maxPASS
The Taj Mahal in Agra seen from the reflecting pool, with accurate dome, minarets, and symmetry

Evaluator (Qwen 3.6 27B)

The image depicts the Taj Mahal in Agra viewed from the reflecting pool, as requested. The central dome, four minarets, and overall symmetry are clearly visible and align with the architectural design of the real Taj Mahal. The reflecting pool and surrounding gardens are also accurately represented in composition and perspective. However, there are subtle inaccuracies: - The minarets are not perfectly symmetrical in scale and spacing relative to the main structure — the two outer minarets appea

The Statue of Liberty on her island pedestal, with correct tbfl/flux-2-maxPASS
The Statue of Liberty on her island pedestal, with correct torch, crown, tablet, and green patina

Evaluator (Qwen 3.6 27B)

The depicted subject is the Statue of Liberty, clearly recognizable and positioned on her island pedestal as requested. The statue shows the correct features: the torch in her raised right hand, the seven-pointed crown on her head, and the tablet held in her left arm. The green patina is accurately rendered across the copper surface. The pedestal structure is also correctly represented, including its layered stone design and surrounding island landscaping. There are no significant architectural

Medium 67%vs33%
A traditional Japanese shamisen instrument with accurate thrbfl/flux-2-maxPASS
A traditional Japanese shamisen instrument with accurate three-stringed design, wooden body, and bachi plectrum

Evaluator (Qwen 3.6 27B)

The depicted subject is a traditional Japanese shamisen, and it is recognizable and factually accurate in most key aspects: - It has a wooden body with a white, stretched membrane (kōshō) on the soundboard — correct. - It has three strings — correct. - It has a long neck with tuning pegs at the headstock — correct. - The accompanying plectrum (bachi) is a fan-shaped piece of material (typically sharkskin or plastic) mounted on a wooden handle — correct. Minor observations: - The tuning pegs ap

A medieval astrolabe with accurate rete, tympan, and engravebfl/flux-2-maxFAIL
A medieval astrolabe with accurate rete, tympan, and engraved degree markings on brass

Evaluator (Qwen 3.6 27B)

The image depicts a circular brass instrument with intricate engravings, star symbols, and degree markings — consistent with the appearance of a medieval astrolabe. The central perforated plate (rete) with star pointers and the surrounding graduated disc (tympan) are clearly visible, and the degree markings (0° to 360°) are present around the perimeter. The script appears to be Arabic, which is historically accurate for many Islamic-era astrolabes. However, upon closer inspection, the “rete” (t

A Japanese kintsugi bowl — cracked ceramic repaired with visbfl/flux-2-maxPASS
A Japanese kintsugi bowl — cracked ceramic repaired with visible gold lacquer lines

Evaluator (Qwen 3.6 27B)

The image depicts a ceramic bowl with visible gold lacquer lines filling cracks — a hallmark of kintsugi, the traditional Japanese art of repairing broken pottery with gold. The bowl’s form, glaze, and repair technique are consistent with authentic kintsugi practice. The cracks are clearly visible and intentionally highlighted with gold, which aligns with the cultural and aesthetic intent of the art form. The bowl’s material appears ceramic, and the gold lines are applied in a way that is both d

Hard 67%vs100%
A detailed diagram of the water cycle showing evaporation frbfl/flux-2-maxPASS
A detailed diagram of the water cycle showing evaporation from the ocean, condensation forming clouds, precipitation as rain, and collection in rivers flowing back to the sea

Evaluator (Qwen 3.6 27B)

The image depicts a diagram of the water cycle with labeled stages: evaporation from the ocean, condensation forming clouds, precipitation as rain, and collection in rivers flowing back to the sea. The sun is shown as the energy source driving evaporation. The arrows indicate the direction of water movement, and the components are clearly labeled. The diagram is factually accurate in its representation of the core processes of the water cycle. While it is a simplified illustration (as expected

A cross-section showing tectonic plates at a convergent bounbfl/flux-2-maxPASS
A cross-section showing tectonic plates at a convergent boundary with subduction, a trench, volcanic arc, and layered mantle

Evaluator (Qwen 3.6 27B)

The image depicts a cross-section of a convergent plate boundary with subduction, a trench, a volcanic arc, and layered mantle — all of which are recognizable and factually accurate representations of a subduction zone. Key elements correctly shown: - A subducting oceanic plate (dark gray) diving beneath another plate. - A deep ocean trench at the surface where the subduction begins. - A volcanic arc (mountain range with active volcanoes) forming on the overriding plate. - Layered mantle struct

An anatomical illustration of the human heart showing four cbfl/flux-2-maxFAIL
An anatomical illustration of the human heart showing four chambers, valves, aorta, pulmonary arteries, and veins with correct placement

Evaluator (Qwen 3.6 27B)

The image depicts a cross-sectional anatomical illustration of the human heart with labels pointing to various structures. However, it contains numerous factual and labeling inaccuracies: 1. **Mislabeling of Chambers and Valves:** - “Right Atrium” is labeled twice, but the label “Right Atrium, Leftfrientiuaty (right ventricle)” is nonsensical and incorrect. - “Mitral (bicuspid) (valve)” is labeled in the wrong location — it should be between the left atrium and left ventricle, but the lab

Extreme 67%vs100%
A Morpho menelaus butterfly with accurate iridescent blue dobfl/flux-2-maxPASS
A Morpho menelaus butterfly with accurate iridescent blue dorsal wings, brown ventral wing pattern with eyespots, and correct body morphology

Evaluator (Qwen 3.6 27B)

The depicted subject is a Morpho menelaus butterfly, and it is recognizable as such. The dorsal (upper) wings display a striking iridescent blue, which is characteristic of this species. The ventral (lower) wings show a brown background with prominent eyespots — notably, the large, well-defined eyespots near the hindwing margin are accurate to Morpho menelaus. The body morphology, including the antennae, legs, and thorax, appears anatomically correct for a butterfly. However, there is one notab

The Hagia Sophia interior showing the massive central dome, bfl/flux-2-maxPASS
The Hagia Sophia interior showing the massive central dome, pendentives, semi-domes, and Islamic calligraphy medallions alongside Byzantine mosaics

Evaluator (Qwen 3.6 27B)

The image depicts the interior of the Hagia Sophia, and it correctly shows the massive central dome, pendentives, and semi-domes — all key architectural features of the building. The Islamic calligraphy medallions (including “Allah,” “Muhammad,” and “A’lā” or “Al-Azīz”) are accurately rendered and appropriately placed, reflecting the mosque’s current configuration. The Byzantine mosaics are also visible, though partially obscured or altered by the Islamic overlay — which is historically accurate

A specimen of bismuth crystal showing accurate iridescent oxbfl/flux-2-maxFAIL
A specimen of bismuth crystal showing accurate iridescent oxide layers, geometric hopper crystal formation, and characteristic staircase structure

Evaluator (Qwen 3.6 27B)

The depicted subject is a bismuth crystal, and it is recognizable as such — exhibiting the characteristic geometric, stepped, or “hopper” morphology that bismuth crystals commonly form due to their low melting point and tendency to grow in layered, stair-step fashion. The iridescent oxide layers are also present, as bismuth naturally develops a thin oxide film that reflects light in a rainbow-like spectrum, especially under magnification or in certain lighting conditions — which is accurately re

Professional Studio 89%vs100%

Camera & Lighting100%vs100%
Easy 100%vs100%
A wide-angle shot of a modern living room interior showing ebfl/flux-2-maxPASS
A wide-angle shot of a modern living room interior showing expansive space with slight barrel distortion at the edges

Evaluator (Gemma 4 26B)

The image presents a wide-angle shot of a modern living room interior, and the perspective clearly exhibits barrel distortion at the edges — especially noticeable in the curved distortion of the walls and floor near the frame’s periphery. This matches the prompt’s request for “slight barrel distortion at the edges.” The depth of field appears uniformly shallow across the scene, which is typical for wide-angle shots with a wide aperture — though not explicitly “shallow” in the traditional portra

A close-up macro shot of a single dewdrop on a rose petal, sbfl/flux-2-maxPASS
A close-up macro shot of a single dewdrop on a rose petal, shallow depth of field

Evaluator (Gemma 4 26B)

The image presents a close-up macro shot of a single dewdrop resting on a rose petal, with the dewdrop sharply in focus while the surrounding petal and background are softly blurred — indicating a shallow depth of field. The perspective is indeed macro, emphasizing fine details like the petal’s texture and the spherical clarity of the droplet. The lighting is soft and diffused, creating gentle reflections and highlights on the dewdrop’s surface, consistent with natural outdoor or studio lighting

An aerial bird's-eye view looking straight down at a trafficbfl/flux-2-maxPASS
An aerial bird's-eye view looking straight down at a traffic roundabout with cars and trees

Evaluator (Gemma 4 26B)

The image presents a clear aerial bird’s-eye view looking straight down at a traffic roundabout, with cars circulating around a central landscaped island and trees visible both within the island and along the periphery. The perspective is top-down and consistent with a drone or high-altitude camera shot. The depth of field appears uniform across the scene, typical of such an aerial shot, with no noticeable blur or shallow focus. Lighting is bright and even, suggesting daylight with shadows cast

Medium 100%vs100%
A portrait lit with classic Rembrandt lighting — triangle ofbfl/flux-2-maxPASS
A portrait lit with classic Rembrandt lighting — triangle of light on the shadowed cheek, moody atmosphere

Evaluator (Gemma 4 26B)

The image depicts a portrait with a strong chiaroscuro effect, featuring a distinct triangle of light on the shadowed cheek — a hallmark of Rembrandt lighting. The lighting creates deep shadows on one side of the face while illuminating the other, contributing to a moody, dramatic atmosphere as requested. The perspective is a three-quarter view, typical for such portraits, and the depth of field is shallow, with the subject sharply in focus against a softly blurred, dark background. The composit

A landscape photograph bathed in golden hour light — long wabfl/flux-2-maxPASS
A landscape photograph bathed in golden hour light — long warm shadows, orange sky, soft glow

Evaluator (Gemma 4 26B)

The image depicts a landscape bathed in warm, golden-hour lighting, with long, soft shadows cast by the trees and terrain. The sky is a gradient of orange and soft pink hues, consistent with a sunset or sunrise. The perspective is wide and immersive, showing a winding dirt path leading into the distance, which enhances depth. The depth of field is shallow — foreground elements are sharp while background hills and trees gradually blur, creating a sense of atmospheric perspective. The lighting pat

A high-key portrait of a person against a bright white backgbfl/flux-2-maxPASS
A high-key portrait of a person against a bright white background, minimal shadows, airy and luminous

Evaluator (Gemma 4 26B)

The image presents a high-key portrait with a bright white background, as specified. The lighting is soft and even, minimizing shadows across the face and neck, which contributes to an airy, luminous quality. The depth of field appears shallow, keeping the subject’s face in sharp focus while the background is uniformly white and slightly blurred, enhancing the portrait’s clean, minimalist aesthetic. The perspective is a direct frontal shot, typical for high-key portraiture, which aligns with the

Hard 100%vs100%
A portrait shot at 85mm f/1.4 with creamy bokeh background, bfl/flux-2-maxPASS
A portrait shot at 85mm f/1.4 with creamy bokeh background, backlit with warm rim light separating the subject from a blurred garden

Evaluator (Gemma 4 26B)

The image shows a portrait with a shallow depth of field, producing a creamy, blurred background consistent with a wide aperture (f/1.4) and likely an 85mm lens — both of which are implied by the prompt. The subject is backlit with a warm rim light that beautifully separates her hair and shoulders from the background, creating a glowing halo effect. The background is out of focus with soft, natural-looking bokeh, suggesting a garden setting. The perspective is a close-up portrait shot, centered

A street photograph with shallow depth of field as if shot abfl/flux-2-maxPASS
A street photograph with shallow depth of field as if shot at 50mm f/1.8, subject sharp with soft city lights in the background

Evaluator (Gemma 4 26B)

The image depicts a street scene at night with a person walking on a wet sidewalk. The subject is in focus, while the background — including streetlights, storefronts, and a car — is softly blurred with circular bokeh, which is characteristic of a shallow depth of field. The lighting comes from ambient city sources (streetlights, signs), creating warm reflections on the wet pavement and soft, out-of-focus highlights in the background. The perspective is eye-level and natural, consistent with a 5

A food photograph shot from above at 35mm f/2.8 with naturalbfl/flux-2-maxPASS
A food photograph shot from above at 35mm f/2.8 with natural window light from the left, slight vignette, warm color temperature

Evaluator (Gemma 4 26B)

The image is shot from a top-down perspective, which matches the “shot from above” requirement. The depth of field appears shallow, with the main dish in focus and the background (table, cutlery, glass) softly blurred — consistent with a 35mm f/2.8 lens. The lighting comes from the upper left, casting soft shadows to the right, suggesting natural window light from the left. There is a subtle vignette effect, with darker corners and edges, and the color temperature is warm, enhancing the rustic,

Extreme 100%vs100%
A tilt-shift photograph of a city street making real-life lobfl/flux-2-maxPASS
A tilt-shift photograph of a city street making real-life look like a miniature model — selective focus strip, saturated colors

Evaluator (Gemma 4 26B)

The image successfully emulates a tilt-shift photograph of a city street, with a narrow band of sharp focus running horizontally across the middle of the frame — precisely matching the “selective focus strip” requirement. The perspective is high-angle, looking down on the street, which enhances the miniature effect. The depth of field is shallow, with foreground and background elements blurred, which is characteristic of tilt-shift photography. The lighting appears bright and even, with consiste

A cinematic frame with blue and orange anamorphic lens flarebfl/flux-2-maxPASS
A cinematic frame with blue and orange anamorphic lens flare streaking horizontally across a dark moody scene

Evaluator (Gemma 4 26B)

The image presents a low-angle, cinematic perspective with a shallow depth of field, focusing on scattered objects on a wet surface while blurring the background into bokeh. The lighting pattern features horizontal streaks of blue and orange light — likely from motion blur or long exposure — that cut across the frame, creating a strong contrast and mood. These streaks resemble anamorphic lens flares in their elongated, horizontal form and color temperature contrast (blue on the left, orange on t

A portrait with dramatic split lighting — half the face brigbfl/flux-2-maxPASS
A portrait with dramatic split lighting — half the face brightly lit, the other half in deep shadow, high contrast chiaroscuro

Evaluator (Gemma 4 26B)

The image presents a portrait with a clear, sharp division down the center of the face — one half brightly lit with detailed texture and color, the other half completely in deep shadow. This perfectly matches the “dramatic split lighting” and “high contrast chiaroscuro” described in the prompt. The lighting pattern is symmetrical and intentional, creating a strong visual contrast. The perspective is a direct, centered close-up, and the depth of field is shallow, keeping the face in sharp focus w

Color Precision83%vs100%
Easy 100%vs100%
A bright red sports car parked on a gray asphalt road — the bfl/flux-2-maxPASS
A bright red sports car parked on a gray asphalt road — the red is vivid and unambiguous

Evaluator (Gemma 4 26B)

The car in the image is a bright, vivid red — consistent with Ferrari’s iconic “Rosso Corsa” color. The red is unambiguous, saturated, and stands out clearly against the gray asphalt road. The prompt specified “a bright red sports car” and “the red is vivid and unambiguous,” which is accurately reflected in the image. VERDICT: PASS

A woman wearing a royal blue evening dress against a neutralbfl/flux-2-maxPASS
A woman wearing a royal blue evening dress against a neutral background

Evaluator (Gemma 4 26B)

The dress in the image is a rich, deep royal blue, which closely matches the color described in the prompt. The background is a neutral beige or light taupe, which also aligns with the “neutral background” requirement. There are no conflicting or incorrect colors visible. VERDICT: PASS

A field of pure sunflower yellow flowers under a clear blue bfl/flux-2-maxPASS
A field of pure sunflower yellow flowers under a clear blue sky

Evaluator (Gemma 4 26B)

The image depicts a vast field of sunflowers under a clear sky. The sunflowers are a vibrant, saturated yellow, consistent with the description “pure sunflower yellow.” The sky is a clear, gradient blue, matching “clear blue sky.” There are no conflicting or inaccurate colors present. The visual representation aligns precisely with the prompt’s color specifications. VERDICT: PASS

Medium 33%vs100%
A bedroom wall painted in #FF6B35 (burnt orange) with white bfl/flux-2-maxPASS
A bedroom wall painted in #FF6B35 (burnt orange) with white furniture and natural daylight

Evaluator (Gemma 4 26B)

The wall in the image is painted in a warm, burnt orange hue that closely matches the hex color #FF6B35. The furniture is uniformly white, as specified. Natural daylight is present, casting soft shadows and illuminating the room as described. VERDICT: PASS

A forest scene where the foliage is precisely #2D5F2D (dark bfl/flux-2-maxFAIL
A forest scene where the foliage is precisely #2D5F2D (dark muted green) with brown tree trunks

Evaluator (Gemma 4 26B)

The foliage in the image is not precisely #2D5F2D (dark muted green). Instead, it appears as a range of desaturated, cool-toned greens and blues, with significant variation due to lighting and atmospheric effects. The color #2D5F2D is a specific, deep green, but the leaves in the image are more grayish-blue and shadowed, lacking the exact hue specified. The tree trunks are brown, which matches the prompt, but the foliage does not. VERDICT: FAIL

A room with walls in #4A90D9 (medium blue) with warm wooden bfl/flux-2-maxFAIL
A room with walls in #4A90D9 (medium blue) with warm wooden floor and white trim

Evaluator (Gemma 4 26B)

The prompt specified walls in the exact hex color #4A90D9 (a medium blue). In the image, the walls are painted in a darker, more muted blue — closer to a navy or slate blue — which is visually distinct from #4A90D9. While the white trim and warm wooden floor are correctly rendered, the wall color does not match the specified hex value. VERDICT: FAIL

Hard 100%vs100%
A minimalist room with walls in #F5E6CC (warm cream), a sofabfl/flux-2-maxPASS
A minimalist room with walls in #F5E6CC (warm cream), a sofa in #2C3E50 (dark navy), and accent pillows in #E74C3C (bright red)

Evaluator (Gemma 4 26B)

The walls in the image are a warm cream color that closely matches #F5E6CC. The sofa is a dark navy that visually aligns with #2C3E50. The accent pillows are a bright red that matches #E74C3C. All specified colors are accurately represented in the image. VERDICT: PASS

A product flat-lay on a #1A1A2E (deep dark blue) background,bfl/flux-2-maxPASS
A product flat-lay on a #1A1A2E (deep dark blue) background, with a #F0C27F (soft gold) watch and #FFFFFF (pure white) earbuds

Evaluator (Gemma 4 26B)

The image shows: - A deep dark blue background that closely matches #1A1A2E. - A watch with a soft gold finish that matches #F0C27F. - White earbuds that match #FFFFFF. All specified colors are accurately represented in the image. VERDICT: PASS

A graphic design composition with background #FAFAFA (near-wbfl/flux-2-maxPASS
A graphic design composition with background #FAFAFA (near-white), main shape #FF4500 (orange-red), and accent lines #333333 (dark charcoal)

Evaluator (Gemma 4 26B)

The image contains: - Background: #FAFAFA (near-white) — matches perfectly. - Main shape: #FF4500 (orange-red) — the main rounded shape is a reddish-orange, very close to #FF4500 (a classic orange-red). Visually, it’s a slightly muted or desaturated version, but it’s clearly in the same color family and within acceptable tolerance for design purposes. - Accent lines: #333333 (dark charcoal) — the thin lines are a dark gray, very close to #333333. Slight variations in rendering may occur, but it’

Extreme 100%vs100%
A smooth gradient on a wall going from #FF0000 (red) on the bfl/flux-2-maxPASS
A smooth gradient on a wall going from #FF0000 (red) on the left to #0000FF (blue) on the right, transitioning evenly through purple

Evaluator (Gemma 4 26B)

The image displays a smooth gradient transitioning from red on the left to blue on the right, passing through purple in the center — exactly as described in the prompt. The red on the left matches #FF0000 (pure red), and the blue on the right matches #0000FF (pure blue). The transition is even and visually consistent with a linear blend through purple. VERDICT: PASS

A sunset sky gradient on a flat surface from #FF6B35 (orangebfl/flux-2-maxPASS
A sunset sky gradient on a flat surface from #FF6B35 (orange) at the bottom to #1B1464 (deep indigo) at the top

Evaluator (Gemma 4 26B)

The image displays a smooth vertical gradient transitioning from a bright orange at the bottom to a dark, muted indigo at the top. The bottom color is a vivid orange that closely matches #FF6B35 (a bright, saturated orange). The top color is a deep, dark indigo that is visually consistent with #1B1464 (a rich, dark indigo). The gradient is smooth and follows the specified direction and color progression. VERDICT: PASS

A horizontal banner with a gradient from #00C9FF (cyan) on tbfl/flux-2-maxPASS
A horizontal banner with a gradient from #00C9FF (cyan) on the left to #92FE9D (mint green) on the right

Evaluator (Gemma 4 26B)

The image shows a horizontal banner with a smooth gradient transitioning from a cyan hue on the left to a mint green hue on the right. The leftmost color is visually consistent with #00C9FF (a bright cyan), and the rightmost color matches #92FE9D (a light mint green). The gradient is seamless and follows the exact color progression specified in the prompt. VERDICT: PASS

Photorealism67%vs100%
Easy 67%vs100%
A single red sneaker on a clean white background with soft sbfl/flux-2-maxPASS
A single red sneaker on a clean white background with soft studio lighting and gentle shadow — product photography style

Evaluator (Gemma 4 26B)

The image presents a clean, studio-quality product shot of a red sneaker on a white background with soft lighting and a gentle shadow — exactly matching the prompt. The textures, stitching, laces, and sole details are rendered with high fidelity. There are no obvious AI artifacts: no warped geometry, inconsistent lighting, unnatural reflections, or texture smearing. The shadow is soft and physically plausible, and the white background is uniformly clean. The sneaker’s materials appear realistic,

A clear glass perfume bottle on a reflective black surface wbfl/flux-2-maxPASS
A clear glass perfume bottle on a reflective black surface with studio lighting, no background distractions

Evaluator (Gemma 4 26B)

The image presents a highly polished, studio-quality product shot of a clear glass perfume bottle on a reflective black surface, with no background distractions — exactly matching the prompt. The lighting is dramatic and controlled, creating realistic highlights and reflections. The glass refracts light naturally, and the reflection below is crisp and symmetrical. There are no obvious AI artifacts: no warped geometry, inconsistent reflections, or unnatural textures. The bottle’s internal structu

A ripe green apple on a seamless white background with soft bfl/flux-2-maxFAIL
A ripe green apple on a seamless white background with soft side lighting and subtle shadow — catalog style

Evaluator (Gemma 4 26B)

The image presents a green apple on a seamless white background with soft side lighting and a subtle shadow — matching the prompt’s description. At first glance, it appears to be a real photograph due to its natural-looking highlights, smooth gradient shadows, and realistic texture. However, upon closer inspection, there are subtle AI artifacts: - The apple’s skin has an unnaturally uniform, almost “painted” texture with perfectly even stippling — not the random, organic variation of real apple

Graphical design 75%vs92%

Data Visualisation33%vs100%
Easy 33%vs100%
A bar chart showing monthly sales for January through June wbfl/flux-2-maxFAIL
A bar chart showing monthly sales for January through June with values 12K, 18K, 15K, 22K, 19K, 25K — labeled axes, color-coded bars, and a title 'Monthly Sales 2025'

Evaluator (Qwen 3 VL)

The requested bar chart shows monthly sales for January through June with values 12K, 18K, 15K, 22K, 19K, 25K — but the chart incorrectly assigns these values to months. Specifically: - January (Jan) is labeled 12K — correct. - February (Feb) is labeled 18K — correct. - March (Mar) is labeled 18K — correct. - April (Apr) is labeled 15K — correct. - May (May) is labeled 22K — correct. - June (Jun) is labeled 25K — correct. The values are correctly assigned to the months. However, the legend is

A pie chart showing market share: Company A 35%, Company B 2bfl/flux-2-maxPASS
A pie chart showing market share: Company A 35%, Company B 25%, Company C 20%, Others 20% — each slice a different color with labels and percentages

Evaluator (Qwen 3 VL)

The chart is a pie chart titled "Market Share" and displays four slices: - Company A: 35% (blue) - Company B: 25% (red) - Company C: 20% (green) - Others: 20% (orange) Each slice has a distinct color and is labeled with its corresponding company name and percentage. The percentages add up to 100% (35 + 25 + 20 + 20 = 100), matching the requested data. The labels are clearly readable and positioned appropriately. The chart type is correctly a pie chart as requested. VERDICT: PASS

A simple line graph showing daily website visitors over one bfl/flux-2-maxFAIL
A simple line graph showing daily website visitors over one week (Mon-Sun): 1200, 1500, 1100, 1800, 2000, 1700, 1400 — with grid lines, axis labels, and title 'Weekly Traffic'

Evaluator (Qwen 3 VL)

The requested data is: Mon=1200, Tue=1500, Wed=1100, Thu=1800, Fri=2000, Sat=1700, Sun=1400. In the image: - Mon shows 1800 (should be 1200) — incorrect. - Tue shows 1500 — correct. - Wed shows 1100 — correct. - Thu shows 1800 — correct. - Fri shows 1700 — correct. - Sat shows 1800 — should be 2000 — incorrect. - Sun shows 2000 — should be 1400 — incorrect. The chart type is a line graph — correct. Grid lines are present — correct. Axis labels (“Visitors” and “Day”) and title (“Weekly Traffic”

Layout & Design78%vs89%
Easy 100%vs100%
A movie night poster with a large popcorn image in the centebfl/flux-2-maxPASS
A movie night poster with a large popcorn image in the center, the title 'MOVIE NIGHT' in bold at the top, and the tagline 'Every Friday at 8 PM' at the bottom

Evaluator (Qwen 3 VL)

The image features a large popcorn bucket centered in the composition, which matches the prompt’s description. The title “MOVIE NIGHT” is prominently displayed at the top in bold, white, uppercase letters, establishing clear visual hierarchy. The tagline “Every Friday at 8 PM” is correctly placed at the bottom, also in readable white text. All described layout zones and elements are present, and the text is legible and correctly positioned. VERDICT: PASS

An event poster featuring a guitar illustration, the headlinbfl/flux-2-maxPASS
An event poster featuring a guitar illustration, the headline 'LIVE MUSIC FEST' in large text, and 'Tickets available now' as a tagline below

Evaluator (Qwen 3 VL)

The image displays a guitar illustration on the right side, the headline "LIVE MUSIC FEST" in large, bold white text on the left, and the tagline "Tickets available now" in smaller yellow text directly beneath the headline. The visual hierarchy is clear: the headline is the most prominent element, followed by the tagline, and the guitar serves as a thematic visual anchor. All text is readable and correctly placed according to the prompt. VERDICT: PASS

A book club poster with an open book illustration, the titlebfl/flux-2-maxPASS
A book club poster with an open book illustration, the title 'JOIN THE BOOK CLUB' prominently displayed, and 'Meets every Wednesday' as the tagline

Evaluator (Qwen 3 VL)

All described layout zones/elements are present: - The title "JOIN THE BOOK CLUB" is prominently displayed at the top in large, bold text. - An open book illustration is centered below the title. - The tagline "Meets every Wednesday" is correctly placed at the bottom. The visual hierarchy is clear: the title is largest and most prominent, followed by the central illustration, then the smaller tagline at the bottom. All text is readable and correctly positioned. VERDICT: PASS

Medium 67%vs67%
A landing page design with a hero header image at the top, abfl/flux-2-maxPASS
A landing page design with a hero header image at the top, a headline and subtitle in the middle, a 'Sign Up' call-to-action button, and a footer with three social media icons

Evaluator (Qwen 3 VL)

The image displays a landing page with the following elements: - Hero header image: Present at the top, featuring a stylized abstract cloud graphic on the right side. - Headline and subtitle: The headline “Grow your business faster” is prominently displayed in large, bold text. Below it, the subtitle “All-in-one platform to manage customers, analytics, and marketing in one place.” is clearly visible and correctly placed. - “Sign Up” call-to-action button: A blue, rounded button with the text “S

A newsletter layout with a banner image header, a two-columnbfl/flux-2-maxPASS
A newsletter layout with a banner image header, a two-column body section with text on the left and an image on the right, a highlighted quote block, and a footer with unsubscribe link

Evaluator (Qwen 3 VL)

The image displays a newsletter layout that includes: - A banner image header with the title “Monthly Insights” and subtitle “Your update on the latest trends” — ✅ Present. - A two-column body section: text on the left (“Featured Article” and “Key Takeaways”) and an image on the right — ✅ Present. - A highlighted quote block (“Innovation starts with a single idea.” — Editorial Team) — ✅ Present. - A footer with “View in browser,” “© 2026 Your Company,” and an “Unsubscribe” button — ✅ Present.

A restaurant menu layout with a logo header, three category bfl/flux-2-maxFAIL
A restaurant menu layout with a logo header, three category sections (appetizers, mains, desserts) each with items and prices, and a footer with address and hours

Evaluator (Qwen 3 VL)

The image shows a restaurant menu with a logo header (“Harbor Bistro”), three category sections (Appetizers, Mains, Desserts), and a footer with address and hours — all of which are present. However, the text under each menu item is garbled and unreadable (e.g., “Grilled salmon s |acid”, “Pasta Primavera when savom”), which violates the requirement that text must be readable and correctly placed. The prices are mostly present but one (Chocolate Mousse) is missing a dollar sign. The layout hiera

Hard 67%vs100%
A magazine double-page spread with a left page containing a bfl/flux-2-maxFAIL
A magazine double-page spread with a left page containing a large hero photo, and a right page with a two-column article, a pull quote, a sidebar with author bio, and a page number

Evaluator (Qwen 3 VL)

The layout matches the prompt’s description: the left page has a large hero photo of a man in a suit, and the right page contains a two-column article, a pull quote (“Creativity is no longer a luxury – it is the engine of progress.”), a sidebar with an author bio (“About the author: Alex Margan...”), and a page number (27) at the bottom right. However, the text in the article columns is gibberish — it appears to be placeholder text (e.g., “Lorem ipsum”-style nonsense) and is not readable or mea

A mobile app UI mockup showing a dashboard with a top navigabfl/flux-2-maxPASS
A mobile app UI mockup showing a dashboard with a top navigation bar, a search field, a 2x2 grid of feature cards with icons, a list of recent activity items, and a bottom tab bar

Evaluator (Qwen 3 VL)

The image shows a mobile app UI mockup that closely matches the described layout: - Top navigation bar: Present as the “Dashboard” title at the top left and a profile avatar at the top right. - Search field: Clearly visible below the title, with a magnifying glass icon and placeholder text “Search”. - 2x2 grid of feature cards: Present with icons and labels: “Analytics”, “Messages”, “Calendar”, and “Settings”. - Recent Activity list: Clearly labeled and positioned below the grid, with entries s

A fashion magazine spread with a full-bleed left-page photo,bfl/flux-2-maxPASS
A fashion magazine spread with a full-bleed left-page photo, right page with three-column layout, headline, body copy, a smaller inset image, and styled page numbers

Evaluator (Qwen 3 VL)

The image shows a fashion magazine spread that closely matches the described layout: - Left page: Full-bleed photo of a model walking outdoors — ✅ Present. - Right page: Three-column layout — ✅ Present (two columns of body copy with a smaller inset image on the right, visually forming a three-column structure). - Headline: “URBAN ELEGANCE” — ✅ Present, large and centered at the top. - Body copy: Two columns of text — ✅ Present, readable (though placeholder text like “Lorem ipsum” is used). - In

Style Diversity83%vs92%
Easy 100%vs100%
A golden retriever sitting in a garden, painted in the stylebfl/flux-2-maxPASS
A golden retriever sitting in a garden, painted in the style of an oil painting with visible brushstrokes and rich texture

Evaluator (Qwen 3 VL)

The image depicts a golden retriever sitting in a garden, surrounded by flowers and foliage. The style is clearly that of an oil painting: thick, visible brushstrokes are evident throughout the composition, especially in the fur of the dog and the leaves and petals in the background. The texture is rich and painterly, with layered impasto effects and a warm, luminous quality typical of traditional oil painting. The lighting and color palette also support the impression of a classical, textured o

A golden retriever sitting in a garden, rendered in Japanesebfl/flux-2-maxPASS
A golden retriever sitting in a garden, rendered in Japanese anime style with flat colors and large expressive eyes

Evaluator (Qwen 3 VL)

The image depicts a golden retriever sitting in a garden, rendered in a style that strongly resembles Japanese anime. Key characteristics of the requested style are clearly present: - **Large, expressive eyes**: The dog has oversized, sparkling, emotive eyes with prominent highlights — a hallmark of anime style. - **Flat colors with minimal shading**: The image uses solid, vibrant color blocks with soft gradients, avoiding complex lighting or realistic textures. - **Simplified anatomy and styli

A golden retriever sitting in a garden, rendered in pixel arbfl/flux-2-maxPASS
A golden retriever sitting in a garden, rendered in pixel art style with a 32x32 grid and limited color palette

Evaluator (Qwen 3 VL)

The image depicts a golden retriever sitting in a garden, rendered in a pixel art style. The visual characteristics of pixel art are clearly present: individual pixels form the shapes, edges are blocky and defined, and the image uses a limited color palette (primarily shades of brown, green, blue, red, and white). The 32x32 grid resolution is not explicitly verifiable from the image alone, but the overall blocky, low-resolution aesthetic is consistent with that scale. The subject and setting mat

Medium 100%vs100%
A city street scene in the style of Art Deco — geometric shabfl/flux-2-maxPASS
A city street scene in the style of Art Deco — geometric shapes, gold and black palette, ornamental symmetry

Evaluator (Qwen 3 VL)

The image clearly embodies the requested Art Deco style. It features: - Geometric shapes: sharp lines, zigzags, chevrons, and symmetrical facades on buildings and street elements. - Gold and black palette: dominant use of gold outlines and accents against a black background, with gold-reflective surfaces on cars and pavement. - Ornamental symmetry: the composition is centrally aligned, with mirrored buildings, street lamps, and decorative motifs (e.g., sunburst patterns) creating strong visual

A woman with a parasol in the style of Impressionism — soft bfl/flux-2-maxPASS
A woman with a parasol in the style of Impressionism — soft brushstrokes, dappled light, pastel palette

Evaluator (Qwen 3 VL)

The image depicts a woman holding a parasol in a field of wildflowers under a soft, cloudy sky. The brushwork is visible and textured, with loose, expressive strokes that suggest movement and light — particularly in the sky, foliage, and the flowing dress. The palette is predominantly pastel, with gentle blues, pinks, yellows, and greens, evoking the luminous, atmospheric quality of Impressionism. Dappled light is implied through the interplay of color and shadow in the field and on the figure.

A portrait of a man in Pop Art style — bold outlines, Ben-Dabfl/flux-2-maxPASS
A portrait of a man in Pop Art style — bold outlines, Ben-Day dots, flat vivid primary colors

Evaluator (Qwen 3 VL)

The image clearly embodies the requested Pop Art style. It features: - Bold, clean black outlines defining all shapes and forms. - Ben-Day dots (a hallmark of Pop Art) used for shading and background gradients, visible in the skin tones, hair, suit, and background. - Flat, vivid primary colors: red, yellow, and blue dominate the composition, with white and black used for contrast and definition. All key visual characteristics of the requested Pop Art style are present and unmistakable. VERDIC

Hard 33%vs67%
A forest rendered in photorealistic watercolor — the detail bfl/flux-2-maxFAIL
A forest rendered in photorealistic watercolor — the detail and lighting of a photo combined with visible watercolor bleeds and paper texture

Evaluator (Qwen 3 VL)

The image depicts a forest scene with soft, diffused lighting and atmospheric depth, reminiscent of a photograph in tone and composition. However, it lacks the key visual characteristics of watercolor: visible paper texture, watercolor bleeds, translucent washes, or brushstroke imperfections. The rendering is smooth and digitally clean, with consistent lighting and sharp detail that aligns more with photorealism or digital painting than watercolor. While the mood and lighting evoke a watercolor

A woman in a futuristic city rendered in cyberpunk Art Nouvebfl/flux-2-maxPASS
A woman in a futuristic city rendered in cyberpunk Art Nouveau — neon colors and tech elements combined with flowing organic lines and floral borders

Evaluator (Qwen 3 VL)

The image successfully embodies the requested “cyberpunk Art Nouveau” style. Key visual characteristics are clearly present: - **Cyberpunk elements**: Neon signage (pink and cyan), rain-slicked futuristic cityscape, glowing tech interfaces, and a woman in a biomechanical outfit with visible circuitry and cybernetic enhancements. - **Art Nouveau elements**: Flowing, organic lines frame the composition; floral motifs (lilies and roses) are integrated into the border; elegant, curvilinear decorati

A medieval castle depicted in Japanese ukiyo-e woodblock pribfl/flux-2-maxFAIL
A medieval castle depicted in Japanese ukiyo-e woodblock print style combined with photorealistic sky and lighting

Evaluator (Qwen 3 VL)

The image attempts to merge a medieval castle with ukiyo-e woodblock print aesthetics, but it fails to clearly satisfy the requested style. The ukiyo-e elements — such as flat planes of color, bold outlines, and stylized composition — are largely absent. Instead, the castle is rendered with photorealistic textures and shading, especially in the stone walls and the detailed wooden bridge. The sky and lighting, while atmospheric, are rendered in a way that resembles cinematic realism rather than t

Extreme 100%vs100%
An interior scene rendered in soft brutalism — raw concrete bfl/flux-2-maxPASS
An interior scene rendered in soft brutalism — raw concrete textures combined with rounded pastel furniture and warm diffused light

Evaluator (Qwen 3 VL)

The image clearly depicts an interior scene that aligns with the requested “soft brutalism” style. Key visual characteristics are present: - **Raw concrete textures**: The walls, ceiling, and structural elements are made of exposed, textured concrete with visible cracks and imperfections — a hallmark of brutalism. - **Rounded pastel furniture**: The seating — including the curved sofa and armchair — is upholstered in soft, rounded forms in pastel shades (pink, mint, beige, yellow), contrasting

A still life painted in anxious minimalism — sparse compositbfl/flux-2-maxPASS
A still life painted in anxious minimalism — sparse composition with uncomfortable negative space, muted colors, and subtle visual tension

Evaluator (Qwen 3 VL)

The image presents a still life with a sparse composition: three objects — a glass, a bowl, and a folded napkin — arranged on a table against a vast, unadorned wall. The negative space is indeed substantial and uncomfortable in its emptiness, dominating the frame and amplifying the quiet isolation of the objects. The color palette is muted — soft grays, pale greens, and off-whites — avoiding vibrancy and reinforcing a subdued, contemplative mood. The brushwork is visible and painterly, suggestin

A cityscape rendered in nostalgic futurism — retro-futuristibfl/flux-2-maxPASS
A cityscape rendered in nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina

Evaluator (Qwen 3 VL)

The image successfully embodies the requested “nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina.” Key visual characteristics are clearly present: - **Retro-futurism / 1960s Space-Age Aesthetics**: The architecture features flying saucers, rocket ships, flying cars, domed towers, and curved, chrome-and-glass structures reminiscent of mid-century sci-fi visions (e.g., “The Jetsons” or “2001: A Space Odyssey” era). The vehicles and buildings hav