ImageBench

vs

192 evaluations across 6 categories

79%vs95%
Pass Rate
4.1svs28.1s
Avg Latency
Text Rendering40%87%Spatial Reasoning77%97%Human realism95%98%Truthfulness82%93%Professional Studio93%100%Graphical design58%92%Latency59%2%
The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink
Text RenderingTypography StyleEasybfl/flux-2-klein-9b

Prompt: The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Text Rendering 40%vs87%

Typography Style67%vs100%
Easy 67%vs100%
The word 'CHAPTER ONE' typed on aged paper with a vintage tybfl/flux-2-klein-9bFAIL
The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Evaluator (Qwen 3 VL)

The text “CHAPTER ONE” is rendered in a font that strongly resembles a vintage typewriter style — blocky, slightly uneven, with visible ink smudges and imperfections that mimic the look of old typewriter output. The paper background is aged, with creases, stains, and a yellowed texture, matching the “aged paper” description. The ink appears slightly uneven and worn, consistent with the prompt’s request. However, there is a spelling error: the word “CHAPTER” has an apostrophe inserted between th

The phrase 'TODAY'S SPECIAL' written in white chalk on a darbfl/flux-2-klein-9bPASS
The phrase 'TODAY'S SPECIAL' written in white chalk on a dark green chalkboard with chalk dust

Evaluator (Qwen 3 VL)

The text “TODAY’S SPECIAL” is rendered in a style that convincingly mimics white chalk on a dark green chalkboard. The letters have a granular, textured appearance consistent with chalk dust, and there are visible specks of chalk dust scattered around and within the letters, especially to the right. The surface shows faint smudges and erasure marks, enhancing the realism of a used chalkboard. The spelling is correct: “TODAY’S SPECIAL” (with proper apostrophe). VERDICT: PASS

The word 'OPEN' displayed on a red LED sign with glowing pixbfl/flux-2-klein-9bPASS
The word 'OPEN' displayed on a red LED sign with glowing pixel-style letters

Evaluator (Qwen 3 VL)

The text “OPEN” is rendered in a pixel-style composed of individual glowing red LED dots, which convincingly matches the described medium of a red LED sign. The letters are clearly formed with the characteristic blocky, dot-matrix appearance typical of LED displays. The spelling is correct: “O-P-E-N”. VERDICT: PASS

Writing accuracy33%vs83%
Easy 100%vs100%
A single large word 'HELLO' centered on a white backgroundbfl/flux-2-klein-9bPASS
A single large word 'HELLO' centered on a white background

Evaluator (Qwen 3 VL)

The word displayed is "HELLO", which is spelled correctly and fully readable. All five letters — H, E, L, L, O — are present, clear, and unambiguous. There are no misspellings, missing letters, or illegible parts. VERDICT: PASS

The word 'ADVENTURE' in big bold letters centered on a plainbfl/flux-2-klein-9bPASS
The word 'ADVENTURE' in big bold letters centered on a plain cream background

Evaluator (Qwen 3 VL)

The word "ADVENTURE" is spelled correctly and is fully readable. All letters are clearly visible and properly formed. There are no misspellings, missing words, or illegible parts. VERDICT: PASS

The word 'SERENITY' written large and centered on a solid libfl/flux-2-klein-9bPASS
The word 'SERENITY' written large and centered on a solid light blue background

Evaluator (Qwen 3 VL)

The word "SERENITY" is spelled correctly and is fully readable. All letters are clearly visible and properly formed against the solid light blue background. There are no misspellings, missing words, or illegible parts. VERDICT: PASS

Medium 33%vs100%
The sentence 'The quick brown fox jumps high' in clean blackbfl/flux-2-klein-9bPASS
The sentence 'The quick brown fox jumps high' in clean black text on a white background

Evaluator (Qwen 3 VL)

All words are spelled correctly and fully readable: - "The" - "quick" - "brown" - "fox" - "jumps" - "high" There are no misspellings, missing words, or illegible parts. VERDICT: PASS

'Every sunset brings a new dawn ahead' written neatly on a pbfl/flux-2-klein-9bFAIL
'Every sunset brings a new dawn ahead' written neatly on a pale yellow background

Evaluator (Qwen 3 VL)

The text in the image contains several errors compared to the intended phrase: - "sunset" is misspelled as "surnset" - "brings" is misspelled as "brimgs" - "dawn" is misspelled as "nown" All words are fully visible and readable, but they are not spelled correctly. VERDICT: FAIL

'Stars shine brightest in the dark night' displayed on a clebfl/flux-2-klein-9bFAIL
'Stars shine brightest in the dark night' displayed on a clean off-white background

Evaluator (Qwen 3 VL)

The word "brightest" is misspelled. In the image, it appears as "brigh!st", where the "i" is replaced by an exclamation mark "!" and the "t" is missing. This is a clear error in spelling. Additionally, the word "dark night" is correctly spelled, and all other words ("Stars", "shine", "in", "the", "dark", "night") are spelled correctly. However, because the intended phrase is "Stars shine brightest in the dark night" and the word "brightest" is misspelled, the text is not fully accurate. VERDI

Hard 0%vs33%
A poster with headline 'SUMMER SALE' in large bold text, bodbfl/flux-2-klein-9bFAIL
A poster with headline 'SUMMER SALE' in large bold text, body text 'Up to 50% off all items this weekend only' in medium size, and fine print 'Terms and conditions apply. See store for details.' at the bottom in small text

Evaluator (Qwen 3 VL)

The word "items" in the body text is misspelled as "iems". VERDICT: FAIL

A flyer with large title 'GRAND OPENING', subtitle 'Join us bfl/flux-2-klein-9bFAIL
A flyer with large title 'GRAND OPENING', subtitle 'Join us for food, music, and fun' in medium text, and small disclaimer 'RSVP required. Limited seating available.' at the bottom

Evaluator (Qwen 3 VL)

The intended text was: - Large title: "GRAND OPENING" — ✅ Correct and fully readable. - Subtitle: "Join us for food, music, and fun" — ✅ Correct and fully readable. - Small disclaimer: "RSVP required. Limited seating available." — ✅ Correct and fully readable. However, in the image, the subtitle is duplicated incorrectly: - The second line reads: "Join us te food, music, aval fun" - "te" is a misspelling of "for" - "aval" is a misspelling of "and" These are errors in the actual image, not

An event banner with headline 'ANNUAL GALA 2025' at the top bfl/flux-2-klein-9bFAIL
An event banner with headline 'ANNUAL GALA 2025' at the top in large serif font, description 'An evening of fine dining and live jazz' in medium text, and footnote 'Black tie optional. Doors open at 6 PM.' in small text

Evaluator (Qwen 3 VL)

The text in the image contains several errors: - “An evening df of fine” → “df” is not “of” (letters appear distorted or replaced). - “tnimingn and ld dive” → “tnimingn” is not “dining”, and “ld” is not “live”. - “jazz” is correctly spelled, but it’s on its own line, which is fine per layout, but the preceding line is corrupted. The bottom line “Black tie optional. Doors open at 6 PM.” is correct. VERDICT: FAIL

Extreme 0%vs100%
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagbfl/flux-2-klein-9bFAIL
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagline 'One ship. One chance. No turning back.' with credits block reading 'Directed by James Harlow. Starring Elena Voss, Marcus Chen, and Sofia Reyes. Music by Thomas Grant. A Paramount Pict

Evaluator (Qwen 3 VL)

The text in the image contains several errors compared to the intended text: 1. The tagline reads: "ONE ship. ONE CHANCLE:" — "CHANCLE" is misspelled; it should be "CHANCE". 2. The title is "THE VOY AGE" — the words "VOY" and "AGE" are split, but the intended title is "THE LAST VOYAGE". The word "LAST" is missing entirely. 3. In the credits block: - "MARRUS CHEN" — "Marcus Chen" is misspelled. - "SOFIA REYEEES." — "Sofia Reyes" is misspelled (extra 'E' at the end). - "A PARAIOUT PICTUR

A magazine cover titled 'NATIONAL EXPLORER' with cover linesbfl/flux-2-klein-9bFAIL
A magazine cover titled 'NATIONAL EXPLORER' with cover lines 'The Hidden Temples of Cambodia — page 34', 'Interview: Dr. Maria Santos on Climate Action', 'Top 10 Trails You Must Hike This Year', and a barcode area with 'Issue 247 | March 2025 | $6.99

Evaluator (Qwen 3 VL)

The intended text was: - "A magazine cover titled 'NATIONAL EXPLORER' with cover lines 'The Hidden Temples of Cambodia — page 34', 'Interview: Dr. Maria Santos on Climate Action', 'Top 10 Trails You Must Hike This Year', and a barcode area with 'Issue 247 | March 2025 | $6.99'". Actual text on the image: - Title: "NATIONAL EXPLORER" — ✅ Correct. - "The Temples of Cambodia — page 34" — ✅ Correct. - "Interview: Dr. Maria Santos Action" — ❌ "Action" is not preceded by "on", and "Santos Action" is

A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtbfl/flux-2-klein-9bFAIL
A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtitle 'Raw & Unfiltered', weight '12 oz (340g)', description 'Pure wildflower honey harvested from sustainable apiaries in the Pacific Northwest. No additives or preservatives.', and small te

Evaluator (Qwen 3 VL)

The text on the label contains several errors compared to the intended text: 1. "HAVESTOT" is misspelled — should be "HARVEST". 2. "Unflteted" is misspelled — should be "Unfiltered". 3. "honeyed" is misspelled — should be "honey". 4. "presertives" is misspelled — should be "preservatives". 5. "Best beere" is misspelled — should be "Best before". 6. The word "apiaries" is correctly spelled, but the phrase "honeyed from" is incorrect — it should be "honey from". 7. The weight "12 oz (340g)" is co

Spatial Reasoning 77%vs97%

Attributes Binding100%vs100%
Easy 100%vs100%
A red ball and a blue cube sitting side by side on a white sbfl/flux-2-klein-9bPASS
A red ball and a blue cube sitting side by side on a white surface

Evaluator (Qwen 3.5 122B)

Object 1: Red ball — Shape: sphere, Color: red Object 2: Blue cube — Shape: cube, Color: blue All attributes are correctly bound to the right object with no swaps. The red object is a ball (sphere), and the blue object is a cube — matching the prompt exactly. VERDICT: PASS

A green apple and an orange pumpkin on a wooden cutting boarbfl/flux-2-klein-9bPASS
A green apple and an orange pumpkin on a wooden cutting board

Evaluator (Qwen 3.5 122B)

Object 1: Green apple - Color: Green (correct) - Shape: Round (correct) - Size: Small to medium (correct for an apple) Object 2: Orange pumpkin - Color: Orange (correct) - Shape: Round with ribbed segments (correct for a pumpkin) - Size: Large (correct for a pumpkin) All attributes are correctly bound to the right object with no swaps. VERDICT: PASS

A yellow taxi and a black limousine parked next to each othebfl/flux-2-klein-9bPASS
A yellow taxi and a black limousine parked next to each other on a street

Evaluator (Qwen 3.5 122B)

Object 1: Yellow taxi - Color: Yellow (correct) - Shape: Standard sedan shape with taxi roof sign (correct) - Size: Medium-sized, typical taxi dimensions (correct) Object 2: Black limousine - Color: Black (correct) - Shape: Extended-length vehicle with elongated body (correct) - Size: Larger than the taxi, consistent with limousine dimensions (correct) All attributes are correctly bound to the right object with no swaps. VERDICT: PASS

Medium 100%vs100%
A large red circle, a small blue triangle, and a medium greebfl/flux-2-klein-9bPASS
A large red circle, a small blue triangle, and a medium green star arranged on a white canvas

Evaluator (Qwen 3.5 122B)

Object 1: Large red circle — Correct. The large circular shape is red, matching the prompt. Object 2: Small blue triangle — Correct. The small blue triangle is visible inside the red circle, matching the prompt. Object 3: Medium green star — Correct. The green star is medium-sized and located to the right of the circle, matching the prompt. All three objects are correctly bound to their specified attributes (color, size, shape) with no swaps. VERDICT: PASS

A tall thin white candle, a short fat red candle, and a medibfl/flux-2-klein-9bPASS
A tall thin white candle, a short fat red candle, and a medium spiral yellow candle on a shelf

Evaluator (Qwen 3.5 122B)

Object 1: Tall thin white candle — Observed: White color, tall and thin shape. ✅ Matches prompt. Object 2: Short fat red candle — Observed: Red color, short and fat (cylindrical) shape. ✅ Matches prompt. Object 3: Medium spiral yellow candle — Observed: Yellow color, spiral shape. ✅ Matches prompt. All attributes (color, size, shape) are correctly bound to the right object with no swaps. VERDICT: PASS

A big brown horse, a small white rabbit, and a medium-sized bfl/flux-2-klein-9bPASS
A big brown horse, a small white rabbit, and a medium-sized black cat standing in a meadow

Evaluator (Qwen 3.5 122B)

- Big brown horse: Observed attributes — brown coat, large size, standing in meadow. Matches prompt. - Small white rabbit: Observed attributes — white coat, small size, sitting in meadow. Matches prompt. - Medium-sized black cat: Observed attributes — black coat, medium size, standing in meadow. Matches prompt. All attributes (color, size, and implied shape/animal type) are correctly bound to the right object with no swaps. VERDICT: PASS

Hard 100%vs100%
A pink fire truck, a blue banana, and a green sun in a surrebfl/flux-2-klein-9bPASS
A pink fire truck, a blue banana, and a green sun in a surreal landscape

Evaluator (Qwen 3.5 122B)

Object: Fire truck — Observed attributes: Pink color, standard fire truck shape, realistic size relative to landscape. Object: Banana — Observed attributes: Blue color, curved banana shape, large size (appears oversized). Object: Sun — Observed attributes: Green color, circular face with eyes and leaf-like rays, stylized shape. All three objects are correctly bound to their specified colors: pink fire truck, blue banana, green sun. The shapes are also correct: fire truck is a vehicle, banana is

A purple carrot, a red tree with red leaves, and a white watbfl/flux-2-klein-9bPASS
A purple carrot, a red tree with red leaves, and a white watermelon sliced open

Evaluator (Qwen 3.5 122B)

Object: Purple carrot - Color: Purple (correct) - Shape: Conical/elongated (correct) - Size: Medium (as per visual proportion, correct) Object: Red tree with red leaves - Color: Red leaves (correct), brown trunk (not specified in prompt, but not a failure) - Shape: Tree with branching structure (correct) - Size: Medium (as per visual proportion, correct) Object: White watermelon sliced open - Color: White interior (correct), green rind (not specified in prompt, but not a failure) - Shape: Slic

An orange elephant, a yellow penguin, and a silver strawberrbfl/flux-2-klein-9bPASS
An orange elephant, a yellow penguin, and a silver strawberry arranged on a table

Evaluator (Qwen 3.5 122B)

Object: Orange elephant — Color: Orange, Shape: Elephant, Size: Medium (as expected for a toy) Object: Yellow penguin — Color: Yellow, Shape: Penguin, Size: Medium (as expected for a toy) Object: Silver strawberry — Color: Silver (mostly), Shape: Strawberry, Size: Medium (as expected for a toy) All three objects are correctly bound to their specified attributes with no swaps. The elephant is orange, the penguin is yellow, and the strawberry is silver. The shapes are also correct. VERDICT: PASS

Compositionality100%vs100%
Easy 100%vs100%
A tabby cat wearing oversized red sunglassesbfl/flux-2-klein-9bPASS
A tabby cat wearing oversized red sunglasses

Evaluator (Qwen 3.5 122B)

Element: Tabby cat — Present. The cat has distinct tabby markings (stripes and swirls) on its fur. Element: Oversized red sunglasses — Present. The cat is wearing large, bright red-framed sunglasses that cover most of its face. Element: Coherent combination — Yes. The sunglasses are properly placed on the cat’s face, and the overall composition is visually coherent and stylistically consistent. VERDICT: PASS

A golden retriever wearing a tiny cowboy hatbfl/flux-2-klein-9bPASS
A golden retriever wearing a tiny cowboy hat

Evaluator (Qwen 3.5 122B)

Element: Golden retriever — Present. Element: Tiny cowboy hat — Present. (The hat is proportionally small relative to the dog’s head, fitting the “tiny” descriptor.) Both elements are clearly and coherently combined in the image. The dog is wearing a cowboy hat that sits appropriately on its head, and the overall composition is realistic and well-executed. VERDICT: PASS

A penguin holding a colorful umbrellabfl/flux-2-klein-9bPASS
A penguin holding a colorful umbrella

Evaluator (Qwen 3.5 122B)

Element: Penguin — Present. Element: Colorful umbrella — Present (multi-colored segments: red, orange, yellow, green, blue). Both elements are clearly depicted and coherently combined in the image. The penguin is holding the umbrella, and the umbrella is colorful as requested. VERDICT: PASS

Medium 100%vs100%
A medieval knight in full armor sitting at a desk typing on bfl/flux-2-klein-9bPASS
A medieval knight in full armor sitting at a desk typing on a modern laptop

Evaluator (Qwen 3.5 122B)

Element 1: Medieval knight in full armor — Present. The figure is clad in detailed plate armor, including helmet, breastplate, gauntlets, and articulated limbs, consistent with a medieval knight. Element 2: Sitting at a desk — Present. The knight is seated on a wooden chair at a wooden desk, which holds the laptop and other items. Element 3: Typing on a modern laptop — Present. The knight’s armored hands are positioned over the keyboard of a silver laptop, which is a modern device. All descri

A Roman gladiator taking a selfie with a smartphone in the Cbfl/flux-2-klein-9bPASS
A Roman gladiator taking a selfie with a smartphone in the Colosseum

Evaluator (Qwen 3.5 122B)

Element 1: Roman gladiator — Present. The central figure wears a helmet with a red plume, a bronze-colored cuirass, and a red tunic, consistent with gladiatorial attire. Element 2: Taking a selfie — Present. The figure is holding a smartphone at arm’s length, angled toward his face, as if capturing a self-portrait. Element 3: Smartphone — Present. A modern black smartphone with a visible camera module is clearly held in the figure’s hand. Element 4: Colosseum — Present. The iconic Roman Colos

An astronaut in a spacesuit riding a bicycle through a parkbfl/flux-2-klein-9bPASS
An astronaut in a spacesuit riding a bicycle through a park

Evaluator (Qwen 3.5 122B)

Element: Astronaut in a spacesuit — Present. The figure is fully clad in a white spacesuit with visible American flag patches and a reflective helmet. Element: Riding a bicycle — Present. The astronaut is seated on and actively pedaling a blue bicycle with standard wheels and handlebars. Element: Through a park — Present. The setting is a lush green park with trees, grass, a paved path, and a bench in the background. All described elements are present and coherently combined in the image, des

Hard 100%vs100%
An octopus wearing a top hat and monocle playing chess on thbfl/flux-2-klein-9bPASS
An octopus wearing a top hat and monocle playing chess on the surface of the moon with Earth visible in the background

Evaluator (Qwen 3.5 122B)

Element 1: Octopus — Present. A large, detailed octopus is the central figure in the image. Element 2: Top hat — Present. The octopus is wearing a black top hat. Element 3: Monocle — Present. The octopus has a single gold-framed monocle over one eye. Element 4: Playing chess — Present. A wooden chessboard with pieces is set up in front of the octopus, and its tentacles are positioned as if interacting with the game. Element 5: Surface of the moon — Present. The setting is a rocky, cratered lunar

A dinosaur in a business suit giving a PowerPoint presentatibfl/flux-2-klein-9bPASS
A dinosaur in a business suit giving a PowerPoint presentation in a modern office while drinking coffee

Evaluator (Qwen 3.5 122B)

Element 1: Dinosaur — Present. A Tyrannosaurus rex head and upper body are clearly visible. Element 2: Business suit — Present. The dinosaur is wearing a full gray suit with a white shirt and blue tie. Element 3: PowerPoint presentation — Present. A large monitor displays a PowerPoint slide with charts and bullet points. Element 4: Modern office — Present. The setting includes a wooden desk, glass windows with city views, office lighting, and typical office furniture. Element 5: Drinking coffee

A robot dressed as a pirate sailing a paper boat on a lake obfl/flux-2-klein-9bPASS
A robot dressed as a pirate sailing a paper boat on a lake of lava while juggling three kittens

Evaluator (Qwen 3.5 122B)

Element checklist: - Robot: Present. The figure is a skeletal robot with mechanical limbs and joints, wearing pirate attire. - Dressed as a pirate: Present. The robot wears a pirate hat with a skull and crossbones, a vest, and a red sash. - Sailing a paper boat: Present. The robot is seated in a clearly visible paper boat floating on the lava. - Lake of lava: Present. The boat is on a body of glowing, flowing lava with rocky shores and fire in the background. - Juggling three kittens: Present.

Counting33%vs100%
Easy 100%vs100%
Exactly 3 red apples arranged on a wooden tablebfl/flux-2-klein-9bPASS
Exactly 3 red apples arranged on a wooden table

Evaluator (Qwen 3.5 122B)

I count exactly 3 red apples arranged on a wooden table. The prompt requested "Exactly 3 red apples arranged on a wooden table", and the image perfectly matches this description. VERDICT: PASS

Exactly 5 yellow rubber ducks sitting in a row on a bathroombfl/flux-2-klein-9bPASS
Exactly 5 yellow rubber ducks sitting in a row on a bathroom shelf

Evaluator (Qwen 3.5 122B)

I count exactly 5 yellow rubber ducks sitting in a row on a white bathroom shelf. All ducks are identical in appearance, yellow with orange beaks and black eyes, and are arranged neatly in a single line. The background consists of white tiled walls, consistent with a bathroom setting. The prompt requested: "Exactly 5 yellow rubber ducks sitting in a row on a bathroom shelf". The image perfectly matches this description. VERDICT: PASS

Exactly 4 white candles standing on a dark surfacebfl/flux-2-klein-9bPASS
Exactly 4 white candles standing on a dark surface

Evaluator (Qwen 3.5 122B)

I see exactly 4 white candles standing on a dark surface. Each candle is white, cylindrical, and has a wick. The background and surface are dark, matching the description. The requested prompt was: "Exactly 4 white candles standing on a dark surface". The image contains exactly 4 white candles on a dark surface. VERDICT: PASS

Medium 0%vs100%
2 orange cats and 3 black dogs sitting together on a grassy bfl/flux-2-klein-9bFAIL
2 orange cats and 3 black dogs sitting together on a grassy lawn

Evaluator (Qwen 3.5 122B)

The image contains: - 1 orange cat - 2 black dogs The prompt requested: - 2 orange cats - 3 black dogs The actual image does not match the requested count. There is only 1 orange cat (not 2) and only 2 black dogs (not 3). VERDICT: FAIL

4 red roses and 2 white daisies in a glass vase on a tablebfl/flux-2-klein-9bFAIL
4 red roses and 2 white daisies in a glass vase on a table

Evaluator (Qwen 3.5 122B)

Counting the objects in the image: - Red roses: There are clearly 7 red roses visible. - White daisies: There are 4 white daisies visible. The prompt requested: "4 red roses and 2 white daisies". The actual count does not match the request: - Red roses: 7 (requested: 4) → mismatch - White daisies: 4 (requested: 2) → mismatch Since the count of both object types does not match the requested quantities, the requirement is not fulfilled. VERDICT: FAIL

1 blue butterfly and 3 yellow butterflies resting on a gardebfl/flux-2-klein-9bFAIL
1 blue butterfly and 3 yellow butterflies resting on a garden fence

Evaluator (Qwen 3.5 122B)

I see: - 1 blue butterfly - 1 yellow butterfly The prompt requested: "1 blue butterfly and 3 yellow butterflies". I see only 1 yellow butterfly, not 3. VERDICT: FAIL

Hard 0%vs100%
Exactly 9 goldfish swimming in a large round fishbowlbfl/flux-2-klein-9bFAIL
Exactly 9 goldfish swimming in a large round fishbowl

Evaluator (Qwen 3.5 122B)

I count 7 goldfish in the image: 6 orange-and-white goldfish and 1 white goldfish with orange fins. The prompt requested exactly 9 goldfish. VERDICT: FAIL

A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 bfl/flux-2-klein-9bFAIL
A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 apples

Evaluator (Qwen 3.5 122B)

Counting the objects in the image: - Bananas: There is a bunch of 3 bananas (one long, one medium, one short, all connected at the stem). - Oranges: There are 4 oranges visible (three clearly in the foreground, one partially obscured behind the bananas). - Apples: There is 1 red apple visible. The prompt requested: 3 bananas, 4 oranges, and 2 apples. Actual count: 3 bananas, 4 oranges, 1 apple. The apple count does not match the request (1 instead of 2). VERDICT: FAIL

Exactly 11 colorful balloons floating against a clear blue sbfl/flux-2-klein-9bFAIL
Exactly 11 colorful balloons floating against a clear blue sky

Evaluator (Qwen 3.5 122B)

Counting the balloons in the image: - Top center: 1 yellow balloon - Left: 1 green, 1 red, 1 white, 1 yellow, 1 orange-red - Right: 1 pink, 1 light blue, 1 purple, 1 light blue, 1 yellow (partially obscured) - Bottom left: 1 red balloon - Bottom right: 1 pink balloon Total: 13 balloons. The prompt requested exactly 11 colorful balloons. There are 13 visible balloons in the image, which exceeds the requested count. VERDICT: FAIL

Negation56%vs89%
Easy 100%vs100%
A tropical beach scene with no peoplebfl/flux-2-klein-9bPASS
A tropical beach scene with no people

Evaluator (Qwen 3.5 122B)

The image depicts a tropical beach scene with palm trees, turquoise water, and white sand under a blue sky with clouds. A careful inspection of the entire image reveals no human figures, no people on the beach, in the water, or anywhere else in the scene. The beach is empty, as required by the prompt. VERDICT: PASS

A cat lying on a couch without a collarbfl/flux-2-klein-9bPASS
A cat lying on a couch without a collar

Evaluator (Qwen 3.5 122B)

The image shows a tabby cat lying on a gray couch. A careful inspection of the cat’s neck and upper body reveals no collar or any other form of neckwear. The cat’s fur is visible uninterrupted around the neck area. VERDICT: PASS

A classroom with desks and a chalkboard but no studentsbfl/flux-2-klein-9bPASS
A classroom with desks and a chalkboard but no students

Evaluator (Qwen 3.5 122B)

The image shows an empty classroom with desks and a chalkboard. There are no students visible in the room. The desks are arranged in rows, and the chalkboard has writing on it. The room is otherwise empty of people. VERDICT: PASS

Medium 67%vs100%
A sandy beach with palm trees, seashells, and a beach umbrelbfl/flux-2-klein-9bFAIL
A sandy beach with palm trees, seashells, and a beach umbrella but no water or ocean visible

Evaluator (Qwen 3.5 122B)

The prompt explicitly excluded “water or ocean” from the image. In the image, however, a clear horizon line is visible in the distance, beyond the sandy beach and beach umbrellas, where the sand meets a body of water — the ocean — under a blue sky. This confirms that the excluded element (water/ocean) is present. VERDICT: FAIL

An office workspace with a desk, chair, bookshelf, and lamp bfl/flux-2-klein-9bPASS
An office workspace with a desk, chair, bookshelf, and lamp but no computers or screens

Evaluator (Qwen 3.5 122B)

The image shows an office workspace with a wooden desk, a black ergonomic chair, a wooden bookshelf filled with books, and a silver adjustable desk lamp — all elements explicitly included in the prompt. The prompt explicitly excluded “computers or screens.” Upon close inspection, there are no computers (laptops, desktops, monitors, tablets, etc.) or screens visible on the desk or anywhere else in the image. The only electronic device present is a black telephone (a landline-style corded phone)

A kitchen with pots, pans, a stove, and a cutting board but bfl/flux-2-klein-9bPASS
A kitchen with pots, pans, a stove, and a cutting board but no food

Evaluator (Qwen 3.5 122B)

The image shows a kitchen stove with multiple pots and pans on it, and a wooden cutting board leaning against the tiled wall. The prompt explicitly excluded “food,” and there is no food visible in the image — no ingredients, leftovers, or prepared dishes are present on the stove, in the pots, or anywhere else in the frame. All required elements (pots, pans, stove, cutting board) are present, and the excluded element (food) is indeed absent. VERDICT: PASS

Hard 0%vs67%
A pepperoni pizza with tomato sauce, basil, and crust but abbfl/flux-2-klein-9bFAIL
A pepperoni pizza with tomato sauce, basil, and crust but absolutely no cheese

Evaluator (Qwen 3.5 122B)

The image shows a pepperoni pizza with tomato sauce, basil, and crust. However, it also clearly features a generous layer of melted, golden-brown cheese covering the entire surface beneath the pepperoni and basil. The prompt explicitly excluded cheese, and this element is visibly present. VERDICT: FAIL

A car body with windows, doors, and headlights but no wheelsbfl/flux-2-klein-9bFAIL
A car body with windows, doors, and headlights but no wheels

Evaluator (Qwen 3.5 122B)

The image shows a silver classic car with clearly visible wheels — both front and rear — featuring white-wall tires. The prompt explicitly excluded “wheels,” yet they are present in the image. VERDICT: FAIL

A human face with eyes, nose, and mouth but no eyebrowsbfl/flux-2-klein-9bFAIL
A human face with eyes, nose, and mouth but no eyebrows

Evaluator (Qwen 3.5 122B)

The image clearly shows a human face with eyes, nose, and mouth. Additionally, well-defined eyebrows are visible above both eyes. The prompt explicitly excluded "a human face with eyes, nose, and mouth but no eyebrows," meaning the absence of eyebrows is the required condition. Since eyebrows are present in the image, the excluded element is not absent. VERDICT: FAIL

Relative Position92%vs92%
Easy 100%vs100%
A cat sitting on top of a cardboard boxbfl/flux-2-klein-9bPASS
A cat sitting on top of a cardboard box

Evaluator (Qwen 3.5 122B)

The prompt specifies: “A cat sitting on top of a cardboard box.” In the image: - The cat is clearly positioned on the upper surface of the cardboard box. - The phrase “on top of” is accurately represented — the cat is sitting directly on the lid/top of the box. - No other positional relationships (left/right/above/below) are specified in the prompt, so there is no requirement to verify those. - The spatial arrangement matches the prompt exactly. VERDICT: PASS

A red ball underneath a wooden chairbfl/flux-2-klein-9bPASS
A red ball underneath a wooden chair

Evaluator (Qwen 3.5 122B)

The prompt specifies: "A red ball underneath a wooden chair". In the image: - There is a red ball. - There is a wooden chair. - The red ball is positioned directly beneath the chair, resting on the floor between the chair’s legs — which accurately fulfills the “underneath” spatial relationship. The prompt only specifies one spatial relationship: “underneath”. The other directional terms (on, left, right, above, below) are not part of the prompt and therefore do not need to be evaluated. Since

A coffee mug next to an open book on a deskbfl/flux-2-klein-9bPASS
A coffee mug next to an open book on a desk

Evaluator (Qwen 3.5 122B)

The prompt specifies: "A coffee mug next to an open book on a desk." In the image: - An open book is on a wooden desk. - A white coffee mug is also on the same wooden desk. - The mug is positioned to the right of the open book. - The spatial relationship “next to” is satisfied — they are adjacent on the desk. - The phrase “on a desk” is also satisfied — both objects are resting on the desk surface. All positional relationships described in the prompt are correctly rendered in the image. VERDI

Medium 100%vs100%
A bird perched on a branch, the branch extending from a treebfl/flux-2-klein-9bPASS
A bird perched on a branch, the branch extending from a tree next to a park bench

Evaluator (Qwen 3.5 122B)

The prompt specifies: “A bird perched on a branch, the branch extending from a tree next to a park bench.” Spatial analysis: - A bird is clearly perched on a branch — ✅ correct. - The branch extends from a tree — ✅ correct. - The tree is next to a park bench — ✅ correct. The bench is positioned on a paved path adjacent to the tree trunk, with visible space between them, satisfying “next to.” All positional relationships described in the prompt are accurately represented in the image. VERDICT:

A cat sitting on a pillow, the pillow placed on a wooden chabfl/flux-2-klein-9bPASS
A cat sitting on a pillow, the pillow placed on a wooden chair

Evaluator (Qwen 3.5 122B)

The prompt specifies: "A cat sitting on a pillow, the pillow placed on a wooden chair." In the image: - A cat is clearly sitting on a pillow. - The pillow is clearly placed on a wooden chair. - The spatial relationships described — “on” — are correctly represented. There are no directional terms like “left,” “right,” “above,” or “below” in the prompt, so those are not required to be verified. The only positional relationship specified is “on,” which is accurately depicted. VERDICT: PASS

A candle on top of a stack of books, the books next to a potbfl/flux-2-klein-9bPASS
A candle on top of a stack of books, the books next to a potted plant

Evaluator (Qwen 3.5 122B)

The candle is placed directly on top of the stack of books — satisfying “on top of”. The stack of books is positioned to the left of the potted plant — satisfying “next to” (specifically, to the left). There are no objects explicitly positioned “under” the candle or “above” the books — but “on top of” and “next to” are sufficient to fulfill the prompt’s spatial requirements. All positional relationships described in the prompt are correctly represented. VERDICT: PASS

Hard 67%vs67%
Five chess pieces on a board: a king in the center, a queen bfl/flux-2-klein-9bFAIL
Five chess pieces on a board: a king in the center, a queen to its left, a bishop to its right, a rook above the king, and a knight below the king

Evaluator (Qwen 3.5 122B)

The image shows five chess pieces arranged on a chessboard. Let’s evaluate the spatial relationships as described in the prompt: Prompt: - King in the center → The central piece is a white king (with cross-shaped crown), which is correct. - Queen to its left → To the left of the king (from viewer’s perspective) is a white rook (crowned with a castle-like top), not a queen. The queen has a crown with a cross or cross-like design, but this piece is a rook. ❌ Incorrect. - Bishop to its right → To

A desk scene: laptop in the center, coffee mug to the left, bfl/flux-2-klein-9bPASS
A desk scene: laptop in the center, coffee mug to the left, phone to the right, lamp above, and notebook below

Evaluator (Qwen 3.5 122B)

The prompt specified: "laptop in the center, coffee mug to the left, phone to the right, lamp above, and notebook below". - Laptop is centered on the desk — ✅ Correct. - Coffee mug is to the left of the laptop — ✅ Correct. - Phone is to the right of the laptop — ✅ Correct. - Lamp is above the desk (positioned above the laptop) — ✅ Correct. - Notebook is below the laptop — ✅ Correct (it is in front of the laptop, which is spatially “below” from the viewer’s perspective). All described spatial r

Five animals in a field: a horse on the far left, a cow to tbfl/flux-2-klein-9bPASS
Five animals in a field: a horse on the far left, a cow to the right of the horse, a sheep in the center, a goat to the right of the sheep, and a dog on the far right

Evaluator (Qwen 3.5 122B)

The image shows five animals standing in a row from left to right in a field: 1. Horse — on the far left. 2. Cow — immediately to the right of the horse. 3. Sheep — in the center (third position). 4. Goat — to the right of the sheep (fourth position). 5. Dog — on the far right. All described spatial relationships are correctly depicted: - Horse is on the far left. - Cow is to the right of the horse. - Sheep is in the center. - Goat is to the right of the sheep. - Dog is on the far right. No p

Extreme 100%vs100%
A landscape with a red barn in the foreground at large scalebfl/flux-2-klein-9bPASS
A landscape with a red barn in the foreground at large scale, a windmill in the midground at medium scale, and snow-capped mountains in the background at small scale, all with consistent depth perspective

Evaluator (Qwen 3.5 122B)

The image correctly depicts the spatial relationships as specified in the prompt: - The red barn is in the foreground at a large scale, occupying the left side of the frame and appearing closest to the viewer. - The windmill is in the midground at a medium scale, positioned to the right of the barn and further back, consistent with depth perspective. - The snow-capped mountains are in the background at a small scale, appearing farthest away and spanning the horizon behind the windmill and barn.

A beach scene with a sandcastle in the foreground, a surfer bfl/flux-2-klein-9bPASS
A beach scene with a sandcastle in the foreground, a surfer on the waves in the midground, and a cruise ship on the horizon in the background, sizes decreasing with distance

Evaluator (Qwen 3.5 122B)

The image correctly depicts the spatial relationships as described in the prompt: - The sandcastle is in the foreground, occupying the lower portion of the image and closest to the viewer. - The surfer is in the midground, positioned on the waves further back than the sandcastle, between the shore and the horizon. - The cruise ship is in the background, located on the horizon line, farthest from the viewer. The sizes of the objects decrease with distance: the sandcastle is large and detailed,

A city park with a park bench and pigeons in the foreground,bfl/flux-2-klein-9bPASS
A city park with a park bench and pigeons in the foreground, a fountain in the midground, and skyscrapers in the background, each layer progressively smaller

Evaluator (Qwen 3.5 122B)

The image correctly depicts the spatial relationships as described in the prompt: - In the foreground: A park bench is clearly visible, with pigeons scattered around and on it. This matches “a park bench and pigeons in the foreground.” - In the midground: A large fountain is centrally located, with pigeons gathered around it. This matches “a fountain in the midground.” - In the background: Skyscrapers form the skyline behind the fountain and trees, appearing progressively smaller due to perspec

Scale & Proportions78%vs100%
Easy 67%vs100%
A person standing next to a yellow school bus, showing realibfl/flux-2-klein-9bPASS
A person standing next to a yellow school bus, showing realistic relative scale

Evaluator (Qwen 3.5 122B)

The person is standing next to the school bus, and their relative scale is realistic. The bus is a standard-sized school bus, and the person appears to be an average adult, fitting the expected proportions. The bus’s windows, wheels, and overall dimensions are consistent with real-world school buses, and the person’s height relative to the bus (e.g., the bus’s door height, window height, and roofline) is proportionally accurate. The perspective and shadows also support correct scale. VERDICT: P

A ladybug sitting on a large green leaf, shown at realistic bfl/flux-2-klein-9bPASS
A ladybug sitting on a large green leaf, shown at realistic proportions

Evaluator (Qwen 3.5 122B)

The ladybug is depicted sitting on a large green leaf, and the proportions are realistic. The leaf is significantly larger than the ladybug, which is appropriately scaled for a real-world macro photograph. The ladybug’s size relative to the leaf’s veins and edges is consistent with natural proportions. There is no distortion or exaggeration of scale. VERDICT: PASS

A child holding a basketball, both at correct relative sizesbfl/flux-2-klein-9bFAIL
A child holding a basketball, both at correct relative sizes

Evaluator (Qwen 3.5 122B)

The basketball is disproportionately large compared to the child holding it. A standard basketball is about 24 cm (9.4 inches) in diameter, while a typical child of this age (around 4–6 years old) is about 90–100 cm (3–3.3 feet) tall. The ball appears to be roughly the same size as the child’s torso, which is physically implausible — a child cannot hold a basketball that is nearly as large as their own body. This violates the “correct relative sizes” requirement. VERDICT: FAIL

Medium 100%vs100%
A tiny person standing on the rim of a coffee cup, looking dbfl/flux-2-klein-9bPASS
A tiny person standing on the rim of a coffee cup, looking down into it

Evaluator (Qwen 3.5 122B)

The image shows a miniature figurine standing on the rim of a coffee cup. The figurine is clearly scaled down relative to the cup — its height is roughly comparable to the width of the cup’s rim, which is a reasonable and consistent scale for a “tiny person” in this context. The perspective and proportions are coherent: the person appears small enough to fit on the rim without distorting the cup’s scale, and the cup itself is proportionally realistic relative to the saucer and spoon. The scene i

A giant tabby cat walking between city skyscrapers like a kabfl/flux-2-klein-9bPASS
A giant tabby cat walking between city skyscrapers like a kaiju

Evaluator (Qwen 3.5 122B)

The image depicts a tabby cat walking between city skyscrapers, and the cat is rendered as enormous — its body and limbs are proportionally massive compared to the surrounding buildings, cars, and pedestrians. The cat’s head and torso are nearly as tall as the buildings, and its paws dwarf the cars on the street. This creates a clear kaiju-like scale, consistent with the prompt’s request for a “giant tabby cat.” The relative sizes are intentionally exaggerated for dramatic effect, and the propo

A miniature house sitting on the palm of someone's handbfl/flux-2-klein-9bPASS
A miniature house sitting on the palm of someone's hand

Evaluator (Qwen 3.5 122B)

The image shows a miniature house resting on the palm of a human hand. The house is clearly scaled down to fit comfortably within the palm, with its roof, windows, door, and base all proportionally small relative to the hand. The fingers and palm are anatomically correct and provide a realistic scale reference. The house is not just “on” the hand — it sits nestled within the palm, with the fingers curved naturally around it, reinforcing the sense of scale. The proportions are consistent with the

Hard 67%vs100%
A group photo of an elephant, a horse, a dog, a cat, and a mbfl/flux-2-klein-9bFAIL
A group photo of an elephant, a horse, a dog, a cat, and a mouse, all standing in a line with correct real-world proportional sizes

Evaluator (Qwen 3.5 122B)

The image shows an elephant, a horse, a dog, a cat, and a mouse standing in a line. However, the mouse is not visible in the image — it is missing entirely. Since the prompt explicitly requires all five animals to be present with correct real-world proportional sizes, and the mouse is absent, the requirement is not met. Additionally, while the elephant, horse, dog, and cat are roughly proportionally sized relative to each other (elephant largest, then horse, then dog, then cat), the absence of

A dinner table scene with a plate, fork, knife, wine glass, bfl/flux-2-klein-9bPASS
A dinner table scene with a plate, fork, knife, wine glass, salt shaker, and peppercorn, all at realistic relative sizes

Evaluator (Qwen 3.5 122B)

The image depicts a dinner table scene with a plate, fork, knife, wine glass, salt shaker, and peppercorn (visible as scattered peppercorns near the jar). The relative sizes of these objects are consistent with real-world dining settings: - The plate is appropriately sized for a main course. - The wine glasses are standard height and bowl size for red wine. - The fork, knife, and spoon are proportionally sized relative to the plate and each other — the knife is longer than the spoon, and the fo

A farm scene with a barn, tractor, farmer, cow, chicken, andbfl/flux-2-klein-9bPASS
A farm scene with a barn, tractor, farmer, cow, chicken, and a fence post, all at correct proportional scale

Evaluator (Qwen 3.5 122B)

The image depicts a farm scene with a barn, tractor, farmer, cow, chickens, and a fence post. The relative sizes of these objects are consistent with real-world proportions. The barn is large and appropriately scaled to the tractor, which is in turn larger than the farmer. The cow is proportionally larger than the chickens, and the fence post is appropriately sized relative to the other elements. All objects are rendered with realistic scale and perspective, maintaining a coherent spatial relati

Human realism 95%vs98%

Faces & Expressions92%vs100%
Easy 100%vs100%
A close-up portrait of a young woman with a warm natural smibfl/flux-2-klein-9bPASS
A close-up portrait of a young woman with a warm natural smile

Evaluator (Qwen 3 VL)

The face in the image is largely symmetrical with natural, correctly proportioned features. The eyes, eyebrows, nose, and mouth are well-aligned and appear anatomically accurate. The smile is warm and natural, with visible teeth and slight crinkling around the eyes, conveying genuine happiness. There are no obvious distortions or unnatural artifacts. The lighting and skin texture look realistic, and the emotion is clearly readable as joyful and approachable. VERDICT: PASS

A portrait of a middle-aged man with a calm, serious expressbfl/flux-2-klein-9bPASS
A portrait of a middle-aged man with a calm, serious expression

Evaluator (Qwen 3 VL)

The face in the image is symmetric and features are anatomically correct — eyes, nose, mouth, and ears are proportionally aligned and natural. The man appears middle-aged, with visible signs of aging such as wrinkles, graying hair, and a beard. His expression is calm and serious, conveyed through neutral lips, direct gaze, and lack of smiling or emotive cues. There are no noticeable distortions — the facial structure is balanced, and lighting and composition support a realistic portrait. The emo

A headshot of a teenager with a relaxed, neutral facebfl/flux-2-klein-9bPASS
A headshot of a teenager with a relaxed, neutral face

Evaluator (Qwen 3 VL)

The face in the image is symmetric and features are anatomically correct: eyes are evenly spaced and aligned, eyebrows are balanced, nose and mouth are proportionate, and ears are symmetrical. There are no noticeable distortions. The expression is neutral and relaxed, with no strong emotion conveyed — consistent with the prompt’s request for a “relaxed, neutral face.” The gaze is direct and steady, which is appropriate for a headshot. VERDICT: PASS

Medium 67%vs100%
A close-up of a person's face showing clear disgust, nose wrbfl/flux-2-klein-9bFAIL
A close-up of a person's face showing clear disgust, nose wrinkled and lip curled

Evaluator (Qwen 3 VL)

The face is not symmetric — the left side (viewer’s left) appears slightly more compressed and distorted compared to the right, especially around the nose and mouth. The eyebrows are unevenly drawn together, and the eyes are asymmetrical in their squinting. The nose is wrinkled and the lips are curled downward, which aligns with the intended emotion of disgust. However, the lower lip and chin area show unnatural, exaggerated bulging and texture, which appears to be a rendering artifact or distor

A portrait capturing genuine surprise — raised eyebrows, widbfl/flux-2-klein-9bPASS
A portrait capturing genuine surprise — raised eyebrows, wide eyes, slightly open mouth

Evaluator (Qwen 3 VL)

The face is largely symmetric with correctly rendered features: both eyes are wide and alert, eyebrows are raised, and the mouth is slightly open — all aligning with the prompt’s request for “genuine surprise.” The expression is clear and emotionally readable. There are no significant distortions; the facial structure, proportions, and details (eyelashes, skin texture, nose, lips) appear natural and well-defined. Minor artifacts like slight unevenness in the eyebrow arches or subtle asymmetry in

A face showing quiet melancholy — downcast eyes, slightly drbfl/flux-2-klein-9bPASS
A face showing quiet melancholy — downcast eyes, slightly drawn mouth, distant gaze

Evaluator (Qwen 3 VL)

The face is largely symmetric with correctly rendered features: eyes, nose, mouth, and ears are proportionally aligned and anatomically plausible. There are minor natural asymmetries (e.g., slight unevenness in eyebrow shape or skin texture), but nothing that disrupts facial coherence or suggests AI distortion. The intended emotion — quiet melancholy — is clearly readable. The downcast gaze, slightly drawn mouth, and furrowed brow convey deep sadness or introspective sorrow. The lighting and te

Hard 100%vs100%
An elderly woman in her 80s with deep wrinkles and white haibfl/flux-2-klein-9bPASS
An elderly woman in her 80s with deep wrinkles and white hair, smiling warmly with kind eyes

Evaluator (Qwen 3 VL)

The face is largely symmetrical, with both eyes, eyebrows, and the smile appearing balanced. The features are anatomically plausible: the deep wrinkles around the eyes and mouth are consistent with an elderly person, and the white hair is natural-looking. The emotion is clearly readable — the warm, genuine smile and crinkled eyes convey kindness and joy. There are no significant distortions; the facial structure is coherent, and the lighting and texture are realistic. The image successfully capt

A 5-year-old child with round cheeks and large eyes, crying bfl/flux-2-klein-9bPASS
A 5-year-old child with round cheeks and large eyes, crying with tears streaming down their face

Evaluator (Qwen 3 VL)

The face in the image is largely symmetric with correctly rendered features: the eyes are large and proportionate, the cheeks appear round, and the tear tracks are clearly visible streaming down from each eye. The child’s expression — furrowed brow, downturned mouth, and visible tears — strongly conveys the emotion of crying. There are no significant distortions; the facial structure is anatomically plausible for a young child, and the details (like the reflection in the eyes and the droplet of

An elderly man in his 70s with age spots and thinning gray hbfl/flux-2-klein-9bPASS
An elderly man in his 70s with age spots and thinning gray hair, looking skeptically over reading glasses

Evaluator (Qwen 3 VL)

The face is largely symmetric with natural, correct anatomical features. The elderly man’s facial structure is balanced, with no obvious distortions. Age spots are clearly visible across the forehead, cheeks, and chin, consistent with the prompt. Thinning gray hair is present, receding slightly at the temples. He wears reading glasses, and his expression — a skeptical, slightly downturned look with furrowed brows and a tight mouth — is clearly readable and matches the requested emotion. There ar

Extreme 100%vs100%
A group of three friends at a table: one laughing with head bfl/flux-2-klein-9bPASS
A group of three friends at a table: one laughing with head tilted back, one looking shocked with wide eyes, and one rolling their eyes with a smirk

Evaluator (Qwen 3 VL)

The image shows three distinct facial expressions that align with the prompt: - Left person: Laughing with head tilted back, eyes closed, mouth wide open — clearly conveying joy. - Center person: Wide eyes, open mouth, raised eyebrows — clearly conveying shock. - Right person: One eye winking, smirk, hand on cheek, and a sweat drop — clearly conveying a dismissive or sarcastic “rolling eyes” expression. All faces are symmetric and feature correctly rendered anatomy for the stylized manga/anime

A family photo with four people: a father beaming with pridebfl/flux-2-klein-9bPASS
A family photo with four people: a father beaming with pride, a mother smiling gently, a teenager looking bored, and a child grinning excitedly

Evaluator (Qwen 3 VL)

The faces in the image are largely symmetrical with correctly rendered features — eyes, noses, mouths, and ears are proportionate and anatomically plausible. There are no obvious distortions or unnatural deformities. Regarding the intended emotions: - The father (left) is beaming with pride — his wide, toothy grin and crinkled eyes convey genuine joy. - The mother (right) is smiling gently — her soft, closed-mouth smile and relaxed expression match this description. - The teenager (front center

Three coworkers in a meeting: one looking frustrated with fubfl/flux-2-klein-9bPASS
Three coworkers in a meeting: one looking frustrated with furrowed brows, one nodding with a pleased expression, and one looking confused with a tilted head

Evaluator (Qwen 3 VL)

The image shows three individuals seated at a conference table, each displaying distinct facial expressions. - The person on the left has a furrowed brow and a clenched jaw, with hands clasped near the mouth — conveying frustration or concern. The facial symmetry and features are natural and correctly rendered. - The person in the center has a slight smile and is looking toward the right, with a relaxed posture — conveying a pleased or satisfied expression. Features are symmetrical and realisti

Full Body100%vs100%
Easy 100%vs100%
A woman standing upright in a relaxed pose, arms at her sidebfl/flux-2-klein-9bPASS
A woman standing upright in a relaxed pose, arms at her sides, wearing jeans and a white t-shirt

Evaluator (Qwen 3 VL)

The image shows a full-body shot of a woman standing upright with her arms at her sides, wearing a white t-shirt and blue jeans — exactly matching the prompt. - Limb count: Correct — two arms, two legs, both visible and intact. - Proportions: Anatomically correct — head, torso, limbs are proportionally balanced and natural. - Pose: The woman is standing upright in a relaxed pose with arms at her sides — fully recognizable and matches the prompt. - No distortions: The figure is symmetrical, limb

A man standing straight facing the camera, hands in pockets,bfl/flux-2-klein-9bPASS
A man standing straight facing the camera, hands in pockets, normal proportions

Evaluator (Qwen 3 VL)

The image shows a full-body shot of a man standing straight and facing the camera, with both hands in his pockets. His posture is upright and symmetrical, and his proportions appear normal — head, torso, arms, and legs are in proportion to each other without obvious exaggeration or distortion. He has two arms, two legs, and all limbs are present and correctly attached. The pose is clearly recognizable as “standing straight with hands in pockets,” a common and natural stance. There are no anatomi

A person standing in a neutral pose in a park, full body visbfl/flux-2-klein-9bPASS
A person standing in a neutral pose in a park, full body visible from head to feet

Evaluator (Qwen 3 VL)

The image shows a full-body view of a person standing in a park, as requested. The person is positioned in a neutral, upright stance, facing the camera, with arms hanging loosely at the sides and feet shoulder-width apart — a pose that is anatomically plausible and recognizable as a standard “neutral standing” pose. Limb count is correct: two arms, two legs, with all limbs attached to the torso and head in a natural configuration. Proportions appear anatomically accurate — head size relative to

Medium 100%vs100%
A woman mid-stride running through a park, hair and clothes bfl/flux-2-klein-9bPASS
A woman mid-stride running through a park, hair and clothes showing motion

Evaluator (Qwen 3 VL)

The image shows a woman in mid-stride running through a park. Her hair is flowing backward, and her clothes (tank top and leggings) show dynamic motion consistent with running. Her pose is anatomically plausible: arms are bent and swinging naturally, legs are in a running gait with one foot off the ground, and her torso is slightly forward — all typical of running motion. Limb count is correct (two arms, two legs), and proportions appear natural and proportional to her body. There are no obvious

A man captured mid-jump over a puddle, body in the air with bfl/flux-2-klein-9bPASS
A man captured mid-jump over a puddle, body in the air with bent knees

Evaluator (Qwen 3 VL)

The image shows a man mid-air, jumping over a puddle. His body is captured with bent knees, arms slightly out for balance, and feet positioned above the water — consistent with the described action. Limb count is correct: two arms, two legs, and the torso. Proportions appear anatomically plausible — no obvious limb duplication or missing limbs. The pose is recognizable as a jump, and the reflection in the puddle reinforces the action. There are no significant distortions; the anatomy is coherent

A dancer in a dynamic spin, dress flowing outward with one lbfl/flux-2-klein-9bPASS
A dancer in a dynamic spin, dress flowing outward with one leg extended

Evaluator (Qwen 3 VL)

The dancer in the image is captured mid-spin, with her dress flowing outward as described, and one leg extended in a classic ballet pose (likely a développé or attitude en l’air). She is balancing on one foot en pointe, with the other leg lifted and extended behind her. Her arms are gracefully extended outward, enhancing the sense of motion. Limb count is correct: two arms, two legs, two feet, and the torso and head are intact. Proportions are anatomically accurate — the body is slender and bal

Hard 100%vs100%
A person in a yoga tree pose — standing on one leg with the bfl/flux-2-klein-9bPASS
A person in a yoga tree pose — standing on one leg with the other foot pressed to the inner thigh, arms raised overhead

Evaluator (Qwen 3 VL)

The image shows a person from behind performing a yoga pose. The pose matches the description: standing on one leg, with the other foot pressed against the inner thigh, and arms raised overhead with palms together. The limb count is correct (two arms, two legs), and proportions appear anatomically plausible — no obvious limb duplication or gross distortion. The pose is clearly recognizable as the “Tree Pose” (Vrksasana), a standard yoga posture. The body alignment, though viewed from behind, is

A rock climber clinging to a cliff face with arms and legs ebfl/flux-2-klein-9bPASS
A rock climber clinging to a cliff face with arms and legs extended to different holds

Evaluator (Qwen 3 VL)

The image shows a real photograph of a rock climber on a steep cliff face. The climber has two arms and two legs, with proportions consistent with a human body. The pose is anatomically plausible: one arm is extended upward gripping a hold, the other arm is extended outward gripping a chain or anchor point, and the legs are positioned to maintain balance and leverage on the rock. The climber’s body is angled appropriately for climbing, with visible muscle engagement and realistic joint articulat

A cellist seated with correct posture, legs around the instrbfl/flux-2-klein-9bPASS
A cellist seated with correct posture, legs around the instrument, bow arm extended

Evaluator (Qwen 3 VL)

The cellist in the image is seated with correct posture, and the cello is properly positioned between his legs, resting on a stand. His left hand holds the neck of the cello, and his right hand holds the bow, which is extended across the strings — consistent with the described action. The limb count is correct (two arms, two legs), and proportions appear anatomically accurate. The pose is recognizable as a standard cello-playing posture, though the bow is held slightly lower than typical for act

Extreme 100%vs100%
Two tango dancers in a close embrace, the man dipping the wobfl/flux-2-klein-9bPASS
Two tango dancers in a close embrace, the man dipping the woman backward with intertwined legs

Evaluator (Qwen 3 VL)

The image shows a real photograph of two tango dancers in a dynamic pose. The man is dipping the woman backward, and their legs are intertwined — her left leg is wrapped around his right thigh, and her right leg is extended back, lifted, and resting on his left thigh or hip area. The woman’s body is arched backward, supported by the man’s arms, and her head is tilted up toward him. Their facial expressions and posture convey the intimacy and intensity of tango. Anatomical correctness: - Limb co

Two soccer players mid-tackle, one sliding and the other jumbfl/flux-2-klein-9bPASS
Two soccer players mid-tackle, one sliding and the other jumping over with the ball at their feet

Evaluator (Qwen 3 VL)

The image shows two soccer players in a dynamic, realistic mid-tackle scenario. The player in the white jersey is sliding on the ground, with his body low to the turf, one leg extended toward the ball, and his arms out for balance. The player in the red jersey is airborne, jumping over the sliding player, with his body upright and his right foot near the ball. The ball is clearly visible at the feet of the sliding player, though it is not explicitly under his control — it is near his foot and be

Two gymnasts performing a synchronized handstand, side by sibfl/flux-2-klein-9bPASS
Two gymnasts performing a synchronized handstand, side by side with identical body alignment

Evaluator (Qwen 3 VL)

The image shows two gymnasts performing a synchronized handstand, side by side, with nearly identical body alignment — both legs are straight and pointed, torsos are parallel to the floor, and heads are aligned downward. Their limb count is correct (two arms, two legs per person), and proportions appear anatomically accurate: limbs are proportionally long relative to torsos, joints are positioned naturally, and there is no obvious limb duplication or missing limbs. The pose is clearly recogniza

Hands92%vs92%
Easy 100%vs100%
A person standing with both hands visible at their sides, fibfl/flux-2-klein-9bPASS
A person standing with both hands visible at their sides, fingers relaxed and naturally spread

Evaluator (Qwen 3 VL)

Both hands are visible at the person’s sides, fingers relaxed and naturally spread, as per the prompt. Counting the fingers on each hand: - Left hand: Thumb, index, middle, ring, pinky — 5 fingers. - Right hand: Thumb, index, middle, ring, pinky — 5 fingers. Anatomically, the hands are correctly formed. The fingers are relaxed, slightly curled at the tips, which is natural for a standing pose with hands at the sides. There are no extra fingers, missing fingers, or anatomical anomalies. The thu

A woman holding her hands out in front of her palms-up, all bfl/flux-2-klein-9bPASS
A woman holding her hands out in front of her palms-up, all ten fingers clearly visible

Evaluator (Qwen 3 VL)

Both hands are clearly visible, palms up, and all ten fingers are distinctly visible — five on each hand. The fingers are anatomically correct in number and arrangement. The gesture is natural and consistent with the prompt: hands held out in front of the body, palms up, fingers spread. No anomalies such as extra or missing fingers, fused digits, or unnatural deformities are present. VERDICT: PASS

A man with arms crossed, both hands resting on opposite uppebfl/flux-2-klein-9bPASS
A man with arms crossed, both hands resting on opposite upper arms with fingers visible

Evaluator (Qwen 3 VL)

The man in the image has his arms crossed, with both hands resting on opposite upper arms, and fingers are visible. Counting the fingers on each hand: - The left hand (on the viewer’s right, resting on the left upper arm) clearly shows 5 fingers: thumb, index, middle, ring, and pinky. - The right hand (on the viewer’s left, resting on the right upper arm) also clearly shows 5 fingers. The grip and gesture are anatomically plausible. The hands are positioned naturally on the upper arms, with fi

Medium 100%vs100%
A person holding a white ceramic coffee mug by the handle wibfl/flux-2-klein-9bPASS
A person holding a white ceramic coffee mug by the handle with their right hand

Evaluator (Qwen 3 VL)

The visible hand in the image is holding the white ceramic coffee mug by its handle with the right hand, as prompted. Counting the fingers: - Thumb: visible, curled around the handle. - Index finger: visible, wrapping around the handle. - Middle finger: visible, gripping the handle. - Ring finger: visible, gripping the handle. - Pinky finger: visible, wrapping around the handle. All five fingers are clearly visible and accounted for. Anatomical correctness: The grip is natural and anatomicall

A woman writing in a notebook with a pen gripped correctly ibfl/flux-2-klein-9bPASS
A woman writing in a notebook with a pen gripped correctly in her right hand

Evaluator (Qwen 3 VL)

The right hand, holding the pen, clearly shows five fingers: thumb, index, middle, ring, and pinky — all visible and anatomically positioned. The grip is a standard tripod grip, with the pen held between the thumb, index, and middle fingers, which is anatomically correct for writing. The left hand is resting on the notebook, also showing five fingers, though less prominently. No anomalies are visible — no extra or missing fingers, no unnatural joint angles, and the hands appear proportionate and

A person playing an acoustic guitar with left hand on the frbfl/flux-2-klein-9bPASS
A person playing an acoustic guitar with left hand on the fretboard and right hand strumming

Evaluator (Qwen 3 VL)

Left hand (on fretboard): 5 fingers visible — thumb, index, middle, ring, pinky — all anatomically positioned to press strings. Right hand (strumming): 5 fingers visible — thumb, index, middle, ring, pinky — thumb is partially obscured but clearly present; fingers are positioned as if strumming or fretting near the soundhole. Anatomical correctness: - Left hand: Fingers are bent and pressing down on strings in a standard fingerstyle/fretting position. Thumb is wrapped behind the neck — ty

Hard 67%vs67%
A person making a peace sign with their right hand, clearly bfl/flux-2-klein-9bPASS
A person making a peace sign with their right hand, clearly showing two fingers up and three curled

Evaluator (Qwen 3 VL)

The visible right hand in the image is making a peace sign: the index and middle fingers are extended upward, while the ring, pinky, and thumb are curled or folded inward. This is a standard “V” sign gesture. Counting the fingers: - Index finger: visible and extended - Middle finger: visible and extended - Ring finger: curled inward - Pinky finger: curled inward - Thumb: curled inward (partially visible, tucked under the other fingers) That’s five fingers — all present and accounted for. Anat

A hand giving a thumbs-up with four fingers curled and thumbbfl/flux-2-klein-9bPASS
A hand giving a thumbs-up with four fingers curled and thumb extended upward

Evaluator (Qwen 3 VL)

The image shows a single hand performing a thumbs-up gesture. The thumb is extended upward, and the other four fingers are curled into a fist. Counting the fingers: There are clearly four fingers curled (index, middle, ring, and pinky) and one thumb extended — totaling five fingers per hand. The hand is shown in full, and no anomalies are visible in the finger count or structure. Anatomically, the gesture is correct: the thumb is extended upward, and the other four fingers are curled inward,

A person counting to three on their fingers, with index, midbfl/flux-2-klein-9bFAIL
A person counting to three on their fingers, with index, middle, and ring fingers extended

Evaluator (Qwen 3 VL)

The image shows a person holding up their left hand with the index, middle, and ring fingers extended — matching the prompt’s request to “count to three.” The thumb is also visible and slightly curled, and the pinky finger is not extended. That makes 3 extended fingers on the left hand, not 5. The right hand is partially visible, with the thumb and index finger extended, and the other fingers curled — not matching the “count to three” gesture. It does not show 5 fingers. Anatomically, the gest

Extreme 100%vs100%
Two people shaking hands in a professional setting, all fingbfl/flux-2-klein-9bPASS
Two people shaking hands in a professional setting, all fingers of both hands anatomically correct

Evaluator (Qwen 3 VL)

Counting the fingers on the visible hands: - The hand on the left (in the navy suit) clearly shows 5 fingers: thumb, index, middle, ring, and pinky. - The hand on the right (in the gray suit) also clearly shows 5 fingers, though slightly obscured by the grip. The grip is a standard professional handshake: palms facing each other, fingers interlaced, thumbs touching or nearly touching. The posture is anatomically plausible — no fingers are fused, duplicated, or missing. The knuckles and finger

Two people doing a high-five, palms meeting with all fingersbfl/flux-2-klein-9bPASS
Two people doing a high-five, palms meeting with all fingers extended on both hands

Evaluator (Qwen 3 VL)

Each visible hand in the image clearly shows five fingers — thumb, index, middle, ring, and pinky — with no missing or extra digits. The gesture is a high-five, with palms meeting and all fingers extended, which matches the prompt. The anatomy is accurate: the hands are positioned naturally for a high-five, with no unnatural deformities, overlapping fingers, or anatomical anomalies. The fingers are aligned and proportionate, and the thumbs are visible and correctly oriented. VERDICT: PASS

One person handing a small gift box to another person, both bfl/flux-2-klein-9bPASS
One person handing a small gift box to another person, both sets of hands gripping the box correctly

Evaluator (Qwen 3 VL)

Both hands in the image are clearly visible and show all five fingers per hand — thumb, index, middle, ring, and little finger — with no missing or extra digits. The grip is anatomically correct: each hand is holding the gift box with the thumb and fingers wrapped around the sides, supporting the box securely. The thumbs are positioned on opposite sides of the box, and the fingers are curled around the edges — a natural and realistic way to hold a small object during a gift exchange. There are

Multi-Subject100%vs100%
Easy 100%vs100%
Two women standing side by side: one with short blonde hair bfl/flux-2-klein-9bPASS
Two women standing side by side: one with short blonde hair wearing a red jacket, the other with long black hair wearing a blue dress

Evaluator (Qwen 3 VL)

Person 1 (left): - Hair: Short, blonde, styled in a neat, slightly wavy bob. - Clothing: Wearing a bright red jacket over a white and black patterned top. - Accessories: Gold necklace and hoop earrings visible. - Matches description: Yes — short blonde hair, red jacket. Person 2 (right): - Hair: Long, straight, black hair with soft waves. - Clothing: Wearing a sleeveless royal blue dress. - Accessories: Gold necklace with a pendant. - Matches description: Yes — long black hair, blue dress. Bot

Two men at a cafe: one with a beard and glasses wearing a grbfl/flux-2-klein-9bPASS
Two men at a cafe: one with a beard and glasses wearing a green sweater, the other clean-shaven in a black hoodie

Evaluator (Qwen 3 VL)

Person 1 (left): - Has a full beard. - Wears black-framed glasses. - Wears a green ribbed sweater. - Visually distinct and matches description. Person 2 (right): - Clean-shaven. - Wears a black hoodie. - Visually distinct and matches description. All individuals are present, visually distinct, and match their described features. VERDICT: PASS

Two friends posing: one tall with curly red hair in denim ovbfl/flux-2-klein-9bPASS
Two friends posing: one tall with curly red hair in denim overalls, the other short with straight brown hair in a yellow sundress

Evaluator (Qwen 3 VL)

Person 1 (tall): - Hair: Curly, red (appears vibrant and distinctly curly) - Clothing: Denim overalls (worn over a light-colored tank top) - Height: Appears significantly taller than the other person, consistent with “tall” Person 2 (short): - Hair: Straight, brown (medium-length, straight-cut) - Clothing: Yellow sundress (strap-style, flowing, sleeveless) - Height: Clearly shorter than the first person, consistent with “short” Both individuals are visually distinct and match the described fea

Medium 100%vs100%
Four people at a dinner table: a bald man in a suit, a womanbfl/flux-2-klein-9bPASS
Four people at a dinner table: a bald man in a suit, a woman with silver hair in a red blouse, a young man with dreadlocks in a denim jacket, and a woman with a hijab in a green dress

Evaluator (Qwen 3 VL)

- Bald man in a suit: Present. He is on the far left, bald, wearing a dark suit with a white shirt and red tie. - Woman with silver hair in a red blouse: Present. She is second from the left, has silver-gray hair, and is wearing a red blouse. - Young man with dreadlocks in a denim jacket: Present. He is third from the left, has dark dreadlocks, and is wearing a denim jacket over a white shirt. - Woman with a hijab in a green dress: Present. She is on the far right, wearing a dark gray hijab and

Three musicians on stage: a tall woman with short pink hair bfl/flux-2-klein-9bPASS
Three musicians on stage: a tall woman with short pink hair playing bass, a man with a beard and bandana on drums, and a slim man with glasses playing keyboard

Evaluator (Qwen 3 VL)

- Tall woman with short pink hair playing bass: Present. She has short, bright pink hair, is playing a bass guitar, and appears tall and slender. Her features match the description. - Man with a beard and bandana on drums: Present. He is seated behind the drum kit, has a full beard, and is wearing a patterned bandana on his head. His features match the description. - Slim man with glasses playing keyboard: Present. He is on the right, playing a keyboard, wearing glasses, and appears slim. His fe

Three hikers on a trail: a man with a backpack and sunhat, abfl/flux-2-klein-9bPASS
Three hikers on a trail: a man with a backpack and sunhat, a woman with braided hair and hiking poles, and a teenager with a baseball cap and water bottle

Evaluator (Qwen 3 VL)

Person 1 (left): Man with a backpack and sunhat — Observed: Wearing a beige sunhat, carrying a large black backpack, smiling, has a beard, wearing a blue t-shirt and khaki shorts. Matches description. Person 2 (center): Woman with braided hair and hiking poles — Observed: Has long braided hair, wearing a purple shirt and black leggings, holding two hiking poles, carrying a blue backpack. Matches description. Person 3 (right): Teenager with a baseball cap and water bottle — Observed: Wearing a

Truthfulness 82%vs93%

Photorealism100%vs100%
Medium 100%vs100%
A still life on a wooden table featuring a clear glass of wabfl/flux-2-klein-9bPASS
A still life on a wooden table featuring a clear glass of water, a polished silver spoon, a folded linen napkin, and a ripe red apple — each material rendered with accurate texture and light behavior

Evaluator (Qwen 3.6 27B)

The image depicts a still life on a wooden table with a clear glass of water, a polished silver spoon, a folded linen napkin, and a ripe red apple — all rendered with attention to material-specific visual properties. - **Glass (water glass)**: The glass is transparent with accurate refraction and reflection. The water inside shows realistic surface tension and subtle distortion of the background. Light bends through the glass correctly, and reflections on the rim and surface are plausible. - *

A workspace desk with a brushed aluminum laptop, a leather-bbfl/flux-2-klein-9bPASS
A workspace desk with a brushed aluminum laptop, a leather-bound notebook, a ceramic coffee mug, and a glass paperweight — all materials visually distinct and realistic

Evaluator (Qwen 3.6 27B)

The image displays the following materials as requested: - **Brushed aluminum laptop**: The laptop’s surface shows a clear brushed metal texture with realistic specular highlights and reflections, consistent with aluminum. - **Leather-bound notebook**: The notebook’s cover has visible grain, stitching, and subtle wear, accurately conveying leather’s tactile and visual properties. - **Ceramic coffee mug**: The mug exhibits matte, slightly porous texture with a speckled glaze and soft reflections

A kitchen counter with a stainless steel pot, a wooden cuttibfl/flux-2-klein-9bPASS
A kitchen counter with a stainless steel pot, a wooden cutting board, a glass bottle of olive oil, and a cotton dish towel — each surface reflecting light correctly

Evaluator (Qwen 3.6 27B)

The stainless steel pot shows realistic metallic reflections, including clear reflections of the window and surrounding objects, with appropriate specular highlights and brushed texture. The glass bottle of olive oil correctly renders translucency — the green oil inside is visible through the glass, and the glass surface reflects light and surroundings accurately, including subtle refraction of the window. The wooden cutting board displays natural wood grain, matte finish, and appropriate diff

Physics & Reflections92%vs100%
Easy 67%vs100%
A red sphere on a white surface with a single shadow cast tobfl/flux-2-klein-9bPASS
A red sphere on a white surface with a single shadow cast to the right, consistent with a light source from the upper left

Evaluator (Qwen 3.6 27B)

The shadow is cast to the right of the red sphere, which is consistent with a light source from the upper left — as specified in the prompt. The shadow’s shape is soft and elongated, which is physically plausible for a directional light source on a spherical object. The shadow’s direction and orientation match the expected behavior: the shadow falls to the right, and its edges are soft and naturally diffused, not distorted or unnaturally shaped. There is no reflection visible on the white surfac

A coffee mug on a sunny windowsill casting a sharp shadow tobfl/flux-2-klein-9bPASS
A coffee mug on a sunny windowsill casting a sharp shadow to the left

Evaluator (Qwen 3.6 27B)

The image shows a coffee mug on a windowsill with a sharp, well-defined shadow cast to the left. This is consistent with a strong, direct light source (like sunlight from a window) coming from the right. The shadow’s direction and shape accurately correspond to the mug’s form — the handle casts a distinct crescent-shaped shadow, and the body of the mug casts a soft-edged shadow that follows its cylindrical contour. The shadow’s edges are crisp, suggesting a bright, direct light source, which mat

A person standing outdoors at midday with a short shadow dirbfl/flux-2-klein-9bFAIL
A person standing outdoors at midday with a short shadow directly below them

Evaluator (Qwen 3.6 27B)

The shadow cast by the person is located directly beneath their feet, consistent with a midday sun position (when the sun is high overhead). The shadow is elongated and slightly distorted — it appears to stretch out behind the person’s feet, which is not physically accurate for a true midday sun. At midday, shadows should be very short and directly beneath the subject, not trailing behind. The shadow’s direction and orientation are therefore incorrect. Additionally, the shadow’s edges are soft a

Medium 100%vs100%
A woman standing in front of a full-length mirror, her reflebfl/flux-2-klein-9bPASS
A woman standing in front of a full-length mirror, her reflection accurately showing the back of her outfit and reversed orientation

Evaluator (Qwen 3.6 27B)

The image shows a woman standing with her back to the viewer, facing two full-length mirrors. The reflections in both mirrors accurately show the back of her outfit — her pink t-shirt and blue jeans — and are correctly reversed left-right, as expected from a mirror’s physical properties. The orientation is consistent: her right shoulder appears on the left side of the reflection, and her left shoulder on the right — a standard mirror reversal. Shadows and reflections are physically plausible: -

A cat sitting on a polished marble floor with its reflectionbfl/flux-2-klein-9bPASS
A cat sitting on a polished marble floor with its reflection visible below

Evaluator (Qwen 3.6 27B)

The reflection of the cat on the polished marble floor is mostly physically plausible. The reflection is inverted as expected (upside down), and its shape and proportions match the cat above, with clear mirroring of features like the eyes, nose, and ears. The distortion is minimal and consistent with the perspective and surface curvature of the floor tiles — slight warping near the edges is natural due to the tiled surface and viewing angle. The color fidelity is accurate, with the cat’s fur ton

A vase of flowers on a glass table with a clear reflection vbfl/flux-2-klein-9bPASS
A vase of flowers on a glass table with a clear reflection visible on the table surface

Evaluator (Qwen 3.6 27B)

The reflection on the glass table is clear, symmetrical, and accurately mirrors the vase and flowers, including the water level and stem details. The reflection is slightly distorted by the curvature of the glass and the angle of the surface, which is physically accurate for a curved glass tabletop. The colors of the reflection match the original objects, with no unnatural color shifts. Shadows are not prominently visible in the image, but the overall lighting is consistent — the reflection appe

Hard 100%vs100%
A row of trees reflected in a calm lake at sunset, with the bfl/flux-2-klein-9bPASS
A row of trees reflected in a calm lake at sunset, with the reflection slightly darker and gently rippled

Evaluator (Qwen 3.6 27B)

The reflection in the image is physically plausible. The trees are reflected in the calm lake, and the reflection is slightly darker and gently rippled — consistent with the surface texture and lighting conditions of a sunset. The direction of the reflection matches the orientation of the trees above, with the reflection inverted as expected. The color of the reflection accurately captures the warm orange and pink hues of the sunset sky, though slightly muted due to the water’s absorption and th

A city skyline reflected in a rain puddle on a street, with bfl/flux-2-klein-9bPASS
A city skyline reflected in a rain puddle on a street, with mild distortion and color shift

Evaluator (Qwen 3.6 27B)

The image depicts a city skyline reflected in a rain puddle, with visible distortion and color shift — consistent with the prompt. Physical plausibility check: - Direction and orientation: The reflection is inverted vertically, as expected in a puddle. The buildings appear upside down relative to their real-world orientation, which is physically correct for a water surface acting as a mirror. - Distortion: The reflection is warped at the edges due to the curved surface of the puddle and the sh

A mountain range reflected in a still alpine lake, with the bfl/flux-2-klein-9bPASS
A mountain range reflected in a still alpine lake, with the reflection inverted and subtly muted in color

Evaluator (Qwen 3.6 27B)

The reflection in the lake is inverted as expected — the top of the mountain appears at the bottom of the reflection, and vice versa, which is physically correct for a calm water surface acting as a mirror. The orientation is accurate: the reflection is flipped vertically, not horizontally. The direction of light and shadows is consistent: the sun appears to be low in the sky (likely morning or late afternoon), casting warm highlights on the mountain faces, and the reflection mirrors this light

Extreme 100%vs100%
A clear glass sphere sitting on a mirror surface near a poolbfl/flux-2-klein-9bPASS
A clear glass sphere sitting on a mirror surface near a pool of water, showing refraction through the sphere, reflection on the mirror, and water reflection simultaneously

Evaluator (Qwen 3.6 27B)

The image depicts a clear glass sphere resting on the surface of water, which is itself on a reflective surface (likely a pool edge or tiled deck). The prompt required simultaneous refraction through the sphere, reflection on the mirror surface, and water reflection — all of which are present. Let’s break down physical plausibility: 1. **Refraction through the sphere**: The sphere acts as a lens. The image shows distorted, inverted, and magnified views of the surroundings inside the sphere — c

A chrome sculpture in a room with a glass floor and large wibfl/flux-2-klein-9bPASS
A chrome sculpture in a room with a glass floor and large windows, showing multiple reflections and environmental mapping on every surface

Evaluator (Qwen 3.6 27B)

The reflections and shadows in the image are largely physically plausible. The chrome sculpture accurately reflects the surrounding environment — including the large windows, buildings outside, and interior lights — with correct perspective and distortion based on its curved surfaces. The reflections on the glass floor mirror the sculpture and the room with appropriate perspective and slight distortion from the glass’s flat, tiled surface. The direction of reflections corresponds to the light so

A crystal wine glass on a polished marble counter next to a bfl/flux-2-klein-9bPASS
A crystal wine glass on a polished marble counter next to a window overlooking a lake — showing refraction in the glass, reflection on the marble, and the lake's own reflective surface

Evaluator (Qwen 3.6 27B)

The image depicts a crystal wine glass on a polished marble countertop next to a window overlooking a lake. Let’s evaluate the physical plausibility of the requested elements: refraction in the glass, reflection on the marble, and the lake’s reflective surface. 1. Refraction in the glass: - The glass shows clear refraction: the background scenery (trees, sky, lake) is distorted and bent as it passes through the curved glass walls. The cut patterns on the glass also refract light, creating br

World Knowledge67%vs83%
Easy 100%vs100%
The Eiffel Tower in Paris on a clear day, architecturally acbfl/flux-2-klein-9bPASS
The Eiffel Tower in Paris on a clear day, architecturally accurate with its iron lattice structure and three levels

Evaluator (Qwen 3.6 27B)

The depicted subject is the Eiffel Tower in Paris, clearly recognizable and presented under a clear blue sky, matching the prompt’s setting. Architecturally, the image accurately shows the iconic iron lattice structure, including the three levels (base, middle, and top observation decks), with correct proportions and design details. The tower’s distinctive arched base and tapering form are faithfully rendered. No significant factual or structural inaccuracies are visible. VERDICT: PASS

The Taj Mahal in Agra seen from the reflecting pool, with acbfl/flux-2-klein-9bPASS
The Taj Mahal in Agra seen from the reflecting pool, with accurate dome, minarets, and symmetry

Evaluator (Qwen 3.6 27B)

The depicted subject is the Taj Mahal in Agra, viewed from the reflecting pool, as requested. The image clearly shows the main dome, four minarets, and the symmetrical layout of the structure, which are all architecturally accurate. The reflection in the pool is well-rendered and enhances the symmetry. The building’s details, including the arches, ornamentation, and proportions, are faithful to the real Taj Mahal. There are no noticeable factual or architectural inaccuracies. VERDICT: PASS

The Statue of Liberty on her island pedestal, with correct tbfl/flux-2-klein-9bPASS
The Statue of Liberty on her island pedestal, with correct torch, crown, tablet, and green patina

Evaluator (Qwen 3.6 27B)

The depicted subject is the Statue of Liberty, clearly recognizable and positioned on her island pedestal as requested. The statue features: - A correct torch (golden, held high in the right hand). - A crown with seven rays (symbolizing the seven continents or seven seas). - A tablet held in the left hand, inscribed with “JULY IV MDCCLXXVI” (July 4, 1776). - A green patina, which is accurate for the copper statue’s age and weathering. The pedestal and surrounding island are also correctly rend

Medium 67%vs33%
A traditional Japanese shamisen instrument with accurate thrbfl/flux-2-klein-9bPASS
A traditional Japanese shamisen instrument with accurate three-stringed design, wooden body, and bachi plectrum

Evaluator (Qwen 3.6 27B)

The depicted instrument is a traditional Japanese shamisen, and it is recognizable as such. It features: - A wooden body with a characteristic rounded, slightly asymmetrical shape (often called “kakushi” or “kakushi-gata”). - Three strings, which is accurate for a standard shamisen. - A long neck with tuning pegs at the top (though the pegs appear to be somewhat stylized or simplified, they are present). - A bachi (plectrum) is not visible in the image — it is typically held separately by the p

A medieval astrolabe with accurate rete, tympan, and engravebfl/flux-2-klein-9bFAIL
A medieval astrolabe with accurate rete, tympan, and engraved degree markings on brass

Evaluator (Qwen 3.6 27B)

The depicted object is a circular brass instrument with engraved degree markings, a central pivot, and radial lines — consistent with the general form of a medieval astrolabe. However, upon closer inspection, several inaccuracies undermine its factual/architectural accuracy: 1. **Rete and Tympan Misrepresentation**: The “rete” (the star-mapping framework) is not clearly distinguishable as a separate, movable component. The radial lines and star positions appear fused with the tympan (the fixed

A Japanese kintsugi bowl — cracked ceramic repaired with visbfl/flux-2-klein-9bPASS
A Japanese kintsugi bowl — cracked ceramic repaired with visible gold lacquer lines

Evaluator (Qwen 3.6 27B)

The image depicts a ceramic bowl with visible cracks repaired using gold lacquer lines, which is consistent with the traditional Japanese art of kintsugi. The bowl’s form, material texture, and the application of gold seams are recognizable and factually accurate. The cracks are not hidden but highlighted, which aligns with the philosophy and aesthetic of kintsugi. The bowl appears to be a real-world object, not an AI-generated or stylized abstraction. There are no significant inaccuracies in t

Hard 0%vs100%
A detailed diagram of the water cycle showing evaporation frbfl/flux-2-klein-9bFAIL
A detailed diagram of the water cycle showing evaporation from the ocean, condensation forming clouds, precipitation as rain, and collection in rivers flowing back to the sea

Evaluator (Qwen 3.6 27B)

The image depicts a diagram of the water cycle, but it contains multiple inaccuracies in labeling and representation: 1. The process labeled “Evaporation” is shown as rain falling into the ocean — this is backwards. Evaporation is water turning into vapor and rising, not rain falling into the ocean. 2. “Condensation forming” is incorrectly labeled with a wavy arrow pointing to a cloud, but the arrow direction and placement are misleading — condensation forms clouds from rising vapor, not from r

A cross-section showing tectonic plates at a convergent bounbfl/flux-2-klein-9bFAIL
A cross-section showing tectonic plates at a convergent boundary with subduction, a trench, volcanic arc, and layered mantle

Evaluator (Qwen 3.6 27B)

The image depicts a cross-section of a convergent tectonic boundary with subduction, a trench, a volcanic arc, and layered mantle — which aligns with the requested subject. However, the labels are not scientifically accurate. For example: - “Do Adenlervt” is not a real geological feature; it appears to be a garbled or fictional name. - “Voveat Arc,” “Teich aac,” “Beudohe Arc,” “Woletiet,” “Wodonvemrie Fralluwy,” and “Convergent Buundaon” are all made-up or incorrectly spelled terms. - The “tren

An anatomical illustration of the human heart showing four cbfl/flux-2-klein-9bFAIL
An anatomical illustration of the human heart showing four chambers, valves, aorta, pulmonary arteries, and veins with correct placement

Evaluator (Qwen 3.6 27B)

The depicted subject is anatomically recognizable as a human heart with four chambers, major vessels, and internal structures. However, the labels are not accurate — they appear to be gibberish or randomly generated text (e.g., “Stechariele,” “Noambarlk,” “Paposalacty,” “Aumobcnarine”). This renders the image factually inaccurate for educational or medical purposes, as the labels fail to correctly identify any real anatomical structures. While the visual representation of the heart’s shape, cha

Extreme 100%vs100%
A Morpho menelaus butterfly with accurate iridescent blue dobfl/flux-2-klein-9bPASS
A Morpho menelaus butterfly with accurate iridescent blue dorsal wings, brown ventral wing pattern with eyespots, and correct body morphology

Evaluator (Qwen 3.6 27B)

The depicted subject is a Morpho menelaus butterfly, and it is recognizable as such. The dorsal (upper) wings display a striking iridescent blue, which is accurate for this species. The ventral (lower) wings show a brown background with prominent eyespots — a key identifying feature — and the wing shape, including the characteristic tail-like extensions, is correct. The body morphology, including the antennae, thorax, and abdomen, appears anatomically accurate. There are no significant factual

The Hagia Sophia interior showing the massive central dome, bfl/flux-2-klein-9bPASS
The Hagia Sophia interior showing the massive central dome, pendentives, semi-domes, and Islamic calligraphy medallions alongside Byzantine mosaics

Evaluator (Qwen 3.6 27B)

The image depicts the interior of the Hagia Sophia in Istanbul, correctly showcasing its massive central dome, the supporting pendentives, semi-domes, and the juxtaposition of Islamic calligraphy medallions (added during its conversion to a mosque) alongside the original Byzantine mosaics. Architecturally, the dome’s structure, the pendentives that transition the dome to the square base, and the semi-domes are accurately rendered. The calligraphy medallions (notably featuring verses from the Qu

A specimen of bismuth crystal showing accurate iridescent oxbfl/flux-2-klein-9bPASS
A specimen of bismuth crystal showing accurate iridescent oxide layers, geometric hopper crystal formation, and characteristic staircase structure

Evaluator (Qwen 3.6 27B)

The depicted subject is a bismuth crystal, and it is recognizable as such. It displays: - **Iridescent oxide layers**: The colorful, rainbow-like striations across the crystal faces are characteristic of bismuth’s surface oxidation, which forms thin oxide films that interfere with light to produce iridescence. - **Geometric hopper crystal formation**: The crystal exhibits a “hopper” morphology — a common growth pattern in bismuth where the crystal grows outward from the center, forming stepped,

Professional Studio 93%vs100%

Camera & Lighting100%vs100%
Easy 100%vs100%
A wide-angle shot of a modern living room interior showing ebfl/flux-2-klein-9bPASS
A wide-angle shot of a modern living room interior showing expansive space with slight barrel distortion at the edges

Evaluator (Gemma 4 26B)

The image presents a wide-angle shot of a modern living room interior, and the perspective clearly exhibits barrel distortion at the edges — a hallmark of ultra-wide or fisheye lenses. The curved distortion is visible in the floorboards, ceiling, and walls, especially at the periphery, which matches the prompt’s description. The depth of field appears to be shallow to moderate, with the central area of the room (sofa, coffee table) in sharp focus while the background elements (windows, dining a

A close-up macro shot of a single dewdrop on a rose petal, sbfl/flux-2-klein-9bPASS
A close-up macro shot of a single dewdrop on a rose petal, shallow depth of field

Evaluator (Gemma 4 26B)

The image presents a close-up macro shot of a single dewdrop resting on a rose petal, with the dewdrop sharply in focus while the surrounding petal and background are softly blurred — indicating a shallow depth of field. The perspective is intimate and macro, emphasizing the dewdrop’s spherical form and reflective surface. The lighting is soft and diffused, creating gentle highlights on the droplet and subtle reflections, which enhances the macro detail without harsh shadows. The composition and

An aerial bird's-eye view looking straight down at a trafficbfl/flux-2-klein-9bPASS
An aerial bird's-eye view looking straight down at a traffic roundabout with cars and trees

Evaluator (Gemma 4 26B)

The image presents a clear aerial bird’s-eye view looking straight down at a traffic roundabout, with cars circulating and trees surrounding the central green space. The perspective is perfectly aligned with the prompt — directly overhead, capturing the full circular layout. Depth of field appears uniform, with no selective blur, suggesting a wide-angle or drone shot with consistent focus across the scene. Lighting is even and natural, with no harsh shadows or artificial lighting patterns, consi

Medium 100%vs100%
A portrait lit with classic Rembrandt lighting — triangle ofbfl/flux-2-klein-9bPASS
A portrait lit with classic Rembrandt lighting — triangle of light on the shadowed cheek, moody atmosphere

Evaluator (Gemma 4 26B)

The image displays a portrait with a clear, dramatic lighting setup. The key element — the “triangle of light” on the shadowed cheek — is present: a distinct, warm-toned triangular highlight is visible on the subject’s left cheek (viewer’s right), casting the opposite side of the face into deep shadow. This is the hallmark of Rembrandt lighting. The background is softly blurred, suggesting a shallow depth of field, which isolates the subject and enhances the moody atmosphere. The perspective is

A landscape photograph bathed in golden hour light — long wabfl/flux-2-klein-9bPASS
A landscape photograph bathed in golden hour light — long warm shadows, orange sky, soft glow

Evaluator (Gemma 4 26B)

The image depicts a landscape bathed in warm, golden-hour lighting, with an orange sky and soft glow. Long, directional shadows stretch across the foreground grassy field, consistent with low-angle sunlight. The perspective is wide and layered, showing foreground terrain, midground trees and fields, and distant rolling hills, creating a strong sense of depth. The lighting pattern — warm tones, soft gradients, and silhouetted elements — aligns with the prompt’s description. VERDICT: PASS

A high-key portrait of a person against a bright white backgbfl/flux-2-klein-9bPASS
A high-key portrait of a person against a bright white background, minimal shadows, airy and luminous

Evaluator (Gemma 4 26B)

The image presents a high-key portrait with a bright white background, as specified. The lighting is soft and even, minimizing harsh shadows and creating an airy, luminous effect across the subject’s face and shoulders. The depth of field is shallow, keeping the subject’s face in sharp focus while gently blurring the background — consistent with portrait photography. The perspective is centered and frontal, typical for a portrait shot. All elements align with the prompt’s description. VERDICT:

Hard 100%vs100%
A portrait shot at 85mm f/1.4 with creamy bokeh background, bfl/flux-2-klein-9bPASS
A portrait shot at 85mm f/1.4 with creamy bokeh background, backlit with warm rim light separating the subject from a blurred garden

Evaluator (Gemma 4 26B)

The image matches the described camera/lighting prompt in key ways: - Perspective: It’s a portrait shot, framed to focus on the subject’s face and shoulders, consistent with a 85mm lens which provides a flattering, compressed perspective. - Depth of field: The background is significantly blurred with creamy bokeh — a hallmark of a wide aperture (f/1.4) and a telephoto lens like 85mm. The subject is sharply in focus while the background dissolves into soft, circular highlights. - Lighting: There

A street photograph with shallow depth of field as if shot abfl/flux-2-klein-9bPASS
A street photograph with shallow depth of field as if shot at 50mm f/1.8, subject sharp with soft city lights in the background

Evaluator (Gemma 4 26B)

The image depicts a street photograph with a subject in sharp focus while the background — including city lights, cars, and buildings — is softly blurred, creating a shallow depth of field. The bokeh effect from the lights is consistent with a wide aperture (f/1.8), and the perspective appears natural for a 50mm lens, with a centered subject and symmetrical urban framing. The lighting pattern includes ambient city lights that are rendered as soft, out-of-focus circles, matching the “soft city li

A food photograph shot from above at 35mm f/2.8 with naturalbfl/flux-2-klein-9bPASS
A food photograph shot from above at 35mm f/2.8 with natural window light from the left, slight vignette, warm color temperature

Evaluator (Gemma 4 26B)

The image is shot from a top-down perspective, which matches the “shot from above” requirement. The depth of field appears shallow, with the food in focus and the background (wooden table and cup) softly blurred — consistent with a 35mm f/2.8 lens. The lighting comes from the left, as evidenced by the shadows cast to the right of the food and the bright window frame visible on the left edge. There is a subtle vignette effect, with darker corners and edges. The color temperature is warm, with gol

Extreme 100%vs100%
A tilt-shift photograph of a city street making real-life lobfl/flux-2-klein-9bPASS
A tilt-shift photograph of a city street making real-life look like a miniature model — selective focus strip, saturated colors

Evaluator (Gemma 4 26B)

The image successfully emulates a tilt-shift photograph of a city street with a miniature model effect. It features a strong perspective looking down a straight street, with buildings converging toward a vanishing point — a hallmark of tilt-shift composition. The depth of field is selectively focused along a narrow horizontal band in the middle of the street, while the foreground and background are blurred, mimicking the shallow depth of field typical of tilt-shift lenses. The lighting appears e

A cinematic frame with blue and orange anamorphic lens flarebfl/flux-2-klein-9bPASS
A cinematic frame with blue and orange anamorphic lens flare streaking horizontally across a dark moody scene

Evaluator (Gemma 4 26B)

The image features a dark, moody scene with a strong horizontal lens flare composed of blue on the left and orange on the right — matching the described “blue and orange anamorphic lens flare streaking horizontally.” The flare is positioned across the frame, creating a cinematic, high-contrast look. The background and subject are intentionally blurred, suggesting a shallow depth of field, which enhances the focus on the flare and contributes to the cinematic feel. The perspective is centered and

A portrait with dramatic split lighting — half the face brigbfl/flux-2-klein-9bPASS
A portrait with dramatic split lighting — half the face brightly lit, the other half in deep shadow, high contrast chiaroscuro

Evaluator (Gemma 4 26B)

The image displays a portrait with a clear dramatic split lighting effect: the left side of the face (from the viewer’s perspective) is brightly illuminated, revealing fine details like skin texture and facial hair, while the right side is plunged into deep shadow, creating a stark contrast. This is a classic chiaroscuro lighting setup. The lighting pattern directly matches the prompt’s description. The perspective is a tight, centered close-up portrait, which emphasizes the facial contrast. Th

Color Precision92%vs100%
Easy 100%vs100%
A bright red sports car parked on a gray asphalt road — the bfl/flux-2-klein-9bPASS
A bright red sports car parked on a gray asphalt road — the red is vivid and unambiguous

Evaluator (Gemma 4 26B)

The car in the image is a bright, vivid red — consistent with Ferrari’s iconic “Rosso Corsa” color, which is a saturated, unambiguous red. The asphalt road is gray, matching the prompt’s description. There are no conflicting or inaccurate colors in the primary elements specified. VERDICT: PASS

A woman wearing a royal blue evening dress against a neutralbfl/flux-2-klein-9bPASS
A woman wearing a royal blue evening dress against a neutral background

Evaluator (Gemma 4 26B)

The dress in the image is a vibrant, saturated royal blue, which closely matches the color described in the prompt. The background is a neutral, light beige or off-white, which also aligns with the prompt’s description. There are no color mismatches or deviations from the specified “royal blue” and “neutral background.” VERDICT: PASS

A field of pure sunflower yellow flowers under a clear blue bfl/flux-2-klein-9bPASS
A field of pure sunflower yellow flowers under a clear blue sky

Evaluator (Gemma 4 26B)

The image displays a field of sunflowers under a clear blue sky, matching the prompt’s description. The sunflowers are a vibrant, saturated yellow — consistent with “pure sunflower yellow” — and the sky is a clear, bright blue, matching “clear blue sky.” There are no unnatural or mismatched hues; the colors are accurate and visually aligned with the prompt’s request. VERDICT: PASS

Medium 67%vs100%
A bedroom wall painted in #FF6B35 (burnt orange) with white bfl/flux-2-klein-9bPASS
A bedroom wall painted in #FF6B35 (burnt orange) with white furniture and natural daylight

Evaluator (Gemma 4 26B)

The wall in the image is painted in a warm, burnt orange hue that closely matches the hex value #FF6B35. The furniture — including the bed frame, nightstand, and lamp — is white, as specified. The lighting appears to be natural daylight coming through the window, which aligns with the prompt. While there are some subtle variations in tone (e.g., the bedding and pillows are off-white or cream, not pure white), these are minor and consistent with a realistic, natural rendering. The core requireme

A forest scene where the foliage is precisely #2D5F2D (dark bfl/flux-2-klein-9bFAIL
A forest scene where the foliage is precisely #2D5F2D (dark muted green) with brown tree trunks

Evaluator (Gemma 4 26B)

The prompt specified that the foliage must be precisely #2D5F2D (a dark muted green) and the tree trunks must be brown. In the image: - The foliage is rendered in multiple shades of green, ranging from dark to lighter green, but none of the foliage areas match the exact hex value #2D5F2D. The dominant green tones are closer to #2D5F2D but not precisely that value — for example, some areas are slightly lighter or darker, and the overall palette is more varied. - The tree trunks are brown, which

A room with walls in #4A90D9 (medium blue) with warm wooden bfl/flux-2-klein-9bPASS
A room with walls in #4A90D9 (medium blue) with warm wooden floor and white trim

Evaluator (Gemma 4 26B)

The walls in the image are a medium blue that closely matches the hex color #4A90D9. The floor is a warm, natural wood tone, and the trim (baseboards and crown molding) is white, as specified. All elements of the prompt are accurately represented. VERDICT: PASS

Hard 100%vs100%
A minimalist room with walls in #F5E6CC (warm cream), a sofabfl/flux-2-klein-9bPASS
A minimalist room with walls in #F5E6CC (warm cream), a sofa in #2C3E50 (dark navy), and accent pillows in #E74C3C (bright red)

Evaluator (Gemma 4 26B)

The walls in the image are a warm cream color that closely matches #F5E6CC. The sofa is a dark navy that aligns with #2C3E50. The accent pillows are a bright red that matches #E74C3C. All specified colors are accurately represented in the image. VERDICT: PASS

A product flat-lay on a #1A1A2E (deep dark blue) background,bfl/flux-2-klein-9bPASS
A product flat-lay on a #1A1A2E (deep dark blue) background, with a #F0C27F (soft gold) watch and #FFFFFF (pure white) earbuds

Evaluator (Gemma 4 26B)

The prompt specified: - Background: #1A1A2E (deep dark blue) - Watch: #F0C27F (soft gold) - Earbuds: #FFFFFF (pure white) In the image: - The background is a deep, dark blue — visually matches #1A1A2E. - The watch has a rose-gold tone, which is close to #F0C27F (soft gold), though not an exact match — #F0C27F is a slightly more yellow-gold tone, while the watch appears more rose-gold. However, in visual design, these are often considered close enough for “soft gold” in context. - The earbuds ar

A graphic design composition with background #FAFAFA (near-wbfl/flux-2-klein-9bPASS
A graphic design composition with background #FAFAFA (near-white), main shape #FF4500 (orange-red), and accent lines #333333 (dark charcoal)

Evaluator (Gemma 4 26B)

The image contains: - A background that is a very light off-white, matching #FAFAFA. - A main shape (an oval) that is a vibrant orange-red, matching #FF4500 (a classic orange-red, often called "coral red" or "tangerine"). - Accent lines that are a dark charcoal gray, matching #333333. All three specified colors are present and visually accurate to the given hex values. VERDICT: PASS

Extreme 100%vs100%
A smooth gradient on a wall going from #FF0000 (red) on the bfl/flux-2-klein-9bPASS
A smooth gradient on a wall going from #FF0000 (red) on the left to #0000FF (blue) on the right, transitioning evenly through purple

Evaluator (Gemma 4 26B)

The image displays a smooth horizontal gradient transitioning from red on the left to blue on the right, with a clear purple band in the center. The left edge is a vivid red (#FF0000), the right edge is a vivid blue (#0000FF), and the transition through purple is even and continuous, matching the description exactly. VERDICT: PASS

A sunset sky gradient on a flat surface from #FF6B35 (orangebfl/flux-2-klein-9bPASS
A sunset sky gradient on a flat surface from #FF6B35 (orange) at the bottom to #1B1464 (deep indigo) at the top

Evaluator (Gemma 4 26B)

The image shows a smooth vertical gradient transitioning from a bright orange-yellow at the bottom to a dark indigo-blue at the top. The bottom color is very close to #FF6B35 (a vibrant orange-red), and the top color is very close to #1B1464 (a deep indigo). The gradient is well-executed and matches the specified color scheme in both hue and direction. VERDICT: PASS

A horizontal banner with a gradient from #00C9FF (cyan) on tbfl/flux-2-klein-9bPASS
A horizontal banner with a gradient from #00C9FF (cyan) on the left to #92FE9D (mint green) on the right

Evaluator (Gemma 4 26B)

The image shows a horizontal banner with a smooth gradient transitioning from a bright cyan on the left to a mint green on the right. The leftmost color matches #00C9FF (a vibrant cyan), and the rightmost color matches #92FE9D (a light mint green). The gradient is continuous and accurately follows the specified hex values. VERDICT: PASS

Photorealism67%vs100%
Easy 67%vs100%
A single red sneaker on a clean white background with soft sbfl/flux-2-klein-9bFAIL
A single red sneaker on a clean white background with soft studio lighting and gentle shadow — product photography style

Evaluator (Gemma 4 26B)

The image presents a single red sneaker on a clean white background with soft studio lighting and a gentle shadow — matching the prompt exactly. At first glance, it appears to be a real photograph due to its natural lighting gradient, realistic shadow placement, and convincing texture rendering on the leather and rubber. However, upon closer inspection, there are subtle AI artifacts: - The stitching and seams are overly uniform and slightly unnaturally perfect, lacking the micro-irregularities

A clear glass perfume bottle on a reflective black surface wbfl/flux-2-klein-9bPASS
A clear glass perfume bottle on a reflective black surface with studio lighting, no background distractions

Evaluator (Gemma 4 26B)

The image presents a highly polished, studio-quality product shot of a clear glass perfume bottle on a reflective black surface, with no background distractions — exactly matching the prompt. The lighting is even and professional, creating realistic highlights and reflections on the glass and metallic cap. The reflection below is crisp and symmetrical, and the bottle’s geometry is clean and physically plausible. There are no obvious AI artifacts: no warped edges, inconsistent reflections, unnatu

A ripe green apple on a seamless white background with soft bfl/flux-2-klein-9bPASS
A ripe green apple on a seamless white background with soft side lighting and subtle shadow — catalog style

Evaluator (Gemma 4 26B)

The image presents a clean, studio-quality product shot of a green apple on a seamless white background, with soft side lighting and a subtle shadow — matching the prompt exactly. The apple’s surface shows natural texture, including tiny lenticels and a realistic sheen, and the stem appears organic and detailed. There are no obvious AI artifacts: no warped geometry, unnatural blending, or inconsistent lighting. The shadow is soft and directionally consistent with side lighting. The composition i

Graphical design 58%vs92%

Data Visualisation0%vs100%
Easy 0%vs100%
A bar chart showing monthly sales for January through June wbfl/flux-2-klein-9bFAIL
A bar chart showing monthly sales for January through June with values 12K, 18K, 15K, 22K, 19K, 25K — labeled axes, color-coded bars, and a title 'Monthly Sales 2025'

Evaluator (Qwen 3 VL)

The requested bar chart is missing several critical elements and contains significant errors: 1. **Chart Type Mismatch**: The prompt requested a bar chart showing monthly sales for January through June. The image shows a stacked bar chart, which is not the same as a simple bar chart. 2. **Incorrect Month Labels**: The x-axis labels are not January through June. Instead, they show “January”, then “15K”, “23K”, “198K”, and “Jul” — which is not a valid sequence for months 1–6. 3. **Incorrect Y-A

A pie chart showing market share: Company A 35%, Company B 2bfl/flux-2-klein-9bFAIL
A pie chart showing market share: Company A 35%, Company B 25%, Company C 20%, Others 20% — each slice a different color with labels and percentages

Evaluator (Qwen 3 VL)

The requested pie chart should show: - Company A: 35% - Company B: 25% - Company C: 20% - Others: 20% In the provided image: - Company A is labeled 35% — correct. - Company B is labeled 25% — correct. - Company C is labeled 20% — correct. - “Others” is labeled 20% — correct. However, the chart contains **incorrect percentages** in two slices: - One slice is labeled “218%” — this is impossible for market share (total exceeds 100%). - Another slice is labeled “200%” — also impossible. Additiona

A simple line graph showing daily website visitors over one bfl/flux-2-klein-9bFAIL
A simple line graph showing daily website visitors over one week (Mon-Sun): 1200, 1500, 1100, 1800, 2000, 1700, 1400 — with grid lines, axis labels, and title 'Weekly Traffic'

Evaluator (Qwen 3 VL)

The requested visualization is a line graph showing daily website visitors from Monday to Sunday with specific values: 1200, 1500, 1100, 1800, 2000, 1700, 1400. In the provided image: - The title “Weekly Traffic” is correct. - The chart type is a line graph, as requested. - Grid lines are present. - However, the x-axis labels (“Wen-Sun”) are incorrect — they should represent Mon-Sun, and “Wen” is not a standard abbreviation for Monday (it’s typically “Mon” or “Mon-Sun” for the full range). The

Layout & Design33%vs89%
Easy 100%vs100%
A movie night poster with a large popcorn image in the centebfl/flux-2-klein-9bPASS
A movie night poster with a large popcorn image in the center, the title 'MOVIE NIGHT' in bold at the top, and the tagline 'Every Friday at 8 PM' at the bottom

Evaluator (Qwen 3 VL)

The image features a large, centrally placed popcorn bucket, which matches the prompt’s description of a “large popcorn image in the center.” The title “MOVIE NIGHT” is prominently displayed at the top in bold, stylized lettering, and the tagline “Every Friday at 8 PM” is correctly positioned at the bottom. The visual hierarchy is clear: the popcorn is the focal point, the title is dominant at the top, and the tagline is legible at the bottom. All text is readable and correctly placed according

An event poster featuring a guitar illustration, the headlinbfl/flux-2-klein-9bPASS
An event poster featuring a guitar illustration, the headline 'LIVE MUSIC FEST' in large text, and 'Tickets available now' as a tagline below

Evaluator (Qwen 3 VL)

The poster includes: - A guitar illustration (acoustic guitar, correctly placed diagonally across the text). - The headline “LIVE MUSIC FEST” in large, bold, stacked text — visually dominant and correctly formatted. - The tagline “Tickets available now” at the bottom — readable, correctly placed, and in a smaller font as expected. The visual hierarchy is clear: the headline is largest and most prominent, the guitar integrates with the text, and the tagline is clearly subordinate but legible. A

A book club poster with an open book illustration, the titlebfl/flux-2-klein-9bPASS
A book club poster with an open book illustration, the title 'JOIN THE BOOK CLUB' prominently displayed, and 'Meets every Wednesday' as the tagline

Evaluator (Qwen 3 VL)

All described layout zones/elements are present: - An open book illustration is centrally located. - The title “JOIN THE BOOK CLUB” is prominently displayed at the top, with clear visual hierarchy (larger font, bold styling, and color variation). - The tagline “Meets every Wednesday” is correctly placed at the bottom in a smaller, readable font. All text is readable and correctly positioned according to the prompt. VERDICT: PASS

Medium 0%vs67%
A landing page design with a hero header image at the top, abfl/flux-2-klein-9bFAIL
A landing page design with a hero header image at the top, a headline and subtitle in the middle, a 'Sign Up' call-to-action button, and a footer with three social media icons

Evaluator (Qwen 3 VL)

The image shows a landing page with the following elements: - Hero header image at the top: Present — a photo of a person holding a scale, with a navigation bar including a logo and menu items. - Headline and subtitle in the middle: Present — large bold text (“Thle Mabem acbeak Goartiny.”) and a smaller subtitle below it. - “Sign Up” call-to-action button: Present — a blue button centered below the subtitle. - Footer with three social media icons: Present — Facebook, a custom icon, and Twitter,

A newsletter layout with a banner image header, a two-columnbfl/flux-2-klein-9bFAIL
A newsletter layout with a banner image header, a two-column body section with text on the left and an image on the right, a highlighted quote block, and a footer with unsubscribe link

Evaluator (Qwen 3 VL)

The image shows a newsletter layout with the following elements: - A banner image header with two people and stylized text (“ForsteuiClela” and “Vlotic Istantix kortocchef”) — this matches the requested header. - A two-column body section: text on the left, image on the right — this is visually present. - A highlighted quote block with quotation marks and attributed text — this is present on the right side, below the image. - A footer with an “Unsubscribe” link — this is present at the bottom.

A restaurant menu layout with a logo header, three category bfl/flux-2-klein-9bFAIL
A restaurant menu layout with a logo header, three category sections (appetizers, mains, desserts) each with items and prices, and a footer with address and hours

Evaluator (Qwen 3 VL)

The image shows a restaurant menu with a logo header (a circular emblem with “Mats” and a teapot icon), three category sections (Appetizers, Mains, Desserts), and a footer with address and hours. However, the text within all sections — including item names, prices, and footer information — is rendered in a garbled, nonsensical script that is not readable. The visual hierarchy is present (header, three columns, footer), but the content is illegible, which violates the requirement that “text is re

Hard 0%vs100%
A magazine double-page spread with a left page containing a bfl/flux-2-klein-9bFAIL
A magazine double-page spread with a left page containing a large hero photo, and a right page with a two-column article, a pull quote, a sidebar with author bio, and a page number

Evaluator (Qwen 3 VL)

The image shows a magazine double-page spread. The left page features a large hero photo of a man, which matches the prompt. The right page contains a two-column article layout, a pull quote in a distinct font and block, and a sidebar with an author bio and photo. Page numbers (44 and 13) are visible at the top corners. However, all text is gibberish — it appears to be placeholder or randomly generated text, not readable content. While the layout structure and visual hierarchy are correctly impl

A mobile app UI mockup showing a dashboard with a top navigabfl/flux-2-klein-9bFAIL
A mobile app UI mockup showing a dashboard with a top navigation bar, a search field, a 2x2 grid of feature cards with icons, a list of recent activity items, and a bottom tab bar

Evaluator (Qwen 3 VL)

The image displays two mobile app UI mockups that largely match the described layout: both include a top navigation bar (with status icons and user profile), a search field, a 2x2 grid of feature cards with icons, a “Recent Activity” list, and a bottom tab bar. The visual hierarchy is consistent with standard mobile app design — top to bottom: navigation, search, feature cards, activity list, tab bar. However, the text is not readable. All labels — including app names (“Foath”, “Nesido”), featu

A fashion magazine spread with a full-bleed left-page photo,bfl/flux-2-klein-9bFAIL
A fashion magazine spread with a full-bleed left-page photo, right page with three-column layout, headline, body copy, a smaller inset image, and styled page numbers

Evaluator (Qwen 3 VL)

The image shows a magazine spread with a full-bleed photo on the left page and a three-column layout on the right page, which matches the prompt’s description. A headline is present at the top of the right page, and body copy fills the three columns beneath it. A smaller inset image is also visible in the lower-left portion of the right page. Page numbers (90 and 25) are placed at the bottom of each respective page. However, the text is not readable — it appears to be gibberish or placeholder t

Style Diversity92%vs92%
Easy 100%vs100%
A golden retriever sitting in a garden, painted in the stylebfl/flux-2-klein-9bPASS
A golden retriever sitting in a garden, painted in the style of an oil painting with visible brushstrokes and rich texture

Evaluator (Qwen 3 VL)

The image depicts a golden retriever sitting in a garden, surrounded by flowers and foliage, with a wooden fence in the background. The visual style is clearly that of an oil painting: there are visible brushstrokes, especially in the fur of the dog, the grass, and the leaves. The texture is rich and painterly, with layered colors and blending that mimics traditional oil painting techniques. The lighting and composition also evoke a classical, impressionistic feel. All key characteristics reque

A golden retriever sitting in a garden, rendered in Japanesebfl/flux-2-klein-9bPASS
A golden retriever sitting in a garden, rendered in Japanese anime style with flat colors and large expressive eyes

Evaluator (Qwen 3 VL)

The image depicts a golden retriever sitting in a garden, which matches the subject matter of the prompt. The style is cartoonish and stylized, with large, expressive eyes — a hallmark of Japanese anime aesthetics. The coloring is largely flat with minimal shading, consistent with the requested “flat colors” characteristic. The outlines are clean and bold, and the overall composition is simplified and illustrative, aligning with anime conventions. However, the image lacks some key elements typi

A golden retriever sitting in a garden, rendered in pixel arbfl/flux-2-klein-9bPASS
A golden retriever sitting in a garden, rendered in pixel art style with a 32x32 grid and limited color palette

Evaluator (Qwen 3 VL)

The image clearly depicts a golden retriever sitting in a garden, rendered in a pixel art style. The visual composition is built on a visible 32x32 grid, with each element — the dog, flowers, trees, and sky — composed of distinct square pixels. The color palette is limited, using a small number of colors per element (e.g., shades of green, blue, pink, brown, and yellow) to create the scene. The outlines are sharp and blocky, consistent with classic pixel art. All key characteristics of the reque

Medium 100%vs100%
A city street scene in the style of Art Deco — geometric shabfl/flux-2-klein-9bPASS
A city street scene in the style of Art Deco — geometric shapes, gold and black palette, ornamental symmetry

Evaluator (Qwen 3 VL)

The image clearly embodies the requested Art Deco style. It features: - **Geometric shapes**: Angular buildings, stepped facades, zigzag patterns on the street, and stylized signage. - **Gold and black palette**: The entire scene is rendered in a monochromatic gold-on-black scheme, emphasizing luxury and contrast. - **Ornamental symmetry**: The composition is highly symmetrical, with buildings mirroring each other down the street, and decorative motifs centered and repeated. All key visual cha

A woman with a parasol in the style of Impressionism — soft bfl/flux-2-klein-9bPASS
A woman with a parasol in the style of Impressionism — soft brushstrokes, dappled light, pastel palette

Evaluator (Qwen 3 VL)

The image depicts a woman holding a parasol in a sunlit garden, rendered with visible, textured brushwork that suggests soft, blended strokes. The palette is dominated by pastels — pinks, blues, greens, and creams — evoking the gentle, luminous quality characteristic of Impressionism. Dappled light is evident through the filtered foliage and the way light plays across the parasol and the woman’s dress. While the composition is somewhat idealized and the figure’s features are rendered with clarit

A portrait of a man in Pop Art style — bold outlines, Ben-Dabfl/flux-2-klein-9bPASS
A portrait of a man in Pop Art style — bold outlines, Ben-Day dots, flat vivid primary colors

Evaluator (Qwen 3 VL)

The image clearly embodies the requested Pop Art style. It features: - Bold, black outlines defining all shapes and features. - Ben-Day dots (halftone patterns) used for shading and color transitions across the face, hair, and background. - Flat, vivid primary colors (yellow, red, green, blue, purple) with high contrast and saturation. All key visual characteristics from the prompt are present and unmistakable. VERDICT: PASS

Hard 67%vs67%
A forest rendered in photorealistic watercolor — the detail bfl/flux-2-klein-9bPASS
A forest rendered in photorealistic watercolor — the detail and lighting of a photo combined with visible watercolor bleeds and paper texture

Evaluator (Qwen 3 VL)

The image depicts a forest scene rendered in a watercolor style, with visible paper texture and watercolor bleeds—especially along the edges and in the background. The lighting is soft and atmospheric, with a sense of depth and realism in the tree trunks, foliage, and path, resembling photographic lighting and composition. However, the “photorealistic” aspect is somewhat ambiguous: while the scene has photographic depth and lighting, the textures (tree bark, leaves, ground) are stylized and pain

A woman in a futuristic city rendered in cyberpunk Art Nouvebfl/flux-2-klein-9bPASS
A woman in a futuristic city rendered in cyberpunk Art Nouveau — neon colors and tech elements combined with flowing organic lines and floral borders

Evaluator (Qwen 3 VL)

The image successfully merges cyberpunk and Art Nouveau aesthetics as requested. Key characteristics are clearly present: - **Cyberpunk elements**: Neon signage, futuristic cityscape, rain-slicked streets, tech-enhanced clothing, and a dystopian urban environment. - **Art Nouveau elements**: The ornate, flowing floral border framing the image, curvilinear decorative motifs, and stylized organic forms in the architecture and framing. - **Neon colors**: Dominant purples, pinks, and blues with glo

A medieval castle depicted in Japanese ukiyo-e woodblock pribfl/flux-2-klein-9bFAIL
A medieval castle depicted in Japanese ukiyo-e woodblock print style combined with photorealistic sky and lighting

Evaluator (Qwen 3 VL)

The image successfully incorporates the key visual characteristics of Japanese ukiyo-e woodblock prints: flat areas of color, bold outlines, stylized clouds and mountains, and the inclusion of Japanese text in the margins and blocks — all hallmarks of the style. The castle itself is rendered with the architectural precision and decorative flair typical of ukiyo-e depictions of historical structures. However, the prompt also requested “photorealistic sky and lighting.” While the sky does show a

Extreme 100%vs100%
An interior scene rendered in soft brutalism — raw concrete bfl/flux-2-klein-9bPASS
An interior scene rendered in soft brutalism — raw concrete textures combined with rounded pastel furniture and warm diffused light

Evaluator (Qwen 3 VL)

The image clearly embodies the requested “soft brutalism” style. Key visual characteristics are present: - **Raw concrete textures**: The walls, ceiling, and floor are all rendered in exposed, unpolished concrete with visible aggregate and seams — a hallmark of brutalism. - **Rounded pastel furniture**: The furniture — including the sofas, armchairs, and coffee tables — is distinctly rounded, organic in form, and upholstered in soft pastel tones (pinks, mint, beige). - **Warm diffused light**:

A still life painted in anxious minimalism — sparse compositbfl/flux-2-klein-9bPASS
A still life painted in anxious minimalism — sparse composition with uncomfortable negative space, muted colors, and subtle visual tension

Evaluator (Qwen 3 VL)

The image presents a still life with a solitary glass bottle on a weathered wooden surface against a large, textured, muted gray wall. The composition is sparse, with the bottle positioned off-center to the right, leaving significant negative space — a key element of “anxious minimalism.” The color palette is subdued, dominated by grays, browns, and the transparent glass, fitting “muted colors.” The brushwork is visible and textured, suggesting a painted medium, and the overall atmosphere feels

A cityscape rendered in nostalgic futurism — retro-futuristibfl/flux-2-klein-9bPASS
A cityscape rendered in nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina

Evaluator (Qwen 3 VL)

The image clearly embodies the requested “nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina.” Key visual characteristics are present: - **Retro-futurism**: The architecture features sleek, optimistic 1960s sci-fi design — domed buildings, spires, flying saucers, and elevated walkways — all hallmarks of mid-century speculative design. - **Warm-toned patina**: The entire scene is bathed in a sepia-toned, golden-hour glow with soft, faded edges a