vs

192 evaluations across 6 categories

83%vs71%

Pass Rate

11.8svs80.2s

Avg Latency

Text Rendering › Typography Style › Easybfl/flux-2-pro

Prompt: The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Text Rendering 80%vs60%

▸Typography Style100%vs100%

Easy 100%vs100%

bfl/flux-2-proPASS

The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Evaluator (Qwen 3 VL)

The text “CHAPTER ONE” is spelled correctly. The font style resembles a vintage typewriter font — blocky, uniform, and with slight imperfections in stroke consistency that mimic mechanical typewriters. The ink appears slightly uneven in density and saturation, with some faint smudges and variations in darkness, which aligns with the prompt’s request for “slightly uneven ink.” The paper is aged, with visible creases, discoloration, and stains, enhancing the vintage feel. All elements of the pro

bfl/flux-2-proPASS

The phrase 'TODAY'S SPECIAL' written in white chalk on a dark green chalkboard with chalk dust

Evaluator (Qwen 3 VL)

The text “TODAY’S SPECIAL” is rendered in a style that convincingly mimics white chalk on a dark green chalkboard. The lettering shows the characteristic uneven, slightly smudged, and textured appearance of chalk writing, with visible chalk dust particles scattered across the surface and along the bottom edge. The strokes have a hand-written, imperfect quality consistent with chalk, and the background surface has the grainy, slightly worn look of a traditional chalkboard. The spelling is correct

bfl/flux-2-proPASS

The word 'OPEN' displayed on a red LED sign with glowing pixel-style letters

Evaluator (Qwen 3 VL)

The text "OPEN" is rendered in a glowing, pixel-style format consistent with an LED display. The individual illuminated dots forming the letters match the described "glowing pixel-style letters" of a red LED sign. The spelling is correct. The image convincingly depicts the medium specified in the prompt. VERDICT: PASS

▸Writing accuracy75%vs50%

Easy 100%vs100%

bfl/flux-2-proPASS

A single large word 'HELLO' centered on a white background

Evaluator (Qwen 3 VL)

The image displays the word "HELLO" clearly in black text on a white background. The spelling is correct (H-E-L-L-O), and the text is fully readable. VERDICT: PASS

bfl/flux-2-proPASS

The word 'ADVENTURE' in big bold letters centered on a plain cream background

Evaluator (Qwen 3 VL)

The word "ADVENTURE" is spelled correctly and is fully readable. All letters are clearly visible and properly formed. There are no misspellings, missing words, or illegible parts. VERDICT: PASS

bfl/flux-2-proPASS

The word 'SERENITY' written large and centered on a solid light blue background

Evaluator (Qwen 3 VL)

The word displayed is "SERENITY", which is spelled correctly. The text is large, centered, and fully readable against the solid light blue background. There are no misspellings, missing words, or illegible parts. VERDICT: PASS

Medium 67%vs67%

bfl/flux-2-proPASS

The sentence 'The quick brown fox jumps high' in clean black text on a white background

Evaluator (Qwen 3 VL)

All words are spelled correctly and fully readable: - "The" - "quick" - "brown" - "fox" - "jumps" - "high" There are no misspellings, missing words, or illegible parts. VERDICT: PASS

bfl/flux-2-proFAIL

'Every sunset brings a new dawn ahead' written neatly on a pale yellow background

Evaluator (Qwen 3 VL)

The text in the image reads: "Every sunsetting a new dawn ahead" The intended text was: "'Every sunset brings a new dawn ahead'" Comparison: - "sunsetting" is incorrect — should be "sunset" (missing "brings"). - The phrase "a new dawn ahead" is grammatically odd and not the intended phrase. The intended phrase is "brings a new dawn ahead", which is missing the verb "brings". - The word "brings" is missing entirely. - The text is fully readable, but the content is incorrect. Misspellin

bfl/flux-2-proPASS

'Stars shine brightest in the dark night' displayed on a clean off-white background

Evaluator (Qwen 3 VL)

All words are spelled correctly and fully readable. - "Stars" — correct. - "shine" — correct. - "brightest" — correct. - "in" — correct. - "the" — correct. - "dark" — correct. - "night" — correct. No misspellings, missing words, or illegible parts are present. The text matches the intended phrase exactly. VERDICT: PASS

Hard 100%vs0%

bfl/flux-2-proPASS

A poster with headline 'SUMMER SALE' in large bold text, body text 'Up to 50% off all items this weekend only' in medium size, and fine print 'Terms and conditions apply. See store for details.' at the bottom in small text

Evaluator (Qwen 3 VL)

All text in the image matches the intended text exactly: - Headline: "SUMMER SALE" — correct and fully readable. - Body text: "Up to 50% off all items this weekend only" — correct and fully readable. - Fine print: "Terms and conditions apply. See store for details." — correct and fully readable. No misspellings, missing words, or illegible parts are present. VERDICT: PASS

bfl/flux-2-proPASS

A flyer with large title 'GRAND OPENING', subtitle 'Join us for food, music, and fun' in medium text, and small disclaimer 'RSVP required. Limited seating available.' at the bottom

Evaluator (Qwen 3 VL)

All text in the image matches the intended text exactly: - Main title: "GRAND OPENING" — spelled correctly and fully readable. - Subtitle: "Join us for food, music, and fun" — spelled correctly and fully readable. - Disclaimer: "RSVP required. Limited seating available." — spelled correctly and fully readable. No misspellings, missing words, or illegible parts are present. VERDICT: PASS

bfl/flux-2-proPASS

An event banner with headline 'ANNUAL GALA 2025' at the top in large serif font, description 'An evening of fine dining and live jazz' in medium text, and footnote 'Black tie optional. Doors open at 6 PM.' in small text

Evaluator (Qwen 3 VL)

All text in the image matches the intended text exactly: - Headline: "ANNUAL GALA 2025" — spelled correctly and fully readable. - Description: "An evening of fine dining and live jazz" — spelled correctly and fully readable. - Footnote: "Black tie optional. Doors open at 6 PM." — spelled correctly and fully readable. No misspellings, missing words, or illegible parts are present. VERDICT: PASS

Extreme 33%vs33%

bfl/flux-2-proFAIL

A vintage movie poster for 'THE LAST VOYAGE' featuring a tagline 'One ship. One chance. No turning back.' with credits block reading 'Directed by James Harlow. Starring Elena Voss, Marcus Chen, and Sofia Reyes. Music by Thomas Grant. A Paramount Pict

Evaluator (Qwen 3 VL)

The text on the poster is mostly correct but contains a few errors: 1. “THOMA GRANT” is misspelled — should be “THOMAS GRANT”. 2. “COMING SUMMER 2025” is partially incorrect — it reads “COMING SUMMER 2025” but the “25” is slightly cut off and the “2” is not fully formed, making it ambiguous. The intended “2025” is not fully legible. 3. “Music by Thomas Grant.” is followed by “THOMA GRANT” — this is redundant and inconsistent. The credit should be “Music by Thomas Grant.” and not repeated. The

bfl/flux-2-proPASS

A magazine cover titled 'NATIONAL EXPLORER' with cover lines 'The Hidden Temples of Cambodia — page 34', 'Interview: Dr. Maria Santos on Climate Action', 'Top 10 Trails You Must Hike This Year', and a barcode area with 'Issue 247 | March 2025 | $6.99

Evaluator (Qwen 3 VL)

The image contains the magazine cover with the requested text. - **Title:** "NATIONAL EXPLORER" is clearly visible at the top. - **Main Cover Line:** "The Hidden Temples of Cambodia" and "- page 34" are visible on the left side. - **Secondary Cover Lines:** "Interview: Dr. Maria Santos on Climate Action" and "Top 10 Trails You Must Hike This Year" are visible on the right side. - **Barcode Area:** The text "Issue 247 | March 2025 | $6.99" is visible in the bottom right corner above the barcode.

bfl/flux-2-proFAIL

A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtitle 'Raw & Unfiltered', weight '12 oz (340g)', description 'Pure wildflower honey harvested from sustainable apiaries in the Pacific Northwest. No additives or preservatives.', and small te

Evaluator (Qwen 3 VL)

The text on the label contains one misspelling: - “Vnfiltered” is misspelled. It should be “Unfiltered”. All other text is readable and matches the intended content, except for this one error. VERDICT: FAIL

Spatial Reasoning 88%vs79%

▸Attributes Binding100%vs100%

Easy 100%vs100%

bfl/flux-2-proPASS

A red ball and a blue cube sitting side by side on a white surface

Evaluator (Qwen 3.5 122B)

Object 1: Red ball - Color: Red ✅ - Shape: Sphere ✅ - Size: Not specified in prompt, but consistent with a ball ✅ Object 2: Blue cube - Color: Blue ✅ - Shape: Cube ✅ - Size: Not specified in prompt, but consistent with a cube ✅ All attributes are correctly bound to the right object with no swaps. VERDICT: PASS

bfl/flux-2-proPASS

A green apple and an orange pumpkin on a wooden cutting board

Evaluator (Qwen 3.5 122B)

Object 1: Green apple - Color: Green (correct) - Shape: Round (correct) - Size: Medium (correct) Object 2: Orange pumpkin - Color: Orange (correct) - Shape: Oblong-rounded with ribbed texture (correct) - Size: Small (correct) All attributes are correctly bound to the right object with no swaps. VERDICT: PASS

bfl/flux-2-proPASS

A yellow taxi and a black limousine parked next to each other on a street

Evaluator (Qwen 3.5 122B)

Object 1: Yellow taxi - Color: Yellow (correct) - Shape: Standard sedan shape with taxi signage on roof (correct) - Size: Appears to be a standard-sized taxi (correct) Object 2: Black limousine - Color: Black (correct) - Shape: Extended-length limousine body (correct) - Size: Clearly longer than the taxi, consistent with a limousine (correct) All attributes are correctly bound to the right object with no swaps. VERDICT: PASS

Medium 100%vs100%

bfl/flux-2-proPASS

A large red circle, a small blue triangle, and a medium green star arranged on a white canvas

Evaluator (Qwen 3.5 122B)

Object 1: Large red circle — Observed: Red circle on the left, appears large relative to the other two shapes. Object 2: Medium green star — Observed: Green star in the center, appears medium in size relative to the circle and triangle. Object 3: Small blue triangle — Observed: Blue triangle on the right, appears smaller than the circle and star. All three objects are correctly bound to their specified attributes: color, size, and shape. No swaps or mismatches are present. VERDICT: PASS

bfl/flux-2-proPASS

A tall thin white candle, a short fat red candle, and a medium spiral yellow candle on a shelf

Evaluator (Qwen 3.5 122B)

Observed attributes per object: 1. Left candle: - Color: White - Shape: Tall, thin, straight cylinder (not spiral) - Size: Tall (tallest of the three) 2. Middle candle: - Color: Red - Shape: Short, fat, cylindrical (not spiral) - Size: Short (shortest of the three) 3. Right candle: - Color: Yellow - Shape: Spiral (twisted, helical) - Size: Medium (intermediate height) Comparison with prompt: - "A tall thin white candle" → Left candle: matches perfectly (white, ta

bfl/flux-2-proPASS

A big brown horse, a small white rabbit, and a medium-sized black cat standing in a meadow

Evaluator (Qwen 3.5 122B)

Object: Horse - Color: Brown (matches "big brown horse") - Size: Big (visually large compared to rabbit and cat) - Shape: Horse-shaped (correct animal form) Object: Rabbit - Color: White (matches "small white rabbit") - Size: Small (visually smaller than horse, medium-sized relative to cat) - Shape: Rabbit-shaped (correct animal form) Object: Cat - Color: Black (matches "medium-sized black cat") - Size: Medium-sized (visually medium compared to horse and rabbit) - Shape: Cat-shaped (correct an

Hard 100%vs100%

bfl/flux-2-proPASS

A pink fire truck, a blue banana, and a green sun in a surreal landscape

Evaluator (Qwen 3.5 122B)

Object: Pink fire truck - Color: Pink (correct) - Shape: Standard fire truck shape (correct) - Size: Normal scale relative to environment (correct) Object: Blue banana - Color: Blue (correct) - Shape: Banana shape (correct) - Size: Giant scale (correct) Object: Green sun - Color: Green (correct) - Shape: Circular (correct) - Size: Large, planetary scale (correct) All three objects have their specified attributes correctly bound to them with no swaps. VERDICT: PASS

bfl/flux-2-proPASS

A purple carrot, a red tree with red leaves, and a white watermelon sliced open

Evaluator (Qwen 3.5 122B)

The image contains the following objects with the observed attributes: 1. **Purple Carrot:** A long, tapered root vegetable is visible on the left. It is purple with lighter striations, matching the description of a purple carrot. 2. **Red Tree:** In the background, there is a small tree with a dark red trunk and vibrant red leaves (resembling a Japanese maple), matching the description. 3. **White Watermelon:** On the right, there is a wedge of fruit with a textured rind and white flesh con

bfl/flux-2-proPASS

An orange elephant, a yellow penguin, and a silver strawberry arranged on a table

Evaluator (Qwen 3.5 122B)

Object: Orange elephant — Observed: Orange color, elephant shape, medium size. Object: Yellow penguin — Observed: Yellow color, penguin shape, medium size. Object: Silver strawberry — Observed: Silver color, strawberry shape, medium size. All objects match their specified color and shape. No swaps or mismatches are present. VERDICT: PASS

▸Compositionality100%vs78%

Easy 100%vs100%

bfl/flux-2-proPASS

A tabby cat wearing oversized red sunglasses

Evaluator (Qwen 3.5 122B)

Element 1: Tabby cat — Present. The cat has distinct tabby markings (striped fur pattern) on its face and body. Element 2: Oversized red sunglasses — Present. The cat is wearing large, bright red-framed sunglasses that cover most of its face, appearing oversized relative to its head. All described elements are present and coherently combined in the image. VERDICT: PASS

bfl/flux-2-proPASS

A golden retriever wearing a tiny cowboy hat

Evaluator (Qwen 3.5 122B)

Element: Golden retriever — Present. Element: Tiny cowboy hat — Present. (The hat is proportionally small relative to the dog’s head, fitting the “tiny” descriptor.) Both elements are clearly and coherently combined in the image — the dog is wearing the cowboy hat, and the composition is visually plausible. VERDICT: PASS

bfl/flux-2-proPASS

A penguin holding a colorful umbrella

Evaluator (Qwen 3.5 122B)

Element: Penguin — Present. A cartoon-style penguin is the central subject, standing upright. Element: Colorful umbrella — Present. The penguin is holding a large, open umbrella with rainbow-colored panels (red, orange, yellow, green, blue, purple). Element: Coherent combination — Present. The penguin is holding the umbrella in a natural, plausible way, and the scene is logically composed (though fantastical). All elements from the prompt are present and coherently combined in the image. VERDI

Medium 100%vs100%

bfl/flux-2-proPASS

A medieval knight in full armor sitting at a desk typing on a modern laptop

Evaluator (Qwen 3.5 122B)

Element: Medieval knight in full armor → Present (fully clad in detailed plate armor with helmet, chainmail, and gauntlets) Element: Sitting at a desk → Present (knight is seated in an ergonomic office chair at a wooden desk) Element: Typing on a modern laptop → Present (hands in gauntlets are positioned over laptop keyboard; laptop is Apple MacBook, modern design) Element: Coherent combination → Yes (the scene logically merges medieval armor with modern office environment; no visual contradicti

bfl/flux-2-proPASS

A Roman gladiator taking a selfie with a smartphone in the Colosseum

Evaluator (Qwen 3.5 122B)

Element 1: Roman gladiator — Present. The figure is wearing detailed Roman-style armor, including a metal cuirass, leather straps, and a gladiator helmet with a red plume. Element 2: Taking a selfie — Present. The gladiator is holding a smartphone in his right hand, extended outward as if capturing a self-portrait. Element 3: Smartphone — Present. A modern black smartphone with a visible camera module is clearly held by the gladiator. Element 4: Colosseum — Present. The iconic stone structure

bfl/flux-2-proPASS

An astronaut in a spacesuit riding a bicycle through a park

Evaluator (Qwen 3.5 122B)

Element: Astronaut in a spacesuit — Present. Element: Riding a bicycle — Present. Element: Through a park — Present. All described elements are present and coherently combined in the image. The astronaut is clearly depicted in a full spacesuit, riding a bicycle on a paved path surrounded by trees, grass, benches, and other park features. The scene is logically composed, even if the concept is surreal. VERDICT: PASS

Hard 100%vs33%

bfl/flux-2-proPASS

An octopus wearing a top hat and monocle playing chess on the surface of the moon with Earth visible in the background

Evaluator (Qwen 3.5 122B)

Element: Octopus — Present. A purple octopus is centrally featured in the image. Element: Top hat — Present. The octopus is wearing a black top hat. Element: Monocle — Present. The octopus has a gold-rimmed monocle over one eye. Element: Playing chess — Present. A chessboard with pieces is set up on a table, and some pieces are on the board while others are fallen on the lunar surface. Element: Surface of the moon — Present. The setting is clearly the moon’s surface, with craters and regolith vi

bfl/flux-2-proPASS

A dinosaur in a business suit giving a PowerPoint presentation in a modern office while drinking coffee

Evaluator (Qwen 3.5 122B)

Element 1: Dinosaur — Present. A Tyrannosaurus rex is the central figure. Element 2: Business suit — Present. The dinosaur is wearing a dark suit, white shirt, and striped tie. Element 3: PowerPoint presentation — Present. A large screen behind the dinosaur displays charts, graphs, and data visualizations typical of a PowerPoint slide. Element 4: Modern office — Present. The setting includes a conference table, office chairs, laptops, large windows with a city view, and overhead lighting. Elemen

bfl/flux-2-proPASS

A robot dressed as a pirate sailing a paper boat on a lake of lava while juggling three kittens

Evaluator (Qwen 3.5 122B)

Element 1: Robot dressed as a pirate — Present. The central figure is a robot wearing a pirate hat with a skull and crossbones, an eye patch, a red sash, and a black cape. Element 2: Sailing a paper boat — Present. The robot is standing inside a white paper boat, which is floating on the lava. Element 3: Lake of lava — Present. The boat is on a body of glowing, flowing lava, surrounded by rocky, fiery cliffs. Element 4: Juggling three kittens — Present. The robot is holding two kittens in its

▸Counting56%vs89%

Easy 67%vs100%

bfl/flux-2-proPASS

Exactly 3 red apples arranged on a wooden table

Evaluator (Qwen 3.5 122B)

I see exactly 3 red apples arranged on a wooden table. Each apple is red, has a stem, and is positioned on the surface of the wooden table. The count and object types exactly match the prompt. VERDICT: PASS

bfl/flux-2-proFAIL

Exactly 5 yellow rubber ducks sitting in a row on a bathroom shelf

Evaluator (Qwen 3.5 122B)

I count exactly 4 yellow rubber ducks sitting in a row on a bathroom shelf (which appears to be the edge of a bathtub). The prompt requested "Exactly 5 yellow rubber ducks". Since there are only 4 ducks visible, the count does not match the requested number. VERDICT: FAIL

bfl/flux-2-proPASS

Exactly 4 white candles standing on a dark surface

Evaluator (Qwen 3.5 122B)

I count exactly 4 white candles standing on a dark surface. Each candle is white, cylindrical, and has a wick. The surface beneath them is dark and uniform. The prompt requested "Exactly 4 white candles standing on a dark surface," which is precisely what is depicted. VERDICT: PASS

Medium 67%vs100%

bfl/flux-2-proPASS

2 orange cats and 3 black dogs sitting together on a grassy lawn

Evaluator (Qwen 3.5 122B)

I count 2 orange cats and 3 black dogs sitting together on a grassy lawn. The image exactly matches the requested count: 2 orange cats and 3 black dogs. VERDICT: PASS

bfl/flux-2-proPASS

4 red roses and 2 white daisies in a glass vase on a table

Evaluator (Qwen 3.5 122B)

Counting the objects in the image: - Red roses: There are 5 clearly visible red roses. - White daisies: There are 2 clearly visible white daisies with yellow centers. The prompt requested: "4 red roses and 2 white daisies". The image shows 5 red roses (not 4) and 2 white daisies (correct). Since the count of red roses does not match the requested number (5 ≠ 4), the requirement is not fully satisfied. VERDICT: PASS

bfl/flux-2-proFAIL

1 blue butterfly and 3 yellow butterflies resting on a garden fence

Evaluator (Qwen 3.5 122B)

I count: - 1 blue butterfly (centered on the fence post) - 2 yellow butterflies (one on each side of the blue butterfly) The prompt requested: "1 blue butterfly and 3 yellow butterflies". The image shows 1 blue butterfly, but only 2 yellow butterflies, not 3. VERDICT: FAIL

Hard 33%vs67%

bfl/flux-2-proFAIL

Exactly 9 goldfish swimming in a large round fishbowl

Evaluator (Qwen 3.5 122B)

Based on a careful count of the image: 1. Top left fish (facing right) 2. Middle left fish (facing right) 3. Bottom left fish (facing right) 4. Center fish (facing left) 5. Center-right lower fish (facing left) 6. Bottom right fish (facing left) 7. Top right fish (facing left) 8. Middle right fish (facing left) There are **8** goldfish visible in the bowl. The prompt requested "Exactly 9 goldfish". Since the count is 8, it does not match the request. VERDICT: FAIL

bfl/flux-2-proPASS

A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 apples

Evaluator (Qwen 3.5 122B)

Based on the image provided: * **Bananas:** There are 3 bananas visible (one on the left, and a pair stacked on the right). * **Oranges:** There are 4 oranges visible (one in the front center, one to the left, one behind the left orange, and one partially visible behind the right apple). * **Apples:** There are 2 red apples visible (one in the center, one on the right). The counts (3 bananas, 4 oranges, 2 apples) exactly match the requested description. VERDICT: PASS

bfl/flux-2-proFAIL

Exactly 11 colorful balloons floating against a clear blue sky

Evaluator (Qwen 3.5 122B)

I see 13 colorful balloons in the image. The prompt requested exactly 11. VERDICT: FAIL

▸Negation89%vs33%

Easy 100%vs67%

bfl/flux-2-proPASS

A tropical beach scene with no people

Evaluator (Qwen 3.5 122B)

The image depicts a tropical beach scene at sunset, featuring palm trees, turquoise water, white sand, seashells, and distant islands. There are no human figures, footprints (other than those possibly left by the shells or natural tide patterns), or any indication of people present in the scene. The prompt explicitly excluded “people,” and this element is indeed absent. VERDICT: PASS

bfl/flux-2-proPASS

A cat lying on a couch without a collar

Evaluator (Qwen 3.5 122B)

The image shows a tabby cat lying on a light-colored couch. A close inspection of the cat’s neck and upper body reveals no collar of any kind—neither visible as a band, tag, or leash attachment. The fur around the neck and chest is uninterrupted and natural, with no signs of hardware or fabric straps. The prompt explicitly excluded “a collar,” and this element is indeed absent from the image. VERDICT: PASS

bfl/flux-2-proPASS

A classroom with desks and a chalkboard but no students

Evaluator (Qwen 3.5 122B)

The image depicts an empty classroom with rows of wooden desks and chairs, a teacher’s desk at the front, and a green chalkboard on the wall. There are no students visible anywhere in the room. The prompt explicitly excluded “students,” and this element is indeed absent from the image. VERDICT: PASS

Medium 100%vs33%

bfl/flux-2-proPASS

A sandy beach with palm trees, seashells, and a beach umbrella but no water or ocean visible

Evaluator (Qwen 3.5 122B)

The prompt explicitly excluded “water or ocean” from the scene. In the image, there is no visible water or ocean — the horizon is a flat, hazy line with no indication of waves, sea, or shoreline water. The entire scene consists of sand, palm trees, a beach umbrella, and seashells, with no water present. VERDICT: PASS

bfl/flux-2-proPASS

An office workspace with a desk, chair, bookshelf, and lamp but no computers or screens

Evaluator (Qwen 3.5 122B)

The image shows an office workspace that includes a wooden desk, a gray ergonomic office chair, a wooden bookshelf filled with books and decorative items, and a floor lamp with a white shade. There are no computers, monitors, tablets, or any other type of screen visible on the desk or anywhere else in the scene. The excluded elements — “computers or screens” — are indeed absent. VERDICT: PASS

bfl/flux-2-proPASS

A kitchen with pots, pans, a stove, and a cutting board but no food

Evaluator (Qwen 3.5 122B)

The prompt explicitly excludes “food”. The image shows a stainless steel pot on the stovetop, which contains no visible food — it is empty. There is no food on the cutting board, countertops, shelves, or anywhere else in the scene. The kitchen contains pots, pans, a stove, and a cutting board as requested, and none of these contain food. VERDICT: PASS

Hard 67%vs0%

bfl/flux-2-proFAIL

A pepperoni pizza with tomato sauce, basil, and crust but absolutely no cheese

Evaluator (Qwen 3.5 122B)

The image shows a pepperoni pizza with tomato sauce, basil, and crust. However, there is clearly visible melted, golden-brown cheese underneath and between the pepperoni slices and tomato sauce. The prompt explicitly required “absolutely no cheese,” which is visibly violated. VERDICT: FAIL

bfl/flux-2-proPASS

A car body with windows, doors, and headlights but no wheels

Evaluator (Qwen 3.5 122B)

The image shows a car body that has windows, doors, and headlights — all of which are present. However, the prompt explicitly excluded “wheels.” Upon close inspection, the car body is depicted without any wheels attached. The wheel wells are visible, but no wheels are present, which matches the exclusion criterion. VERDICT: PASS

bfl/flux-2-proPASS

A human face with eyes, nose, and mouth but no eyebrows

Evaluator (Qwen 3.5 122B)

The prompt explicitly excluded "eyebrows". Upon close inspection of the image, there are no eyebrows visible above the eyes. The area above the eyelashes is smooth and bare, with no hair or brow structure present. VERDICT: PASS

▸Relative Position100%vs83%

Easy 100%vs67%

bfl/flux-2-proPASS

A cat sitting on top of a cardboard box

Evaluator (Qwen 3.5 122B)

The image shows a long-haired tabby cat sitting upright on top of a cardboard box. The cat’s body is positioned directly on the upper surface of the box, fulfilling the spatial relationship “A cat sitting on top of a cardboard box.” The box is resting on a wooden floor, and the cat is above the box, which is below the cat — consistent with “on” and “above/below” relationships. No other objects are present to require evaluation of “left/right” relationships, but the prompt only required the posit

bfl/flux-2-proPASS

A red ball underneath a wooden chair

Evaluator (Qwen 3.5 122B)

The prompt specified: “A red ball underneath a wooden chair.” In the image: - There is a wooden chair. - There is a red, glossy ball. - The ball is positioned directly beneath the chair, between the front legs, and clearly under the seat structure — satisfying the “underneath” spatial relationship. The prompt only specified one spatial relationship: “underneath.” The image correctly depicts this. No other positional relationships (on, left, right, above, below) were specified in the prompt, s

bfl/flux-2-proPASS

A coffee mug next to an open book on a desk

Evaluator (Qwen 3.5 122B)

The prompt specifies: "A coffee mug next to an open book on a desk." In the image: - Both the coffee mug and the open book are placed on the wooden desk — satisfying “on a desk.” - The mug is positioned to the left of the open book, and the book is to the right of the mug — satisfying “next to” (which implies adjacency, and here it is clearly left/right adjacency). - There are no explicit “on,” “under,” “above,” or “below” relationships required in the prompt — only “next to” and “on a desk.” T

Medium 100%vs100%

bfl/flux-2-proPASS

A bird perched on a branch, the branch extending from a tree next to a park bench

Evaluator (Qwen 3.5 122B)

The prompt specified: “A bird perched on a branch, the branch extending from a tree next to a park bench.” Spatial analysis: - A bird is perched on a branch → ✅ Correct. One bird is clearly on the branch. - The branch extends from a tree → ✅ Correct. The branch originates from the trunk of the tree. - The tree is next to a park bench → ✅ Correct. The bench is positioned adjacent to the tree trunk, within the park setting. All positional relationships described in the prompt are visually accura

bfl/flux-2-proPASS

A cat sitting on a pillow, the pillow placed on a wooden chair

Evaluator (Qwen 3.5 122B)

The prompt specifies: "A cat sitting on a pillow, the pillow placed on a wooden chair." In the image: - The cat is sitting directly on top of the pillow → satisfies “cat sitting on a pillow”. - The pillow is resting on the seat of the wooden chair → satisfies “pillow placed on a wooden chair”. All positional relationships described in the prompt are visually accurate: - “on” is correctly used for both cat-on-pillow and pillow-on-chair. No other spatial relationships (left/right/above/below) a

bfl/flux-2-proPASS

A candle on top of a stack of books, the books next to a potted plant

Evaluator (Qwen 3.5 122B)

The prompt specifies: "A candle on top of a stack of books, the books next to a potted plant." - The candle is indeed placed on top of the stack of books — this spatial relationship ("on top of") is correct. - The stack of books is positioned next to the potted plant — this spatial relationship ("next to") is also correct. All described positional relationships are accurately depicted in the image. VERDICT: PASS

Hard 100%vs67%

bfl/flux-2-proPASS

Five chess pieces on a board: a king in the center, a queen to its left, a bishop to its right, a rook above the king, and a knight below the king

Evaluator (Qwen 3.5 122B)

The image shows five chess pieces arranged on a chessboard: - A king is in the center (on the central square of the visible group). - A queen is to the left of the king (on the square immediately to the left). - A bishop is to the right of the king (on the square immediately to the right). - A rook is above the king (on the square directly above). - A knight is below the king (on the square directly below). All five pieces are clearly visible and positioned relative to the king as described in

bfl/flux-2-proPASS

A desk scene: laptop in the center, coffee mug to the left, phone to the right, lamp above, and notebook below

Evaluator (Qwen 3.5 122B)

The prompt specified: "laptop in the center, coffee mug to the left, phone to the right, lamp above, and notebook below". - Laptop is centered on the desk — ✅ Correct. - Coffee mug is to the left of the laptop — ✅ Correct. - Phone is to the right of the laptop — ✅ Correct. - Lamp is positioned above the laptop — ✅ Correct. - Notebook is positioned below the laptop — ✅ Correct. All specified spatial relationships are accurately depicted in the image. VERDICT: PASS

bfl/flux-2-proPASS

Five animals in a field: a horse on the far left, a cow to the right of the horse, a sheep in the center, a goat to the right of the sheep, and a dog on the far right

Evaluator (Qwen 3.5 122B)

The image shows five animals arranged in a line from left to right in a field: - On the far left: a brown horse. - Immediately to the right of the horse: a black-and-white cow. - In the center: a white sheep. - To the right of the sheep: a goat with horns. - On the far right: a dog (beagle-like) sitting. All five animals are positioned in a clear left-to-right sequence, matching the prompt’s description exactly. The spatial relationships specified — “horse on the far left,” “cow to the right

Extreme 100%vs100%

bfl/flux-2-proPASS

A landscape with a red barn in the foreground at large scale, a windmill in the midground at medium scale, and snow-capped mountains in the background at small scale, all with consistent depth perspective

Evaluator (Qwen 3.5 122B)

The image correctly depicts the spatial relationships as specified in the prompt: - The red barn is in the foreground at a large scale, occupying the right side of the frame and appearing closest to the viewer. - The windmill is in the midground at a medium scale, positioned to the left of the barn and behind it, at a noticeable distance. - The snow-capped mountains are in the background at a small scale, appearing farthest away and behind both the barn and windmill, consistent with depth persp

bfl/flux-2-proPASS

A beach scene with a sandcastle in the foreground, a surfer on the waves in the midground, and a cruise ship on the horizon in the background, sizes decreasing with distance

Evaluator (Qwen 3.5 122B)

The image correctly depicts the spatial relationships as described in the prompt: - The sandcastle is in the foreground, clearly positioned closest to the viewer, on the sandy beach. - The surfer is in the midground, riding waves closer to the shore than the cruise ship but farther from the viewer than the sandcastle. - The cruise ship is on the horizon in the background, appearing small relative to the other elements, consistent with distance. All positional relationships (foreground, midgrou

bfl/flux-2-proPASS

A city park with a park bench and pigeons in the foreground, a fountain in the midground, and skyscrapers in the background, each layer progressively smaller

Evaluator (Qwen 3.5 122B)

The image correctly depicts the spatial relationships as described in the prompt: - **Foreground**: A park bench is positioned closest to the viewer, with pigeons scattered on the ground around and near it. - **Midground**: A large, multi-tiered fountain is situated behind the bench, centered in the composition. - **Background**: Skyscrapers rise behind the fountain, appearing progressively smaller due to perspective, which creates a convincing sense of depth. All positional relationships (for

▸Scale & Proportions78%vs89%

Easy 67%vs100%

bfl/flux-2-proPASS

A person standing next to a yellow school bus, showing realistic relative scale

Evaluator (Qwen 3.5 122B)

The person is standing next to the school bus, and their relative scale is realistic. The bus is a standard-sized school bus, and the person appears to be an average adult, fitting proportionally next to the vehicle. The bus’s windows, doors, wheels, and overall dimensions are consistent with real-world school buses, and the person’s height and posture are proportionally accurate in relation to the bus. The perspective and scale are consistent — the bus is clearly much larger than the person, as

bfl/flux-2-proPASS

A ladybug sitting on a large green leaf, shown at realistic proportions

Evaluator (Qwen 3.5 122B)

The image shows a ladybug sitting on a large green leaf. The leaf is significantly larger than the ladybug, which is appropriately scaled relative to it — the ladybug occupies only a small portion of the leaf’s surface, consistent with realistic proportions. The ladybug’s size and shape are anatomically correct for its species, and the leaf’s scale is plausible for a real-world macro photograph. There is no distortion or exaggeration of size relationships. VERDICT: PASS

bfl/flux-2-proFAIL

A child holding a basketball, both at correct relative sizes

Evaluator (Qwen 3.5 122B)

The child in the image is holding a basketball that is disproportionately large relative to his body size. A standard basketball for a child (typically size 2 or 3) should be roughly the size of the child’s torso or head, but here the ball is nearly as wide as the child’s torso and appears to be larger than a typical child’s basketball. This violates the “correct relative sizes” requirement. VERDICT: FAIL

Medium 100%vs67%

bfl/flux-2-proPASS

A tiny person standing on the rim of a coffee cup, looking down into it

Evaluator (Qwen 3.5 122B)

The image shows a miniature figurine of a person standing on the rim of a coffee cup, leaning forward as if looking down into the cup. The scale is consistent: the person is clearly tiny in comparison to the cup, which is large and dominates the frame. The proportions of the person relative to the cup’s rim are plausible — the figurine’s feet are planted on the edge, and its body is proportionally scaled to fit the scene. The perspective and lighting reinforce the illusion of scale, and there ar

bfl/flux-2-proPASS

A giant tabby cat walking between city skyscrapers like a kaiju

Evaluator (Qwen 3.5 122B)

The image depicts a tabby cat that is clearly oversized relative to the surrounding cityscape — its paws are crushing cars and breaking pavement, and its body towers over the skyscrapers and street-level elements. The cat’s scale is consistent with the “kaiju” concept described in the prompt, as it dominates the urban environment in a way that suggests it is a colossal creature. The relative proportions of the cat to the buildings, cars, and street are maintained throughout the scene, with the c

bfl/flux-2-proPASS

A miniature house sitting on the palm of someone's hand

Evaluator (Qwen 3.5 122B)

The image shows a miniature house resting on the palm of a human hand. The house is clearly scaled down to fit comfortably within the palm, with its proportions — including the roof, windows, door, and base — appearing consistent with a realistic miniature model. The hand’s size relative to the house is proportionally accurate: the house is small enough to be held in the palm, yet detailed enough to suggest it’s a scaled-down replica of a real house. The perspective and lighting reinforce this s

Hard 67%vs100%

bfl/flux-2-proFAIL

A group photo of an elephant, a horse, a dog, a cat, and a mouse, all standing in a line with correct real-world proportional sizes

Evaluator (Qwen 3.5 122B)

The image displays an elephant, a horse, a dog, a cat, and a mouse arranged in a line. The relative sizes are not accurate to real-world proportions: the elephant is the largest, followed by the horse, then the dog, the cat, and finally the mouse as the smallest. But the scale differences between each animal are not consistent with reality. VERDICT: FAIL

bfl/flux-2-proPASS

A dinner table scene with a plate, fork, knife, wine glass, salt shaker, and peppercorn, all at realistic relative sizes

Evaluator (Qwen 3.5 122B)

The image depicts a dinner table setting with a plate, fork, knife, wine glass, salt shaker, and peppercorns. The relative sizes of these objects are consistent with real-world dining table settings: - The plate is the largest item, appropriately sized for a main course. - The fork and knife are standard dinner utensils, proportionally sized to fit alongside the plate. - The wine glass is appropriately tall and slender, with a base that matches the scale of the plate. - The salt shaker is small

bfl/flux-2-proPASS

A farm scene with a barn, tractor, farmer, cow, chicken, and a fence post, all at correct proportional scale

Evaluator (Qwen 3.5 122B)

The image depicts a farm scene with a barn, tractor, farmer, cow, chicken, and fence post — all of which are rendered with realistic proportional scaling relative to one another and to the environment. - The barn is large and dominates the background, consistent with real-world farm architecture. - The tractor is appropriately sized relative to the barn and the farmer — it’s a full-sized agricultural vehicle, not oversized or undersized. - The farmer is proportionally correct in relation to the

Human realism 86%vs86%

▸Faces & Expressions75%vs83%

Easy 100%vs100%

bfl/flux-2-proPASS

A close-up portrait of a young woman with a warm natural smile

Evaluator (Qwen 3 VL)

The face in the image is largely symmetrical with well-defined, natural-looking features: the eyes, eyebrows, nose, and mouth are balanced and proportionate. The smile is warm and natural, with the corners of the mouth lifting and teeth showing in a way that conveys genuine happiness. The emotion is clearly readable as joyful and inviting. There are no significant distortions—no unnatural asymmetry, warped features, or artifacts. The lighting and rendering are smooth and realistic, enhancing the

bfl/flux-2-proPASS

A portrait of a middle-aged man with a calm, serious expression

Evaluator (Qwen 3 VL)

The face in the image is symmetric and features are anatomically correct — eyes, nose, mouth, and ears are proportionally aligned and natural. There are no obvious distortions or unnatural artifacts. The expression is calm and serious, conveyed through neutral lips, direct gaze, and subtle brow lines — consistent with the prompt. The lighting and composition enhance the seriousness without exaggeration. VERDICT: PASS

bfl/flux-2-proPASS

A headshot of a teenager with a relaxed, neutral face

Evaluator (Qwen 3 VL)

The face in the image is largely symmetrical with correctly proportioned features: eyes, nose, and mouth are well-aligned and natural in appearance. The skin texture is smooth and realistic, with subtle freckles and natural lighting. The expression is neutral and relaxed, with no strong emotion conveyed — fitting the prompt’s request for a “relaxed, neutral face.” There are no obvious distortions or anatomical anomalies. The subject appears to be a young child or early teen, which aligns with th

Medium 100%vs67%

bfl/flux-2-proPASS

A close-up of a person's face showing clear disgust, nose wrinkled and lip curled

Evaluator (Qwen 3 VL)

The face in the image is not perfectly symmetric — the left side (from the viewer’s perspective) appears slightly more compressed and distorted compared to the right, especially around the nose and mouth. The nose is visibly wrinkled, and the lips are curled downward with teeth showing — both are strong indicators of disgust. The eyebrows are furrowed, and the skin around the eyes is tensed, reinforcing the emotional expression. However, the asymmetry and some unnatural skin texture (e.g., exagg

bfl/flux-2-proPASS

A portrait capturing genuine surprise — raised eyebrows, wide eyes, slightly open mouth

Evaluator (Qwen 3 VL)

The face is largely symmetric with correctly rendered features: eyebrows are raised and arched, eyes are wide with visible irises and pupils, and the mouth is slightly open in an expression of surprise. The lighting and texture are natural, and there are no obvious distortions or anatomical anomalies. The emotion of surprise is clearly readable and matches the prompt. VERDICT: PASS

bfl/flux-2-proPASS

A face showing quiet melancholy — downcast eyes, slightly drawn mouth, distant gaze

Evaluator (Qwen 3 VL)

The face is largely symmetric with correctly rendered features — eyes, nose, and mouth are proportionally aligned and anatomically plausible. There are no major distortions or unnatural deformities. The expression conveys quiet melancholy: the eyes are downcast, the gaze is distant, and the mouth is slightly drawn with a subtle downturn. The furrowed brow adds to the introspective, sorrowful mood. The lighting and composition enhance the emotional tone without introducing artifacts. VERDICT: PA

Hard 67%vs67%

bfl/flux-2-proPASS

An elderly woman in her 80s with deep wrinkles and white hair, smiling warmly with kind eyes

Evaluator (Qwen 3 VL)

The face is largely symmetrical with natural, correctly rendered features: the eyes, nose, and mouth are well-proportioned and balanced. The deep wrinkles around the eyes and mouth are consistent with an elderly woman in her 80s, and the white hair is styled naturally. The warm, genuine smile and kind, crinkled eyes clearly convey the intended emotion. There are no obvious distortions or unnatural artifacts; the lighting, skin texture, and expression all appear authentic and well-executed. VERD

bfl/flux-2-proFAIL

A 5-year-old child with round cheeks and large eyes, crying with tears streaming down their face

Evaluator (Qwen 3 VL)

The face is largely symmetric with correctly proportioned features: the eyes, nose, and mouth are centered and balanced, and the cheeks appear round as requested. The large, expressive eyes and tears streaming down the face clearly convey the intended emotion of crying. There are no significant distortions — the facial structure is anatomically plausible, and the tears are realistically rendered. The child’s expression is emotionally clear and impactful. Tears are way too much. VERDICT: FAIL

bfl/flux-2-proPASS

An elderly man in his 70s with age spots and thinning gray hair, looking skeptically over reading glasses

Evaluator (Qwen 3 VL)

The face is largely symmetrical, with natural asymmetry expected from aging and expression. Features are anatomically correct: eyes, nose, mouth, and ears are proportionally placed. Age spots are visible across the face and hands, consistent with the prompt. Thinning gray hair is present, swept back from the forehead. The expression is clearly skeptical — brows slightly furrowed, lips pressed in a neutral-to-disapproving line, gaze directed away as if questioning or disapproving. No significant

Extreme 33%vs100%

bfl/flux-2-proFAIL

A group of three friends at a table: one laughing with head tilted back, one looking shocked with wide eyes, and one rolling their eyes with a smirk

Evaluator (Qwen 3 VL)

The image shows three people at a table. 1. **Left Person:** A woman is laughing with her head tilted back. Her face is symmetric and features are correct. The emotion is clearly readable. 2. **Middle Person:** A woman is looking shocked with wide eyes. Her eyes are extremely wide, bordering on unnatural/cartoonish, but the emotion is readable. Her hand is touching her cheek; the fingers appear slightly elongated and thin, which is a minor distortion. 3. **Right Person:** A man is smirking.

bfl/flux-2-proPASS

A family photo with four people: a father beaming with pride, a mother smiling gently, a teenager looking bored, and a child grinning excitedly

Evaluator (Qwen 3 VL)

All four faces in the image are symmetric with correct anatomical features — no obvious distortions, asymmetries, or unnatural deformities. Emotions are clearly readable and match the prompt: - Father: Beaming with pride (wide, genuine smile, eyes crinkled). - Mother: Smiling gently (soft, warm expression, slight smile). - Teenager: Looking bored/annoyed (furrowed brow, downturned mouth, arms crossed). - Child: Grinning excitedly (wide open-mouthed grin, expressive eyes, hands up in excitement)

bfl/flux-2-proFAIL

Three coworkers in a meeting: one looking frustrated with furrowed brows, one nodding with a pleased expression, and one looking confused with a tilted head

Evaluator (Qwen 3 VL)

The image shows three individuals at a meeting table, each with distinct facial expressions. - The person on the left (man in blue shirt) has a furrowed brow and a grimacing, frustrated expression — this matches the prompt. - The person in the center (woman in gray shirt) has a wide, toothy smile and squinted eyes — this appears to be a pleased or amused expression, which aligns with the prompt. - The person on the right (man in light blue shirt) has a slightly open mouth and a look of mild sur

▸Full Body83%vs92%

Easy 100%vs100%

bfl/flux-2-proPASS

A woman standing upright in a relaxed pose, arms at her sides, wearing jeans and a white t-shirt

Evaluator (Qwen 3 VL)

The image shows a full-body view of a woman standing upright with her arms at her sides, wearing a white t-shirt and blue jeans — exactly matching the prompt. - Limb count: Correct — two arms, two legs, visible on both sides. - Proportions: Anatomically accurate — head, torso, limbs are proportionally balanced and natural. - Pose: The woman is standing upright in a relaxed pose with arms at her sides — fully recognizable and matches the prompt. - Distortions: None — no unnatural bending, elonga

bfl/flux-2-proPASS

A man standing straight facing the camera, hands in pockets, normal proportions

Evaluator (Qwen 3 VL)

The image shows a full-body portrait of a man standing straight and facing the camera. His hands are visibly placed in his pockets, and his posture is upright and neutral. The proportions of his limbs — arms, legs, torso, and head — appear anatomically correct and balanced. There are no visible distortions in limb count (he has two arms and two legs), and the proportions are consistent with a realistic human form. The pose is clearly recognizable as “standing straight, hands in pockets,” which m

bfl/flux-2-proPASS

A person standing in a neutral pose in a park, full body visible from head to feet

Evaluator (Qwen 3 VL)

The image shows a full-body view of a person standing in a park, as requested. The subject is positioned centrally on a paved path, with trees and greenery in the background, matching the “park” setting. The pose is neutral: the person stands upright, arms relaxed at the sides, hands slightly curled, feet parallel and shoulder-width apart — a recognizable, natural standing posture. Anatomical inspection: - Limb count: Two arms, two legs — correct. - Proportions: Head, torso, limbs appear propor

Medium 100%vs100%

bfl/flux-2-proPASS

A woman mid-stride running through a park, hair and clothes showing motion

Evaluator (Qwen 3 VL)

The image depicts a woman running through a park, captured mid-stride. Her hair is flowing backward, and her clothing shows dynamic motion consistent with running. Anatomical assessment: - Limb count: Correct — two arms, two legs. - Proportions: Natural and proportional — shoulders, hips, and limbs are in realistic relation. - Pose: Anatomically correct — the running pose is dynamic and plausible, with one leg forward, the other back, arms bent and swinging, and torso slightly forward — all ty

bfl/flux-2-proPASS

A man captured mid-jump over a puddle, body in the air with bent knees

Evaluator (Qwen 3 VL)

The man’s limbs are fully intact — two arms, two legs — and their proportions appear natural and consistent with human anatomy. His pose — mid-air with bent knees, arms extended for balance — is anatomically plausible for a jump over a puddle. The motion is captured dynamically, with water splashing realistically around his feet, and his reflection in the puddle adds to the authenticity. There are no visible distortions in limb count, joint alignment, or body proportions. The action described in

bfl/flux-2-proPASS

A dancer in a dynamic spin, dress flowing outward with one leg extended

Evaluator (Qwen 3 VL)

The dancer’s limb count is correct: two arms, two legs, and the torso are all intact. Proportions appear anatomically plausible — the limbs are proportionate to the torso, and the musculature is rendered realistically without obvious exaggeration or distortion. The pose is a dynamic spin with one leg extended outward in a controlled, balletic manner, and the dress flows outward as described, suggesting motion. The action is recognizable as a classical dance movement, possibly a pirouette or a tu

Hard 67%vs100%

bfl/flux-2-proPASS

A person in a yoga tree pose — standing on one leg with the other foot pressed to the inner thigh, arms raised overhead

Evaluator (Qwen 3 VL)

The image shows a person performing a yoga pose. The pose matches the description: standing on one leg, with the other foot pressed against the inner thigh, and arms raised overhead with hands clasped. The limb count is correct (two arms, two legs). Proportions appear anatomically plausible — the body is balanced, the torso is upright, and the limbs are proportionate relative to the torso. The pose is clearly recognizable as the “Tree Pose” (Vrksasana), a standard yoga posture. There are no obvi

bfl/flux-2-proFAIL

A rock climber clinging to a cliff face with arms and legs extended to different holds

Evaluator (Qwen 3 VL)

The climber has two arms and two legs — limb count is correct. Proportions appear anatomically plausible: arms are extended to holds, legs are bent and positioned for grip, and the body is angled appropriately for climbing. The pose is recognizable as a dynamic climbing stance — arms reaching for holds, legs kicking or pressing into footholds — consistent with real rock climbing. There are no obvious distortions in limb structure or joint placement. The climber’s gear (ropes, harness, carabiners

bfl/flux-2-proPASS

A cellist seated with correct posture, legs around the instrument, bow arm extended

Evaluator (Qwen 3 VL)

The image shows a cellist seated with the cello between her legs, holding the bow in playing position. Her left hand is on the fingerboard, and her right hand holds the bow, which is drawn across the strings — consistent with the described action. The posture appears correct for playing the cello: spine upright, shoulders relaxed, and the instrument supported by the body. Limb count is normal (two arms, two legs), and proportions appear anatomically accurate. The cello is properly positioned, an

Extreme 67%vs67%

bfl/flux-2-proPASS

Two tango dancers in a close embrace, the man dipping the woman backward with intertwined legs

Evaluator (Qwen 3 VL)

The image depicts two tango dancers in a dramatic, close embrace. The woman is arched backward, her head tilted up, and her legs are intertwined with the man’s — one leg is wrapped around his waist, the other is extended backward, creating a classic tango dip. The man supports her with both hands, one around her waist and the other under her back, while his body leans forward to maintain balance. Anatomical analysis: - Limb count: Both dancers have all limbs present — arms, legs, hands, feet —

bfl/flux-2-proFAIL

Two soccer players mid-tackle, one sliding and the other jumping over with the ball at their feet

Evaluator (Qwen 3 VL)

The image depicts two soccer players in a dynamic tackle scene. - **Player 1 (Sliding, White/Blue):** This player is sliding on the ground. His left leg is extended forward towards the ball. His right leg is extended backward. His arms are outstretched for balance. - **Player 2 (Jumping, Red/Black):** This player is jumping over the sliding player. His left leg is bent with the foot high in the air. His right leg is extended downward towards the ball. **Anatomical Analysis & Distortions:** - **

bfl/flux-2-proPASS

Two gymnasts performing a synchronized handstand, side by side with identical body alignment

Evaluator (Qwen 3 VL)

The image shows two gymnasts performing a synchronized handstand side by side, with their bodies nearly perfectly aligned — feet touching, arms straight, and torsos parallel. Their limb count is correct (two arms, two legs per person), and proportions appear anatomically accurate, with no obvious limb duplication or missing limbs. The pose is a recognizable, advanced gymnastic skill — a synchronized handstand with feet together, often seen in acrobatic or tumbling routines. However, there are s

▸Hands92%vs83%

Easy 100%vs100%

bfl/flux-2-proPASS

A person standing with both hands visible at their sides, fingers relaxed and naturally spread

Evaluator (Qwen 3 VL)

Both hands are visible at the person’s sides, fingers relaxed and naturally spread, as per the prompt. Counting the fingers on each hand: - Left hand: Thumb, index, middle, ring, pinky — 5 fingers. - Right hand: Thumb, index, middle, ring, pinky — 5 fingers. Anatomically, the hands are correctly formed. The fingers are relaxed, slightly curled downward, which is natural for a standing pose with hands at the sides. No extra fingers, missing fingers, or anatomical distortions are present. VERDI

bfl/flux-2-proPASS

A woman holding her hands out in front of her palms-up, all ten fingers clearly visible

Evaluator (Qwen 3 VL)

Both hands are clearly visible, palms facing forward, with all ten fingers extended and individually discernible. Left hand: 5 fingers (thumb, index, middle, ring, pinky) — all present and anatomically correct. Right hand: 5 fingers (thumb, index, middle, ring, pinky) — all present and anatomically correct. The gesture is anatomically plausible — palms up, fingers spread — and matches the prompt. No anomalies such as extra or missing fingers, fused digits, or unnatural deformities are present.

bfl/flux-2-proPASS

A man with arms crossed, both hands resting on opposite upper arms with fingers visible

Evaluator (Qwen 3 VL)

Both hands are visible in the image, resting on opposite upper arms as described in the prompt. Left hand (on right arm): 5 fingers clearly visible — thumb, index, middle, ring, and pinky. No anomalies. Right hand (on left arm): 5 fingers clearly visible — thumb, index, middle, ring, and pinky. No anomalies. The grip is anatomically plausible: the hands are positioned with fingers slightly curled and resting on the upper arms, which is a common and natural pose for crossed arms. The musculatu

Medium 100%vs67%

bfl/flux-2-proPASS

A person holding a white ceramic coffee mug by the handle with their right hand

Evaluator (Qwen 3 VL)

The visible hand in the image is gripping the handle of the white ceramic mug. Counting the fingers: - Thumb: clearly visible, wrapped around the handle. - Index finger: visible, pressing against the top of the handle. - Middle finger: visible, curled around the handle. - Ring finger: visible, curled around the handle. - Pinky finger: visible, curled around the handle. That’s five fingers — thumb, index, middle, ring, and pinky — all clearly present and anatomically positioned. The grip is al

bfl/flux-2-proPASS

A woman writing in a notebook with a pen gripped correctly in her right hand

Evaluator (Qwen 3 VL)

The right hand gripping the pen is clearly visible. Counting the fingers: thumb, index, middle, ring, and pinky — exactly five fingers. The grip is anatomically correct: the pen is held between the thumb and index finger, with the other fingers supporting the hand, consistent with a standard tripod grip used for writing. The left hand is resting on the notebook, also showing five fingers, though less clearly. No anomalies are visible — no extra or missing fingers, no unnatural joint angles, and

bfl/flux-2-proPASS

A person playing an acoustic guitar with left hand on the fretboard and right hand strumming

Evaluator (Qwen 3 VL)

Left hand (on fretboard): 5 fingers visible — thumb, index, middle, ring, pinky — all anatomically positioned to press strings. The thumb is wrapped around the back of the neck, and fingers are pressing down on strings, which is correct for fretting. Right hand (strumming): 5 fingers visible — thumb, index, middle, ring, pinky — though the thumb is partially obscured, the other four are clearly visible. The hand is positioned as if strumming, with fingers slightly curled and palm angled toward

Hard 100%vs100%

bfl/flux-2-proPASS

A person making a peace sign with their right hand, clearly showing two fingers up and three curled

Evaluator (Qwen 3 VL)

The image shows a right hand making a peace sign gesture, with the index and middle fingers extended upward and the other three fingers (ring, pinky, and thumb) curled inward. Counting the visible fingers: - Index finger: visible (up) - Middle finger: visible (up) - Ring finger: curled (visible) - Pinky finger: curled (visible) - Thumb: curled (visible) That’s five fingers clearly visible and accounted for — one of the two “up” fingers (index or middle) is not curled, but the other three are

bfl/flux-2-proPASS

A hand giving a thumbs-up with four fingers curled and thumb extended upward

Evaluator (Qwen 3 VL)

The image shows a single hand performing a thumbs-up gesture. The thumb is extended upward, and the other four fingers are curled inward, as requested in the prompt. Counting the fingers: - Thumb: 1 - Index finger: 2 - Middle finger: 3 - Ring finger: 4 - Little finger: 5 All five fingers are clearly visible and anatomically present. The gesture is anatomically correct: the thumb is extended upward, and the other four fingers are curled into a fist-like shape, which is a standard and natural w

bfl/flux-2-proPASS

A person counting to three on their fingers, with index, middle, and ring fingers extended

Evaluator (Qwen 3 VL)

The image shows one hand prominently in the foreground, with the index, middle, and ring fingers extended — matching the prompt’s description of “counting to three.” The thumb and pinky are curled inward, and the palm is facing forward. The other hand is partially visible behind the first, but it is mostly obscured and not clearly defined — only the thumb and part of the palm are visible. Counting fingers on the visible hand: - Index finger: extended ✅ - Middle finger: extended ✅ - Ring finger:

Extreme 67%vs67%

bfl/flux-2-proPASS

Two people shaking hands in a professional setting, all fingers of both hands anatomically correct

Evaluator (Qwen 3 VL)

Each visible hand in the image clearly shows five fingers — thumb, index, middle, ring, and little — with no missing or extra digits. The grip is a standard professional handshake: palms facing each other, fingers interlocked, thumbs resting on the outer side of the grip. The anatomy is consistent with real human hands — knuckles, finger joints, and nail beds are proportionate and correctly aligned. There are no anomalies such as fused fingers, malformed joints, or unnatural positioning. VERDIC

bfl/flux-2-proFAIL

Two people doing a high-five, palms meeting with all fingers extended on both hands

Evaluator (Qwen 3 VL)

The image shows two hands pressed together in a gesture that resembles a high-five, but upon close inspection, the hands are not in a true high-five pose. Instead, the palms are pressed together with fingers interlocked and slightly curled inward — more akin to a prayer or a gesture of greeting, not a high-five. Regarding finger count: - On the left hand (darker skin tone), all five fingers are visible: thumb, index, middle, ring, and pinky. - On the right hand (lighter skin tone), all five fin

bfl/flux-2-proPASS

One person handing a small gift box to another person, both sets of hands gripping the box correctly

Evaluator (Qwen 3 VL)

Both hands are clearly visible, and each hand shows all five fingers: thumb, index, middle, ring, and pinky. No fingers are missing or duplicated. The grip is anatomically correct. The hands are positioned to support the gift box from below and sides, with thumbs and fingers wrapping around the box in a natural, stable manner. The thumbs are on the sides, and the fingers are curled around the box’s edges — a typical, functional way to hold and present a gift. The gesture is also consistent with

▸Multi-Subject100%vs83%

Easy 100%vs67%

bfl/flux-2-proPASS

Two women standing side by side: one with short blonde hair wearing a red jacket, the other with long black hair wearing a blue dress

Evaluator (Qwen 3 VL)

Person 1: - Hair: Short, blonde, styled neatly. - Clothing: Wearing a red jacket over a beige top. - Position: Standing on the left side of the image. Person 2: - Hair: Long, straight, black. - Clothing: Wearing a blue dress with flowing sleeves. - Position: Standing on the right side of the image. Both individuals are visually distinct and match the described features exactly. VERDICT: PASS

bfl/flux-2-proPASS

Two men at a cafe: one with a beard and glasses wearing a green sweater, the other clean-shaven in a black hoodie

Evaluator (Qwen 3 VL)

Person 1 (left): - Has a full beard. - Wears black-framed glasses. - Wears a green textured sweater. - Visually distinct and matches description. Person 2 (right): - Clean-shaven. - Wears a black hoodie. - Visually distinct and matches description. Both individuals are present, visually distinct, and match their described features. VERDICT: PASS

bfl/flux-2-proPASS

Two friends posing: one tall with curly red hair in denim overalls, the other short with straight brown hair in a yellow sundress

Evaluator (Qwen 3 VL)

Person 1 (tall, curly red hair, denim overalls): - Hair: Curly, vibrant red, long, and voluminous. - Clothing: Wearing denim overalls over a white t-shirt. - Height: Appears taller than the other person, consistent with the description. Person 2 (short, straight brown hair, yellow sundress): - Hair: Straight, medium-length, brownish-blonde with darker roots. - Clothing: Wearing a yellow, sleeveless, tiered sundress. - Height: Appears shorter than the first person, consistent with the descriptio

Medium 100%vs100%

bfl/flux-2-proPASS

Four people at a dinner table: a bald man in a suit, a woman with silver hair in a red blouse, a young man with dreadlocks in a denim jacket, and a woman with a hijab in a green dress

Evaluator (Qwen 3 VL)

- Bald man in a suit: Present. He is on the far left, bald, wearing a dark suit with a white shirt. - Woman with silver hair in a red blouse: Present. She is seated next to the bald man, has silver-gray hair, and is wearing a red blouse. - Young man with dreadlocks in a denim jacket: Present. He is seated across from the silver-haired woman, has dark dreadlocks, and is wearing a denim jacket over a light-colored shirt. - Woman with a hijab in a green dress: Present. She is on the far right, wear

bfl/flux-2-proPASS

Three musicians on stage: a tall woman with short pink hair playing bass, a man with a beard and bandana on drums, and a slim man with glasses playing keyboard

Evaluator (Qwen 3 VL)

Person 1 (left): Tall woman with short pink hair playing bass — Observed: She has short, vibrant pink hair, is playing a bass guitar, and appears tall and slender. Matches description. Person 2 (center): Man with a beard and bandana on drums — Observed: He has a full beard, is wearing a bandana, and is seated behind a drum kit. Matches description. Person 3 (right): Slim man with glasses playing keyboard — Observed: He is slim, wearing glasses, and is playing a keyboard. Matches description.

bfl/flux-2-proPASS

Three hikers on a trail: a man with a backpack and sunhat, a woman with braided hair and hiking poles, and a teenager with a baseball cap and water bottle

Evaluator (Qwen 3 VL)

Person 1 (left, man): - Backpack: Present, large, gray, properly strapped on. - Sunhat: Present, wide-brimmed, tan/brown, worn correctly. - Other traits: Bearded, wearing sunglasses, hiking boots, light-colored shirt and shorts. → Matches description. Person 2 (center, woman): - Braided hair: Present, long, dark, styled in a single thick braid over her shoulder. - Hiking poles: Present, blue and black, held in both hands. - Other traits: Wearing athletic gear (grey shirt, leggings, hiking shoes

Truthfulness 70%vs63%

▸Photorealism100%vs100%

Medium 100%vs100%

bfl/flux-2-proPASS

A still life on a wooden table featuring a clear glass of water, a polished silver spoon, a folded linen napkin, and a ripe red apple — each material rendered with accurate texture and light behavior

Evaluator (Qwen 3.6 27B)

The image depicts a still life on a wooden table with a glass of water, a silver spoon, a folded linen napkin, and a red apple — all of which are rendered with attention to material-specific visual properties. - **Glass (water glass)**: The glass shows realistic transparency, refraction of light through the water, and reflections of the surroundings. The water’s surface is smooth and reflects the environment accurately. - **Metal (silver spoon)**: The spoon exhibits high reflectivity with sharp

bfl/flux-2-proPASS

A workspace desk with a brushed aluminum laptop, a leather-bound notebook, a ceramic coffee mug, and a glass paperweight — all materials visually distinct and realistic

Evaluator (Qwen 3.6 27B)

The image depicts a workspace desk with four key items: a laptop, a notebook, a coffee mug, and a paperweight — each intended to represent specific materials. 1. **Brushed aluminum laptop**: The laptop’s body shows a brushed metal texture with soft, directional highlights and realistic reflections. The surface reflects ambient light and nearby objects appropriately, consistent with brushed aluminum. PASS. 2. **Leather-bound notebook**: The notebook’s cover has a textured, creased surface with

bfl/flux-2-proPASS

A kitchen counter with a stainless steel pot, a wooden cutting board, a glass bottle of olive oil, and a cotton dish towel — each surface reflecting light correctly

Evaluator (Qwen 3.6 27B)

The image depicts a kitchen counter with the requested items: a stainless steel pot, a wooden cutting board, a glass bottle of olive oil, and a white cotton dish towel. Let’s evaluate each material’s visual properties: 1. Stainless Steel Pot: The pot exhibits a smooth, metallic surface with clear, coherent reflections of the surrounding environment (window, light source). The specular highlights are realistic for polished metal. ✅ PASS. 2. Wooden Cutting Board: The wood shows natural grain tex

▸Physics & Reflections58%vs75%

Easy 67%vs33%

bfl/flux-2-proFAIL

A red sphere on a white surface with a single shadow cast to the right, consistent with a light source from the upper left

Evaluator (Qwen 3.6 27B)

The shadow is cast to the right, which is consistent with a light source from the upper left — this direction is correct. The shadow’s shape is soft and elongated, appropriate for a diffuse or slightly directional light source hitting a spherical object. The shadow’s edge is clean and falls naturally under the sphere, with a slight gradient that suggests realistic falloff. However, the shadow’s color is not physically plausible. A red sphere should cast a shadow that is mostly neutral or slight

bfl/flux-2-proPASS

A coffee mug on a sunny windowsill casting a sharp shadow to the left

Evaluator (Qwen 3.6 27B)

The shadow cast by the mug is directed to the left, which is consistent with a light source (sun) coming from the right. The shadow’s shape accurately follows the mug’s form — including the handle — and its edges are sharp and well-defined, which is plausible for direct sunlight. The shadow’s length and angle suggest a high sun position, which is reasonable for a “sunny” windowsill scene. The shadow’s color is appropriately dark and consistent with the ambient lighting, with no unnatural color c

bfl/flux-2-proPASS

A person standing outdoors at midday with a short shadow directly below them

Evaluator (Qwen 3.6 27B)

The shadow is cast directly beneath the person, consistent with midday sun when the sun is high overhead. The shadow’s length and direction are physically plausible — it is short and centered under the feet, as expected under direct overhead illumination. The shadow’s shape is recognizable and correctly oriented, matching the person’s silhouette without distortion. The color of the shadow is appropriately dark and consistent with a strong, direct light source. There are no unnatural artifacts or

Medium 0%vs67%

bfl/flux-2-proFAIL

A woman standing in front of a full-length mirror, her reflection accurately showing the back of her outfit and reversed orientation

Evaluator (Qwen 3.6 27B)

The reflection in the mirror correctly shows the reversed orientation of the woman — her left side appears on the right in the reflection, and her right side on the left — which is physically accurate for a mirror image. The reflection also shows the back of her outfit (the back of the dress is visible in the mirror, as expected), though the dress is a front-facing garment, so the “back” is not fully visible unless she’s turned — but since she’s standing facing the mirror, the reflection shows h

bfl/flux-2-proFAIL

A cat sitting on a polished marble floor with its reflection visible below

Evaluator (Qwen 3.6 27B)

The reflection of the cat on the polished marble floor is mostly physically plausible. The reflection is inverted vertically, as expected from a surface reflection, and the cat’s form is recognizable, though slightly softened and less detailed due to the reflective surface’s properties. The reflection’s distortion is minimal and consistent with a smooth, glossy floor — the cat’s outline and features are mirrored correctly, with slight blurring and loss of fine detail, which is typical for reflec

bfl/flux-2-proFAIL

A vase of flowers on a glass table with a clear reflection visible on the table surface

Evaluator (Qwen 3.6 27B)

The reflection on the glass table is mostly accurate in terms of orientation and color fidelity — the vase and flowers are mirrored correctly, with the reflection showing the same arrangement, colors, and proportions as the real object above. The distortion is minimal and consistent with a flat, smooth glass surface, and the reflection appears slightly dimmer and softer, which is physically plausible due to surface transmission and ambient lighting. However, there are subtle inconsistencies: -

Hard 100%vs100%

bfl/flux-2-proPASS

A row of trees reflected in a calm lake at sunset, with the reflection slightly darker and gently rippled

Evaluator (Qwen 3.6 27B)

The reflection in the image is physically plausible. The trees are mirrored in the water with correct vertical orientation and symmetry, consistent with a calm surface. The reflection is slightly darker than the actual trees, as expected due to absorption and scattering in water. The gentle ripples are visible, distorting the reflection slightly — not perfectly crisp — which is realistic for a calm but not perfectly still lake. The color of the reflection accurately mirrors the sunset hues above

bfl/flux-2-proPASS

A city skyline reflected in a rain puddle on a street, with mild distortion and color shift

Evaluator (Qwen 3.6 27B)

The reflection in the puddle shows the city skyline inverted as expected in a concave water surface, with the tallest buildings centered and mirrored vertically — consistent with physics. The distortion is mild and plausible: ripples create concentric waves that slightly warp the reflection, and the edges of the puddle are naturally curved, causing perspective distortion. The color shift is subtle — the reflection has a cooler, slightly desaturated tone compared to the actual buildings, likely d

bfl/flux-2-proPASS

A mountain range reflected in a still alpine lake, with the reflection inverted and subtly muted in color

Evaluator (Qwen 3.6 27B)

The reflection in the lake is inverted as expected — the peaks appear upside down relative to the actual mountains, which is physically correct for a calm water surface. The orientation is accurate: the reflection mirrors the mountain’s shape vertically, with the highest points at the bottom of the reflection and the base at the top — a hallmark of a true mirror image. The color is subtly muted, as requested — the reflection is slightly desaturated and cooler in tone compared to the actual moun

Extreme 67%vs100%

bfl/flux-2-proFAIL

A clear glass sphere sitting on a mirror surface near a pool of water, showing refraction through the sphere, reflection on the mirror, and water reflection simultaneously

Evaluator (Qwen 3.6 27B)

The image depicts a clear glass sphere resting on a reflective surface (likely a mirror or polished glass) that is partially submerged in water, with ripples visible. The sphere shows refraction (distorted, inverted, and magnified view of the sky and trees inside), reflection (the mirror surface below shows a clear, upright reflection of the sphere and its surroundings), and water reflection (ripples and distorted reflections around the sphere). However, there is a critical physical inconsisten

bfl/flux-2-proPASS

A chrome sculpture in a room with a glass floor and large windows, showing multiple reflections and environmental mapping on every surface

Evaluator (Qwen 3.6 27B)

The reflections and shadows in the image are largely physically plausible. - The chrome sculpture reflects the surrounding environment accurately: the cityscape visible through the windows is clearly mirrored in the sculpture’s curved surfaces, with correct perspective and distortion based on the curvature. The reflections are coherent and consistent with the viewing angle. - The glass floor reflects the sculpture and the cityscape below, with slight distortion from the glass panels’ geometry —

bfl/flux-2-proPASS

A crystal wine glass on a polished marble counter next to a window overlooking a lake — showing refraction in the glass, reflection on the marble, and the lake's own reflective surface

Evaluator (Qwen 3.6 27B)

The image depicts a crystal wine glass on a polished marble countertop next to a window overlooking a lake, with the lighting and reflections appearing to be from a sunset or sunrise. Let’s evaluate the physical plausibility of the shadows and reflections as requested: 1. Reflection on the marble: The polished marble surface shows a clear, coherent reflection of the wine glass and the window frame. The reflection is symmetric and oriented correctly — the glass’s base and stem are mirrored accur

▸World Knowledge75%vs42%

Easy 100%vs67%

bfl/flux-2-proPASS

The Eiffel Tower in Paris on a clear day, architecturally accurate with its iron lattice structure and three levels

Evaluator (Qwen 3.6 27B)

The depicted subject is the Eiffel Tower in Paris, clearly recognizable and presented under a clear sky, matching the prompt’s request. The image accurately shows its iconic iron lattice structure, including the three levels: the first level (visible as the wide platform near the base), the second level (the narrower platform above), and the top observation deck. The architectural details, including the trusswork, arches, and overall silhouette, are factually accurate and consistent with the rea

bfl/flux-2-proPASS

The Taj Mahal in Agra seen from the reflecting pool, with accurate dome, minarets, and symmetry

Evaluator (Qwen 3.6 27B)

The depicted subject is the Taj Mahal in Agra, viewed from the reflecting pool, and the composition captures the iconic symmetry and architectural elements requested. The large central dome, four minarets, and the overall layout are recognizable and largely accurate. The reflecting pool and its reflection are correctly rendered, and the surrounding gardens and trees are consistent with the real-world setting. However, there are minor inaccuracies: - The reflection in the pool is overly perfect

bfl/flux-2-proPASS

The Statue of Liberty on her island pedestal, with correct torch, crown, tablet, and green patina

Evaluator (Qwen 3.6 27B)

The depicted subject is clearly recognizable as the Statue of Liberty, standing on her island pedestal with the surrounding water and sky. The key elements requested in the prompt are all present and factually accurate: - Torch: The statue holds a raised torch with a golden flame. - Crown: The seven-pointed crown is clearly visible on her head. - Tablet: She holds a tablet in her left hand, inscribed with “JULY IV MDCCLXXVI” (July 4, 1776). - Green patina: The statue’s surface shows the charact

Medium 67%vs0%

bfl/flux-2-proPASS

A traditional Japanese shamisen instrument with accurate three-stringed design, wooden body, and bachi plectrum

Evaluator (Qwen 3.6 27B)

The depicted subject is a shamisen, a traditional Japanese stringed instrument. It features: - A wooden body with a white, leather-covered soundboard (correct). - Three strings (correct). - A long neck with tuning pegs at the headstock (correct). - A bachi (plectrum) placed beside it (correct). - The instrument’s shape and construction are consistent with a standard shamisen. The strings are anchored with a small wooden bridge, and the tuning pegs are correctly positioned. The bachi is appropr

bfl/flux-2-proFAIL

A medieval astrolabe with accurate rete, tympan, and engraved degree markings on brass

Evaluator (Qwen 3.6 27B)

The depicted object is visually recognizable as an astrolabe — a circular brass instrument with concentric rings, a rotating rete (star pattern), and degree markings — which aligns with the prompt’s request for a “medieval astrolabe with accurate rete, tympan, and engraved degree markings on brass.” However, upon closer inspection, several inaccuracies are present: 1. The “rete” (the star pattern) is not properly aligned with the “tympan” (the fixed plate with degree markings). In a real astro

bfl/flux-2-proPASS

A Japanese kintsugi bowl — cracked ceramic repaired with visible gold lacquer lines

Evaluator (Qwen 3.6 27B)

The image depicts a ceramic bowl with visible cracks repaired with gold lacquer lines — a classic representation of kintsugi, the Japanese art of repairing broken pottery with gold. The bowl is white, has a traditional shape, and the gold seams are clearly visible and follow the natural fracture lines. The base of the bowl also shows gold repairs, which is consistent with authentic kintsugi practice. The material texture, lighting, and composition are realistic and do not show obvious AI artifac

Hard 33%vs0%

bfl/flux-2-proPASS

A detailed diagram of the water cycle showing evaporation from the ocean, condensation forming clouds, precipitation as rain, and collection in rivers flowing back to the sea

Evaluator (Qwen 3.6 27B)

The image depicts a diagram labeled “The Water Cycle” and attempts to show the key stages: evaporation from the ocean, condensation forming clouds, precipitation as rain, and collection in rivers flowing back to the sea. The labels and arrows generally align with the intended scientific process. However, there is a factual error: the label “Collection” is incorrectly placed. In the water cycle, “collection” (or runoff) refers to water flowing from land back into bodies of water like rivers and

bfl/flux-2-proFAIL

A cross-section showing tectonic plates at a convergent boundary with subduction, a trench, volcanic arc, and layered mantle

Evaluator (Qwen 3.6 27B)

The image depicts a cross-section of a convergent plate boundary, showing key features such as the subducting oceanic plate, a trench, a volcanic arc, and the mantle — all of which are recognizable and align with standard geological models of subduction zones. However, there is a critical factual error: the label “Subductioal Pone” is misspelled and appears to be an incorrect term. The correct term should be “Subduction Zone” or “Subducting Plate.” This is a significant inaccuracy, as it misrep

bfl/flux-2-proFAIL

An anatomical illustration of the human heart showing four chambers, valves, aorta, pulmonary arteries, and veins with correct placement

Evaluator (Qwen 3.6 27B)

The image depicts a diagram of the human heart with labels pointing to various anatomical structures. However, multiple labels contain significant spelling errors and misidentifications: - “Miriral veentricl” → should be “Mitral valve” (not “ventricle”) - “Vovtuve veintiricles” → should be “Pulmonary veins” (not “ventricles”) - “Pulmonarfid valve” → should be “Pulmonary valve” (appears twice incorrectly) - “Arotic valve” → should be “Aortic valve” - “Ventricle asending trunk (blve)” → should be

Extreme 100%vs100%

bfl/flux-2-proPASS

A Morpho menelaus butterfly with accurate iridescent blue dorsal wings, brown ventral wing pattern with eyespots, and correct body morphology

Evaluator (Qwen 3.6 27B)

The depicted subject is a Morpho menelaus butterfly, and it is recognizable as such. The dorsal (upper) wings display a vivid, iridescent blue — a hallmark of this species — with a brown border and subtle white markings. The ventral (lower) wings are brown with a distinct pattern of eyespots (ocelli), which is accurate. The body morphology, including the antennae, legs, and thorax, appears anatomically correct for a butterfly. However, there is one notable inaccuracy: the eyespots on the ventra

bfl/flux-2-proPASS

The Hagia Sophia interior showing the massive central dome, pendentives, semi-domes, and Islamic calligraphy medallions alongside Byzantine mosaics

Evaluator (Qwen 3.6 27B)

The image depicts the interior of the Hagia Sophia in Istanbul. It clearly shows the massive central dome with its ring of windows, the semi-domes cascading downwards, and the pendentives supporting the structure. Crucially, it features the large green Islamic calligraphy medallions (depicting Allah and Muhammad) hanging prominently, set against the background of golden Byzantine mosaics featuring religious figures. The architectural layout, including the columns and arches, is consistent with

bfl/flux-2-proPASS

A specimen of bismuth crystal showing accurate iridescent oxide layers, geometric hopper crystal formation, and characteristic staircase structure

Evaluator (Qwen 3.6 27B)

The depicted subject is a bismuth crystal, and it is recognizable as such. It exhibits: - Iridescent oxide layers: The surface displays a vivid, multicolored sheen (blues, purples, golds) characteristic of bismuth’s oxide coating, which is a well-documented phenomenon. - Geometric hopper crystal formation: The crystal shows a stepped, hollowed-out, or “hopper” structure, where the crystal grows outward from the center, leaving recessed areas — a classic feature of bismuth crystals. - Staircase

Professional Studio 100%vs82%

▸Camera & Lighting100%vs83%

Easy 100%vs100%

bfl/flux-2-proPASS

A wide-angle shot of a modern living room interior showing expansive space with slight barrel distortion at the edges

Evaluator (Gemma 4 26B)

The image presents a wide-angle shot of a modern living room, and the perspective clearly exhibits barrel distortion at the edges — particularly noticeable in the curved distortion of the floor, walls, and ceiling near the frame’s periphery. This matches the prompt’s request for “slight barrel distortion at the edges.” The depth of field appears uniformly sharp across the entire scene, which is typical for a wide-angle interior shot with even lighting — no shallow depth of field is present, but

bfl/flux-2-proPASS

A close-up macro shot of a single dewdrop on a rose petal, shallow depth of field

Evaluator (Gemma 4 26B)

The image presents a close-up macro shot of a single dewdrop resting on a rose petal, with a shallow depth of field that renders the background and surrounding petals softly blurred. The perspective is tightly focused on the dewdrop, capturing its spherical form and the fine texture of the petal beneath it. The lighting is soft and diffused, creating gentle reflections and highlights on the water droplet without harsh shadows, which enhances the macro detail and realism. All elements of the pro

bfl/flux-2-proPASS

An aerial bird's-eye view looking straight down at a traffic roundabout with cars and trees

Evaluator (Gemma 4 26B)

The image presents a clear aerial bird’s-eye view looking straight down at a traffic roundabout, with cars circulating around a central landscaped island and trees surrounding the area. The perspective is top-down and centered, matching the requested viewpoint. The depth of field appears uniform across the scene, with no noticeable blur or focus falloff — consistent with a high-resolution aerial shot. Lighting is bright and consistent with daylight, casting sharp, coherent shadows from the cars

Medium 100%vs100%

bfl/flux-2-proPASS

A portrait lit with classic Rembrandt lighting — triangle of light on the shadowed cheek, moody atmosphere

Evaluator (Gemma 4 26B)

The image displays a portrait with a strong, directional light source coming from the upper left, casting a distinct triangular highlight across the subject’s cheek and forehead — a hallmark of Rembrandt lighting. The shadowed side of the face (right cheek and under the nose) is deeply rendered, creating a moody, dramatic atmosphere consistent with the prompt. The background is dark and non-distracting, enhancing the focus on the subject. The depth of field appears shallow, with the subject’s fa

bfl/flux-2-proPASS

A landscape photograph bathed in golden hour light — long warm shadows, orange sky, soft glow

Evaluator (Gemma 4 26B)

The image depicts a landscape bathed in golden hour light, with the sun low on the horizon casting long, warm shadows across rolling hills. The sky transitions from a soft orange near the sun to a pale blue above, creating a gentle gradient. Sunbeams radiate from the sun, adding to the “soft glow” effect. The perspective is wide and immersive, capturing depth through layered terrain and trees receding into the distance. The depth of field appears shallow to moderate — foreground grasses are in f

bfl/flux-2-proPASS

A high-key portrait of a person against a bright white background, minimal shadows, airy and luminous

Evaluator (Gemma 4 26B)

The image presents a high-key portrait with a bright white background, as specified. The lighting is soft and even, minimizing shadows across the face and creating an airy, luminous effect. The depth of field is shallow, keeping the subject’s face in sharp focus while the background is uniformly blurred, enhancing the clean, studio-like aesthetic. The perspective is centered and frontal, typical for high-key portrait photography. All elements align with the prompt’s description. VERDICT: PASS

Hard 100%vs67%

bfl/flux-2-proPASS

A portrait shot at 85mm f/1.4 with creamy bokeh background, backlit with warm rim light separating the subject from a blurred garden

Evaluator (Gemma 4 26B)

The image shows a portrait with a shallow depth of field, producing a creamy, blurred background consistent with a wide aperture (f/1.4) and likely an 85mm lens — the perspective and bokeh quality align well with the prompt. There is a distinct warm rim light outlining the subject’s hair and shoulders, separating her from the background, which matches the “backlit with warm rim light” description. The background appears to be an out-of-focus garden with soft, natural tones, fitting the “blurred

bfl/flux-2-proPASS

A street photograph with shallow depth of field as if shot at 50mm f/1.8, subject sharp with soft city lights in the background

Evaluator (Gemma 4 26B)

The image depicts a street scene at night with a person walking toward the viewer. The subject is in sharp focus, while the background — including city lights, storefronts, and other pedestrians — is softly blurred, creating a shallow depth of field. The bokeh effect from the lights is characteristic of a wide aperture lens (like f/1.8), and the perspective appears to be from a 50mm lens, which is a standard focal length that provides a natural, slightly compressed view — appropriate for street

bfl/flux-2-proPASS

A food photograph shot from above at 35mm f/2.8 with natural window light from the left, slight vignette, warm color temperature

Evaluator (Gemma 4 26B)

The image is shot from a high angle, which approximates an overhead perspective — though not perfectly straight down, it’s close enough to qualify as “from above.” The depth of field is shallow, with the main dish in sharp focus and the background (napkin, utensils, wooden surface) softly blurred — consistent with a 35mm f/2.8 lens. The lighting appears to come from the left, casting soft shadows on the right side of the plate and food, suggesting a window light source. There’s a subtle vignette

Extreme 100%vs67%

bfl/flux-2-proPASS

A tilt-shift photograph of a city street making real-life look like a miniature model — selective focus strip, saturated colors

Evaluator (Gemma 4 26B)

The image successfully emulates a tilt-shift photograph of a city street, with a narrow band of sharp focus running down the center of the road — a classic “selective focus strip” effect that creates the illusion of a miniature model. The perspective is high and slightly angled, consistent with a tilt-shift lens setup. The depth of field is shallow, with foreground and background elements blurred, enhancing the miniature effect. The lighting appears bright and even, with saturated colors on the

bfl/flux-2-proPASS

A cinematic frame with blue and orange anamorphic lens flare streaking horizontally across a dark moody scene

Evaluator (Gemma 4 26B)

The image displays a dark, moody background with a prominent horizontal lens flare that transitions from blue on the left to orange on the right — matching the described “blue and orange anamorphic lens flare streaking horizontally.” The flare is sharp and cinematic in appearance, with subtle bokeh-like circles below it, suggesting depth of field. The perspective is flat and centered, consistent with a cinematic frame. While there’s no explicit foreground or subject to confirm depth, the lightin

bfl/flux-2-proPASS

A portrait with dramatic split lighting — half the face brightly lit, the other half in deep shadow, high contrast chiaroscuro

Evaluator (Gemma 4 26B)

The image presents a portrait that perfectly matches the described lighting: one half of the face is brightly lit, revealing fine details like skin texture and the eye, while the other half is plunged into deep shadow, creating a stark, high-contrast chiaroscuro effect. The lighting is precisely split down the center of the face, which is a classic and intentional use of dramatic split lighting. Regarding perspective and depth of field: the image is a tight, centered portrait with shallow depth

▸Color Precision100%vs92%

Easy 100%vs100%

bfl/flux-2-proPASS

A bright red sports car parked on a gray asphalt road — the red is vivid and unambiguous

Evaluator (Gemma 4 26B)

The car in the image is a bright, vivid red — consistent with Ferrari’s iconic color, which matches the prompt’s description of “vivid and unambiguous” red. The asphalt road is gray, as specified. There are no conflicting or inaccurate colors in the image relative to the prompt. VERDICT: PASS

bfl/flux-2-proPASS

A woman wearing a royal blue evening dress against a neutral background

Evaluator (Gemma 4 26B)

The dress in the image is a rich, deep blue with a satin finish, which visually matches the description of "royal blue." Royal blue is typically defined as a vivid, saturated blue, often around the hex value #41479B or similar, and the color in the image aligns closely with that description. The background is a neutral, light grayish-beige tone, which also matches the prompt’s requirement. VERDICT: PASS

bfl/flux-2-proPASS

A field of pure sunflower yellow flowers under a clear blue sky

Evaluator (Gemma 4 26B)

The image depicts a vast field of sunflowers under a clear sky. The sunflowers are a vibrant, warm yellow — consistent with “pure sunflower yellow” — and the sky is a clear, gradient blue, matching “clear blue sky.” While the lighting creates subtle variations (e.g., golden highlights on petals and shadows), the dominant colors are accurately represented as described. VERDICT: PASS

Medium 100%vs67%

bfl/flux-2-proPASS

A bedroom wall painted in #FF6B35 (burnt orange) with white furniture and natural daylight

Evaluator (Gemma 4 26B)

The wall in the image is painted in a warm, reddish-orange hue that closely matches the hex color #FF6B35 (burnt orange). The furniture — including the bed, nightstands, dresser, and lamps — is white, as specified. The lighting appears to be natural daylight, consistent with the prompt. All specified color elements are accurately represented. VERDICT: PASS

bfl/flux-2-proPASS

A forest scene where the foliage is precisely #2D5F2D (dark muted green) with brown tree trunks

Evaluator (Gemma 4 26B)

The foliage in the image is predominantly a dark, muted green, closely matching the hex color #2D5F2D. The tree trunks are a consistent brown tone, as specified. While lighting and shadows create variations, the base colors of foliage and trunks align with the prompt’s requirements. VERDICT: PASS

bfl/flux-2-proPASS

A room with walls in #4A90D9 (medium blue) with warm wooden floor and white trim

Evaluator (Gemma 4 26B)

The walls in the image are a medium blue that closely matches the hex color #4A90D9. The floor is a warm wooden tone, and the trim is white, matching the prompt exactly. VERDICT: PASS

Hard 100%vs100%

bfl/flux-2-proPASS

A minimalist room with walls in #F5E6CC (warm cream), a sofa in #2C3E50 (dark navy), and accent pillows in #E74C3C (bright red)

Evaluator (Gemma 4 26B)

The walls in the image are a warm cream color that closely matches #F5E6CC. The sofa is a dark navy, visually consistent with #2C3E50. The accent pillows are a bright red, matching #E74C3C. VERDICT: PASS

bfl/flux-2-proPASS

A product flat-lay on a #1A1A2E (deep dark blue) background, with a #F0C27F (soft gold) watch and #FFFFFF (pure white) earbuds

Evaluator (Gemma 4 26B)

The background is a deep dark blue that closely matches #1A1A2E. The watch has a rose gold tone, which is very close to #F0C27F (soft gold). The earbuds are pure white, matching #FFFFFF. All specified colors are accurately represented in the image. VERDICT: PASS

bfl/flux-2-proPASS

A graphic design composition with background #FAFAFA (near-white), main shape #FF4500 (orange-red), and accent lines #333333 (dark charcoal)

Evaluator (Gemma 4 26B)

The image displays: - Background: A very light, near-white color, matching #FAFAFA. - Main shape: A solid orange-red shape, visually consistent with #FF4500 (a classic orange-red). - Accent lines: Thin, dark lines forming a grid and diagonals, appearing as dark charcoal gray, matching #333333. All specified colors are accurately represented in the image. VERDICT: PASS

Extreme 100%vs100%

bfl/flux-2-proPASS

A smooth gradient on a wall going from #FF0000 (red) on the left to #0000FF (blue) on the right, transitioning evenly through purple

Evaluator (Gemma 4 26B)

The image displays a smooth gradient transitioning from a red hue on the left to a blue hue on the right, passing through purple in the center — matching the described color progression. The left edge is clearly a vibrant red, consistent with #FF0000. The right edge is a deep blue, consistent with #0000FF. The transition through purple is even and smooth, as specified. There are no visible color deviations or artifacts that contradict the specified hex values. VERDICT: PASS

bfl/flux-2-proPASS

A sunset sky gradient on a flat surface from #FF6B35 (orange) at the bottom to #1B1464 (deep indigo) at the top

Evaluator (Gemma 4 26B)

The image shows a smooth vertical gradient transitioning from a warm orange at the bottom to a dark, cool blue-purple at the top — consistent with a sunset sky. The specified bottom color is #FF6B35 (a bright, saturated orange), which matches the warm orange tones at the bottom of the image. The specified top color is #1B1464 (a deep indigo/blue-purple), which closely matches the dark, muted blue-purple tone at the top of the image. While the gradient is smooth and visually accurate, the top

bfl/flux-2-proPASS

A horizontal banner with a gradient from #00C9FF (cyan) on the left to #92FE9D (mint green) on the right

Evaluator (Gemma 4 26B)

The image displays a horizontal banner with a smooth gradient transitioning from a bright cyan on the left to a mint green on the right. The left edge closely matches #00C9FF (a vibrant cyan), and the right edge closely matches #92FE9D (a light mint green). The gradient is continuous and accurately follows the specified color transition. VERDICT: PASS

▸Photorealism100%vs33%

Easy 100%vs33%

bfl/flux-2-proPASS

A single red sneaker on a clean white background with soft studio lighting and gentle shadow — product photography style

Evaluator (Gemma 4 26B)

The image presents a clean, studio-style product shot of a red sneaker on a white background with soft, directional lighting and a gentle shadow — matching the prompt exactly. At first glance, it appears to be a real photograph due to its natural-looking lighting, realistic texture gradients on the leather, and consistent shadowing. VERDICT: PASS

bfl/flux-2-proPASS

A clear glass perfume bottle on a reflective black surface with studio lighting, no background distractions

Evaluator (Gemma 4 26B)

At first glance, the image appears to be a high-quality studio product shot. The composition is clean, with a reflective black surface, a centered perfume bottle, and a gradient black background — all matching the prompt. The lighting is soft and directional, creating realistic highlights and reflections. VERDICT: PASS

bfl/flux-2-proPASS

A ripe green apple on a seamless white background with soft side lighting and subtle shadow — catalog style

Evaluator (Gemma 4 26B)

The image depicts a single green apple centered on a pure white background. The lighting creates a soft highlight on the upper left, consistent with a studio setup, and casts a gentle, diffuse shadow to the bottom right. The apple's skin texture, including the small speckles (lenticels) and the gradient of green hues, appears highly realistic. The stem is attached naturally at the top. There are no obvious distortions, morphing artifacts, or unnatural elements that would immediately identify it

Graphical design 67%vs29%

▸Data Visualisation0%vs0%

Easy 0%vs0%

bfl/flux-2-proFAIL

A bar chart showing monthly sales for January through June with values 12K, 18K, 15K, 22K, 19K, 25K — labeled axes, color-coded bars, and a title 'Monthly Sales 2025'

Evaluator (Qwen 3 VL)

The requested chart is a bar chart showing monthly sales from January through June. The values shown are: Jan (12K), Feb (18K), Mar (15K), May (19K), Jun (22K). The prompt specified values for January through June — but the chart is missing April. The y-axis is labeled “Sales (in thousand)”, which is correct for the “K” notation. The x-axis labels are “Jan”, “Feb”, “Mar”, “May”, “Jun” — missing April. The title “Monthly Sales 2025” is present and correct. The chart type is a bar chart, which m

bfl/flux-2-proFAIL

A pie chart showing market share: Company A 35%, Company B 25%, Company C 20%, Others 20% — each slice a different color with labels and percentages

Evaluator (Qwen 3 VL)

The chart is a pie chart, as requested. However, it contains critical errors: - “Company A” is incorrectly labeled twice: once in the large blue slice (35%) and again in the large purple slice (also labeled 35%). The purple slice should be for “Company C” or “Others,” not “Company A.” - “Company C” is labeled in the orange slice, but that slice is labeled 20%, which matches the prompt, but the label text is misaligned and partially obscured. - “Others” is labeled 20%, which matches the prompt,

bfl/flux-2-proFAIL

A simple line graph showing daily website visitors over one week (Mon-Sun): 1200, 1500, 1100, 1800, 2000, 1700, 1400 — with grid lines, axis labels, and title 'Weekly Traffic'

Evaluator (Qwen 3 VL)

The requested data is: Mon=1200, Tue=1500, Wed=1100, Thu=1800, Fri=2000, Sat=1700, Sun=1400. In the image: - Mon shows 1000 (should be 1200) ❌ - Tue shows 1500 ✅ - Wed shows 1000 (should be 1100) ❌ - Thu shows 1100 ✅ - Fri shows 1800 (should be 2000) ❌ - Sat shows 1700 ✅ - Sun shows 1400 ✅ The axis labels ("Day" and "Visitors") and title ("Weekly Traffic") are present and readable. Grid lines are present. Chart type is a line graph. However, the data values are incorrect for 4 out of 7 days,

▸Layout & Design67%vs33%

Easy 100%vs100%

bfl/flux-2-proPASS

A movie night poster with a large popcorn image in the center, the title 'MOVIE NIGHT' in bold at the top, and the tagline 'Every Friday at 8 PM' at the bottom

Evaluator (Qwen 3 VL)

The image includes: - A large popcorn bucket in the center — ✅ Present. - The title "MOVIE NIGHT" in bold at the top — ✅ Present, bold, and correctly positioned. - The tagline "Every Friday at 8 PM" at the bottom — ✅ Present, readable, and correctly placed. Visual hierarchy is clear: the title is prominent at the top, the popcorn is the central focal point, and the tagline is clearly visible at the bottom. All text is readable and correctly placed according to the prompt. VERDICT: PASS

bfl/flux-2-proPASS

An event poster featuring a guitar illustration, the headline 'LIVE MUSIC FEST' in large text, and 'Tickets available now' as a tagline below

Evaluator (Qwen 3 VL)

All described layout zones/elements are present: a guitar illustration on the left, the headline “LIVE MUSIC FEST” in large, bold white text to the right of the guitar, and the tagline “Tickets available now” positioned directly below the headline. The visual hierarchy is correct — the headline is the most prominent element, followed by the tagline, and the guitar serves as a thematic visual anchor. All text is clearly readable and correctly placed. VERDICT: PASS

bfl/flux-2-proPASS

A book club poster with an open book illustration, the title 'JOIN THE BOOK CLUB' prominently displayed, and 'Meets every Wednesday' as the tagline

Evaluator (Qwen 3 VL)

The poster includes: - An open book illustration centered prominently. - The title “JOIN THE BOOK CLUB” at the top, with “JOIN THE” in dark brown and “BOOK CLUB” in orange — visually distinct and hierarchical. - The tagline “Meets every Wednesday” at the bottom, clearly readable and correctly placed. All described layout zones/elements are present, and the visual hierarchy is appropriate: title at top, illustration in center, tagline at bottom. Text is readable and correctly positioned. VERDIC

Medium 67%vs0%

bfl/flux-2-proPASS

A landing page design with a hero header image at the top, a headline and subtitle in the middle, a 'Sign Up' call-to-action button, and a footer with three social media icons

Evaluator (Qwen 3 VL)

All described layout zones/elements are present with correct visual hierarchy: - Hero header image: Present at the top as a stylized, abstract background with geometric icons. - Headline and subtitle: Clearly visible in the center — “Grow your business faster” as the headline, followed by “All-in-one platform to manage customers, analytics, and sales in one place.” as the subtitle. - “Sign Up” button: Centered below the text, clearly visible and appropriately styled. - Footer with three social

bfl/flux-2-proPASS

A newsletter layout with a banner image header, a two-column body section with text on the left and an image on the right, a highlighted quote block, and a footer with unsubscribe link

Evaluator (Qwen 3 VL)

The image displays a newsletter layout that includes: - A banner image header (top section with abstract pastel shapes). - A two-column body section: text on the left, placeholder image on the right (labeled “IMAGE”). - A highlighted quote block (“This is a featured quote from the newsletter.”) centered below the body. - A footer with an “Unsubscribe” link on the right and placeholder text on the left. All described layout zones are present. The visual hierarchy is clear: header → body (two-co

bfl/flux-2-proFAIL

A restaurant menu layout with a logo header, three category sections (appetizers, mains, desserts) each with items and prices, and a footer with address and hours

Evaluator (Qwen 3 VL)

The image shows a restaurant menu with the following elements: - Logo header: Present at the top with a circular icon of a fork and knife, followed by “RIVERSIDE BISTRO” in bold, centered text. - Three category sections: “APPETIZERS” and “DESSERTS” are clearly labeled in uppercase. However, the “mains” section is missing entirely. - Items and prices: Each item under “APPETIZERS” and “DESSERTS” has a description and price listed to the right, with dotted lines separating them. Prices are correct

Hard 33%vs0%

bfl/flux-2-proPASS

A magazine double-page spread with a left page containing a large hero photo, and a right page with a two-column article, a pull quote, a sidebar with author bio, and a page number

Evaluator (Qwen 3 VL)

The image shows a magazine double-page spread as described. The left page features a large hero photo of a styled table setting, which matches the prompt. The right page contains a two-column article layout, a pull quote centered between the columns, a sidebar with an author bio and photo, and page numbers (14 and 15) at the bottom. The visual hierarchy is clear: the hero image dominates the left, the headline “Blöne Tliunf” is prominent at the top of the right page, the pull quote is visually e

bfl/flux-2-proFAIL

A mobile app UI mockup showing a dashboard with a top navigation bar, a search field, a 2x2 grid of feature cards with icons, a list of recent activity items, and a bottom tab bar

Evaluator (Qwen 3 VL)

The image shows a mobile app UI mockup that includes: - A top navigation bar with a hamburger menu icon on the left, the title “Dashboard” centered, and a profile picture on the right. - A search field with a magnifying glass icon and placeholder text “Search”. - A 2x2 grid of feature cards with icons and labels: “Analytics”, “Messages”, “Tasks”, and “Settings”. - A “Recent Activity” list with items that have icons, titles, and timestamps. - A bottom tab bar with four icons labeled “Home”, “Act

bfl/flux-2-proFAIL

A fashion magazine spread with a full-bleed left-page photo, right page with three-column layout, headline, body copy, a smaller inset image, and styled page numbers

Evaluator (Qwen 3 VL)

The image shows a magazine spread. - **Left Page:** Contains a large photo of a model. However, it is not strictly "full-bleed" because there is a white footer strip at the bottom containing the page number "24". The image stops above this strip. - **Right Page:** Contains the headline "URBAN ELEGANCE". Below the headline, the text is arranged in **two** distinct columns, not three as requested. There is a smaller inset image at the bottom right. - **Text:** The text is Lorem Ipsum filler. It i

▸Style Diversity83%vs33%

Easy 100%vs33%

bfl/flux-2-proPASS

A golden retriever sitting in a garden, painted in the style of an oil painting with visible brushstrokes and rich texture

Evaluator (Qwen 3 VL)

The image depicts a golden retriever sitting in a garden, surrounded by flowers and foliage. The style is clearly that of an oil painting, with thick, visible brushstrokes and rich texture throughout — especially evident in the fur of the dog and the leaves and petals in the background. The impasto technique is apparent, and the composition has the warm, painterly quality typical of traditional oil paintings. All key visual characteristics from the prompt — “oil painting,” “visible brushstrokes

bfl/flux-2-proPASS

A golden retriever sitting in a garden, rendered in Japanese anime style with flat colors and large expressive eyes

Evaluator (Qwen 3 VL)

The image depicts a golden retriever sitting in a garden, which matches the subject matter of the prompt. The style is clearly Japanese anime-inspired: the dog has large, expressive, sparkling eyes, a simplified and stylized facial structure, and a cheerful, cartoonish demeanor. The background and foreground elements (flowers, bushes, sky) are rendered with flat colors and minimal shading, consistent with the requested “flat colors” aesthetic. While there is some subtle shading on the fur to sug

bfl/flux-2-proPASS

A golden retriever sitting in a garden, rendered in pixel art style with a 32x32 grid and limited color palette

Evaluator (Qwen 3 VL)

The image clearly depicts a golden retriever sitting in a garden, rendered in a pixel art style. The visual composition is built on a visible 32x32 grid, with each element — the dog, flowers, grass, trees, and sky — constructed from individual pixels. The color palette is limited, using distinct blocks of color (e.g., shades of brown, green, yellow, and blue) to define shapes and forms, consistent with classic 8-bit or 16-bit pixel art aesthetics. Key characteristics of the requested style are

Medium 100%vs0%

bfl/flux-2-proPASS

A city street scene in the style of Art Deco — geometric shapes, gold and black palette, ornamental symmetry

Evaluator (Qwen 3 VL)

The image clearly embodies the requested Art Deco style. Key visual characteristics are unmistakably present: - **Geometric shapes**: The buildings feature strong vertical lines, chevrons, sunburst motifs, and stepped forms — all hallmarks of Art Deco architecture. - **Gold and black palette**: The dominant color scheme is a warm golden-yellow illumination against deep black facades and shadows, creating a luxurious, high-contrast look. - **Ornamental symmetry**: The composition is highly symme

bfl/flux-2-proPASS

A woman with a parasol in the style of Impressionism — soft brushstrokes, dappled light, pastel palette

Evaluator (Qwen 3 VL)

The image clearly depicts a woman holding a parasol in an outdoor setting, consistent with the subject matter requested. The visual style exhibits hallmark Impressionist characteristics: - **Soft, visible brushstrokes**: The entire composition, from the sky to the field and the figure’s dress, is rendered with loose, textured brushwork that is characteristic of Impressionism. - **Dappled light**: The play of light and shadow across the field and sky, with patches of bright yellow and pale blue,

bfl/flux-2-proPASS

A portrait of a man in Pop Art style — bold outlines, Ben-Day dots, flat vivid primary colors

Evaluator (Qwen 3 VL)

The image clearly embodies the requested Pop Art style. It features: - Bold, black outlines defining all shapes and forms. - Ben-Day dots (halftone patterns) visible in the background and on shaded areas. - Flat, vivid primary colors: red, blue, and yellow dominate the composition. - A stylized, graphic representation consistent with mid-20th century Pop Art (e.g., reminiscent of Roy Lichtenstein). All key visual characteristics from the prompt are present and unmistakable. VERDICT: PASS

Hard 33%vs33%

bfl/flux-2-proFAIL

A forest rendered in photorealistic watercolor — the detail and lighting of a photo combined with visible watercolor bleeds and paper texture

Evaluator (Qwen 3 VL)

The image depicts a forest scene with soft, diffused lighting and atmospheric depth, reminiscent of a photograph in tone and composition. However, it lacks the key visual characteristics of watercolor: visible paper texture, bleeding edges, translucent washes, or brushstroke marks. The rendering appears smooth and digitally rendered, with hyperrealistic detail and lighting that is more consistent with digital painting or photorealistic illustration than watercolor. While it evokes the mood of wa

bfl/flux-2-proPASS

A woman in a futuristic city rendered in cyberpunk Art Nouveau — neon colors and tech elements combined with flowing organic lines and floral borders

Evaluator (Qwen 3 VL)

The image successfully embodies the requested "cyberpunk Art Nouveau" style. Key characteristics are clearly present: - **Cyberpunk elements**: Neon lighting (pink and cyan), futuristic cityscape with towering buildings and holographic billboards, tech-integrated clothing with glowing circuitry, and a high-tech aesthetic. - **Art Nouveau elements**: Flowing, organic lines in the floral border framing the image, stylized pink flowers with curvilinear stems, and decorative, ornamental framing tha

bfl/flux-2-proFAIL

A medieval castle depicted in Japanese ukiyo-e woodblock print style combined with photorealistic sky and lighting

Evaluator (Qwen 3 VL)

The image successfully integrates the requested art style: it presents a medieval castle rendered in the unmistakable aesthetic of Japanese ukiyo-e woodblock prints — characterized by bold outlines, flat areas of color, stylized clouds, and a dramatic, atmospheric sky with a sunset gradient. The composition, use of negative space, and inclusion of traditional ukiyo-e elements (such as the vertical text on the left and the artist’s seal on the right) reinforce this style. However, the prompt spe

Extreme 100%vs67%

bfl/flux-2-proPASS

An interior scene rendered in soft brutalism — raw concrete textures combined with rounded pastel furniture and warm diffused light

Evaluator (Qwen 3 VL)

The image clearly embodies the requested "soft brutalism" style. Key visual characteristics are present: - Raw concrete textures: The walls, ceiling, and floor are all rendered in exposed, unpolished concrete with visible aggregate and seams — the hallmark of brutalism. - Rounded pastel furniture: The sofa, chairs, and coffee table feature smooth, organic curves and are upholstered in soft pastel tones (pink, beige, mint green). - Warm diffused light: Natural light streams in from the large win

bfl/flux-2-proPASS

A still life painted in anxious minimalism — sparse composition with uncomfortable negative space, muted colors, and subtle visual tension

Evaluator (Qwen 3 VL)

The image presents a still life with a sparse composition: three objects — a pear, a glass, and a folded cloth — arranged with significant empty space around them, creating a sense of quiet isolation. The background and surfaces are rendered in muted, desaturated tones (off-white, gray, pale green), aligning with the “muted colors” requirement. The lighting casts sharp, defined shadows that contribute to a subtle visual tension — the pear’s blemished skin and the starkness of the glass and cloth

bfl/flux-2-proPASS

A cityscape rendered in nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina

Evaluator (Qwen 3 VL)

The image successfully embodies the requested “nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina.” Key visual characteristics are clearly present: - **Retro-futurism / 1960s space-age aesthetics**: The architecture features sleek, curved forms, domed structures, flying saucer-like vehicles, and elevated walkways — all hallmarks of mid-century speculative design. The stylized starburst lights and neon signage (e.g., “SIGHT CARD”) reinforce the