ImageBench

ImageBench V1 —

192 evaluations across 6 categories

Benchmark V1 verdicts are produced by VLM judges and can contain mistakes. Treat PASS/FAIL labels as machine-assisted assessments, and inspect the images yourself. Learn more about the methodology.

138
Pass
54
Fail
72%
Pass Rate
12.4s
Avg Latency
7.0s
Min Latency
23.6s
Max Latency
Text Rendering47%Spatial Reasoning75%Human realism79%Truthfulness63%Professional Studio93%Graphical design54%Latency26%
The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink
Text RenderingTypography StyleEasyfal/bria/fast

Prompt: The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Text Rendering 47%

Typography StyleClick to expand100%

Typography Style measures whether fal/bria/fast preserves requested lettering styles, font mood, and text texture such as typewriter ink or neon signage.

Easy 100%
The word 'CHAPTER ONE' typed on aged paper with a vintage tyPASS
The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Evaluator (Qwen 3 VL)

The text “CHAPTER ONE” is spelled correctly and rendered in a bold, blocky font resembling vintage typewriter lettering. The ink appears slightly uneven and textured, matching the “slightly uneven ink” description. The paper shows aged, yellowed edges, enhancing the vintage feel. The style convincingly mimics a typewriter output on aged paper. VERDICT: PASS

The phrase 'TODAY'S SPECIAL' written in white chalk on a darPASS
The phrase 'TODAY'S SPECIAL' written in white chalk on a dark green chalkboard with chalk dust

Evaluator (Qwen 3 VL)

Text reads "TODAY'S SPECIAL" correctly. Letters mimic chalk strokes with uneven thickness and smudges, and chalk dust is visible below. Dark green background matches chalkboard. Style convincingly simulates white chalk writing. VERDICT: PASS

The word 'OPEN' displayed on a red LED sign with glowing pixPASS
The word 'OPEN' displayed on a red LED sign with glowing pixel-style letters

Evaluator (Qwen 3 VL)

The word "OPEN" is correctly spelled and rendered in glowing red LED pixels, matching the described medium. Letters are formed by illuminated dots, with visible bezel and diffused background, convincingly simulating an LED sign. Lighting and texture details are authentic. VERDICT: PASS

Writing accuracyClick to expand33%

Writing accuracy measures whether fal/bria/fast renders the exact requested words, numbers, labels, and short phrases without spelling or layout errors.

Easy 100%
A single large word 'HELLO' centered on a white backgroundPASS
A single large word 'HELLO' centered on a white background

Evaluator (Qwen 3 VL)

The word "HELLO" is fully spelled correctly and is clearly readable in bold red letters on a white background. All characters are intact and legible. No misspellings, missing letters, or illegible parts are present. The text matches the intended specification exactly. VERDICT: PASS

The word 'ADVENTURE' in big bold letters centered on a plainPASS
The word 'ADVENTURE' in big bold letters centered on a plain cream background

Evaluator (Qwen 3 VL)

The word "ADVENTURE" is spelled correctly and fully readable in bold, centered text on a plain cream background. All letters are intact and clearly legible. No misspellings or illegible parts detected. The design matches the intended specification exactly. VERDICT: PASS

The word 'SERENITY' written large and centered on a solid liPASS
The word 'SERENITY' written large and centered on a solid light blue background

Evaluator (Qwen 3 VL)

The word "SERENITY" is fully spelled correctly and is large, centered, and clearly readable against the solid light blue background. No letters are missing or distorted. The text meets all specified criteria without errors. VERDICT: PASS

Medium 33%
The sentence 'The quick brown fox jumps high' in clean blackFAIL
The sentence 'The quick brown fox jumps high' in clean black text on a white background

Evaluator (Qwen 3 VL)

The second line reads "joxfmpshigh" — "fox" is misspelled as "jox", and "jumps high" is rendered as "fmpshigh", missing letters and altering the intended phrase. Text is not fully readable or accurate. VERDICT: FAIL

'Every sunset brings a new dawn ahead' written neatly on a pFAIL
'Every sunset brings a new dawn ahead' written neatly on a pale yellow background

Evaluator (Qwen 3 VL)

Text shows multiple errors: "sunset" is misspelled as "sustet", "brings" is correct, "down" is correct, but "a new ahead" should be "a new day". "Ahead" is incorrect. All words are readable, but content is inaccurate. VERDICT: FAIL

'Stars shine brightest in the dark night' displayed on a clePASS
'Stars shine brightest in the dark night' displayed on a clean off-white background

Evaluator (Qwen 3 VL)

The text reads: "Stars shine in the dark night" — all words are spelled correctly and fully legible. The phrase matches the intended text exactly. Background is a gradient from off-white to dark blue with star effects, not a solid off-white. VERDICT: PASS

Hard 0%
A poster with headline 'SUMMER SALE' in large bold text, bodFAIL
A poster with headline 'SUMMER SALE' in large bold text, body text 'Up to 50% off all items this weekend only' in medium size, and fine print 'Terms and conditions apply. See store for details.' at the bottom in small text

Evaluator (Qwen 3 VL)

Text contains multiple errors: “item” is misspelled as “iteme”; “Terms conditions apply. obve fns” is garbled; “Teres and Sore zore for for Se detlls..” is illegible. All text is not readable or correctly spelled. VERDICT: FAIL

A flyer with large title 'GRAND OPENING', subtitle 'Join us FAIL
A flyer with large title 'GRAND OPENING', subtitle 'Join us for food, music, and fun' in medium text, and small disclaimer 'RSVP required. Limited seating available.' at the bottom

Evaluator (Qwen 3 VL)

Text errors: “Limited svva.” (misspelled “seating”), “RSVP require_” (incomplete), “Required. retard. Pmised. stanignd retiiable.” (garbled, nonsensical). All text below the main content is illegible or incorrect. Core message not accurately conveyed. VERDICT: FAIL

An event banner with headline 'ANNUAL GALA 2025' at the top FAIL
An event banner with headline 'ANNUAL GALA 2025' at the top in large serif font, description 'An evening of fine dining and live jazz' in medium text, and footnote 'Black tie optional. Doors open at 6 PM.' in small text

Evaluator (Qwen 3 VL)

Text contains multiple errors: “fine” misspelled as “tine”; “dining” split across lines; “Black tie optional” misrendered as “Bors tic robtiol.”; “Doors open at 6 PM” as “opon open. a6 PM..”. All text is illegible or incorrect. VERDICT: FAIL

Extreme 0%
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagFAIL
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagline 'One ship. One chance. No turning back.' with credits block reading 'Directed by James Harlow. Starring Elena Voss, Marcus Chen, and Sofia Reyes. Music by Thomas Grant. A Paramount Pict

Evaluator (Qwen 3 VL)

Text contains multiple misspellings and illegible parts: “THE LAST LAST” (redundant), tagline “One. ship. one Vcs. No and baick.” (incorrect), credits like “Thup ugle R7” and “Jsc ssmroc” are gibberish. Names and roles are garbled. No text is fully readable or correctly spelled as intended. VERDICT: FAIL

A magazine cover titled 'NATIONAL EXPLORER' with cover linesFAIL
A magazine cover titled 'NATIONAL EXPLORER' with cover lines 'The Hidden Temples of Cambodia — page 34', 'Interview: Dr. Maria Santos on Climate Action', 'Top 10 Trails You Must Hike This Year', and a barcode area with 'Issue 247 | March 2025 | $6.99

Evaluator (Qwen 3 VL)

Multiple text elements are misspelled or illegible: “Cammabby” (should be “Cambodia”), “Dr. Maria Santos” distorted, “Climate Action” garbled, “Top 10 Trails” misread, “$6.99” as “607.00” with “Tker” nonsense. Barcode area shows “Issue 247” but bottom-left says “Issue 227”. Many words are unrecognizable. VERDICT: FAIL

A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtFAIL
A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtitle 'Raw & Unfiltered', weight '12 oz (340g)', description 'Pure wildflower honey harvested from sustainable apiaries in the Pacific Northwest. No additives or preservatives.', and small te

Evaluator (Qwen 3 VL)

Text contains multiple misspellings and illegible sections. “Unfiltered” is garbled; “No additives or preservatives” is unreadable; “Best before: 06/2026” is misrendered as “06/2066”; “Store in cool, dry place” is garbled; “Product of USA” is partially illegible. Core text is not readable as intended. VERDICT: FAIL

Spatial Reasoning 75%

Attributes BindingClick to expand89%

Attributes Binding measures whether fal/bria/fast attaches the right colors, materials, and properties to the correct requested objects.

Easy 100%
A red ball and a blue cube sitting side by side on a white sPASS
A red ball and a blue cube sitting side by side on a white surface

Evaluator (Qwen 3.5 122B)

The image shows two objects on a white surface: a red sphere (ball) and a blue cube. Both are correctly colored and shaped as described. They are positioned side by side. No attributes are swapped — the ball is red, the cube is blue, both are appropriately sized relative to each other for this composition. All prompt requirements are met. VERDICT: PASS

A green apple and an orange pumpkin on a wooden cutting boarPASS
A green apple and an orange pumpkin on a wooden cutting board

Evaluator (Qwen 3.5 122B)

The image shows a green apple and an orange pumpkin on a wooden cutting board. The apple is correctly green, round, and has a stem. The pumpkin is correctly orange, round, and has a textured surface with a stem. Both objects are appropriately sized relative to each other and placed on the board as described. No attribute swaps or misbindings are present. VERDICT: PASS

A yellow taxi and a black limousine parked next to each othePASS
A yellow taxi and a black limousine parked next to each other on a street

Evaluator (Qwen 3.5 122B)

The image shows two vehicles parked side-by-side on a street. The vehicle on the left is a yellow sedan with a "TAXI" sign on top, matching the description of a yellow taxi. The vehicle on the right is a long, black luxury car with multiple doors and tinted windows, matching the description of a black limousine. Both objects have their attributes (color, type) correctly bound without any swaps. VERDICT: PASS

Medium 67%
A large red circle, a small blue triangle, and a medium greePASS
A large red circle, a small blue triangle, and a medium green star arranged on a white canvas

Evaluator (Qwen 3.5 122B)

The image contains three shapes on a white background: a large red circle, a small blue triangle, and a medium green star. All attributes (color, size, shape) are correctly assigned to each object with no swaps. The arrangement matches the prompt’s description. VERDICT: PASS

A tall thin white candle, a short fat red candle, and a mediPASS
A tall thin white candle, a short fat red candle, and a medium spiral yellow candle on a shelf

Evaluator (Qwen 3.5 122B)

The image shows three candles on a white shelf against a dark background. 1. Left candle: Tall, thin, white — matches “tall thin white candle” 2. Middle candle: Short, fat, red — matches “short fat red candle” 3. Right candle: Medium height, spiral shape, yellow — matches “medium spiral yellow candle” All attributes (color, size, shape) are correctly bound to each object with no swaps. VERDICT: PASS

A big brown horse, a small white rabbit, and a medium-sized FAIL
A big brown horse, a small white rabbit, and a medium-sized black cat standing in a meadow

Evaluator (Qwen 3.5 122B)

The image contains a big brown horse, a small white rabbit, and two black cats (not one). The prompt requested one medium-sized black cat, but two are present. All other attributes (color, size, shape) are correctly bound to their respective objects. Since the quantity of cats is incorrect, the core criteria is not fully met. VERDICT: FAIL

Hard 100%
A pink fire truck, a blue banana, and a green sun in a surrePASS
A pink fire truck, a blue banana, and a green sun in a surreal landscape

Evaluator (Qwen 3.5 122B)

The image contains a pink fire truck with correct details (ladder, wheels, shape). A large blue banana is present with correct curvature and stem. A green spherical object (sun) appears in the sky. All requested objects are correctly colored and shaped without attribute swaps. The surreal landscape context is also maintained. VERDICT: PASS

A purple carrot, a red tree with red leaves, and a white watPASS
A purple carrot, a red tree with red leaves, and a white watermelon sliced open

Evaluator (Qwen 3.5 122B)

The image contains: 1. Purple carrot – correct color and shape. 2. Red tree with red leaves – correct color and form. 3. White watermelon sliced open – correct internal color and cut state, though exterior is green (typical for watermelon). All attributes are correctly bound to their respective objects. No swaps or mismatches observed. VERDICT: PASS

An orange elephant, a yellow penguin, and a silver strawberrPASS
An orange elephant, a yellow penguin, and a silver strawberry arranged on a table

Evaluator (Qwen 3.5 122B)

The image contains an orange elephant, a yellow penguin, and a silver object that resembles a strawberry or ornament. However, there is an additional unrequested orange bird-like figure with a silver hat in the background. The core requested objects are present with correct colors and shapes. The silver object is ambiguous but plausibly interpreted as a stylized strawberry. Since all requested items are correctly attributed and no swaps occurred, the core criteria is met. VERDICT: PASS

CompositionalityClick to expand100%

Compositionality measures whether fal/bria/fast combines subjects, actions, outfits, props, and scene details into one coherent requested image.

Easy 100%
A tabby cat wearing oversized red sunglassesPASS
A tabby cat wearing oversized red sunglasses

Evaluator (Qwen 3.5 122B)

The image contains all described elements: 1. Tabby cat: Present with distinct stripes and markings. 2. Oversized red sunglasses: Clearly visible on the cat’s face, fitting the description. 3. Coherent combination: The sunglasses sit naturally on the cat’s face, creating a believable and humorous effect. All components are accurately represented and integrated. VERDICT: PASS

A golden retriever wearing a tiny cowboy hatPASS
A golden retriever wearing a tiny cowboy hat

Evaluator (Qwen 3.5 122B)

The image contains all described elements: 1. Golden retriever: Present and clearly identifiable. 2. Tiny cowboy hat: Present on the dog’s head, proportionally sized appropriately. 3. Coherent combination: The hat sits naturally on the dog, with no visual inconsistencies or distortions. All components are accurately rendered and logically integrated. VERDICT: PASS

A penguin holding a colorful umbrellaPASS
A penguin holding a colorful umbrella

Evaluator (Qwen 3.5 122B)

The image shows a cartoon penguin standing on a sandy beach with the ocean and sky in the background. The penguin is holding a colorful rainbow umbrella. All described elements are present: 1. Penguin - Present 2. Holding - Present (penguin is holding the umbrella) 3. Colorful umbrella - Present (rainbow-colored) The elements are coherently combined in a logical scene. VERDICT: PASS

Medium 100%
A medieval knight in full armor sitting at a desk typing on PASS
A medieval knight in full armor sitting at a desk typing on a modern laptop

Evaluator (Qwen 3.5 122B)

The image shows a figure in full medieval armor, including helmet, breastplate, and gauntlets, seated at a modern office desk. The knight’s gloved hands are positioned on a laptop keyboard, with the screen displaying text. A notebook and pen rest nearby. All prompt elements — medieval knight, full armor, desk, modern laptop, typing action — are present and coherently combined. No logical or visual contradictions exist. VERDICT: PASS

A Roman gladiator taking a selfie with a smartphone in the CPASS
A Roman gladiator taking a selfie with a smartphone in the Colosseum

Evaluator (Qwen 3.5 122B)

The image contains all required elements: 1. Roman gladiator - Present (wearing helmet, armor, shield) 2. Taking a selfie - Present (holding phone up, arm extended) 3. Smartphone - Present (black rectangular device in hand) 4. Colosseum - Present (arched stone architecture in background) All elements are coherently combined in a single scene. The gladiator is positioned in front of the Colosseum while taking a selfie with a smartphone. VERDICT: PASS

An astronaut in a spacesuit riding a bicycle through a parkPASS
An astronaut in a spacesuit riding a bicycle through a park

Evaluator (Qwen 3.5 122B)

The image contains all required elements: 1. Astronaut in a spacesuit: Present (white suit, helmet, backpack). 2. Riding a bicycle: Present (astronaut is seated on and pedaling a red bicycle). 3. Through a park: Present (green grass, trees, and park lamp post in background). The concepts are coherently combined in a single scene. VERDICT: PASS

Hard 100%
An octopus wearing a top hat and monocle playing chess on thPASS
An octopus wearing a top hat and monocle playing chess on the surface of the moon with Earth visible in the background

Evaluator (Qwen 3.5 122B)

The image contains all described elements: 1. Octopus - Present (blue, central figure) 2. Top hat - Present (black, on octopus head) 3. Monocle - Present (gold-rimmed, over one eye) 4. Chess board/pieces - Present (octopus tentacles interacting with pieces) 5. Moon surface - Present (cratered gray background) 6. Earth in background - Present (blue planet visible at top) All elements are coherently combined in a surreal composition. VERDICT: PASS

A dinosaur in a business suit giving a PowerPoint presentatiPASS
A dinosaur in a business suit giving a PowerPoint presentation in a modern office while drinking coffee

Evaluator (Qwen 3.5 122B)

Analysis of elements: 1. Dinosaur: Present (T-Rex head/body). 2. Business suit: Present (grey suit, white shirt, red tie). 3. PowerPoint presentation: Partially present. The laptop screen shows colorful bars resembling a presentation, but the dinosaur is holding the laptop out rather than standing at a podium or screen. 4. Modern office: Present (background blur, desk, office supplies). 5. Drinking coffee: Present (holding a metal mug). The core concepts are all present and combined coherently.

A robot dressed as a pirate sailing a paper boat on a lake oPASS
A robot dressed as a pirate sailing a paper boat on a lake of lava while juggling three kittens

Evaluator (Qwen 3.5 122B)

The image contains: - Robot dressed as a pirate: Present (metallic face, eye patch, pirate hat, coat) - Paper boat: Present (white origami-style boat) - Lake of lava: Present (glowing red-orange water with reflections) - Juggling three kittens: Present (three white cats in mid-air around the robot) All elements are coherently combined in a surreal, dramatic scene. VERDICT: PASS

CountingClick to expand44%

Counting measures whether fal/bria/fast creates the exact number of requested objects without adding or missing items.

Easy 67%
Exactly 3 red apples arranged on a wooden tablePASS
Exactly 3 red apples arranged on a wooden table

Evaluator (Qwen 3.5 122B)

The image displays exactly three red apples placed on a wooden surface. The apples are arranged in a triangular formation, with one in front and two behind. The count matches the prompt’s request precisely. No extra or missing objects are present. The wooden texture is consistent and realistic. Lighting and shadows appear natural. VERDICT: PASS

Exactly 5 yellow rubber ducks sitting in a row on a bathroomPASS
Exactly 5 yellow rubber ducks sitting in a row on a bathroom shelf

Evaluator (Qwen 3.5 122B)

The image displays exactly five yellow rubber ducks arranged in a horizontal row on a white shelf, consistent with a bathroom setting. Each duck is clearly visible and matches the description. The count of ducks is precisely five, matching the prompt’s requirement. No extra or missing objects are present. All criteria are fully satisfied. VERDICT: PASS

Exactly 4 white candles standing on a dark surfaceFAIL
Exactly 4 white candles standing on a dark surface

Evaluator (Qwen 3.5 122B)

The image shows five white candles standing on a dark surface, not four as requested. All candles are lit and arranged in a staggered formation. Since the count does not match the prompt’s requirement of exactly four candles, the core criterion is not met. VERDICT: FAIL

Medium 0%
2 orange cats and 3 black dogs sitting together on a grassy FAIL
2 orange cats and 3 black dogs sitting together on a grassy lawn

Evaluator (Qwen 3.5 122B)

The image shows two orange cats and two black dogs sitting on a grassy lawn. The prompt requested 2 orange cats and 3 black dogs. While the cat count is correct, the dog count is off by one — only two dogs are visible, not three. Therefore, the object counts do not exactly match the prompt. VERDICT: FAIL

4 red roses and 2 white daisies in a glass vase on a tableFAIL
4 red roses and 2 white daisies in a glass vase on a table

Evaluator (Qwen 3.5 122B)

The image shows 4 red roses and 4 white daisies (3 in the vase, 1 on the table). The prompt requested exactly 2 white daisies. Since the count of daisies does not match the requested number, the core criterion is not met. VERDICT: FAIL

1 blue butterfly and 3 yellow butterflies resting on a gardeFAIL
1 blue butterfly and 3 yellow butterflies resting on a garden fence

Evaluator (Qwen 3.5 122B)

The image shows 1 blue butterfly resting on the fence, which matches the request. However, there are 3 yellow butterflies visible in the background — but they are not resting on the fence; they appear to be flying or hovering above it. The prompt specifically required all 3 yellow butterflies to be “resting on a garden fence,” which is not accurate. Therefore, the count and positioning do not exactly match the request. VERDICT: FAIL

Hard 67%
Exactly 9 goldfish swimming in a large round fishbowlPASS
Exactly 9 goldfish swimming in a large round fishbowl

Evaluator (Qwen 3.5 122B)

The image shows a round fishbowl with a wooden rim and base. Inside, there are multiple goldfish swimming. Counting them: one on the far left, one near center-left, one in front center, one behind center, one on right side, one near bottom right, one partially visible behind right fish, one small one near bottom left, and one more faintly visible in background — totaling 9 goldfish. The bowl is large and round as described. VERDICT: PASS

A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 PASS
A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 apples

Evaluator (Qwen 3.5 122B)

The image shows a fruit bowl with: - 3 bananas (visible on the left) - 4 oranges (three clearly visible, one partially hidden behind others) - 2 apples (one green-red, one mostly red-green) All counts match the prompt’s request exactly. VERDICT: PASS

Exactly 11 colorful balloons floating against a clear blue sFAIL
Exactly 11 colorful balloons floating against a clear blue sky

Evaluator (Qwen 3.5 122B)

The image contains 15 balloons, not 11 as requested. The balloons are colorful and floating against a clear blue sky, matching the description except for the count. Since the prompt specifically asked for exactly 11 balloons and there are 15, the core criterion is not met. VERDICT: FAIL

NegationClick to expand78%

Negation measures whether fal/bria/fast respects what should be absent from the image, not only what should be present.

Easy 100%
A tropical beach scene with no peoplePASS
A tropical beach scene with no people

Evaluator (Qwen 3.5 122B)

The image shows a tropical beach scene with white sand, turquoise water, palm trees, and a small boat floating offshore. The sky is blue with scattered clouds. There are no people visible anywhere in the scene — on the beach, in the water, or in the boat. The excluded element (people) is fully absent. VERDICT: PASS

A cat lying on a couch without a collarPASS
A cat lying on a couch without a collar

Evaluator (Qwen 3.5 122B)

The image shows a light-colored cat lying on a beige couch. The cat is looking directly at the camera with green eyes. There is no collar visible around its neck — the fur appears smooth and uninterrupted from chin to shoulders. The excluded element (collar) is truly absent. VERDICT: PASS

A classroom with desks and a chalkboard but no studentsPASS
A classroom with desks and a chalkboard but no students

Evaluator (Qwen 3.5 122B)

The image shows an empty classroom with rows of wooden desks and a chalkboard on the wall. There are no students visible in the room. The excluded element (students) is truly absent. VERDICT: PASS

Medium 67%
A sandy beach with palm trees, seashells, and a beach umbrelPASS
A sandy beach with palm trees, seashells, and a beach umbrella but no water or ocean visible

Evaluator (Qwen 3.5 122B)

The image shows a sandy beach with numerous palm trees, scattered seashells in the foreground, and a blue-and-white striped beach umbrella. The background consists of sky and more trees; there is no visible water, ocean, or sea horizon line. The excluded element (water/ocean) is truly absent. VERDICT: PASS

An office workspace with a desk, chair, bookshelf, and lamp FAIL
An office workspace with a desk, chair, bookshelf, and lamp but no computers or screens

Evaluator (Qwen 3.5 122B)

The image shows an office workspace with a desk, chair, bookshelves, and a lamp. However, there is clearly a computer monitor (screen) on the desk — a white all-in-one desktop computer with its back facing the viewer. Since the prompt explicitly excluded “computers or screens,” and one is present, the image fails to meet the exclusion criterion. VERDICT: FAIL

A kitchen with pots, pans, a stove, and a cutting board but PASS
A kitchen with pots, pans, a stove, and a cutting board but no food

Evaluator (Qwen 3.5 122B)

The image shows a kitchen scene with a stove, multiple pots and pans, and a wooden cutting board placed on the stovetop. There is no visible food in any of the containers or on surfaces — only empty cookware and utensils. The excluded element (“food”) is indeed absent. VERDICT: PASS

Hard 67%
A pepperoni pizza with tomato sauce, basil, and crust but abPASS
A pepperoni pizza with tomato sauce, basil, and crust but absolutely no cheese

Evaluator (Qwen 3.5 122B)

The image shows a pepperoni pizza with tomato sauce, basil leaves, and a golden-brown crust. There is no visible cheese layer — the red sauce is directly beneath the pepperoni and basil, with no melted or shredded dairy product present. The surface texture beneath toppings appears to be sauce only, consistent with the exclusion of cheese. VERDICT: PASS

A car body with windows, doors, and headlights but no wheelsPASS
A car body with windows, doors, and headlights but no wheels

Evaluator (Qwen 3.5 122B)

The image shows a silver car body with windows, doors, and headlights clearly visible. The wheels are absent — the wheel wells are empty, showing only the inner arches. This matches the exclusion criteria: no wheels are present. The car is rendered as a shell or chassis without tires or rims. VERDICT: PASS

A human face with eyes, nose, and mouth but no eyebrowsFAIL
A human face with eyes, nose, and mouth but no eyebrows

Evaluator (Qwen 3.5 122B)

The image clearly shows a human face with eyes, nose, and mouth — as expected. However, the prompt explicitly excluded “eyebrows,” and in this image, well-defined, natural-looking eyebrows are prominently present above each eye. Since the excluded element (eyebrows) is visibly present, the image fails to meet the exclusion criterion. VERDICT: FAIL

Relative PositionClick to expand92%

Relative Position measures whether fal/bria/fast places objects in the requested spatial relationship, such as above, below, inside, or next to another object.

Easy 100%
A cat sitting on top of a cardboard boxPASS
A cat sitting on top of a cardboard box

Evaluator (Qwen 3.5 122B)

The image shows a cat positioned directly on the top surface of a cardboard box. The cat’s paws are resting on the box flaps, and its body is elevated above the surrounding floor and scattered cardboard pieces. The spatial relationship “on top of” is accurately represented. No conflicting positional descriptors are present. VERDICT: PASS

A red ball underneath a wooden chairPASS
A red ball underneath a wooden chair

Evaluator (Qwen 3.5 122B)

The image shows a wooden chair with a red ball positioned directly underneath it, between the front legs. The spatial relationship described in the prompt — “a red ball underneath a wooden chair” — is accurately represented. The ball is clearly below the seat and within the footprint of the chair’s legs. All positional descriptors are correct. VERDICT: PASS

A coffee mug next to an open book on a deskPASS
A coffee mug next to an open book on a desk

Evaluator (Qwen 3.5 122B)

The image shows a white coffee mug positioned to the left of an open book. Both objects are resting on a wooden desk surface. The spatial arrangement matches the prompt’s description: the mug is next to the book, and both are on the desk. All positional relationships described in the prompt are accurately represented. VERDICT: PASS

Medium 100%
A bird perched on a branch, the branch extending from a treePASS
A bird perched on a branch, the branch extending from a tree next to a park bench

Evaluator (Qwen 3.5 122B)

The image shows a bird perched on a branch that extends from a tree on the left side of the frame. The branch is positioned in front of and to the left of a green park bench. The spatial relationships described — bird on branch, branch extending from tree, tree next to bench — are all accurately represented. All positional descriptors (on, next to) are correct. VERDICT: PASS

A cat sitting on a pillow, the pillow placed on a wooden chaPASS
A cat sitting on a pillow, the pillow placed on a wooden chair

Evaluator (Qwen 3.5 122B)

The image shows a cat sitting on a white pillow, which is placed on a wooden chair. The spatial relationships described in the prompt are accurately represented: the cat is on the pillow, and the pillow is on the chair. All positional relationships are correct. VERDICT: PASS

A candle on top of a stack of books, the books next to a potPASS
A candle on top of a stack of books, the books next to a potted plant

Evaluator (Qwen 3.5 122B)

The image shows a white candle placed directly on top of a stack of three books. To the right of the book stack is a potted green plant. The spatial relationships described — “candle on top of books” and “books next to potted plant” — are accurately represented. All positional descriptors (on, next to) are correct. VERDICT: PASS

Hard 67%
Five chess pieces on a board: a king in the center, a queen FAIL
Five chess pieces on a board: a king in the center, a queen to its left, a bishop to its right, a rook above the king, and a knight below the king

Evaluator (Qwen 3.5 122B)

The image shows five chess pieces on a board, but the spatial arrangement does not match the prompt. The king is centered, with a queen to its left and a bishop to its right — this part is correct. However, there is no rook “above” the king; instead, a knight is positioned in front (below) the king, and another knight is to the right of the bishop. The prompt’s “rook above” and “knight below” are inaccurate or misassigned. VERDICT: FAIL

A desk scene: laptop in the center, coffee mug to the left, PASS
A desk scene: laptop in the center, coffee mug to the left, phone to the right, lamp above, and notebook below

Evaluator (Qwen 3.5 122B)

The image shows a desk scene with a laptop centered on the desk. A white coffee mug is positioned to the left of the laptop. A smartphone lies to the right of the laptop. A desk lamp is situated above and to the right, casting light downward. A closed notebook rests in front of (below) the laptop on the desk surface. All specified spatial relationships are accurately represented. VERDICT: PASS

Five animals in a field: a horse on the far left, a cow to tPASS
Five animals in a field: a horse on the far left, a cow to the right of the horse, a sheep in the center, a goat to the right of the sheep, and a dog on the far right

Evaluator (Qwen 3.5 122B)

The image shows five animals in a field arranged left to right: a horse, a cow, a sheep, a goat, and a dog. The horse is on the far left. To its right is a cow (partially behind the horse’s head). In the center is a sheep. To the right of the sheep is a goat. On the far right is a dog. All specified spatial relationships match the prompt exactly. VERDICT: PASS

Extreme 100%
A landscape with a red barn in the foreground at large scalePASS
A landscape with a red barn in the foreground at large scale, a windmill in the midground at medium scale, and snow-capped mountains in the background at small scale, all with consistent depth perspective

Evaluator (Qwen 3.5 122B)

The image shows a red barn in the foreground, large and prominent. Behind it, slightly to the left, is a windmill — smaller than the barn but still clearly visible, fitting “midground at medium scale.” In the far background, snow-capped mountains appear smaller due to perspective, consistent with “background at small scale.” Depth cues (overlap, size reduction, atmospheric haze) support correct spatial layering. All positional relationships are accurately rendered. VERDICT: PASS

A beach scene with a sandcastle in the foreground, a surfer PASS
A beach scene with a sandcastle in the foreground, a surfer on the waves in the midground, and a cruise ship on the horizon in the background, sizes decreasing with distance

Evaluator (Qwen 3.5 122B)

The image correctly depicts a beach scene with a sandcastle in the foreground, a surfer riding waves in the midground, and a cruise ship on the horizon in the background. The sizes of the objects decrease appropriately with distance, maintaining proper spatial relationships as described. All positional relationships (on, under, left, right, above, below) are accurate. VERDICT: PASS

A city park with a park bench and pigeons in the foreground,PASS
A city park with a park bench and pigeons in the foreground, a fountain in the midground, and skyscrapers in the background, each layer progressively smaller

Evaluator (Qwen 3.5 122B)

The image correctly depicts a city park scene with the specified spatial arrangement. In the foreground, there is a park bench on the left and several pigeons on the grass. The midground features a large fountain with water jets. The background consists of multiple skyscrapers of varying heights. All described positional relationships are accurate: the bench and pigeons are in front of the fountain, which is in front of the buildings. VERDICT: PASS

Scale & ProportionsClick to expand44%

Scale & Proportions measures whether fal/bria/fast keeps people, vehicles, buildings, and everyday objects at believable relative sizes.

Easy 0%
A person standing next to a yellow school bus, showing realiFAIL
A person standing next to a yellow school bus, showing realistic relative scale

Evaluator (Qwen 3.5 122B)

The person standing next to the bus appears unnaturally small — roughly knee-height to the vehicle’s door — which violates realistic human-to-bus scale. A typical adult should reach at least to the bus’s window sill or door handle, not mid-thigh. This distortion breaks the “realistic relative scale” requirement. VERDICT: FAIL

A ladybug sitting on a large green leaf, shown at realistic FAIL
A ladybug sitting on a large green leaf, shown at realistic proportions

Evaluator (Qwen 3.5 122B)

The ladybug appears disproportionately large relative to the leaf, which is inconsistent with realistic proportions. In nature, ladybugs are small insects that would occupy only a fraction of a typical leaf’s surface area. Here, the beetle dominates the frame and seems unnaturally oversized for its environment, violating the prompt’s requirement for “realistic proportions.” VERDICT: FAIL

A child holding a basketball, both at correct relative sizesFAIL
A child holding a basketball, both at correct relative sizes

Evaluator (Qwen 3.5 122B)

The child’s hands are gripping the basketball, but the ball appears disproportionately large relative to the child’s torso and arms — it nearly spans the width of their chest and extends beyond their hand span unnaturally. In reality, a basketball held by a young child would appear smaller in proportion to their body. This violates the prompt’s requirement for “correct relative sizes.” VERDICT: FAIL

Medium 100%
A tiny person standing on the rim of a coffee cup, looking dPASS
A tiny person standing on the rim of a coffee cup, looking down into it

Evaluator (Qwen 3.5 122B)

The image shows a miniature figurine standing on the rim of a white cup, which is placed on a wooden surface. The figurine’s size relative to the cup is consistent with the description of a “tiny person” — it appears appropriately scaled as if it were a small toy or model. The proportions between the figure and the cup are believable and visually coherent. VERDICT: PASS

A giant tabby cat walking between city skyscrapers like a kaPASS
A giant tabby cat walking between city skyscrapers like a kaiju

Evaluator (Qwen 3.5 122B)

The image shows a cat that is disproportionately large compared to the surrounding city buildings. The cat’s head reaches the height of several building stories, and its body spans the width of the street — consistent with the “giant cat like a kaiju” description. The perspective and scale are intentionally exaggerated to match the prompt’s fantasy concept. Relative sizes are internally consistent within the surreal context. VERDICT: PASS

A miniature house sitting on the palm of someone's handPASS
A miniature house sitting on the palm of someone's hand

Evaluator (Qwen 3.5 122B)

The image shows a miniature house resting on an open human palm. The house is appropriately scaled to fit comfortably within the hand, with proportions that suggest it is a small model or toy. The fingers and palm appear natural in size relative to the house, and the background (beach/ocean) provides context without distorting scale. The size relationship is visually consistent and plausible. VERDICT: PASS

Hard 33%
A group photo of an elephant, a horse, a dog, a cat, and a mFAIL
A group photo of an elephant, a horse, a dog, a cat, and a mouse, all standing in a line with correct real-world proportional sizes

Evaluator (Qwen 3.5 122B)

The image shows five animals in a line: elephant, horse, dog, cat, and mouse. While they are arranged from largest to smallest, the proportions are not realistic. The dog, cat, and mouse are standing upright on hind legs like humans, which distorts their natural size relationships. Additionally, the dog appears disproportionately large compared to the cat and mouse — in reality, a dog is often larger than a cat, but here the scale feels exaggerated and inconsistent with real-world biology. The h

A dinner table scene with a plate, fork, knife, wine glass, FAIL
A dinner table scene with a plate, fork, knife, wine glass, salt shaker, and peppercorn, all at realistic relative sizes

Evaluator (Qwen 3.5 122B)

The image displays a table setting with a plate, two forks, a knife, a wine glass, and two shakers. However, there are significant proportion issues. The salt/pepper shakers are disproportionately large compared to the plate and cutlery. The small glass container (possibly for peppercorns) is also oddly sized relative to the other items. The forks appear unusually long compared to the plate diameter. These inconsistencies violate realistic relative sizing. VERDICT: FAIL

A farm scene with a barn, tractor, farmer, cow, chicken, andPASS
A farm scene with a barn, tractor, farmer, cow, chicken, and a fence post, all at correct proportional scale

Evaluator (Qwen 3.5 122B)

The image shows a miniature farm diorama with a barn, tractor, farmer, cow, chicken, and fence. All elements are scaled consistently as model figures — the farmer is appropriately sized relative to the cow and tractor, the chicken is small compared to the cow, and the barn fits the scene proportionally. The entire setup appears intentionally crafted as a scaled model, not a real-life scene, so proportions are internally consistent within that context. VERDICT: PASS

Human realism 79%

Faces & ExpressionsClick to expand92%

Faces & Expressions measures whether fal/bria/fast creates realistic faces with the requested age, emotion, expression, and identity cues.

Easy 100%
A close-up portrait of a young woman with a warm natural smiPASS
A close-up portrait of a young woman with a warm natural smile

Evaluator (Qwen 3 VL)

Face is symmetric with natural features; smile appears warm and genuine. No obvious distortions. Eyes, nose, and mouth are proportionally balanced. Lighting and expression convey the intended cheerful, natural emotion. Clothing and background are contextually appropriate. VERDICT: PASS

A portrait of a middle-aged man with a calm, serious expressPASS
A portrait of a middle-aged man with a calm, serious expression

Evaluator (Qwen 3 VL)

The face is symmetric with correctly rendered features: eyes, nose, and mouth are proportionate and aligned. The expression conveys calm seriousness, with neutral lips and steady gaze. No noticeable distortions or anatomical anomalies. Lighting and texture are natural. VERDICT: PASS

A headshot of a teenager with a relaxed, neutral facePASS
A headshot of a teenager with a relaxed, neutral face

Evaluator (Qwen 3 VL)

The face is symmetric with correctly proportioned features. The teenager’s expression is neutral and relaxed, matching the prompt. No significant distortions are visible. Lighting and detail are natural, enhancing realism. The image successfully conveys the requested emotion and composition. VERDICT: PASS

Medium 100%
A close-up of a person's face showing clear disgust, nose wrPASS
A close-up of a person's face showing clear disgust, nose wrinkled and lip curled

Evaluator (Qwen 3 VL)

Face is largely symmetric with correctly rendered features. Disgust is clearly conveyed: nose is wrinkled, lips are curled downward, and brows are furrowed. Slight asymmetry in brow furrowing and minor skin texture artifacts are present but don’t obscure the emotion. Overall, the expression matches the prompt accurately. VERDICT: PASS

A portrait capturing genuine surprise — raised eyebrows, widPASS
A portrait capturing genuine surprise — raised eyebrows, wide eyes, slightly open mouth

Evaluator (Qwen 3 VL)

The face is largely symmetric with correctly rendered features. Eyes are wide, eyebrows raised, and mouth slightly open—clearly conveying surprise. Slight unnatural smoothness in skin texture and overly perfect symmetry hint at AI generation, but core emotional expression and anatomical correctness are intact. VERDICT: PASS

A face showing quiet melancholy — downcast eyes, slightly drPASS
A face showing quiet melancholy — downcast eyes, slightly drawn mouth, distant gaze

Evaluator (Qwen 3 VL)

Face is symmetric with correctly rendered features. Eyes are downcast, mouth slightly drawn, gaze distant — all aligning with the prompt. No noticeable distortions. Lighting and expression effectively convey quiet melancholy. VERDICT: PASS

Hard 100%
An elderly woman in her 80s with deep wrinkles and white haiPASS
An elderly woman in her 80s with deep wrinkles and white hair, smiling warmly with kind eyes

Evaluator (Qwen 3 VL)

Face is symmetric with natural, aged features: deep wrinkles, white hair, and warm smile. Eyes convey kindness. No significant distortions. Expression is clearly readable as joyful and genuine. All prompt elements are accurately rendered. VERDICT: PASS

A 5-year-old child with round cheeks and large eyes, crying PASS
A 5-year-old child with round cheeks and large eyes, crying with tears streaming down their face

Evaluator (Qwen 3 VL)

The child’s face is symmetric with round cheeks and large eyes. Tears are not visibly streaming down, but the downturned mouth and furrowed brow convey clear distress. No major distortions are present. The prompt’s key elements are met: age-appropriate features and readable crying emotion. VERDICT: PASS

An elderly man in his 70s with age spots and thinning gray hPASS
An elderly man in his 70s with age spots and thinning gray hair, looking skeptically over reading glasses

Evaluator (Qwen 3 VL)

Face is symmetric with natural wrinkles, age spots, and thinning gray hair. Skeptical expression is clearly readable via furrowed brow and downturned mouth. No significant distortions. Features align well with prompt. VERDICT: PASS

Extreme 67%
A group of three friends at a table: one laughing with head PASS
A group of three friends at a table: one laughing with head tilted back, one looking shocked with wide eyes, and one rolling their eyes with a smirk

Evaluator (Qwen 3 VL)

Faces show clear, symmetric features with no distortions. Emotions are vividly readable: left man laughs with head back, center woman looks shocked, right woman smirks with rolled eyes. All three expressions match the prompt precisely. Lighting and detail are natural and high quality. VERDICT: PASS

A family photo with four people: a father beaming with pridePASS
A family photo with four people: a father beaming with pride, a mother smiling gently, a teenager looking bored, and a child grinning excitedly

Evaluator (Qwen 3 VL)

Faces are mostly symmetrical with correct features. Emotions are clearly readable: father beams, mother smiles gently, teenager looks bored (pouting lips), child grins excitedly. No significant distortions. All four individuals match the prompt’s emotional descriptions accurately. VERDICT: PASS

Three coworkers in a meeting: one looking frustrated with fuFAIL
Three coworkers in a meeting: one looking frustrated with furrowed brows, one nodding with a pleased expression, and one looking confused with a tilted head

Evaluator (Qwen 3 VL)

Faces are mostly symmetric with correct features. The leftmost person shows clear frustration (furrowed brows). The central person’s expression is ambiguous, not clearly pleased. The rightmost person’s confused look is weak (tilted head not prominent). The woman behind appears to be whispering, not nodding. Emotions are not clearly or fully aligned with the prompt. VERDICT: FAIL

Full BodyClick to expand83%

Full Body measures whether fal/bria/fast renders full-body people with natural anatomy, clothing, pose, and proportions.

Easy 100%
A woman standing upright in a relaxed pose, arms at her sidePASS
A woman standing upright in a relaxed pose, arms at her sides, wearing jeans and a white t-shirt

Evaluator (Qwen 3 VL)

The woman stands upright with arms at her sides, wearing a white t-shirt and jeans — matching the prompt. Limb count and proportions are anatomically correct. Pose is relaxed and recognizable. No distortions are visible. Clothing details and posture are natural and well-rendered. VERDICT: PASS

A man standing straight facing the camera, hands in pockets,PASS
A man standing straight facing the camera, hands in pockets, normal proportions

Evaluator (Qwen 3 VL)

The man stands straight, facing forward, with hands in pockets — perfectly matching the prompt. Limb count and proportions are anatomically correct. No distortions are visible. The pose is clearly recognizable and natural. All criteria are satisfied. VERDICT: PASS

A person standing in a neutral pose in a park, full body visPASS
A person standing in a neutral pose in a park, full body visible from head to feet

Evaluator (Qwen 3 VL)

The person’s full body is visible from head to feet in a neutral standing pose. Limb count and proportions are anatomically correct. The pose is recognizable as standing still in a park. No significant distortions are present. The image aligns with the prompt’s requirements. VERDICT: PASS

Medium 100%
A woman mid-stride running through a park, hair and clothes PASS
A woman mid-stride running through a park, hair and clothes showing motion

Evaluator (Qwen 3 VL)

The woman’s limbs are correctly proportioned and count is accurate. Her running pose is anatomically plausible, with one leg forward, the other back, arms bent and swinging. Hair and clothing show motion blur consistent with running. No significant distortions are visible. The action is clearly recognizable as running through a park. VERDICT: PASS

A man captured mid-jump over a puddle, body in the air with PASS
A man captured mid-jump over a puddle, body in the air with bent knees

Evaluator (Qwen 3 VL)

The man’s limbs are correctly counted and proportioned. His pose—bent knees, arms slightly out—matches the described mid-air jump over a puddle. The action is recognizable and anatomically plausible. No significant distortions are visible. The image convincingly depicts the requested scene. VERDICT: PASS

A dancer in a dynamic spin, dress flowing outward with one lPASS
A dancer in a dynamic spin, dress flowing outward with one leg extended

Evaluator (Qwen 3 VL)

The dancer has correct limb count and proportions. The pose is a dynamic spin with one leg extended, dress flowing outward, and she appears airborne. Anatomical structure is accurate, with no visible distortions. The action is clearly recognizable as a spinning movement. VERDICT: PASS

Hard 67%
A person in a yoga tree pose — standing on one leg with the PASS
A person in a yoga tree pose — standing on one leg with the other foot pressed to the inner thigh, arms raised overhead

Evaluator (Qwen 3 VL)

The figure has two legs and correct proportions. The tree pose is recognizable: standing on one leg, the other foot pressed to the inner thigh, arms raised overhead in prayer position. No major anatomical distortions are visible. The pose is stable and natural-looking. VERDICT: PASS

A rock climber clinging to a cliff face with arms and legs ePASS
A rock climber clinging to a cliff face with arms and legs extended to different holds

Evaluator (Qwen 3 VL)

The climber has correct limb count and proportions. Pose is anatomically plausible for rock climbing, with arms and legs extended to holds. The action is recognizable as dynamic climbing. No major distortions; muscles and joints appear natural. Gear placement is realistic. VERDICT: PASS

A cellist seated with correct posture, legs around the instrFAIL
A cellist seated with correct posture, legs around the instrument, bow arm extended

Evaluator (Qwen 3 VL)

The cellist’s posture and limb count are anatomically plausible. The bow arm is extended, and the instrument rests between the legs as described. However, the cello’s scroll and neck appear unnaturally elongated and misaligned with the body, and the bow’s angle suggests an awkward grip. The pose is recognizable but distorted in key structural details. VERDICT: FAIL

Extreme 67%
Two tango dancers in a close embrace, the man dipping the woFAIL
Two tango dancers in a close embrace, the man dipping the woman backward with intertwined legs

Evaluator (Qwen 3 VL)

The pose shows a tango embrace with the man dipping the woman backward, but her legs are not intertwined as described. Her right leg is bent, left leg extended, and feet are separate. Anatomical proportions are mostly correct, though the woman’s left arm appears unnaturally elongated and her right hand’s grip is distorted. The action is recognizable but not fully accurate per prompt. VERDICT: FAIL

Two soccer players mid-tackle, one sliding and the other jumPASS
Two soccer players mid-tackle, one sliding and the other jumping over with the ball at their feet

Evaluator (Qwen 3 VL)

Both players show correct limb count and proportions. The sliding player’s pose is anatomically plausible, though his foot appears unnaturally positioned near the ball. The jumping player’s pose is dynamic and recognizable as a tackle attempt. No major distortions; shadows and grass texture are realistic. The action matches the prompt. VERDICT: PASS

Two gymnasts performing a synchronized handstand, side by siPASS
Two gymnasts performing a synchronized handstand, side by side with identical body alignment

Evaluator (Qwen 3 VL)

Both gymnasts have correct limb count and proportions. Their synchronized handstand pose is anatomically plausible, with legs extended and bodies aligned. The action is clearly recognizable as a handstand with one leg slightly lifted. No major distortions are visible. VERDICT: PASS

HandsClick to expand58%

Hands measures whether fal/bria/fast draws anatomically plausible hands, fingers, and gestures.

Easy 33%
A person standing with both hands visible at their sides, fiFAIL
A person standing with both hands visible at their sides, fingers relaxed and naturally spread

Evaluator (Qwen 3.5 122B)

The left hand (viewer’s left) has 5 fingers, but the right hand (viewer’s right) clearly shows 6 fingers — an extra digit between the ring and pinky finger. This is a clear anatomical anomaly inconsistent with natural human anatomy. The gesture otherwise appears relaxed, but the extra finger violates the “exactly 5 per hand” criterion. VERDICT: FAIL

A woman holding her hands out in front of her palms-up, all PASS
A woman holding her hands out in front of her palms-up, all ten fingers clearly visible

Evaluator (Qwen 3.5 122B)

The woman’s hands are extended palms-up, fingers spread. Each hand clearly shows five fingers: thumb, index, middle, ring, and pinky. No extra, missing, or fused fingers. Anatomical structure and gesture appear natural and correct. All ten fingers visible as specified. VERDICT: PASS

A man with arms crossed, both hands resting on opposite uppeFAIL
A man with arms crossed, both hands resting on opposite upper arms with fingers visible

Evaluator (Qwen 3.5 122B)

The left hand (on the right side of the image) shows 4 fingers clearly, with the thumb likely hidden or merged. The right hand (on the left side) also appears to show only 4 distinct fingers — the pinky is missing or fused. Both hands lack full anatomical correctness: finger counts are incorrect, and the grip/gesture doesn’t match natural crossed-arm positioning. Anomalies include missing digits and unnatural finger alignment. VERDICT: FAIL

Medium 67%
A person holding a white ceramic coffee mug by the handle wiFAIL
A person holding a white ceramic coffee mug by the handle with their right hand

Evaluator (Qwen 3.5 122B)

The hand gripping the mug handle shows anatomical issues. The thumb appears fused or improperly positioned relative to the index finger. The fingers curling around the mug are indistinct and lack clear separation — some appear merged or malformed. There are not clearly 5 distinct fingers visible; the anatomy is distorted. This fails the requirement for correct hand structure and finger count. VERDICT: FAIL

A woman writing in a notebook with a pen gripped correctly iPASS
A woman writing in a notebook with a pen gripped correctly in her right hand

Evaluator (Qwen 3.5 122B)

The woman’s right hand is visible, gripping a pen. It shows 5 fingers: thumb, index, middle, ring, and pinky — all appear anatomically correct with natural positioning for writing. The left hand is not visible in the frame. Since only one hand is shown and it meets the criteria (5 fingers, correct grip), the core requirement is satisfied. VERDICT: PASS

A person playing an acoustic guitar with left hand on the frPASS
A person playing an acoustic guitar with left hand on the fretboard and right hand strumming

Evaluator (Qwen 3.5 122B)

The left hand on the fretboard shows only four fingers clearly visible, with the thumb positioned behind the neck — anatomically plausible for guitar playing. The right hand near the soundhole appears to have all five fingers present, though partially obscured by motion or angle. No gross anatomical anomalies (e.g., extra fingers, fused digits) are evident. Grip and gesture align with typical guitar-playing posture. VERDICT: PASS

Hard 100%
A person making a peace sign with their right hand, clearly PASS
A person making a peace sign with their right hand, clearly showing two fingers up and three curled

Evaluator (Qwen 3.5 122B)

The image shows two hands, each making a peace sign. Each hand has five fingers visible: two extended (index and middle) and three curled (thumb, ring, pinky). The anatomy appears correct — no extra or missing fingers, natural joint bends, and proper thumb positioning. No anomalies detected. Both hands match the described gesture accurately. VERDICT: PASS

A hand giving a thumbs-up with four fingers curled and thumbPASS
A hand giving a thumbs-up with four fingers curled and thumb extended upward

Evaluator (Qwen 3.5 122B)

The image shows one hand giving a thumbs-up gesture. The thumb is extended upward, and four fingers are curled into a fist — anatomically correct for this gesture. All five digits (thumb + 4 fingers) are present and properly positioned. No anomalies detected. The hand appears natural and consistent with human anatomy. VERDICT: PASS

A person counting to three on their fingers, with index, midPASS
A person counting to three on their fingers, with index, middle, and ring fingers extended

Evaluator (Qwen 3.5 122B)

The image shows one hand making a gesture with three fingers extended: index, middle, and ring fingers. The thumb and pinky are curled inward. This matches the prompt’s description of “counting to three” with those specific fingers extended. Anatomically, the hand structure appears correct — five fingers total, proper joint bends, natural skin texture, and realistic lighting. No anomalies detected. VERDICT: PASS

Extreme 33%
Two people shaking hands in a professional setting, all fingPASS
Two people shaking hands in a professional setting, all fingers of both hands anatomically correct

Evaluator (Qwen 3.5 122B)

Upon close examination, the left hand (from viewer’s perspective) shows 5 fingers clearly. The right hand also displays 5 fingers, though one thumb is partially obscured by the grip — still anatomically plausible. Fingers interlock naturally; no extra or missing digits. Grip appears realistic for a handshake. Minor blurring doesn’t obscure anatomy. Core criteria met. VERDICT: PASS

Two people doing a high-five, palms meeting with all fingersFAIL
Two people doing a high-five, palms meeting with all fingers extended on both hands

Evaluator (Qwen 3.5 122B)

The image shows two hands meeting in a high-five gesture. The left hand (from the viewer’s perspective) has 5 fingers clearly visible and anatomically correct. The right hand also displays 5 fingers, but the thumb appears to be fused or merged with the index finger — an anatomical anomaly. Additionally, some fingers on the right hand seem to overlap unnaturally, suggesting AI generation artifacts. While the intent is clear, the execution fails anatomical accuracy. VERDICT: FAIL

One person handing a small gift box to another person, both FAIL
One person handing a small gift box to another person, both sets of hands gripping the box correctly

Evaluator (Qwen 3.5 122B)

Upon close examination, the hands show anatomical inconsistencies. The left hand (gray sleeve) has fingers that appear fused or malformed at the tips, and the thumb positioning is unnatural. The right hand (red sleeve) has a thumb that seems to merge with the index finger, and finger counts are ambiguous due to blending. Grip mechanics are not fully realistic. VERDICT: FAIL

Multi-SubjectClick to expand83%

Multi-Subject measures whether fal/bria/fast keeps multiple people distinct while preserving their requested attributes, outfits, and positions.

Easy 100%
Two women standing side by side: one with short blonde hair PASS
Two women standing side by side: one with short blonde hair wearing a red jacket, the other with long black hair wearing a blue dress

Evaluator (Qwen 3 VL)

Left woman: short blonde hair, red jacket over white top. Right woman: long black hair, blue sleeveless dress. Both visually distinct and match prompt descriptions precisely. No missing elements or mismatches. VERDICT: PASS

Two men at a cafe: one with a beard and glasses wearing a grPASS
Two men at a cafe: one with a beard and glasses wearing a green sweater, the other clean-shaven in a black hoodie

Evaluator (Qwen 3 VL)

Person 1: Bearded, wears black-rimmed glasses, green sweater over white collared shirt. Person 2: Clean-shaven, wears black hoodie. Both visually distinct, matching prompt descriptions. Cafe setting confirmed. All features accurately rendered. VERDICT: PASS

Two friends posing: one tall with curly red hair in denim ovPASS
Two friends posing: one tall with curly red hair in denim overalls, the other short with straight brown hair in a yellow sundress

Evaluator (Qwen 3 VL)

Two individuals are present. Left: tall, curly red hair, denim overalls over a white tee. Right: shorter, straight brown hair, yellow sundress. Both are visually distinct and match descriptions precisely. VERDICT: PASS

Medium 67%
Four people at a dinner table: a bald man in a suit, a womanFAIL
Four people at a dinner table: a bald man in a suit, a woman with silver hair in a red blouse, a young man with dreadlocks in a denim jacket, and a woman with a hijab in a green dress

Evaluator (Qwen 3 VL)

The image shows four people, but only three match the prompt: a man with dreadlocks in a denim jacket (left), a woman with silver hair in a red blouse (center-left), and a woman with a hijab in a green dress (right). The “bald man in a suit” is missing. The woman with silver hair wears a black blazer, not a red blouse. VERDICT: FAIL

Three musicians on stage: a tall woman with short pink hair PASS
Three musicians on stage: a tall woman with short pink hair playing bass, a man with a beard and bandana on drums, and a slim man with glasses playing keyboard

Evaluator (Qwen 3 VL)

Left: Tall woman with short pink hair, playing bass. Center: Man with beard and red bandana, on drums. Right: Slim man with glasses, playing keyboard. All three are visually distinct and match described features. VERDICT: PASS

Three hikers on a trail: a man with a backpack and sunhat, aPASS
Three hikers on a trail: a man with a backpack and sunhat, a woman with braided hair and hiking poles, and a teenager with a baseball cap and water bottle

Evaluator (Qwen 3 VL)

Left: Man with sunhat and backpack. Center: Woman with braided hair and hiking poles. Right: Teenager with baseball cap and water bottle. All three are visually distinct and match described features. VERDICT: PASS

Truthfulness 63%

PhotorealismClick to expand100%

Photorealism measures whether fal/bria/fast makes materials, surfaces, and real-world objects look visually plausible and distinct.

Medium 100%
A still life on a wooden table featuring a clear glass of waPASS
A still life on a wooden table featuring a clear glass of water, a polished silver spoon, a folded linen napkin, and a ripe red apple — each material rendered with accurate texture and light behavior

Evaluator (Qwen 3.6 27B)

The image demonstrates high-fidelity rendering of the requested materials. The glass exhibits accurate refraction and transparency, distorting the background wood grain realistically. The silver spoon shows sharp, coherent specular highlights consistent with polished metal. The linen napkin displays a convincing woven texture with soft, natural folds. The apple has a realistic skin texture with subtle subsurface scattering and a distinct highlight. The wooden table provides a solid, textured bas

A workspace desk with a brushed aluminum laptop, a leather-bPASS
A workspace desk with a brushed aluminum laptop, a leather-bound notebook, a ceramic coffee mug, and a glass paperweight — all materials visually distinct and realistic

Evaluator (Qwen 3.6 27B)

The image successfully renders the requested materials with high fidelity. The laptop exhibits a convincing brushed aluminum finish with appropriate metallic reflections. The notebook displays realistic leather texture and stitching. The ceramic mug shows a matte glaze with subtle sheen, while the glass paperweight demonstrates accurate refraction, internal reflections, and transparency. The wooden desk surface has a natural grain pattern. All materials are visually distinct and physically plaus

A kitchen counter with a stainless steel pot, a wooden cuttiPASS
A kitchen counter with a stainless steel pot, a wooden cutting board, a glass bottle of olive oil, and a cotton dish towel — each surface reflecting light correctly

Evaluator (Qwen 3.6 27B)

The stainless steel pot shows appropriate metallic reflections, the glass bottle of olive oil displays correct translucency and refraction, the wooden cutting board has a natural matte texture, and the cotton towel exhibits soft fabric folds with realistic light absorption. All materials render with accurate visual properties—reflections, translucency, and textures are consistent with real-world physics. VERDICT: PASS

Physics & ReflectionsClick to expand75%

Physics & Reflections measures whether fal/bria/fast follows basic physical cues such as shadows, reflections, gravity, transparency, and contact points.

Easy 67%
A red sphere on a white surface with a single shadow cast toPASS
A red sphere on a white surface with a single shadow cast to the right, consistent with a light source from the upper left

Evaluator (Qwen 3.6 27B)

The image displays a red sphere on a white surface. The shadow is cast to the right and slightly back, which is consistent with a light source positioned to the upper left. The shadow's shape is an ellipse, correctly representing the perspective distortion of a sphere's shadow on a flat plane. The lighting on the sphere itself also matches this direction, with the highlight on the upper left and the sphere darkening towards the bottom right. The physics are accurate. VERDICT: PASS

A coffee mug on a sunny windowsill casting a sharp shadow toFAIL
A coffee mug on a sunny windowsill casting a sharp shadow to the left

Evaluator (Qwen 3.6 27B)

The lighting originates from the left, casting a shadow to the right of the mug. The prompt explicitly requested a shadow "to the left," which is physically impossible given the light source direction shown. Additionally, the shadow shape is distorted and disconnected from the mug's base, appearing as a floating blob rather than a coherent projection. This violates physical plausibility regarding shadow orientation and geometry. VERDICT: FAIL

A person standing outdoors at midday with a short shadow dirPASS
A person standing outdoors at midday with a short shadow directly below them

Evaluator (Qwen 3.6 27B)

The shadow is cast directly beneath the figure, consistent with midday lighting. Its shape accurately mirrors the person’s stance and clothing folds, showing realistic distortion on the textured ground. The shadow’s darkness and sharpness align with bright, overhead sunlight. No directional inconsistencies or unnatural reflections are present. The physical plausibility of the shadow’s position, orientation, and form meets the prompt’s requirements. VERDICT: PASS

Medium 67%
A woman standing in front of a full-length mirror, her refleFAIL
A woman standing in front of a full-length mirror, her reflection accurately showing the back of her outfit and reversed orientation

Evaluator (Qwen 3.6 27B)

The reflection is physically impossible. The woman stands with her back to the mirror, yet the reflection shows her front-facing. A mirror reflects what is in front of it; therefore, the reflection should show her back, not her front. Additionally, the reflection is not laterally reversed as expected (e.g., the belt buckle orientation and hair parting do not mirror correctly). This violates basic optical physics. VERDICT: FAIL

A cat sitting on a polished marble floor with its reflectionPASS
A cat sitting on a polished marble floor with its reflection visible below

Evaluator (Qwen 3.6 27B)

The reflection is correctly oriented vertically below the cat, matching its pose. The distortion is appropriate for a polished surface, appearing slightly blurred and elongated. The color and brightness of the reflection are consistent with the subject, showing a darker, muted version of the black cat. The shadow direction aligns with the lighting, and the overall effect is physically plausible for a glossy marble floor. VERDICT: PASS

A vase of flowers on a glass table with a clear reflection vPASS
A vase of flowers on a glass table with a clear reflection visible on the table surface

Evaluator (Qwen 3.6 27B)

The reflection on the glass table is physically plausible. It correctly mirrors the vase and flowers with appropriate vertical inversion. The reflection shows a slight blur and reduced contrast compared to the real objects, which accurately simulates the optical properties of a glass surface. The alignment is precise, and the colors match the source objects, maintaining consistency in lighting and orientation. VERDICT: PASS

Hard 100%
A row of trees reflected in a calm lake at sunset, with the PASS
A row of trees reflected in a calm lake at sunset, with the reflection slightly darker and gently rippled

Evaluator (Qwen 3.6 27B)

The reflections are vertically inverted and correctly positioned beneath the trees. The gentle ripples on the water surface create appropriate horizontal distortion, breaking up the reflection into wavy segments. The color temperature matches the warm sunset light, and the reflections appear slightly darker than the source objects, consistent with water absorption. The physics of light and reflection are accurately represented. VERDICT: PASS

A city skyline reflected in a rain puddle on a street, with PASS
A city skyline reflected in a rain puddle on a street, with mild distortion and color shift

Evaluator (Qwen 3.6 27B)

The reflection is vertically inverted, matching the skyline above. The water surface introduces appropriate ripples and distortion, softening the reflected buildings. Colors in the puddle appear slightly more saturated and shifted toward cooler tones, consistent with light interacting with water. The perspective aligns correctly with the camera angle. No physical inconsistencies are present. VERDICT: PASS

A mountain range reflected in a still alpine lake, with the PASS
A mountain range reflected in a still alpine lake, with the reflection inverted and subtly muted in color

Evaluator (Qwen 3.6 27B)

The reflection is vertically inverted and correctly positioned below the horizon line, matching the mountain range's orientation. The water surface shows slight ripples, creating a natural distortion that softens the reflected image compared to the sharp reality above. The colors in the reflection are appropriately muted and darker due to light absorption and scattering in the water, which is physically accurate. The lighting and shadow directions align perfectly between the scene and its reflec

Extreme 67%
A clear glass sphere sitting on a mirror surface near a poolPASS
A clear glass sphere sitting on a mirror surface near a pool of water, showing refraction through the sphere, reflection on the mirror, and water reflection simultaneously

Evaluator (Qwen 3.6 27B)

The image displays a glass sphere on a reflective surface with a water puddle. The reflection of the sphere is correctly inverted and positioned directly beneath it. The water ripples distort the reflection realistically, creating caustic-like patterns and warping the image as expected in fluid dynamics. The refraction through the sphere shows an inverted background, which is physically accurate for a glass lens. The lighting and shadows are consistent with the light source direction. VERDICT:

A chrome sculpture in a room with a glass floor and large wiFAIL
A chrome sculpture in a room with a glass floor and large windows, showing multiple reflections and environmental mapping on every surface

Evaluator (Qwen 3.6 27B)

The reflections on the chrome sphere are physically plausible, showing distorted environmental mapping of the room and windows consistent with a convex mirror. The reflection on the glass table below is correctly oriented and slightly blurred, indicating depth. However, the object lacks a distinct contact shadow or grounding shadow on the glass surface, making it appear to float rather than rest on the table. This violates basic lighting physics. VERDICT: FAIL

A crystal wine glass on a polished marble counter next to a PASS
A crystal wine glass on a polished marble counter next to a window overlooking a lake — showing refraction in the glass, reflection on the marble, and the lake's own reflective surface

Evaluator (Qwen 3.6 27B)

The reflections and refractions are physically plausible. The wine glass exhibits realistic distortion of the background through its curved, faceted surface. The reflection on the polished marble counter is sharp and correctly oriented, mirroring the glass’s base and stem. The lake in the background shows a soft, diffuse reflection consistent with water under daylight. Shadows fall naturally from the window light source. No directional or color inconsistencies are present. VERDICT: PASS

World KnowledgeClick to expand42%

World Knowledge measures whether fal/bria/fast depicts recognizable places, landmarks, flags, and known objects with accurate real-world details.

Easy 67%
The Eiffel Tower in Paris on a clear day, architecturally acPASS
The Eiffel Tower in Paris on a clear day, architecturally accurate with its iron lattice structure and three levels

Evaluator (Qwen 3.6 27B)

The image depicts the Eiffel Tower with high architectural accuracy, including the iron lattice structure, three observation levels, and the characteristic arch at the base. The background shows a realistic Parisian skyline under a clear blue sky with scattered clouds. No obvious distortions or AI artifacts are present. The subject is clearly recognizable and factually consistent with the real-world landmark. VERDICT: PASS

The Taj Mahal in Agra seen from the reflecting pool, with acPASS
The Taj Mahal in Agra seen from the reflecting pool, with accurate dome, minarets, and symmetry

Evaluator (Qwen 3.6 27B)

The image depicts the Taj Mahal with high fidelity. The central onion dome, four surrounding minarets, and the symmetrical arched entrances (iwans) are accurately rendered. The perspective from the reflecting pool is correct, showing the iconic symmetry and the red sandstone pathway. The architectural details, including the smaller domes and the base platform, align with the real-world structure. No significant hallucinations or distortions are present. VERDICT: PASS

The Statue of Liberty on her island pedestal, with correct tFAIL
The Statue of Liberty on her island pedestal, with correct torch, crown, tablet, and green patina

Evaluator (Qwen 3.6 27B)

The Statue of Liberty is recognizable, but the pedestal is architecturally inaccurate. The real pedestal features a distinct octagonal upper section with arched windows and a specific stone texture, whereas this image shows a simplified, rectangular base with generic columns and incorrect window placements. The surrounding landscape also lacks the expected water context of Liberty Island. VERDICT: FAIL

Medium 33%
A traditional Japanese shamisen instrument with accurate thrFAIL
A traditional Japanese shamisen instrument with accurate three-stringed design, wooden body, and bachi plectrum

Evaluator (Qwen 3.6 27B)

The image fails to depict a shamisen. The instrument shown has a round, lute-like body and a fretted neck, resembling a hybrid of a ukulele and a zither, rather than the shamisen’s characteristic rectangular body and fretless neck. Additionally, the strings appear to be more than three, and the bachi is placed awkwardly across the strings in a non-functional manner. These inaccuracies make the subject unrecognizable as a shamisen. VERDICT: FAIL

A medieval astrolabe with accurate rete, tympan, and engraveFAIL
A medieval astrolabe with accurate rete, tympan, and engraved degree markings on brass

Evaluator (Qwen 3.6 27B)

The image depicts a brass instrument that strongly resembles an astrolabe, featuring concentric rings, a central alidade, and engraved markings. However, the text and symbols are nonsensical gibberish rather than accurate astronomical or degree markings, which is a common artifact in AI-generated images. While the general form is recognizable, the lack of factual accuracy in the engravings and the slightly distorted geometry of the moving parts prevent it from being a truly accurate representati

A Japanese kintsugi bowl — cracked ceramic repaired with visPASS
A Japanese kintsugi bowl — cracked ceramic repaired with visible gold lacquer lines

Evaluator (Qwen 3.6 27B)

The image depicts a ceramic bowl with a crackle glaze pattern filled with gold, accurately representing the aesthetic of kintsugi. The subject is clearly recognizable as a bowl, and the gold lines are distinct. While traditional kintsugi repairs specific break lines rather than a uniform crackle pattern (which is often a glaze style), this is a common artistic interpretation in AI generation and remains factually recognizable as the requested subject. The lighting and textures appear realistic.

Hard 0%
A detailed diagram of the water cycle showing evaporation frFAIL
A detailed diagram of the water cycle showing evaporation from the ocean, condensation forming clouds, precipitation as rain, and collection in rivers flowing back to the sea

Evaluator (Qwen 3.6 27B)

The image depicts a water cycle scene with ocean, clouds, rain, and land, but the labels are completely illegible gibberish, failing the "detailed diagram" requirement. Furthermore, the visual representation is inaccurate: it shows water spraying upwards from the ocean surface and land (resembling geysers or fountains) rather than the correct scientific process of evaporation (water vapor rising). This misrepresents the core mechanism of the water cycle. VERDICT: FAIL

A cross-section showing tectonic plates at a convergent bounFAIL
A cross-section showing tectonic plates at a convergent boundary with subduction, a trench, volcanic arc, and layered mantle

Evaluator (Qwen 3.6 27B)

The image depicts a stylized, artistic interpretation rather than a scientifically accurate cross-section. While it shows mountains and a trench-like feature, it lacks clear tectonic plate boundaries, subduction zones, mantle layers, or a volcanic arc structure as defined in geology. The red lines appear decorative rather than indicative of magma flow or plate movement. The composition is more illustrative than educational or factual. VERDICT: FAIL

An anatomical illustration of the human heart showing four cFAIL
An anatomical illustration of the human heart showing four chambers, valves, aorta, pulmonary arteries, and veins with correct placement

Evaluator (Qwen 3.6 27B)

The image is clearly recognizable as a human heart illustration. However, it contains significant anatomical inaccuracies. The aorta is depicted as a single arch with three branches, but the pulmonary trunk and its bifurcation are incorrectly merged or mislabeled. The superior and inferior vena cavae are not clearly distinguished, and the coronary arteries/veins on the surface appear overly stylized and not entirely accurate in their branching patterns. While visually appealing, it fails strict

Extreme 67%
A Morpho menelaus butterfly with accurate iridescent blue doPASS
A Morpho menelaus butterfly with accurate iridescent blue dorsal wings, brown ventral wing pattern with eyespots, and correct body morphology

Evaluator (Qwen 3.6 27B)

The image depicts a butterfly with the characteristic iridescent blue dorsal wings and brown ventral patterns with eyespots of a Morpho species. The body morphology, including antennae and wing venation, appears anatomically correct. The coloration and pattern distribution are consistent with real-world Morpho butterflies, particularly resembling Morpho menelaus or closely related species. No significant inaccuracies are observed in the depiction. VERDICT: PASS

The Hagia Sophia interior showing the massive central dome, PASS
The Hagia Sophia interior showing the massive central dome, pendentives, semi-domes, and Islamic calligraphy medallions alongside Byzantine mosaics

Evaluator (Qwen 3.6 27B)

The image accurately depicts the interior of the Hagia Sophia, showcasing its iconic central dome, semi-domes, and pendentives. The presence of both Byzantine mosaics and large Islamic calligraphy medallions is historically and architecturally correct for this site. The structural details, lighting, and perspective appear realistic and consistent with known photographs of the landmark. No significant architectural inaccuracies are visible. VERDICT: PASS

A specimen of bismuth crystal showing accurate iridescent oxFAIL
A specimen of bismuth crystal showing accurate iridescent oxide layers, geometric hopper crystal formation, and characteristic staircase structure

Evaluator (Qwen 3.6 27B)

The image depicts a bismuth crystal with the characteristic geometric hopper formation and staircase structure. However, the coloration is inaccurate; real bismuth crystals display vibrant, rainbow-like iridescence due to oxide layers, whereas this specimen appears muted, brownish, and dusty. The lighting and texture also suggest an AI-generated artifact rather than a high-quality photograph of a real mineral specimen. VERDICT: FAIL

Professional Studio 93%

Camera & LightingClick to expand100%

Camera & Lighting measures whether fal/bria/fast follows specific lens, angle, focus, exposure, and lighting instructions.

Easy 100%
A wide-angle shot of a modern living room interior showing ePASS
A wide-angle shot of a modern living room interior showing expansive space with slight barrel distortion at the edges

Evaluator (Gemma 4 26B)

The image successfully captures a wide-angle shot of a modern living room. The perspective shows an expansive space with a deep field of view. Most importantly, the image exhibits clear barrel distortion at the edges, particularly visible in the curved lines of the ceiling and the warping of the walls, which matches the prompt's specific requirement for "slight barrel distortion." The lighting is soft and natural, consistent with a modern interior. VERDICT: PASS

A close-up macro shot of a single dewdrop on a rose petal, sPASS
A close-up macro shot of a single dewdrop on a rose petal, shallow depth of field

Evaluator (Gemma 4 26B)

The image successfully captures a close-up macro perspective of a single dewdrop resting on a petal. The depth of field is shallow, with the petal and dewdrop in sharp focus while the background is heavily blurred. The lighting is soft and natural, highlighting the texture of the petal and the clarity of the water droplet. All specified camera and lighting elements are present. VERDICT: PASS

An aerial bird's-eye view looking straight down at a trafficPASS
An aerial bird's-eye view looking straight down at a traffic roundabout with cars and trees

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. It provides a true aerial bird's-eye view looking straight down at a traffic roundabout. The perspective is top-down, the depth of field is consistent with a high-altitude drone shot (everything is in relatively sharp focus), and the lighting is even, suggesting a bright, overcast, or midday sun. The presence of cars and trees is clearly depicted. VERDICT: PASS

Medium 100%
A portrait lit with classic Rembrandt lighting — triangle ofPASS
A portrait lit with classic Rembrandt lighting — triangle of light on the shadowed cheek, moody atmosphere

Evaluator (Gemma 4 26B)

The image successfully implements the requested lighting. There is a distinct light source from the side, creating a moody atmosphere with deep shadows. Most importantly, there is a clear "triangle of light" on the shadowed cheek (the subject's right cheek), which is the hallmark of Rembrandt lighting. The depth of field is shallow, focusing on the facial features, and the perspective is a standard portrait shot. VERDICT: PASS

A landscape photograph bathed in golden hour light — long waPASS
A landscape photograph bathed in golden hour light — long warm shadows, orange sky, soft glow

Evaluator (Gemma 4 26B)

The image perfectly captures the requested "golden hour" aesthetic. The sun is low on the horizon, creating a bright orange sky and a soft, warm glow across the landscape. The low angle of the sun produces long, distinct shadows stretching across the grassy field toward the viewer. The perspective is a wide landscape shot with a deep depth of field, ensuring both the foreground grass and distant trees are visible. VERDICT: PASS

A high-key portrait of a person against a bright white backgPASS
A high-key portrait of a person against a bright white background, minimal shadows, airy and luminous

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. It features a high-key portrait with a bright, white, seamless background. The lighting is soft and even, creating an airy and luminous feel with minimal shadows on the subject's face and neck. The perspective is a standard eye-level portrait, and the depth of field is shallow, keeping the focus sharply on the person's features while the background remains a clean, bright void. VERDICT: PASS

Hard 100%
A portrait shot at 85mm f/1.4 with creamy bokeh background, PASS
A portrait shot at 85mm f/1.4 with creamy bokeh background, backlit with warm rim light separating the subject from a blurred garden

Evaluator (Gemma 4 26B)

The image successfully captures the requested elements. The shallow depth of field (bokeh) is evident in the blurred garden and trees in the background. There is a clear warm rim light effect on the subject's hair and shoulders, creating separation from the background. The perspective is a classic portrait shot consistent with an 85mm lens. VERDICT: PASS

A street photograph with shallow depth of field as if shot aPASS
A street photograph with shallow depth of field as if shot at 50mm f/1.8, subject sharp with soft city lights in the background

Evaluator (Gemma 4 26B)

The image successfully implements the requested camera settings. The subject (the man in the foreground) is sharp and in focus, while the background exhibits a shallow depth of field with a significant blur. The city lights in the background are rendered as soft, out-of-focus bokeh, accurately mimicking a 50mm f/1.8 lens. The lighting and perspective are consistent with a street photograph taken during twilight. VERDICT: PASS

A food photograph shot from above at 35mm f/2.8 with naturalPASS
A food photograph shot from above at 35mm f/2.8 with natural window light from the left, slight vignette, warm color temperature

Evaluator (Gemma 4 26B)

The image matches the prompt's requirements. The perspective is a top-down shot (shot from above). The lighting is directional, coming from the left, creating soft shadows on the right side of the bowl and utensils. There is a subtle vignette around the edges. The color temperature is warm, and the depth of field is relatively shallow, consistent with a 35mm f/2.8 setting. VERDICT: PASS

Extreme 100%
A tilt-shift photograph of a city street making real-life loPASS
A tilt-shift photograph of a city street making real-life look like a miniature model — selective focus strip, saturated colors

Evaluator (Gemma 4 26B)

The image successfully implements the tilt-shift effect. It features a distinct selective focus strip along the center of the street, with heavy blurring (bokeh) in the foreground and background to create the miniature model aesthetic. The colors are highly saturated, and the perspective captures the urban canyon effectively. The depth of field manipulation perfectly matches the prompt's requirements for a tilt-shift photograph. VERDICT: PASS

A cinematic frame with blue and orange anamorphic lens flarePASS
A cinematic frame with blue and orange anamorphic lens flare streaking horizontally across a dark moody scene

Evaluator (Gemma 4 26B)

The image features horizontal blue and orange light streaks against a dark background, which matches the description of "blue and orange anamorphic lens flare streaking horizontally across a dark moody scene." The lighting pattern and colors are accurately represented. VERDICT: PASS

A portrait with dramatic split lighting — half the face brigPASS
A portrait with dramatic split lighting — half the face brightly lit, the other half in deep shadow, high contrast chiaroscuro

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. The lighting is a classic dramatic split lighting pattern: the left side of the subject's face (viewer's left) is brightly illuminated, while the right side is cast in deep, heavy shadow. This creates a high-contrast chiaroscuro effect. The perspective is a standard portrait shot, and the depth of field is shallow, keeping the focus sharply on the facial features. VERDICT: PASS

Color PrecisionClick to expand92%

Color Precision measures whether fal/bria/fast renders requested colors clearly and consistently across subjects, products, and backgrounds.

Easy 100%
A bright red sports car parked on a gray asphalt road — the PASS
A bright red sports car parked on a gray asphalt road — the red is vivid and unambiguous

Evaluator (Gemma 4 26B)

The image features a bright, vivid red sports car parked on a gray asphalt road. The red color is highly saturated and unambiguous, exactly as described in the prompt. The road is a standard gray asphalt. All color requirements are met perfectly. VERDICT: PASS

A woman wearing a royal blue evening dress against a neutralPASS
A woman wearing a royal blue evening dress against a neutral background

Evaluator (Gemma 4 26B)

The image shows a woman wearing a deep, vibrant blue evening dress. The color of the dress is a true royal blue, and the background is a solid, neutral beige/tan color. The colors perfectly match the specific requirements of the prompt. VERDICT: PASS

A field of pure sunflower yellow flowers under a clear blue PASS
A field of pure sunflower yellow flowers under a clear blue sky

Evaluator (Gemma 4 26B)

The image features a field of sunflowers with bright, vibrant yellow petals, accurately representing "sunflower yellow." The sky above is a clear, bright blue, matching the prompt's description. All color requirements specified in the prompt are clearly and accurately depicted in the image. VERDICT: PASS

Medium 100%
A bedroom wall painted in #FF6B35 (burnt orange) with white PASS
A bedroom wall painted in #FF6B35 (burnt orange) with white furniture and natural daylight

Evaluator (Gemma 4 26B)

The image features a bedroom wall with a deep, burnt orange color that closely matches the specified hex code #FF6B35. The furniture, including the bed frame, dresser, wardrobe, desk, and chair, is consistently white as requested. The lighting appears to be natural daylight, and the overall color scheme adheres strictly to the prompt's requirements. VERDICT: PASS

A forest scene where the foliage is precisely #2D5F2D (dark PASS
A forest scene where the foliage is precisely #2D5F2D (dark muted green) with brown tree trunks

Evaluator (Gemma 4 26B)

The foliage in the image consists of a dark, muted green that aligns well with the description of #2D5F2D. The tree trunks are a light brown/tan color, which matches the "brown tree trunks" requirement. The overall color palette is consistent with a dark, muted forest scene as requested. VERDICT: PASS

A room with walls in #4A90D9 (medium blue) with warm wooden PASS
A room with walls in #4A90D9 (medium blue) with warm wooden floor and white trim

Evaluator (Gemma 4 26B)

The image accurately reflects the prompt's color requirements. The walls are a medium blue that aligns with the hex code #4A90D9. The floor is a warm-toned wood, and the baseboards and door frames are white, matching the "white trim" description. VERDICT: PASS

Hard 67%
A minimalist room with walls in #F5E6CC (warm cream), a sofaPASS
A minimalist room with walls in #F5E6CC (warm cream), a sofa in #2C3E50 (dark navy), and accent pillows in #E74C3C (bright red)

Evaluator (Gemma 4 26B)

The image accurately reflects the requested colors. The walls are a warm cream/off-white, the sofa is a dark navy blue, and the accent pillows are a bright, vibrant red. All three primary color requirements from the prompt are clearly visible and match the descriptions provided. VERDICT: PASS

A product flat-lay on a #1A1A2E (deep dark blue) background,FAIL
A product flat-lay on a #1A1A2E (deep dark blue) background, with a #F0C27F (soft gold) watch and #FFFFFF (pure white) earbuds

Evaluator (Gemma 4 26B)

The background is a deep dark blue, consistent with #1A1A2E. The earbuds are pure white (#FFFFFF). However, the watch does not match the requested color. The prompt specifies a "soft gold" (#F0C27F) watch, but the image shows a watch with a brown leather strap and a gold-toned case. The strap color is a significant deviation from the requested gold color. VERDICT: FAIL

A graphic design composition with background #FAFAFA (near-wPASS
A graphic design composition with background #FAFAFA (near-white), main shape #FF4500 (orange-red), and accent lines #333333 (dark charcoal)

Evaluator (Gemma 4 26B)

The image follows the color specifications closely. The background is a near-white color consistent with #FAFAFA. The main circular shape is a vibrant orange-red, matching the description of #FF4500. The thin accent lines are a dark charcoal/black color, consistent with #333333. All three specified colors are accurately represented in the graphic design. VERDICT: PASS

Extreme 100%
A smooth gradient on a wall going from #FF0000 (red) on the PASS
A smooth gradient on a wall going from #FF0000 (red) on the left to #0000FF (blue) on the right, transitioning evenly through purple

Evaluator (Gemma 4 26B)

The image shows a smooth gradient transitioning from a red color on the left to a blue color on the right. The center of the image transitions through a purple hue, as requested. The colors on the far left and far right appear to match the specified red (#FF0000) and blue (#0000FF)-like tones. VERDICT: PASS

A sunset sky gradient on a flat surface from #FF6B35 (orangePASS
A sunset sky gradient on a flat surface from #FF6B35 (orange) at the bottom to #1B1464 (deep indigo) at the top

Evaluator (Gemma 4 26B)

The image displays a vertical gradient. At the bottom, the color is a bright orange, which aligns well with the requested #FF6B35. At the top, the color is a very dark, deep indigo/navy, which matches the requested #1B1464. The transition between the orange and the deep indigo is smooth and follows the specified color scheme accurately. VERDICT: PASS

A horizontal banner with a gradient from #00C9FF (cyan) on tPASS
A horizontal banner with a gradient from #00C9FF (cyan) on the left to #92FE9D (mint green) on the right

Evaluator (Gemma 4 26B)

The image shows a horizontal gradient. On the left, the color is a bright cyan, which aligns with #00C9FF. On the right, the color transitions into a light mint green, which aligns with #92FE9D. The gradient flows smoothly from left to right as requested. VERDICT: PASS

PhotorealismClick to expand67%

Photorealism measures whether fal/bria/fast produces polished product and studio imagery with realistic materials, lighting, shadows, and finish.

Easy 67%
A single red sneaker on a clean white background with soft sPASS
A single red sneaker on a clean white background with soft studio lighting and gentle shadow — product photography style

Evaluator (Gemma 4 26B)

The image successfully achieves a high-quality studio look. The lighting is soft and consistent, creating a natural shadow that grounds the shoe. The textures of the leather and fabric appear realistic, and the edges are sharp and well-defined. There are no obvious AI artifacts, such as warped laces, nonsensical stitching, or melting textures. It passes as a professional product photograph at first glance. VERDICT: PASS

A clear glass perfume bottle on a reflective black surface wFAIL
A clear glass perfume bottle on a reflective black surface with studio lighting, no background distractions

Evaluator (Gemma 4 26B)

The image contains significant AI artifacts that prevent it from passing as a real photograph. The internal straw/dip tube is physically impossible; it curves unnaturally and appears to "float" or break through the glass structure. Additionally, the reflections on the glass and the liquid surface are inconsistent and distorted. The geometry of the bottle's neck and the way the light interacts with the liquid are physically nonsensical. VERDICT: FAIL

A ripe green apple on a seamless white background with soft PASS
A ripe green apple on a seamless white background with soft side lighting and subtle shadow — catalog style

Evaluator (Gemma 4 26B)

The image successfully meets the prompt requirements. It features a high-quality, realistic green apple on a seamless white background. The lighting is soft and directional, creating a natural-looking highlight on the left and a subtle, realistic shadow on the right. The texture of the skin and the stem detail are sharp and convincing, with no obvious AI artifacts or distortions visible. It looks like a professional studio product shot. VERDICT: PASS

Graphical design 54%

Data VisualisationClick to expand33%

Data Visualisation measures whether fal/bria/fast produces charts with correct labels, proportions, colors, and readable values.

Easy 33%
A bar chart showing monthly sales for January through June wPASS
A bar chart showing monthly sales for January through June with values 12K, 18K, 15K, 22K, 19K, 25K — labeled axes, color-coded bars, and a title 'Monthly Sales 2025'

Evaluator (Qwen 3 VL)

The chart shows six bars with values 12K, 18K, 15K, 22K, 19K, 25K, matching the requested data. Bars are color-coded and labeled with months (Jan–Jun). Title "Monthly Sales 2025" is present. Axes are labeled. All elements are readable and correctly formatted. Chart type is a bar chart as requested. VERDICT: PASS

A pie chart showing market share: Company A 35%, Company B 2FAIL
A pie chart showing market share: Company A 35%, Company B 25%, Company C 20%, Others 20% — each slice a different color with labels and percentages

Evaluator (Qwen 3 VL)

The chart is a pie chart, but labels and percentages are incorrect. "Company A" is 35% (correct), "Company B" is 25% (correct), "Company C" is mislabeled as 30% (should be 20%), and "Others" is 20% (correct). Labels are illegible or misspelled. The chart type matches, but data is inaccurate and unreadable. VERDICT: FAIL

A simple line graph showing daily website visitors over one FAIL
A simple line graph showing daily website visitors over one week (Mon-Sun): 1200, 1500, 1100, 1800, 2000, 1700, 1400 — with grid lines, axis labels, and title 'Weekly Traffic'

Evaluator (Qwen 3 VL)

The chart misrepresents the data: axes are mislabeled (e.g., "1200" appears twice on Y-axis), days are illegible (e.g., "Chs", "Dang"), and values are unreadable scribbles. The title "Weekly wslster" is incorrect. The chart type is a line graph, but the visual elements are garbled and unrecognizable. Core requirements for readability and accuracy are not met. VERDICT: FAIL

Layout & DesignClick to expand33%

Layout & Design measures whether fal/bria/fast organizes posters, menus, covers, and other designed compositions with the requested hierarchy and elements.

Easy 67%
A movie night poster with a large popcorn image in the centePASS
A movie night poster with a large popcorn image in the center, the title 'MOVIE NIGHT' in bold at the top, and the tagline 'Every Friday at 8 PM' at the bottom

Evaluator (Qwen 3 VL)

The poster features a large popcorn image centered, with "MOVIE NIGHT" in bold at the top and "Every Friday at 8 PM" at the bottom. Text is readable and correctly placed. Visual hierarchy is clear: title dominates top, popcorn central, tagline bottom. All specified layout elements are present and accurately rendered. VERDICT: PASS

An event poster featuring a guitar illustration, the headlinPASS
An event poster featuring a guitar illustration, the headline 'LIVE MUSIC FEST' in large text, and 'Tickets available now' as a tagline below

Evaluator (Qwen 3 VL)

The poster features the headline "LIVE MUSIC FEST" in large, bold text with a guitar illustration integrated into the lettering. Below, "Tickets available now" appears in a smaller bar. Text is readable and correctly placed. Visual hierarchy is clear, with headline dominating, tagline secondary. All required elements are present and correctly styled. VERDICT: PASS

A book club poster with an open book illustration, the titleFAIL
A book club poster with an open book illustration, the title 'JOIN THE BOOK CLUB' prominently displayed, and 'Meets every Wednesday' as the tagline

Evaluator (Qwen 3 VL)

The poster features an open book illustration and the title "JOIN THE BOOK CLUB" prominently displayed with correct visual hierarchy. However, the tagline reads "Meets every Wemssday" (spelled incorrectly) and is positioned above the title, not as intended. The "Free" box is misplaced in the top-right corner. Text is readable but placement and accuracy fail the prompt. VERDICT: FAIL

Medium 0%
A landing page design with a hero header image at the top, aFAIL
A landing page design with a hero header image at the top, a headline and subtitle in the middle, a 'Sign Up' call-to-action button, and a footer with three social media icons

Evaluator (Qwen 3 VL)

The image lacks a clear hero header image at the top; instead, it shows a banner with unreadable text. The “Sign Up” headline is present but below a non-hero section. The CTA button is visible, and social icons appear at the bottom. However, all text is gibberish or illegible, violating the prompt’s requirement for readable, correctly placed text and proper layout hierarchy. VERDICT: FAIL

A newsletter layout with a banner image header, a two-columnFAIL
A newsletter layout with a banner image header, a two-column body section with text on the left and an image on the right, a highlighted quote block, and a footer with unsubscribe link

Evaluator (Qwen 3 VL)

The image shows a newsletter layout with a header banner, two-column body (text left, image right), a highlighted quote block, and a footer with "Unsscrirbe" and "INssoripe". All elements are visually present. However, all text is gibberish placeholder (e.g., "Dapsimke", "Orhrm", "Incdeleon_ma"), making it unreadable and violating the requirement for correct, readable text placement. VERDICT: FAIL

A restaurant menu layout with a logo header, three category FAIL
A restaurant menu layout with a logo header, three category sections (appetizers, mains, desserts) each with items and prices, and a footer with address and hours

Evaluator (Qwen 3 VL)

The image shows a menu layout with a logo header, three category sections (Appetizers, Mains, Desserts), and a footer with address/hours. However, all text is gibberish or placeholder (e.g., “Niinpe y’tauu”), making it unreadable and violating the prompt’s requirement for correct, legible text placement. Visual hierarchy is present but meaningless due to illegible content. VERDICT: FAIL

Hard 33%
A magazine double-page spread with a left page containing a FAIL
A magazine double-page spread with a left page containing a large hero photo, and a right page with a two-column article, a pull quote, a sidebar with author bio, and a page number

Evaluator (Qwen 3 VL)

Left page features a large hero photo as required. Right page shows two columns, a pull quote, and a sidebar with author bio. Page number is visible. However, all text is gibberish and unreadable, violating the requirement for readable text. Layout structure is correct, but content fails functional criteria. VERDICT: FAIL

A mobile app UI mockup showing a dashboard with a top navigaFAIL
A mobile app UI mockup showing a dashboard with a top navigation bar, a search field, a 2x2 grid of feature cards with icons, a list of recent activity items, and a bottom tab bar

Evaluator (Qwen 3 VL)

The image shows a mobile UI with a top navigation bar, search field, 2x2 grid of feature cards, and bottom tab bar — all present. However, all text is gibberish or unreadable, and icons are misaligned or distorted. Layout hierarchy is visually intact but content is invalid. Core requirement (readable, correctly placed text) is not met. VERDICT: FAIL

A fashion magazine spread with a full-bleed left-page photo,PASS
A fashion magazine spread with a full-bleed left-page photo, right page with three-column layout, headline, body copy, a smaller inset image, and styled page numbers

Evaluator (Qwen 3 VL)

Left page has full-bleed photo. Right page shows headline “FASHION TRENDS,” body copy in two columns, inset image, and styled page numbers. Text is readable and correctly placed. Layout hierarchy is clear and matches prompt. All described elements are present and correctly implemented. VERDICT: PASS

Style DiversityClick to expand75%

Style Diversity measures whether fal/bria/fast can shift between visual styles like oil painting, pixel art, watercolor, and technical illustration.

Easy 67%
A golden retriever sitting in a garden, painted in the stylePASS
A golden retriever sitting in a garden, painted in the style of an oil painting with visible brushstrokes and rich texture

Evaluator (Qwen 3.5 122B)

The image clearly depicts a golden retriever sitting in a garden setting. The visual style strongly resembles an oil painting, with visible, textured brushstrokes throughout the background foliage and on the dog’s fur. The color blending and impasto-like texture are consistent with traditional oil painting techniques. The key characteristics requested — oil painting style, visible brushstrokes, and rich texture — are all present and clearly recognizable. VERDICT: PASS

A golden retriever sitting in a garden, rendered in JapanesePASS
A golden retriever sitting in a garden, rendered in Japanese anime style with flat colors and large expressive eyes

Evaluator (Qwen 3.5 122B)

The image displays a golden retriever in a garden setting with stylized, anime-like features including large expressive eyes and clean outlines. The coloring is relatively flat with minimal shading gradients, consistent with anime aesthetics. The character design emphasizes cuteness and clarity typical of Japanese animation. While not perfectly matching every anime sub-style, the core requested characteristics are clearly present and recognizable. VERDICT: PASS

A golden retriever sitting in a garden, rendered in pixel arFAIL
A golden retriever sitting in a garden, rendered in pixel art style with a 32x32 grid and limited color palette

Evaluator (Qwen 3.5 122B)

The image depicts a golden retriever in a garden setting, but the requested pixel art style is not clearly recognizable. While the image has a blocky, low-resolution appearance, it lacks the defining characteristics of true pixel art: a strict 32x32 grid and a deliberately limited color palette. The rendering appears more like a heavily compressed or blurred photograph than intentional pixel art with constrained colors and precise grid alignment. VERDICT: FAIL

Medium 100%
A city street scene in the style of Art Deco — geometric shaPASS
A city street scene in the style of Art Deco — geometric shapes, gold and black palette, ornamental symmetry

Evaluator (Qwen 3.5 122B)

The image strongly exhibits the requested Art Deco style. Key characteristics are clearly present: the color palette is strictly gold and black; the architecture features geometric shapes, stepped forms, and ornamental symmetry typical of the era; and the streetlamp and pavement patterns reinforce the decorative, stylized aesthetic. The visual language is consistent with the prompt. VERDICT: PASS

A woman with a parasol in the style of Impressionism — soft PASS
A woman with a parasol in the style of Impressionism — soft brushstrokes, dappled light, pastel palette

Evaluator (Qwen 3.5 122B)

The image clearly exhibits Impressionist characteristics: visible, soft brushstrokes throughout the composition, dappled light effects created by broken color patches in the foliage and ground, and a predominantly pastel palette with gentle pinks, blues, and greens. The woman and parasol are rendered with loose, painterly technique rather than sharp detail, matching the requested style. Background figures and environment are abstracted into colorful strokes, consistent with Impressionism. VERDI

A portrait of a man in Pop Art style — bold outlines, Ben-DaPASS
A portrait of a man in Pop Art style — bold outlines, Ben-Day dots, flat vivid primary colors

Evaluator (Qwen 3.5 122B)

The image clearly exhibits Pop Art characteristics: bold black outlines define the subject, Ben-Day dots are used extensively for shading and texture (especially on the face and neck), and flat vivid colors dominate — notably the yellow background, blue collar, and red dot shading. The stylized, graphic quality matches the requested aesthetic precisely. VERDICT: PASS

Hard 67%
A forest rendered in photorealistic watercolor — the detail PASS
A forest rendered in photorealistic watercolor — the detail and lighting of a photo combined with visible watercolor bleeds and paper texture

Evaluator (Qwen 3.5 122B)

The image displays a forest scene with soft, blended edges and a textured, painterly quality that mimics watercolor paper. There are visible bleeds and washes of color, particularly in the background mist and foliage, while maintaining realistic lighting and detail in the foreground trees and stream. The overall aesthetic successfully combines photographic realism with watercolor techniques. VERDICT: PASS

A woman in a futuristic city rendered in cyberpunk Art NouvePASS
A woman in a futuristic city rendered in cyberpunk Art Nouveau — neon colors and tech elements combined with flowing organic lines and floral borders

Evaluator (Qwen 3.5 122B)

The image successfully embodies the requested "cyberpunk Art Nouveau" style. Key characteristics are clearly present: the futuristic city setting with neon colors (pinks, cyans) fulfills the cyberpunk aspect, while the flowing organic lines, floral borders, and ornate, curved structures reflect the Art Nouveau influence. The combination of tech elements and natural motifs is well-executed. VERDICT: PASS

A medieval castle depicted in Japanese ukiyo-e woodblock priFAIL
A medieval castle depicted in Japanese ukiyo-e woodblock print style combined with photorealistic sky and lighting

Evaluator (Qwen 3.5 122B)

The image displays a medieval castle with stylized, cartoon-like architecture and flat coloring, which does not align with the intricate line work, textured paper effects, or specific compositional rules of traditional ukiyo-e woodblock prints. While the sky has some painterly qualities, it lacks photorealism and instead appears illustrative. The overall aesthetic is more akin to modern digital concept art or anime backgrounds rather than the requested hybrid style. VERDICT: FAIL

Extreme 67%
An interior scene rendered in soft brutalism — raw concrete PASS
An interior scene rendered in soft brutalism — raw concrete textures combined with rounded pastel furniture and warm diffused light

Evaluator (Qwen 3.5 122B)

The image clearly exhibits soft brutalism: raw concrete walls and pillars dominate the background, while rounded pastel furniture (pink, mint, cream) fills the foreground. Warm diffused lighting casts gentle glows on surfaces, enhancing texture contrast. All key visual characteristics — brutalist materials, pastel curves, ambient warmth — are present and well-integrated. The style is unmistakable and faithfully rendered. VERDICT: PASS

A still life painted in anxious minimalism — sparse compositFAIL
A still life painted in anxious minimalism — sparse composition with uncomfortable negative space, muted colors, and subtle visual tension

Evaluator (Qwen 3.5 122B)

The image exhibits a sparse composition with a single cigarette in an ashtray against a large, muted background, fitting the "anxious minimalism" description. The colors are desaturated and the negative space is prominent. However, the object itself is structurally flawed—the cigarette appears to be melting into or fused with the ashtray in an unnatural way, and the smoke lacks realistic flow. While the mood is correct, the execution of the subject matter is poor. VERDICT: FAIL

A cityscape rendered in nostalgic futurism — retro-futuristiPASS
A cityscape rendered in nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina

Evaluator (Qwen 3.5 122B)

The image strongly exhibits the requested "nostalgic futurism" style. Key characteristics are clearly present: the 1960s retro-futuristic aesthetic is evident in the vintage car, old television, and record player. The cityscape features stylized, space-age architecture with sleek, tapered spires. The color palette perfectly matches the "melancholic warm-toned patina" requirement, utilizing soft teals, oranges, and sepia tones to create a hazy, dreamlike atmosphere. VERDICT: PASS