ImageBench

ImageBench V1 —

192 evaluations across 6 categories

Benchmark V1 verdicts are produced by VLM judges and can contain mistakes. Treat PASS/FAIL labels as machine-assisted assessments, and inspect the images yourself. Learn more about the methodology.

154
Pass
38
Fail
80%
Pass Rate
18.5s
Avg Latency
14.6s
Min Latency
26.1s
Max Latency
Text Rendering87%Spatial Reasoning90%Human realism83%Truthfulness74%Professional Studio89%Graphical design46%Latency14%
The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink
Text RenderingTypography StyleEasyfal/krea/v2-medium

Prompt: The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Text Rendering 87%

Typography StyleClick to expand100%

Typography Style measures whether fal/krea/v2-medium preserves requested lettering styles, font mood, and text texture such as typewriter ink or neon signage.

Easy 100%
The word 'CHAPTER ONE' typed on aged paper with a vintage tyPASS
The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Evaluator (Qwen 3 VL)

Text reads “CHAPTER ONE” correctly. Font mimics vintage typewriter style with distinct serif edges and consistent spacing. Paper shows aged, slightly yellowed texture. Ink appears uniform, lacking the “slightly uneven” quality requested. Overall, medium is convincingly typewriter-like, though ink uniformity is a minor shortfall. VERDICT: PASS

The phrase 'TODAY'S SPECIAL' written in white chalk on a darPASS
The phrase 'TODAY'S SPECIAL' written in white chalk on a dark green chalkboard with chalk dust

Evaluator (Qwen 3 VL)

Text reads "TODAY'S SPECIAL" correctly. Letters show convincing chalk texture, smudges, and dust on dark green surface. Handwritten style matches chalkboard medium. No obvious artifacts or misrendering. VERDICT: PASS

The word 'OPEN' displayed on a red LED sign with glowing pixPASS
The word 'OPEN' displayed on a red LED sign with glowing pixel-style letters

Evaluator (Qwen 3 VL)

The word "OPEN" is rendered in glowing red pixel-style letters, matching an LED sign. The text is spelled correctly. The pixelated, dot-matrix appearance convincingly simulates an LED display with visible individual lights and ambient glow. VERDICT: PASS

Writing accuracyClick to expand83%

Writing accuracy measures whether fal/krea/v2-medium renders the exact requested words, numbers, labels, and short phrases without spelling or layout errors.

Easy 100%
A single large word 'HELLO' centered on a white backgroundPASS
A single large word 'HELLO' centered on a white background

Evaluator (Qwen 3 VL)

The word "HELLO" is spelled correctly and fully readable. All letters are intact, clearly formed, and legible against the white background. No misspellings or missing characters are present. The text is centered as intended. VERDICT: PASS

The word 'ADVENTURE' in big bold letters centered on a plainPASS
The word 'ADVENTURE' in big bold letters centered on a plain cream background

Evaluator (Qwen 3 VL)

The word "ADVENTURE" is spelled correctly and fully readable in bold, 3D-style letters. All letters are intact and clearly visible against the plain cream background. No misspellings or illegible parts exist. The text is centered as intended. VERDICT: PASS

The word 'SERENITY' written large and centered on a solid liPASS
The word 'SERENITY' written large and centered on a solid light blue background

Evaluator (Qwen 3 VL)

The word "SERENITY" is fully spelled correctly and clearly readable in white, centered on a solid light blue background. All letters are intact and legible. No misspellings or missing parts. The text matches the intended specification exactly. VERDICT: PASS

Medium 100%
The sentence 'The quick brown fox jumps high' in clean blackPASS
The sentence 'The quick brown fox jumps high' in clean black text on a white background

Evaluator (Qwen 3 VL)

All words are spelled correctly and fully readable: "The", "quick", "brown", "fox", "jumps", "high". The text is clean black on white background. No misspellings or illegible parts. The word "fox" has a decorative underline, but this is not a spelling error. Accuracy matches the intended text exactly. VERDICT: PASS

'Every sunset brings a new dawn ahead' written neatly on a pPASS
'Every sunset brings a new dawn ahead' written neatly on a pale yellow background

Evaluator (Qwen 3 VL)

Text reads: "Every sunset brings a new dawn ahead". All words are spelled correctly and fully readable. Handwritten style is slightly stylized but unambiguous. No misspellings or missing words. Background is pale yellow as intended. VERDICT: PASS

'Stars shine brightest in the dark night' displayed on a clePASS
'Stars shine brightest in the dark night' displayed on a clean off-white background

Evaluator (Qwen 3 VL)

Text reads: "Stars shine brightest in the dark night." — matches intended phrase exactly. All words are spelled correctly, complete, and fully legible against the off-white background. No misspellings or missing parts detected. Font has slight texture but doesn't impair readability. VERDICT: PASS

Hard 100%
A poster with headline 'SUMMER SALE' in large bold text, bodPASS
A poster with headline 'SUMMER SALE' in large bold text, body text 'Up to 50% off all items this weekend only' in medium size, and fine print 'Terms and conditions apply. See store for details.' at the bottom in small text

Evaluator (Qwen 3 VL)

All text matches the intended content exactly. "SUMMER SALE" is correct. Body text "Up to 50% off all items this weekend only" is fully readable. Fine print "Terms and conditions apply. See store for details." is present and legible. No misspellings or missing words detected. Text hierarchy and placement align with specifications. VERDICT: PASS

A flyer with large title 'GRAND OPENING', subtitle 'Join us PASS
A flyer with large title 'GRAND OPENING', subtitle 'Join us for food, music, and fun' in medium text, and small disclaimer 'RSVP required. Limited seating available.' at the bottom

Evaluator (Qwen 3 VL)

All text matches the intended content exactly: "GRAND OPENING", "Join us for food, music, and fun.", and "RSVP required. Limited seating available." No misspellings, missing words, or illegible parts. Text is fully readable and correctly formatted. VERDICT: PASS

An event banner with headline 'ANNUAL GALA 2025' at the top PASS
An event banner with headline 'ANNUAL GALA 2025' at the top in large serif font, description 'An evening of fine dining and live jazz' in medium text, and footnote 'Black tie optional. Doors open at 6 PM.' in small text

Evaluator (Qwen 3 VL)

The text reads: "ANNUAL GALA 2025", "SPONSORED BY", "An evening of fine dining and live jazz", and "Black tie optional. Doors open at 6 PM." All words are spelled correctly and fully readable. The intended description and footnote match exactly. No misspellings or illegible parts. VERDICT: PASS

Extreme 33%
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagPASS
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagline 'One ship. One chance. No turning back.' with credits block reading 'Directed by James Harlow. Starring Elena Voss, Marcus Chen, and Sofia Reyes. Music by Thomas Grant. A Paramount Pict

Evaluator (Qwen 3 VL)

All text matches the intended content exactly: tagline, credits, and release info are fully readable and correctly spelled. No misspellings or missing words detected. The text integrates realistically into the poster design as specified. VERDICT: PASS

A magazine cover titled 'NATIONAL EXPLORER' with cover linesFAIL
A magazine cover titled 'NATIONAL EXPLORER' with cover lines 'The Hidden Temples of Cambodia — page 34', 'Interview: Dr. Maria Santos on Climate Action', 'Top 10 Trails You Must Hike This Year', and a barcode area with 'Issue 247 | March 2025 | $6.99

Evaluator (Qwen 3 VL)

“Cambida” is misspelled (should be “Cambodia”). “Top 10 Trails You Must Hike This Year.” is missing a period at the end. All other text matches the intended content. The barcode area is readable. VERDICT: FAIL

A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtFAIL
A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtitle 'Raw & Unfiltered', weight '12 oz (340g)', description 'Pure wildflower honey harvested from sustainable apiaries in the Pacific Northwest. No additives or preservatives.', and small te

Evaluator (Qwen 3 VL)

“Unfiltered” is misspelled as “Unfliltered”. “Pacific Northwest” is partially obscured and illegible. “Store in cool, dry place” is partially obscured and illegible. “No additives or preservatives” is partially obscured and illegible. “Product of USA” is partially obscured and illegible. Multiple critical words are unreadable or misspelled. VERDICT: FAIL

Spatial Reasoning 90%

Attributes BindingClick to expand100%

Attributes Binding measures whether fal/krea/v2-medium attaches the right colors, materials, and properties to the correct requested objects.

Easy 100%
A red ball and a blue cube sitting side by side on a white sPASS
A red ball and a blue cube sitting side by side on a white surface

Evaluator (Qwen 3.5 122B)

The image shows two objects on a white surface. On the left is a red sphere (ball). On the right is a blue cube. Both objects are correctly colored and shaped as specified. They are positioned side by side. All attributes (color, shape) are correctly bound to their respective objects with no swaps. VERDICT: PASS

A green apple and an orange pumpkin on a wooden cutting boarPASS
A green apple and an orange pumpkin on a wooden cutting board

Evaluator (Qwen 3.5 122B)

The image shows a green apple with a leaf, positioned in the foreground on a wooden cutting board. Behind it is an orange pumpkin, also on the board. Both objects match their described attributes: color (green apple, orange pumpkin), shape (round apple, ribbed pumpkin), and placement (on wooden board). No attribute swaps or misbindings are present. VERDICT: PASS

A yellow taxi and a black limousine parked next to each othePASS
A yellow taxi and a black limousine parked next to each other on a street

Evaluator (Qwen 3.5 122B)

The image shows two vehicles parked nose-to-nose on a street. The vehicle on the left is yellow and has taxi markings (including a "T" logo and roof light), matching the description of a yellow taxi. The vehicle on the right is black, elongated, and appears to be a limousine or luxury sedan, consistent with the prompt’s “black limousine.” Both objects are correctly attributed by color and type; no attribute swaps are present. VERDICT: PASS

Medium 100%
A large red circle, a small blue triangle, and a medium greePASS
A large red circle, a small blue triangle, and a medium green star arranged on a white canvas

Evaluator (Qwen 3.5 122B)

The image contains three distinct geometric shapes on a white background. The red circle is large and positioned on the right. The blue triangle is small and located in the upper left. The green star is medium-sized and situated between the other two shapes. All attributes (color, size, shape) are correctly bound to their respective objects with no swaps or errors. VERDICT: PASS

A tall thin white candle, a short fat red candle, and a mediPASS
A tall thin white candle, a short fat red candle, and a medium spiral yellow candle on a shelf

Evaluator (Qwen 3.5 122B)

The image shows three candles on a shelf: 1. Left: Tall, thin, white candle — matches description. 2. Center: Short, fat, red candle — matches description. 3. Right: Medium height, spiral-shaped, yellow candle — matches description. All attributes (color, size, shape) are correctly bound to each object with no swaps. VERDICT: PASS

A big brown horse, a small white rabbit, and a medium-sized PASS
A big brown horse, a small white rabbit, and a medium-sized black cat standing in a meadow

Evaluator (Qwen 3.5 122B)

The image shows a brown horse, a white rabbit, and a black cat in a meadow. The horse is large and brown, the rabbit is small and white, and the cat is medium-sized and black. All attributes (color, size, shape) are correctly bound to the right objects with no swaps. VERDICT: PASS

Hard 100%
A pink fire truck, a blue banana, and a green sun in a surrePASS
A pink fire truck, a blue banana, and a green sun in a surreal landscape

Evaluator (Qwen 3.5 122B)

The image shows a pink fire truck with yellow ladder and green-tinted windows, correctly matching the prompt. A large blue banana is present beside it, also matching. The sky features a large green circular object resembling a sun or moon — while “sun” was requested, its appearance is more lunar, but color and placement align with surreal intent. No attribute swaps occur. VERDICT: PASS

A purple carrot, a red tree with red leaves, and a white watPASS
A purple carrot, a red tree with red leaves, and a white watermelon sliced open

Evaluator (Qwen 3.5 122B)

The image contains: - A purple carrot with green leaves (correct color and shape) - A red tree with red leaves and white trunk (correct colors) - A sliced watermelon that is white inside with green rind (correct appearance) All three objects are present with their specified attributes correctly bound. The purple carrot is purple, the tree has red leaves, and the watermelon slice shows white flesh with green rind. No attribute swaps detected. VERDICT: PASS

An orange elephant, a yellow penguin, and a silver strawberrPASS
An orange elephant, a yellow penguin, and a silver strawberry arranged on a table

Evaluator (Qwen 3.5 122B)

The image shows three objects on a wooden surface: 1. Orange elephant – correct color and shape. 2. Yellow penguin – correct color and shape. 3. Silver strawberry – correct color and shape (metallic silver body with green stem). All attributes are correctly bound to the right objects; no swaps or mismatches observed. Lighting and composition are clean, and object identities are unambiguous. VERDICT: PASS

CompositionalityClick to expand100%

Compositionality measures whether fal/krea/v2-medium combines subjects, actions, outfits, props, and scene details into one coherent requested image.

Easy 100%
A tabby cat wearing oversized red sunglassesPASS
A tabby cat wearing oversized red sunglasses

Evaluator (Qwen 3.5 122B)

The image shows a tabby cat with distinct striped fur patterns. It is wearing large, red-framed sunglasses that cover its eyes. The combination is coherent and visually integrated — the glasses sit naturally on the cat’s face, and the tabby markings are clearly visible around them. All prompt elements are present and well-combined. VERDICT: PASS

A golden retriever wearing a tiny cowboy hatPASS
A golden retriever wearing a tiny cowboy hat

Evaluator (Qwen 3.5 122B)

The image shows a golden retriever with a small cowboy hat on its head. The dog’s fur is golden, and the hat is proportionally tiny relative to the dog’s head, fitting the description. Both elements are clearly present and coherently combined — the hat sits naturally atop the dog’s head, and the breed is unmistakably a golden retriever. No contradictions or missing components. VERDICT: PASS

A penguin holding a colorful umbrellaPASS
A penguin holding a colorful umbrella

Evaluator (Qwen 3.5 122B)

The image contains: - Penguin: Present (rockhopper penguin with distinctive yellow crest) - Colorful umbrella: Present (orange, yellow, pink panels) - Holding: The umbrella appears positioned above the penguin, but there's no visible mechanism showing how it's being held - the shaft extends downward toward the penguin's head area While all core elements are present, the "holding" aspect lacks clear visual coherence - the umbrella seems to float above rather than being actively held by the pengu

Medium 100%
A medieval knight in full armor sitting at a desk typing on PASS
A medieval knight in full armor sitting at a desk typing on a modern laptop

Evaluator (Qwen 3.5 122B)

The image shows a knight in full plate armor seated at a wooden desk, typing on a modern laptop. The setting includes Gothic architecture (towers, arched windows) and soft backlighting. All prompt elements are present: medieval knight, full armor, desk, modern laptop, and coherent combination. No contradictions or missing components. VERDICT: PASS

A Roman gladiator taking a selfie with a smartphone in the CPASS
A Roman gladiator taking a selfie with a smartphone in the Colosseum

Evaluator (Qwen 3.5 122B)

The image contains all required elements: 1. Roman gladiator - Present (wearing traditional armor and helmet) 2. Taking a selfie - Present (holding phone up in selfie pose) 3. Smartphone - Present (modern black smartphone in hand) 4. Colosseum - Present (ancient Roman amphitheater in background) All elements are coherently combined in a single scene with appropriate perspective and lighting. The anachronistic combination is intentionally depicted as requested. VERDICT: PASS

An astronaut in a spacesuit riding a bicycle through a parkPASS
An astronaut in a spacesuit riding a bicycle through a park

Evaluator (Qwen 3.5 122B)

The image contains all required elements: 1. Astronaut in a spacesuit - Present (white suit with blue stripes, helmet, backpack) 2. Riding a bicycle - Present (black mountain bike) 3. Through a park - Present (trees, grass, bench, paved path) All elements are coherently combined in a single scene. The astronaut is actively riding the bike through what appears to be a park setting. VERDICT: PASS

Hard 100%
An octopus wearing a top hat and monocle playing chess on thPASS
An octopus wearing a top hat and monocle playing chess on the surface of the moon with Earth visible in the background

Evaluator (Qwen 3.5 122B)

The image contains all described elements: 1. Octopus - Present (blue, large) 2. Top hat - Present (black, on octopus head) 3. Monocle - Present (gold-rimmed, over eye) 4. Chess game - Present (board with pieces) 5. Moon surface - Present (cratered gray terrain) 6. Earth in background - Present (blue planet visible) All elements are coherently combined in a surreal but visually consistent scene. VERDICT: PASS

A dinosaur in a business suit giving a PowerPoint presentatiPASS
A dinosaur in a business suit giving a PowerPoint presentation in a modern office while drinking coffee

Evaluator (Qwen 3.5 122B)

The image contains all required elements: - Dinosaur: Present (raptor-like) - Business suit: Present (gray suit, white shirt, green tie) - PowerPoint presentation: Present (on wall-mounted screen with charts) - Modern office: Present (large windows, city view, wooden floor) - Drinking coffee: Present (holding coffee cup in claw) All elements are coherently combined in a single scene. The dinosaur is actively presenting while holding coffee, fitting the prompt exactly. VERDICT: PASS

A robot dressed as a pirate sailing a paper boat on a lake oPASS
A robot dressed as a pirate sailing a paper boat on a lake of lava while juggling three kittens

Evaluator (Qwen 3.5 122B)

The image contains: - Robot dressed as a pirate: Present (wearing pirate hat with skull) - Paper boat: Present (white origami boat in foreground) - Lake of lava: Present (molten orange/red background with volcanoes) - Three kittens: Present (four kittens actually — one gray, one orange, one black, one tabby) - Juggling: Not accurately depicted — kittens are floating/flying around robot, not being juggled Core elements are present but “juggling” is misinterpreted. Still, the unusual concept is c

CountingClick to expand78%

Counting measures whether fal/krea/v2-medium creates the exact number of requested objects without adding or missing items.

Easy 100%
Exactly 3 red apples arranged on a wooden tablePASS
Exactly 3 red apples arranged on a wooden table

Evaluator (Qwen 3.5 122B)

The image displays three red apples arranged on a wooden surface. The apples are positioned in a triangular formation, with one in front and two behind. The lighting creates distinct shadows, enhancing the realism. The count of apples matches the requested number exactly. VERDICT: PASS

Exactly 5 yellow rubber ducks sitting in a row on a bathroomPASS
Exactly 5 yellow rubber ducks sitting in a row on a bathroom shelf

Evaluator (Qwen 3.5 122B)

The image displays five yellow rubber ducks arranged in a row on a white surface, likely a bathroom shelf. The background is softly blurred with light blue tones. All five ducks are clearly visible and match the description. The count exactly matches the requested number. VERDICT: PASS

Exactly 4 white candles standing on a dark surfacePASS
Exactly 4 white candles standing on a dark surface

Evaluator (Qwen 3.5 122B)

The image displays exactly four white candles arranged in descending height order from left to right. They are standing upright on a dark, flat surface against a dark background. The count matches the prompt’s request precisely: four candles, white, standing, on a dark surface. No extra or missing objects. Lighting and shadows are consistent with realism. VERDICT: PASS

Medium 100%
2 orange cats and 3 black dogs sitting together on a grassy PASS
2 orange cats and 3 black dogs sitting together on a grassy lawn

Evaluator (Qwen 3.5 122B)

The image shows exactly 2 orange cats in the foreground and 3 black dogs positioned behind them on a grassy lawn. The count of each animal type matches the prompt’s request precisely. All animals are sitting or positioned as described, and the setting is accurate. VERDICT: PASS

4 red roses and 2 white daisies in a glass vase on a tablePASS
4 red roses and 2 white daisies in a glass vase on a table

Evaluator (Qwen 3.5 122B)

The image shows 4 red roses and 2 prominent white daisies in a glass vase on a wooden table. There are also smaller, less distinct daisy-like flowers in the background, but the main subjects match the requested count exactly. The core criteria — 4 roses and 2 daisies — is met. VERDICT: PASS

1 blue butterfly and 3 yellow butterflies resting on a gardePASS
1 blue butterfly and 3 yellow butterflies resting on a garden fence

Evaluator (Qwen 3.5 122B)

The image shows exactly 1 blue butterfly and 3 yellow butterflies resting on a wooden garden fence. The counts match the prompt’s request precisely: one blue, three yellow. All butterflies are clearly visible and positioned as described. No extra or missing butterflies. The scene is well-composed and matches the specified object types and quantities. VERDICT: PASS

Hard 33%
Exactly 9 goldfish swimming in a large round fishbowlFAIL
Exactly 9 goldfish swimming in a large round fishbowl

Evaluator (Qwen 3.5 122B)

The image shows a round glass fishbowl filled with water and containing multiple goldfish. Counting the fish: there are 10 goldfish visible — arranged in three rows (top: 3, middle: 4, bottom: 3). The prompt requested exactly 9 goldfish. Since the count is 10, not 9, the requirement is not met. VERDICT: FAIL

A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 PASS
A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 apples

Evaluator (Qwen 3.5 122B)

The image shows a fruit bowl with: - 3 bananas (clearly visible in the front) - 4 oranges (three fully visible, one partially hidden behind bananas on the right) - 2 apples (both red, on the left side) All counts match the requested quantities exactly. VERDICT: PASS

Exactly 11 colorful balloons floating against a clear blue sFAIL
Exactly 11 colorful balloons floating against a clear blue sky

Evaluator (Qwen 3.5 122B)

The image shows colorful balloons against a blue sky. Counting them: there are 10 balloons visible (2 pink, 2 red, 2 yellow, 2 orange, 2 purple). The prompt requested exactly 11 balloons. Since the count is off by one, the core criterion is not met. VERDICT: FAIL

NegationClick to expand78%

Negation measures whether fal/krea/v2-medium respects what should be absent from the image, not only what should be present.

Easy 67%
A tropical beach scene with no peoplePASS
A tropical beach scene with no people

Evaluator (Qwen 3.5 122B)

The image depicts a serene tropical beach with palm trees, gentle waves, and clear skies. There are no visible people, footprints, or human artifacts in the scene. The excluded element — “people” — is entirely absent. The composition focuses solely on natural elements: sand, sea, sky, and vegetation. No signs of human presence are detectable anywhere in the frame. VERDICT: PASS

A cat lying on a couch without a collarFAIL
A cat lying on a couch without a collar

Evaluator (Qwen 3.5 122B)

The image shows a fluffy, light-brown cat lying on a couch. The cat is wearing a black collar with a round metallic tag attached. Since the prompt explicitly excluded “a cat lying on a couch without a collar,” and the cat in the image clearly has a collar, the excluded element is present. VERDICT: FAIL

A classroom with desks and a chalkboard but no studentsPASS
A classroom with desks and a chalkboard but no students

Evaluator (Qwen 3.5 122B)

The image shows an empty classroom with multiple desks and chairs arranged in rows. A large green chalkboard is mounted on the front wall, with framed pictures above it. There are no students visible anywhere in the scene — the room is completely unoccupied. The excluded element (students) is truly absent. VERDICT: PASS

Medium 100%
A sandy beach with palm trees, seashells, and a beach umbrelPASS
A sandy beach with palm trees, seashells, and a beach umbrella but no water or ocean visible

Evaluator (Qwen 3.5 122B)

The image shows a sandy beach in the foreground with several seashells scattered on the sand. In the midground, there is a striped beach umbrella planted in the sand. In the background, there are palm trees against a clear sky. The horizon line is visible where the sand meets the sky, and there is no water or ocean visible in the image. The excluded element (water/ocean) is truly absent. VERDICT: PASS

An office workspace with a desk, chair, bookshelf, and lamp PASS
An office workspace with a desk, chair, bookshelf, and lamp but no computers or screens

Evaluator (Qwen 3.5 122B)

The image shows a bright, minimalist office workspace featuring a wooden desk, a slat-back chair with metal legs, and a tall bookshelf filled with books. A silver desk lamp sits on the desk. There are no computers, monitors, tablets, or any other screens visible in the scene. The excluded elements are fully absent. VERDICT: PASS

A kitchen with pots, pans, a stove, and a cutting board but PASS
A kitchen with pots, pans, a stove, and a cutting board but no food

Evaluator (Qwen 3.5 122B)

The image shows a clean kitchen scene with a gas stove, a large stainless steel pot with lid slightly ajar, a black frying pan, a small black pot, a white kettle, a wooden cutting board, and a white bowl. No food items are visible in any of the cookware or on surfaces. The excluded element — food — is indeed absent. VERDICT: PASS

Hard 67%
A pepperoni pizza with tomato sauce, basil, and crust but abPASS
A pepperoni pizza with tomato sauce, basil, and crust but absolutely no cheese

Evaluator (Qwen 3.5 122B)

The image shows a pizza with a charred crust, red tomato sauce, pepperoni slices, and fresh basil leaves. There is no visible melted cheese layer covering the sauce or toppings — the surface appears to be primarily sauce with toppings resting directly on it. This matches the prompt’s exclusion of cheese. VERDICT: PASS

A car body with windows, doors, and headlights but no wheelsPASS
A car body with windows, doors, and headlights but no wheels

Evaluator (Qwen 3.5 122B)

The image shows a white car body shell without wheels, doors, or windows installed. The wheel wells are empty, confirming no wheels are present. However, the prompt’s exclusion criteria mention “windows, doors, and headlights” as elements to exclude — but the image actually lacks doors and windows entirely (they’re not installed), while headlights are partially visible as molded cavities in the front bumper area. Since the prompt says “excluded certain elements: ‘A car body with windows, doors,

A human face with eyes, nose, and mouth but no eyebrowsFAIL
A human face with eyes, nose, and mouth but no eyebrows

Evaluator (Qwen 3.5 122B)

The image shows a close-up of a human face with clearly visible eyes, nose, and mouth. However, the prompt explicitly excluded eyebrows — and in this image, eyebrows are clearly present above both eyes. They are well-defined, textured, and anatomically correct. Since the excluded element (eyebrows) is present, the image fails to meet the exclusion criterion. VERDICT: FAIL

Relative PositionClick to expand100%

Relative Position measures whether fal/krea/v2-medium places objects in the requested spatial relationship, such as above, below, inside, or next to another object.

Easy 100%
A cat sitting on top of a cardboard boxPASS
A cat sitting on top of a cardboard box

Evaluator (Qwen 3.5 122B)

The image shows a hairless cat positioned directly on the upper surface of a brown cardboard box. The cat’s paws are in contact with the box, and its body is elevated above the box’s top plane. The spatial relationship “sitting on top of” is accurately represented — the cat is above and supported by the box. No conflicting positional descriptors are present. VERDICT: PASS

A red ball underneath a wooden chairPASS
A red ball underneath a wooden chair

Evaluator (Qwen 3.5 122B)

The image shows a wooden chair with a red ball positioned directly underneath it, resting on the floor between the chair legs. The spatial relationship described in the prompt — “a red ball underneath a wooden chair” — is accurately represented. No other positional relationships are specified or violated. VERDICT: PASS

A coffee mug next to an open book on a deskPASS
A coffee mug next to an open book on a desk

Evaluator (Qwen 3.5 122B)

The image shows a black coffee mug positioned to the right of an open book, both resting on a wooden desk surface. The spatial arrangement matches the prompt: the mug is next to the book, and both are on the desk. No positional relationships are incorrect. VERDICT: PASS

Medium 100%
A bird perched on a branch, the branch extending from a treePASS
A bird perched on a branch, the branch extending from a tree next to a park bench

Evaluator (Qwen 3.5 122B)

The image shows a bird perched on a branch that extends from a tree trunk on the right side of the frame. The branch stretches diagonally toward the left. In the background, slightly out of focus, is a park bench positioned behind and to the left of the bird and branch. All spatial relationships described — bird on branch, branch extending from tree, tree next to bench — are accurately represented. VERDICT: PASS

A cat sitting on a pillow, the pillow placed on a wooden chaPASS
A cat sitting on a pillow, the pillow placed on a wooden chair

Evaluator (Qwen 3.5 122B)

The image shows a calico cat sitting on a striped pillow. The pillow is placed on the seat of a wooden chair. All spatial relationships described in the prompt — “cat on pillow,” “pillow on chair” — are accurately represented. No positional errors are present. VERDICT: PASS

A candle on top of a stack of books, the books next to a potPASS
A candle on top of a stack of books, the books next to a potted plant

Evaluator (Qwen 3.5 122B)

The image shows a white candle placed directly on top of a stack of two books. To the right of the books (from the viewer’s perspective) is a potted fern-like plant in a beige pot. The spatial relationships described — “candle on top of books” and “books next to potted plant” — are accurately represented. All positional descriptors align with the visual layout. VERDICT: PASS

Hard 100%
Five chess pieces on a board: a king in the center, a queen PASS
Five chess pieces on a board: a king in the center, a queen to its left, a bishop to its right, a rook above the king, and a knight below the king

Evaluator (Qwen 3.5 122B)

The image shows five chess pieces arranged as described: a white king in the center, a white queen to its left, a white bishop to its right, a black rook above the king, and a black knight below the king. All positional relationships match the prompt exactly. VERDICT: PASS

A desk scene: laptop in the center, coffee mug to the left, PASS
A desk scene: laptop in the center, coffee mug to the left, phone to the right, lamp above, and notebook below

Evaluator (Qwen 3.5 122B)

The image shows a laptop centered on a wooden tray. A coffee mug is to the left of the laptop, a phone is to the right, and a notebook is below (in front of) the laptop. A white desk lamp is positioned above and behind the laptop. All specified spatial relationships — left, right, above, below — are accurately represented in the image. VERDICT: PASS

Five animals in a field: a horse on the far left, a cow to tPASS
Five animals in a field: a horse on the far left, a cow to the right of the horse, a sheep in the center, a goat to the right of the sheep, and a dog on the far right

Evaluator (Qwen 3.5 122B)

The image shows five animals arranged as described: a horse on the far left, a cow to its right, a sheep in the center, a goat to the right of the sheep, and a dog on the far right. All positional relationships specified in the prompt are accurately represented in the image. VERDICT: PASS

Extreme 100%
A landscape with a red barn in the foreground at large scalePASS
A landscape with a red barn in the foreground at large scale, a windmill in the midground at medium scale, and snow-capped mountains in the background at small scale, all with consistent depth perspective

Evaluator (Qwen 3.5 122B)

The image shows a red barn in the foreground on the right, a windmill in the midground to the left of the barn, and snow-capped mountains in the background. The spatial arrangement matches the description: barn (foreground, large), windmill (midground, medium), mountains (background, small). Depth perspective is consistent. All positional relationships are accurate. VERDICT: PASS

A beach scene with a sandcastle in the foreground, a surfer PASS
A beach scene with a sandcastle in the foreground, a surfer on the waves in the midground, and a cruise ship on the horizon in the background, sizes decreasing with distance

Evaluator (Qwen 3.5 122B)

The image correctly shows a sandcastle in the foreground, a surfer riding waves in the midground, and a cruise ship on the horizon in the background. All objects are positioned as described: sandcastle closest to viewer, surfer further back on water, ship farthest away. Sizes decrease appropriately with distance. Spatial relationships (foreground/midground/background) are accurate. VERDICT: PASS

A city park with a park bench and pigeons in the foreground,PASS
A city park with a park bench and pigeons in the foreground, a fountain in the midground, and skyscrapers in the background, each layer progressively smaller

Evaluator (Qwen 3.5 122B)

The image correctly shows a park bench and pigeons in the foreground, a fountain in the midground, and skyscrapers in the background. The spatial arrangement matches the prompt: pigeons are on and around the bench, the fountain is behind the bench, and skyscrapers rise behind the fountain. All positional relationships described are accurate. VERDICT: PASS

Scale & ProportionsClick to expand78%

Scale & Proportions measures whether fal/krea/v2-medium keeps people, vehicles, buildings, and everyday objects at believable relative sizes.

Easy 100%
A person standing next to a yellow school bus, showing realiPASS
A person standing next to a yellow school bus, showing realistic relative scale

Evaluator (Qwen 3.5 122B)

The person standing next to the yellow school bus appears to be of realistic relative scale — their height is proportionate to the bus’s side windows and wheels, consistent with real-world dimensions. The bus itself maintains standard school bus proportions (length, window spacing, wheel size). No obvious distortions or scaling errors are present. VERDICT: PASS

A ladybug sitting on a large green leaf, shown at realistic PASS
A ladybug sitting on a large green leaf, shown at realistic proportions

Evaluator (Qwen 3.5 122B)

The ladybug appears appropriately sized relative to the leaf, consistent with real-world proportions. The leaf’s curvature and texture suggest a natural scale, and the insect’s anatomy (legs, antennae, spots) is rendered at a believable size for a ladybug on foliage. No exaggerated or miniature distortions are evident. VERDICT: PASS

A child holding a basketball, both at correct relative sizesPASS
A child holding a basketball, both at correct relative sizes

Evaluator (Qwen 3.5 122B)

The child’s hands are holding the basketball in a natural, proportionate manner — the ball fits comfortably within both hands, consistent with real-world scale for a young child. The torso, head, and limbs also appear anatomically proportional. No distortions or size inconsistencies are visible between the child and the basketball. VERDICT: PASS

Medium 100%
A tiny person standing on the rim of a coffee cup, looking dPASS
A tiny person standing on the rim of a coffee cup, looking down into it

Evaluator (Qwen 3.5 122B)

The image shows a tiny person standing on the rim of a coffee cup, looking down into it. The person’s reflection is visible in the liquid below. The scale relationship between the miniature figure and the oversized cup is visually consistent with the prompt’s description — the person is appropriately tiny relative to the cup, and their posture matches “looking down.” No proportional inconsistencies are evident. VERDICT: PASS

A giant tabby cat walking between city skyscrapers like a kaPASS
A giant tabby cat walking between city skyscrapers like a kaiju

Evaluator (Qwen 3.5 122B)

The image depicts a giant tabby cat walking between skyscrapers, consistent with the "kaiju" description. The cat’s size relative to the buildings and street elements (cars, trees, people) is appropriately scaled to convey its massive proportions. The perspective and proportions are visually coherent and match the prompt’s intent. VERDICT: PASS

A miniature house sitting on the palm of someone's handPASS
A miniature house sitting on the palm of someone's hand

Evaluator (Qwen 3.5 122B)

The image shows a small house model resting on an open human palm. The house is appropriately scaled to fit comfortably within the hand, with proportions that suggest it is indeed miniature relative to the hand. The size relationship between the house and the hand is consistent with the prompt’s description. VERDICT: PASS

Hard 33%
A group photo of an elephant, a horse, a dog, a cat, and a mPASS
A group photo of an elephant, a horse, a dog, a cat, and a mouse, all standing in a line with correct real-world proportional sizes

Evaluator (Qwen 3.5 122B)

The image shows an elephant, horse, dog, cat, and mouse arranged in a line from largest to smallest. The elephant is correctly depicted as the largest, followed by the horse, then the dog, cat, and finally the mouse as the smallest. The proportions between each animal appear accurate relative to real-world sizes. All animals are standing on grass in a pastoral setting. VERDICT: PASS

A dinner table scene with a plate, fork, knife, wine glass, FAIL
A dinner table scene with a plate, fork, knife, wine glass, salt shaker, and peppercorn, all at realistic relative sizes

Evaluator (Qwen 3.5 122B)

The image shows a plate, fork, knife, wine glass, salt shaker, and a single peppercorn. The relative sizes are mostly realistic — the plate is appropriately large compared to the cutlery, and the wine glass and salt shaker are proportionally scaled. However, the single peppercorn appears disproportionately large relative to the plate and utensils — it’s nearly as tall as the fork’s tines, which is unrealistic. This breaks the “realistic relative sizes” criterion. VERDICT: FAIL

A farm scene with a barn, tractor, farmer, cow, chicken, andFAIL
A farm scene with a barn, tractor, farmer, cow, chicken, and a fence post, all at correct proportional scale

Evaluator (Qwen 3.5 122B)

The image depicts a farm scene with a barn, tractor, farmer, cow, chicken, and fence post. The relative sizes are mostly consistent: the barn is largest, followed by the tractor and cow, then the farmer, and the chicken is smallest. However, the tractor appears slightly oversized relative to the farmer and cow, and the fence post in the foreground seems disproportionately large compared to its distance. These minor inconsistencies affect overall proportion accuracy. VERDICT: FAIL

Human realism 83%

Faces & ExpressionsClick to expand92%

Faces & Expressions measures whether fal/krea/v2-medium creates realistic faces with the requested age, emotion, expression, and identity cues.

Easy 100%
A close-up portrait of a young woman with a warm natural smiPASS
A close-up portrait of a young woman with a warm natural smile

Evaluator (Qwen 3 VL)

Face is largely symmetric with natural features. Smile appears warm and genuine, clearly conveying happiness. Slight asymmetry in eyebrow shape and minor skin texture artifacts are present but not distracting. Overall, the portrait successfully captures the requested emotion and likeness. VERDICT: PASS

A portrait of a middle-aged man with a calm, serious expressPASS
A portrait of a middle-aged man with a calm, serious expression

Evaluator (Qwen 3 VL)

The face is symmetric with natural, correctly proportioned features. The expression is calm and serious, with subtle facial cues conveying contemplation. No distortions are visible. Lighting and detail are realistic, matching a high-quality portrait. VERDICT: PASS

A headshot of a teenager with a relaxed, neutral facePASS
A headshot of a teenager with a relaxed, neutral face

Evaluator (Qwen 3 VL)

Face is symmetric with correctly proportioned features. Expression is neutral and relaxed, matching the prompt. No significant distortions or anomalies. Hair and lighting are natural. The teen’s gaze and slight lip parting convey calm neutrality without strain. VERDICT: PASS

Medium 67%
A close-up of a person's face showing clear disgust, nose wrFAIL
A close-up of a person's face showing clear disgust, nose wrinkled and lip curled

Evaluator (Qwen 3 VL)

Face is asymmetric with distorted features—eyes uneven, nose skewed, and lips misaligned. Disgust is clearly conveyed via wrinkled nose and curled lips, though anatomical distortions reduce realism. Skin texture and lighting appear natural, but structural inaccuracies violate facial symmetry expectations. VERDICT: FAIL

A portrait capturing genuine surprise — raised eyebrows, widPASS
A portrait capturing genuine surprise — raised eyebrows, wide eyes, slightly open mouth

Evaluator (Qwen 3 VL)

Face is symmetric with natural features. Eyes are wide, eyebrows raised, mouth slightly open — clearly conveying genuine surprise. No noticeable distortions. Skin texture and hair look realistic. Emotion is vividly readable and matches the prompt precisely. VERDICT: PASS

A face showing quiet melancholy — downcast eyes, slightly drPASS
A face showing quiet melancholy — downcast eyes, slightly drawn mouth, distant gaze

Evaluator (Qwen 3 VL)

Face is symmetric with correctly rendered features. Eyes are downcast, lips slightly drawn, and gaze distant, conveying quiet melancholy. Subtle artistic softness and lighting enhance the mood without distortion. All elements align with the prompt’s emotional intent. VERDICT: PASS

Hard 100%
An elderly woman in her 80s with deep wrinkles and white haiPASS
An elderly woman in her 80s with deep wrinkles and white hair, smiling warmly with kind eyes

Evaluator (Qwen 3 VL)

The face shows natural asymmetry and deep wrinkles consistent with an elderly woman. White hair is present. The smile appears warm and genuine, with kind, crinkled eyes conveying emotion. No major distortions are visible. Features are anatomically plausible and well-rendered. VERDICT: PASS

A 5-year-old child with round cheeks and large eyes, crying PASS
A 5-year-old child with round cheeks and large eyes, crying with tears streaming down their face

Evaluator (Qwen 3 VL)

The child’s face is mostly symmetrical with proportionate features, including round cheeks and large eyes. Tears are clearly streaming down both cheeks, and the furrowed brow and downturned mouth convey a strong crying expression. Slight asymmetry in eyebrow shape and tear paths is minor. Overall, the emotion is clearly readable and aligns with the prompt. VERDICT: PASS

An elderly man in his 70s with age spots and thinning gray hPASS
An elderly man in his 70s with age spots and thinning gray hair, looking skeptically over reading glasses

Evaluator (Qwen 3 VL)

Face is symmetric with natural wrinkles, age spots, and thinning gray hair. Skeptical expression is clearly readable via furrowed brow and downturned mouth. Glasses fit correctly. No significant distortions. Features align with prompt. VERDICT: PASS

Extreme 100%
A group of three friends at a table: one laughing with head PASS
A group of three friends at a table: one laughing with head tilted back, one looking shocked with wide eyes, and one rolling their eyes with a smirk

Evaluator (Qwen 3 VL)

Faces show clear, symmetrical features with natural anatomy. Emotions are vividly readable: center person laughs with head back, right person looks shocked, left person smirks while rolling eyes. No distortions or unnatural artifacts. All three expressions match the prompt precisely. VERDICT: PASS

A family photo with four people: a father beaming with pridePASS
A family photo with four people: a father beaming with pride, a mother smiling gently, a teenager looking bored, and a child grinning excitedly

Evaluator (Qwen 3 VL)

Faces show natural symmetry and correct anatomical features. Emotions are clearly readable: father beams, mother smiles gently, teenager looks bored, child grins excitedly. No significant distortions or artifacts. All four individuals match the prompt’s emotional descriptions accurately. VERDICT: PASS

Three coworkers in a meeting: one looking frustrated with fuPASS
Three coworkers in a meeting: one looking frustrated with furrowed brows, one nodding with a pleased expression, and one looking confused with a tilted head

Evaluator (Qwen 3 VL)

Faces are symmetric with correct features. The central man shows clear frustration (furrowed brows, tense mouth). The woman on the left displays a pleased smile. The man on the right shows confusion (tilted head, furrowed brow). All emotions are clearly readable with no distortions. VERDICT: PASS

Full BodyClick to expand92%

Full Body measures whether fal/krea/v2-medium renders full-body people with natural anatomy, clothing, pose, and proportions.

Easy 100%
A woman standing upright in a relaxed pose, arms at her sidePASS
A woman standing upright in a relaxed pose, arms at her sides, wearing jeans and a white t-shirt

Evaluator (Qwen 3 VL)

The woman stands upright with arms at her sides, wearing jeans and a white t-shirt as described. Limb count and proportions are anatomically correct. Pose is relaxed and recognizable. No distortions or anomalies are visible. All elements match the prompt accurately. VERDICT: PASS

A man standing straight facing the camera, hands in pockets,PASS
A man standing straight facing the camera, hands in pockets, normal proportions

Evaluator (Qwen 3 VL)

The man stands straight, facing forward, with hands clearly in pockets. Limb count and proportions are anatomically correct. Pose is recognizable and matches the prompt. No distortions or abnormalities are visible. All elements align with the requested description. VERDICT: PASS

A person standing in a neutral pose in a park, full body visPASS
A person standing in a neutral pose in a park, full body visible from head to feet

Evaluator (Qwen 3 VL)

The person’s full body is visible, standing neutrally in a park. Limb count and proportions are anatomically correct. Pose is recognizable as standing still, hands at sides. No significant distortions observed. Clothing and background details are coherent. VERDICT: PASS

Medium 100%
A woman mid-stride running through a park, hair and clothes PASS
A woman mid-stride running through a park, hair and clothes showing motion

Evaluator (Qwen 3 VL)

Limb count and proportions are anatomically correct. Pose is a recognizable running stride with motion blur in hair and clothes. No significant distortions. Motion is well-rendered, matching the prompt. VERDICT: PASS

A man captured mid-jump over a puddle, body in the air with PASS
A man captured mid-jump over a puddle, body in the air with bent knees

Evaluator (Qwen 3 VL)

The man has correct limb count and proportions. His pose—bent knees, mid-air jump over a puddle—is anatomically plausible and recognizable. No distortions in limbs or body structure. The reflection adds realism. The unusual hat doesn’t affect anatomical correctness. VERDICT: PASS

A dancer in a dynamic spin, dress flowing outward with one lPASS
A dancer in a dynamic spin, dress flowing outward with one leg extended

Evaluator (Qwen 3 VL)

The dancer’s limbs are correctly counted and proportioned, with one leg extended and arms outstretched in a dynamic spin. The dress flows outward as described, capturing motion. Anatomical structure is preserved despite artistic stylization. No distortions in limb count or proportions. The pose clearly conveys a spinning motion. VERDICT: PASS

Hard 100%
A person in a yoga tree pose — standing on one leg with the PASS
A person in a yoga tree pose — standing on one leg with the other foot pressed to the inner thigh, arms raised overhead

Evaluator (Qwen 3 VL)

The figure has two arms, two legs, and correct proportions. The tree pose is accurately depicted: standing on one leg, the other foot pressed to the inner thigh, arms raised overhead with hands clasped. No anatomical distortions are visible. The shadow confirms the pose. The description matches the visual. VERDICT: PASS

A rock climber clinging to a cliff face with arms and legs ePASS
A rock climber clinging to a cliff face with arms and legs extended to different holds

Evaluator (Qwen 3 VL)

The climber has two arms and two legs, with proportions consistent with human anatomy. The pose shows arms and legs extended to different holds, recognizable as a dynamic climbing stance. Lighting creates silhouette, but no anatomical distortions are visible. Limb placement and body tension align with realistic climbing mechanics. VERDICT: PASS

A cellist seated with correct posture, legs around the instrPASS
A cellist seated with correct posture, legs around the instrument, bow arm extended

Evaluator (Qwen 3 VL)

The figure has correct limb count and proportions. The pose shows a seated cellist with legs around the instrument and bow arm extended, matching the prompt. No anatomical distortions are visible. The action is clearly recognizable as playing the cello. The silhouette is stylized but maintains essential structural accuracy. VERDICT: PASS

Extreme 67%
Two soccer players mid-tackle, one sliding and the other jumPASS
Two soccer players mid-tackle, one sliding and the other jumping over with the ball at their feet

Evaluator (Qwen 3 VL)

Both players show correct limb count and proportions. The sliding player’s body posture and the jumping player’s mid-air pose are anatomically plausible and match the described action. The ball is positioned at the jumping player’s feet, and motion blur enhances realism. No significant distortions are visible. The scene captures a dynamic, recognizable tackle. VERDICT: PASS

Two gymnasts performing a synchronized handstand, side by siPASS
Two gymnasts performing a synchronized handstand, side by side with identical body alignment

Evaluator (Qwen 3 VL)

Both gymnasts show correct limb count and proportions. Their synchronized handstand pose is anatomically accurate, with aligned bodies, straight legs, and proper hand placement. The action is clearly recognizable as a handstand. No distortions are visible; musculature and symmetry are natural. VERDICT: PASS

Not generated

Model refused to generate this prompt (HTTP 422).

FAIL
Two tango dancers in a close embrace, the man dipping the woman backward with intertwined legs
HandsClick to expand58%

Hands measures whether fal/krea/v2-medium draws anatomically plausible hands, fingers, and gestures.

Easy 0%
A person standing with both hands visible at their sides, fiFAIL
A person standing with both hands visible at their sides, fingers relaxed and naturally spread

Evaluator (Qwen 3.5 122B)

The left hand (viewer’s right) shows 5 fingers, all present and anatomically plausible. The right hand (viewer’s left) also has 5 fingers, but the thumb appears abnormally short and stubby, and the index finger is slightly malformed or fused at the tip — suggesting an anatomical anomaly. While finger count is correct, the gesture and structure are not fully natural or accurate. VERDICT: FAIL

A woman holding her hands out in front of her palms-up, all FAIL
A woman holding her hands out in front of her palms-up, all ten fingers clearly visible

Evaluator (Qwen 3.5 122B)

The left hand (viewer’s left) has 5 fingers, all clearly visible and anatomically correct. The right hand (viewer’s right) also has 5 fingers, but the thumb appears abnormally short and stubby compared to the other fingers — an anatomical anomaly. While finger count is correct, the gesture’s realism is compromised by the distorted thumb. Core criteria of “anatomically correct” is not fully met. VERDICT: FAIL

A man with arms crossed, both hands resting on opposite uppeFAIL
A man with arms crossed, both hands resting on opposite upper arms with fingers visible

Evaluator (Qwen 3.5 122B)

The left hand (visible) shows 4 fingers and thumb — total 5, anatomically correct. The right hand is mostly obscured under the left arm; only a small portion of the wrist/forearm is visible, no fingers can be counted. Since one hand’s fingers are not fully visible to verify “exactly 5 per hand,” the criterion is not fully met. VERDICT: FAIL

Medium 100%
A person holding a white ceramic coffee mug by the handle wiPASS
A person holding a white ceramic coffee mug by the handle with their right hand

Evaluator (Qwen 3.5 122B)

The hand gripping the mug handle shows 5 fingers: thumb, index, middle, ring, and pinky. The grip appears anatomically correct — thumb on top of handle, fingers curled around it. No visible anomalies or extra/missing digits. Lighting and focus are soft but do not obscure finger count or positioning. VERDICT: PASS

A woman writing in a notebook with a pen gripped correctly iPASS
A woman writing in a notebook with a pen gripped correctly in her right hand

Evaluator (Qwen 3.5 122B)

The image shows one hand gripping a pen. The hand has 5 fingers visible: thumb, index, middle, ring, and pinky — all present and proportionally placed. The grip is anatomically correct for writing: pen held between thumb and index finger, supported by middle finger. No anomalies detected. Only one hand is visible; the other is not shown, so no count can be made for it. Since the prompt specifies “each visible hand” and only one is visible, and it meets criteria, this passes. VERDICT: PASS

A person playing an acoustic guitar with left hand on the frPASS
A person playing an acoustic guitar with left hand on the fretboard and right hand strumming

Evaluator (Qwen 3.5 122B)

The left hand on the fretboard shows 5 fingers with correct positioning and grip. The right hand strumming area is blurred but appears to have 5 fingers in a natural strumming position. No obvious anatomical anomalies detected. Both hands show proper guitar-playing posture. VERDICT: PASS

Hard 100%
A person making a peace sign with their right hand, clearly PASS
A person making a peace sign with their right hand, clearly showing two fingers up and three curled

Evaluator (Qwen 3.5 122B)

The image shows a single right hand making a peace sign. Two fingers (index and middle) are extended upward, while the thumb, ring, and pinky fingers are curled inward — totaling five fingers visible on one hand. The gesture is anatomically correct: thumb crosses over curled fingers, knuckles align naturally, no extra or missing digits. No anomalies detected. VERDICT: PASS

A hand giving a thumbs-up with four fingers curled and thumbPASS
A hand giving a thumbs-up with four fingers curled and thumb extended upward

Evaluator (Qwen 3.5 122B)

The image shows one hand making a thumbs-up gesture. The thumb is extended upward, and the other four fingers are curled into a fist — this matches the prompt exactly. Anatomically, the hand appears correct: five digits total (one thumb + four fingers), with natural skin texture, creases, and proportions. No anomalies or extra/missing fingers are visible. The background is blurred, keeping focus on the hand. VERDICT: PASS

A person counting to three on their fingers, with index, midPASS
A person counting to three on their fingers, with index, middle, and ring fingers extended

Evaluator (Qwen 3.5 122B)

The image shows one hand making a gesture with three fingers extended: index, middle, and ring fingers. The thumb is curled inward, and the pinky is also curled — consistent with counting to three. Anatomically, this is correct for that gesture. There are no visible anomalies in finger count or structure. The hand appears natural and properly formed. VERDICT: PASS

Extreme 33%
Two people shaking hands in a professional setting, all fingPASS
Two people shaking hands in a professional setting, all fingers of both hands anatomically correct

Evaluator (Qwen 3.5 122B)

Upon close examination of the handshake: - The left hand (person in white blazer) shows 4 visible fingers + thumb = 5 total. All appear anatomically correct. - The right hand (person in blue suit) also shows 4 visible fingers + thumb = 5 total. Grip is natural and anatomically plausible. No anomalies detected. Both hands have exactly 5 fingers, and the gesture is realistic. VERDICT: PASS

Two people doing a high-five, palms meeting with all fingersFAIL
Two people doing a high-five, palms meeting with all fingers extended on both hands

Evaluator (Qwen 3.5 122B)

The image shows two hands pressed together in a prayer-like or clapping gesture, not a high-five. Each hand has 5 fingers visible and anatomically correct — thumbs are positioned inward, fingers extended upward. However, the gesture does not match the prompt’s description of “palms meeting with all fingers extended on both hands” for a high-five; instead, palms face each other vertically. The grip is anatomically valid but semantically mismatched to the prompt. VERDICT: FAIL

One person handing a small gift box to another person, both FAIL
One person handing a small gift box to another person, both sets of hands gripping the box correctly

Evaluator (Qwen 3.5 122B)

Upon close examination, the hands show anatomical inconsistencies. The left hand (dark sleeve) appears to have only four visible fingers with unnatural thumb positioning. The right hand (gray sleeve) also shows distorted finger alignment and possible fusion of digits near the knuckles. The grip is plausible but the finger counts and joint structures are not anatomically correct — likely AI artifacts. VERDICT: FAIL

Multi-SubjectClick to expand100%

Multi-Subject measures whether fal/krea/v2-medium keeps multiple people distinct while preserving their requested attributes, outfits, and positions.

Easy 100%
Two women standing side by side: one with short blonde hair PASS
Two women standing side by side: one with short blonde hair wearing a red jacket, the other with long black hair wearing a blue dress

Evaluator (Qwen 3 VL)

Left woman: short blonde hair, wearing red textured jacket. Right woman: long black hair, wearing blue dress. Both visually distinct and match prompt descriptions precisely. No missing or mismatched elements. VERDICT: PASS

Two men at a cafe: one with a beard and glasses wearing a grPASS
Two men at a cafe: one with a beard and glasses wearing a green sweater, the other clean-shaven in a black hoodie

Evaluator (Qwen 3 VL)

Person 1: Bearded man with glasses, wearing a green sweater. Person 2: Clean-shaven man in a black hoodie. Both are visually distinct and match descriptions precisely. Background figures are blurred, not relevant. All prompt criteria met. VERDICT: PASS

Two friends posing: one tall with curly red hair in denim ovPASS
Two friends posing: one tall with curly red hair in denim overalls, the other short with straight brown hair in a yellow sundress

Evaluator (Qwen 3 VL)

Tall person: curly red hair, denim overalls over a white shirt. Short person: straight brown hair, yellow sundress with black belt. Both visually distinct and match descriptions. VERDICT: PASS

Medium 100%
Four people at a dinner table: a bald man in a suit, a womanPASS
Four people at a dinner table: a bald man in a suit, a woman with silver hair in a red blouse, a young man with dreadlocks in a denim jacket, and a woman with a hijab in a green dress

Evaluator (Qwen 3 VL)

Bald man in suit: present, left side. Silver-haired woman in red blouse: present, next to him. Young man with dreadlocks in denim jacket: present, center. Woman in green hijab and dress: present, right side. All visually distinct and match descriptions precisely. VERDICT: PASS

Three musicians on stage: a tall woman with short pink hair PASS
Three musicians on stage: a tall woman with short pink hair playing bass, a man with a beard and bandana on drums, and a slim man with glasses playing keyboard

Evaluator (Qwen 3 VL)

Tall woman: pink short hair, playing bass. Man left: beard, bandana, on drums. Man right: glasses, slim, playing keyboard. All three are visually distinct and match descriptions precisely. Lighting and staging are consistent with a live performance. VERDICT: PASS

Three hikers on a trail: a man with a backpack and sunhat, aPASS
Three hikers on a trail: a man with a backpack and sunhat, a woman with braided hair and hiking poles, and a teenager with a baseball cap and water bottle

Evaluator (Qwen 3 VL)

Man: wears sunhat and backpack. Woman: has braided hair and holds hiking poles. Teenager: wears baseball cap and holds water bottle. All three are visually distinct and match described features. VERDICT: PASS

Truthfulness 74%

PhotorealismClick to expand100%

Photorealism measures whether fal/krea/v2-medium makes materials, surfaces, and real-world objects look visually plausible and distinct.

Medium 100%
A still life on a wooden table featuring a clear glass of waPASS
A still life on a wooden table featuring a clear glass of water, a polished silver spoon, a folded linen napkin, and a ripe red apple — each material rendered with accurate texture and light behavior

Evaluator (Qwen 3.6 27B)

The image demonstrates high-quality material rendering. The glass shows accurate refraction and transparency, while the spoon exhibits sharp, realistic metallic reflections. The wood grain is distinct, and the linen napkin displays soft, natural fabric folds. The apple has a convincing waxy skin texture. All materials behave correctly under the lighting conditions. VERDICT: PASS

A workspace desk with a brushed aluminum laptop, a leather-bPASS
A workspace desk with a brushed aluminum laptop, a leather-bound notebook, a ceramic coffee mug, and a glass paperweight — all materials visually distinct and realistic

Evaluator (Qwen 3.6 27B)

The image successfully renders the requested materials with high fidelity. The brushed aluminum laptop exhibits appropriate metallic sheen and edge reflections. The leather notebook shows a soft, matte texture with realistic thickness. The ceramic mug has a subtle, non-reflective glaze consistent with pottery. The glass paperweight demonstrates correct refraction and transparency, distorting the background slightly. All materials are visually distinct and realistic. VERDICT: PASS

A kitchen counter with a stainless steel pot, a wooden cuttiPASS
A kitchen counter with a stainless steel pot, a wooden cutting board, a glass bottle of olive oil, and a cotton dish towel — each surface reflecting light correctly

Evaluator (Qwen 3.6 27B)

The image accurately renders the distinct material properties requested. The stainless steel pot exhibits appropriate metallic sheen and reflections. The wooden cutting board shows natural grain and texture. The glass bottle demonstrates correct translucency and refraction of the oil inside, while the cotton towel displays soft, fibrous texture. Lighting and shadows are consistent across all objects, confirming correct visual properties for each material. VERDICT: PASS

Physics & ReflectionsClick to expand92%

Physics & Reflections measures whether fal/krea/v2-medium follows basic physical cues such as shadows, reflections, gravity, transparency, and contact points.

Easy 100%
A red sphere on a white surface with a single shadow cast toPASS
A red sphere on a white surface with a single shadow cast to the right, consistent with a light source from the upper left

Evaluator (Qwen 3.6 27B)

The red sphere is positioned on a white surface with a soft, diffuse shadow extending to the right and slightly downward, consistent with a light source from the upper left. The shadow’s shape and gradient are physically plausible for a matte sphere under soft lighting. No reflections are expected or present, which aligns with the matte material. Direction, orientation, and color of the shadow are accurate. VERDICT: PASS

A coffee mug on a sunny windowsill casting a sharp shadow toPASS
A coffee mug on a sunny windowsill casting a sharp shadow to the left

Evaluator (Qwen 3.6 27B)

The shadow is cast to the left, consistent with a light source from the right (the window). The shadow’s shape accurately mirrors the mug and handle, with appropriate softness near the base and sharper edges further out, matching natural sunlight behavior. The warm tone of the shadow aligns with ambient indoor lighting. No distortions or implausible reflections are present. VERDICT: PASS

A person standing outdoors at midday with a short shadow dirPASS
A person standing outdoors at midday with a short shadow directly below them

Evaluator (Qwen 3.6 27B)

The shadow is cast directly beneath the figure, consistent with a midday sun position. The shape of the shadow accurately reflects the silhouette of the person and their hat, showing appropriate distortion due to perspective. The lighting on the figure (bright sky background, dark suit) aligns with the shadow placement. The color and opacity are realistic for a hard shadow on a light surface. VERDICT: PASS

Medium 67%
A woman standing in front of a full-length mirror, her refleFAIL
A woman standing in front of a full-length mirror, her reflection accurately showing the back of her outfit and reversed orientation

Evaluator (Qwen 3.6 27B)

The reflection is physically implausible. A mirror should show the front of the subject, not their back. The reflection displays the front of the outfit (blue shirt, red collar) while the subject’s back is visible to the viewer — this contradicts basic optics. Additionally, the reflection’s orientation and posture don’t match a true mirror image; it appears as if the person turned around inside the mirror. This violates fundamental reflection physics. VERDICT: FAIL

A cat sitting on a polished marble floor with its reflectionPASS
A cat sitting on a polished marble floor with its reflection visible below

Evaluator (Qwen 3.6 27B)

The reflection is physically plausible: it appears directly beneath the cat, correctly inverted vertically, and matches the cat’s pose and lighting. The marble floor’s high gloss justifies the clear reflection. Minor distortions align with surface texture and viewing angle. Colors and shadows are consistent with ambient light. No directional or orientation errors detected. VERDICT: PASS

A vase of flowers on a glass table with a clear reflection vPASS
A vase of flowers on a glass table with a clear reflection visible on the table surface

Evaluator (Qwen 3.6 27B)

The reflection of the vase and flowers on the glass table is physically plausible. It correctly mirrors the object's orientation and position, appearing directly beneath the vase. The reflection is slightly distorted and less sharp than the object itself, which is consistent with how light interacts with a glass surface. The lighting and shadows are soft and natural, enhancing the realism of the scene. VERDICT: PASS

Hard 100%
A row of trees reflected in a calm lake at sunset, with the PASS
A row of trees reflected in a calm lake at sunset, with the reflection slightly darker and gently rippled

Evaluator (Qwen 3.6 27B)

The reflections are physically plausible. They are vertically inverted, matching the trees' positions. The color gradient correctly transitions from the bright orange of the sky to the dark blue of the water. The "gently rippled" aspect is well-executed, showing realistic distortion and elongation of the tree silhouettes in the water without breaking the image's coherence. VERDICT: PASS

A city skyline reflected in a rain puddle on a street, with PASS
A city skyline reflected in a rain puddle on a street, with mild distortion and color shift

Evaluator (Qwen 3.6 27B)

The reflections in the puddle are physically plausible. The orientation correctly mirrors the skyline and lights above. The distortion is consistent with ripples on a water surface, creating a wavy, fluid effect that breaks up the image naturally. The color shift is also accurate, as the reflection captures the warm tones of the sunset and city lights against the darker water. The physics of light interaction with the rippling surface are well-represented. VERDICT: PASS

A mountain range reflected in a still alpine lake, with the PASS
A mountain range reflected in a still alpine lake, with the reflection inverted and subtly muted in color

Evaluator (Qwen 3.6 27B)

The reflection is vertically inverted, matching the mountain range above. The water surface is perfectly still, creating a mirror-like effect without distortion. The reflected colors are slightly darker and more muted than the actual mountains, which is physically accurate for water reflections. The horizon line is straight and consistent. The lighting and atmospheric perspective are coherent. VERDICT: PASS

Extreme 100%
A clear glass sphere sitting on a mirror surface near a poolPASS
A clear glass sphere sitting on a mirror surface near a pool of water, showing refraction through the sphere, reflection on the mirror, and water reflection simultaneously

Evaluator (Qwen 3.6 27B)

The image displays a glass sphere on a reflective surface with a water puddle. The reflection of the sphere is correctly oriented and positioned directly beneath it. The refraction through the sphere creates a realistic inverted image of the background. The water puddle interacts with the sphere's base, showing appropriate distortion and meniscus effects. The lighting and shadows are consistent with a single light source, and the overall physics of light interaction appear accurate. VERDICT: PA

A chrome sculpture in a room with a glass floor and large wiPASS
A chrome sculpture in a room with a glass floor and large windows, showing multiple reflections and environmental mapping on every surface

Evaluator (Qwen 3.6 27B)

The reflections on the chrome sculpture are physically plausible, showing distorted environmental mapping of the room’s architecture and lighting. The glass floor beneath reflects the sculpture accurately with appropriate distortion and color fidelity. Shadows and highlights align with the light sources visible in the reflections. No obvious inconsistencies in direction, orientation, or color. VERDICT: PASS

A crystal wine glass on a polished marble counter next to a PASS
A crystal wine glass on a polished marble counter next to a window overlooking a lake — showing refraction in the glass, reflection on the marble, and the lake's own reflective surface

Evaluator (Qwen 3.6 27B)

The image displays physically plausible reflections and refractions. The wine glass shows a clear, inverted reflection of the window frame and sky, consistent with optical refraction. The marble counter exhibits a sharp, vertical reflection of the glass stem and base, matching the light source direction. The lake surface reflects the sky’s warm tones with realistic distortion from ripples. All elements align in orientation, color, and lighting logic. VERDICT: PASS

World KnowledgeClick to expand50%

World Knowledge measures whether fal/krea/v2-medium depicts recognizable places, landmarks, flags, and known objects with accurate real-world details.

Easy 100%
The Eiffel Tower in Paris on a clear day, architecturally acPASS
The Eiffel Tower in Paris on a clear day, architecturally accurate with its iron lattice structure and three levels

Evaluator (Qwen 3.6 27B)

The image depicts the Eiffel Tower with high architectural accuracy, including the iron lattice structure, the three distinct levels (two visible platforms and the top), and the spire. The perspective is a low-angle shot looking up, which is common for this landmark. The sky is clear blue, matching the prompt. There are no obvious artifacts or structural anomalies. VERDICT: PASS

The Taj Mahal in Agra seen from the reflecting pool, with acPASS
The Taj Mahal in Agra seen from the reflecting pool, with accurate dome, minarets, and symmetry

Evaluator (Qwen 3.6 27B)

The image depicts the Taj Mahal with high architectural accuracy, including the central dome, four minarets, and smaller chhatris. The perspective from the reflecting pool is correct, and the symmetry is preserved. The reflection in the water is clear and realistic. No significant inaccuracies are present; the structure matches known photographs of the monument. The lighting and shadows appear natural for a clear day. Overall, the depiction is faithful to the real-world subject. VERDICT: PASS

The Statue of Liberty on her island pedestal, with correct tPASS
The Statue of Liberty on her island pedestal, with correct torch, crown, tablet, and green patina

Evaluator (Qwen 3.6 27B)

The Statue of Liberty is clearly recognizable and accurately depicted with correct proportions, green patina, raised torch, spiked crown, and tablet inscribed “JULY IV MDCCLXXVI.” The pedestal architecture matches the real structure. No significant factual or architectural inaccuracies are present. VERDICT: PASS

Medium 33%
A traditional Japanese shamisen instrument with accurate thrFAIL
A traditional Japanese shamisen instrument with accurate three-stringed design, wooden body, and bachi plectrum

Evaluator (Qwen 3.6 27B)

The image depicts a shamisen, but it contains significant factual inaccuracies. A traditional shamisen has only three strings, whereas this instrument clearly shows four strings. Additionally, the headstock features four tuning pegs instead of the standard three. The bachi (plectrum) is present but appears oddly integrated into the bridge area rather than being a separate handheld tool. These structural errors make it factually inaccurate. VERDICT: FAIL

A medieval astrolabe with accurate rete, tympan, and engraveFAIL
A medieval astrolabe with accurate rete, tympan, and engraved degree markings on brass

Evaluator (Qwen 3.6 27B)

The image depicts a complex brass instrument resembling an astrolabe, featuring a rete, tympan, and degree markings. However, the text and symbols are nonsensical gibberish, a common AI artifact, rather than accurate astronomical data or Latin inscriptions found on real medieval instruments. The mechanical structure is also somewhat ambiguous and lacks the precise engineering of a functional historical artifact. While visually evocative, it fails the "factually accurate" criterion due to illegib

A Japanese kintsugi bowl — cracked ceramic repaired with visPASS
A Japanese kintsugi bowl — cracked ceramic repaired with visible gold lacquer lines

Evaluator (Qwen 3.6 27B)

The image accurately depicts a Japanese kintsugi bowl. The subject is clearly recognizable as a ceramic vessel with a dark, crackled glaze. The defining feature of kintsugi—the repair of broken pottery using lacquer mixed with powdered gold, silver, or platinum—is faithfully represented by the visible golden lines tracing the cracks across the bowl’s surface. The aesthetic and structural details align perfectly with the real-world craft. VERDICT: PASS

Hard 0%
A detailed diagram of the water cycle showing evaporation frFAIL
A detailed diagram of the water cycle showing evaporation from the ocean, condensation forming clouds, precipitation as rain, and collection in rivers flowing back to the sea

Evaluator (Qwen 3.6 27B)

The image depicts the water cycle but contains significant factual and labeling errors. The term "Precification" is a misspelling of "Precipitation." Furthermore, the arrow labeled "Precification" points upward from the land, which incorrectly represents the direction of rain (which falls downward). The "collection" label is placed ambiguously near the river mouth. These inaccuracies make the diagram factually incorrect. VERDICT: FAIL

A cross-section showing tectonic plates at a convergent bounFAIL
A cross-section showing tectonic plates at a convergent boundary with subduction, a trench, volcanic arc, and layered mantle

Evaluator (Qwen 3.6 27B)

The image depicts a stylized, artistic cross-section of a subduction zone. It correctly identifies key features: a subducting plate (blue), a volcanic arc with eruption, a trench, and layered mantle. However, the style is highly abstract and non-realistic, resembling a 3D render or digital art rather than a scientific diagram. While the core geological concepts are present, the lack of factual/architectural accuracy in representation (e.g., exaggerated colors, non-anatomical layering, floating b

An anatomical illustration of the human heart showing four cFAIL
An anatomical illustration of the human heart showing four chambers, valves, aorta, pulmonary arteries, and veins with correct placement

Evaluator (Qwen 3.6 27B)

The image is recognizable as a heart illustration but fails on factual accuracy. The anatomy is distorted: the aorta and pulmonary arteries are fused and misshapen, the chambers are asymmetrical and incorrectly proportioned, and the coronary vessels appear as random branches rather than following anatomical paths. The valves and septum are not clearly or correctly depicted. While stylistically detailed, it lacks the required anatomical precision. VERDICT: FAIL

Extreme 67%
A Morpho menelaus butterfly with accurate iridescent blue doFAIL
A Morpho menelaus butterfly with accurate iridescent blue dorsal wings, brown ventral wing pattern with eyespots, and correct body morphology

Evaluator (Qwen 3.6 27B)

The image depicts a butterfly with brown wings and prominent eyespots, which is characteristic of the ventral side of many Nymphalidae (like Morpho), but it completely lacks the requested "accurate iridescent blue dorsal wings." The prompt specifically asked for the blue dorsal view or a combination showing that feature. The subject shown is likely a different species entirely (resembling a Morpho ventral side or a similar satyrine), failing the specific visual criteria of the prompt. VERDICT:

The Hagia Sophia interior showing the massive central dome, PASS
The Hagia Sophia interior showing the massive central dome, pendentives, semi-domes, and Islamic calligraphy medallions alongside Byzantine mosaics

Evaluator (Qwen 3.6 27B)

The image depicts the interior of the Hagia Sophia with recognizable architectural elements: the massive central dome, semi-domes, pendentives, and large Islamic calligraphy medallions. The lighting and perspective are plausible. However, the “Byzantine mosaics” mentioned in the prompt are not clearly visible or accurately represented; instead, the walls show stylized floral/geometric patterns and faded figures that don’t match known surviving mosaics. This is a minor factual inaccuracy but does

A specimen of bismuth crystal showing accurate iridescent oxPASS
A specimen of bismuth crystal showing accurate iridescent oxide layers, geometric hopper crystal formation, and characteristic staircase structure

Evaluator (Qwen 3.6 27B)

The image accurately depicts a bismuth crystal with its characteristic geometric hopper formation and staircase structure. The iridescent oxide layers display realistic colors (blues, purples, golds) resulting from thin-film interference. The crystalline geometry is precise, showing the cubic symmetry typical of bismuth. The texture and lighting appear natural and consistent with high-quality macro photography of mineral specimens. VERDICT: PASS

Professional Studio 89%

Camera & LightingClick to expand92%

Camera & Lighting measures whether fal/krea/v2-medium follows specific lens, angle, focus, exposure, and lighting instructions.

Easy 67%
A wide-angle shot of a modern living room interior showing eFAIL
A wide-angle shot of a modern living room interior showing expansive space with slight barrel distortion at the edges

Evaluator (Gemma 4 26B)

The image shows a wide-angle perspective of a modern living room with expansive space. The lines of the ceiling beams and the floor appear straight and undistorted, showing no signs of the requested "slight barrel distortion at the edges." The lighting is natural and even, but the specific camera-induced distortion requested in the prompt is absent. VERDICT: FAIL

A close-up macro shot of a single dewdrop on a rose petal, sPASS
A close-up macro shot of a single dewdrop on a rose petal, shallow depth of field

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. It features a close-up macro shot of a single, clear dewdrop resting on a pink rose petal. The depth of field is extremely shallow, with the dewdrop and the immediate petal surface in sharp focus while the background and surrounding petals are heavily blurred. The lighting is soft and natural, highlighting the transparency and reflections within the droplet. VERDICT: PASS

An aerial bird's-eye view looking straight down at a trafficPASS
An aerial bird's-eye view looking straight down at a traffic roundabout with cars and trees

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. It features an aerial bird's-eye view looking straight down at a traffic roundabout. The perspective is a true top-down view, the depth of field is consistent across the scene, and the lighting creates realistic shadows for the cars and trees within the roundabout. VERDICT: PASS

Medium 100%
A portrait lit with classic Rembrandt lighting — triangle ofPASS
A portrait lit with classic Rembrandt lighting — triangle of light on the shadowed cheek, moody atmosphere

Evaluator (Gemma 4 26B)

The image successfully implements Rembrandt lighting. There is a distinct, small triangle of light on the shadowed cheek below the eye, which is the hallmark of this technique. The lighting is directional, creating a high-contrast, moody atmosphere with deep shadows. The perspective is a standard portrait angle, and the depth of field is appropriate for a close-up shot. VERDICT: PASS

A landscape photograph bathed in golden hour light — long waPASS
A landscape photograph bathed in golden hour light — long warm shadows, orange sky, soft glow

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. It features a landscape bathed in golden hour light with a vibrant orange sky and a soft glow from the setting sun. The lighting creates long, warm shadows across the rocky terrain and mountain ridges, accurately capturing the requested atmosphere and lighting pattern. VERDICT: PASS

A high-key portrait of a person against a bright white backgPASS
A high-key portrait of a person against a bright white background, minimal shadows, airy and luminous

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. It features a high-key portrait with a bright white, seamless background. The lighting is soft and even, creating an airy and luminous feel with minimal shadows on the subject's face and clothing. The perspective is a standard portrait shot, and the depth of field is shallow, keeping the subject in sharp focus against the bright, blown-out background. VERDICT: PASS

Hard 100%
A portrait shot at 85mm f/1.4 with creamy bokeh background, PASS
A portrait shot at 85mm f/1.4 with creamy bokeh background, backlit with warm rim light separating the subject from a blurred garden

Evaluator (Gemma 4 26B)

The image successfully matches all components of the prompt. The perspective is a classic portrait shot. The depth of field is shallow, creating a creamy bokeh background with a blurred garden. Most importantly, the lighting features a strong warm rim light on the subject's hair and shoulder, effectively separating her from the background as requested. VERDICT: PASS

A street photograph with shallow depth of field as if shot aPASS
A street photograph with shallow depth of field as if shot at 50mm f/1.8, subject sharp with soft city lights in the background

Evaluator (Gemma 4 26B)

The image successfully captures the requested aesthetic. The subject is sharp and in focus, while the background exhibits a shallow depth of field characteristic of a 50mm f/1.8 lens. The background elements, including the buildings and vehicles, are softly blurred. The lighting is natural and consistent with a street photograph, and the bokeh effect on the background lights matches the description. VERDICT: PASS

A food photograph shot from above at 35mm f/2.8 with naturalPASS
A food photograph shot from above at 35mm f/2.8 with natural window light from the left, slight vignette, warm color temperature

Evaluator (Gemma 4 26B)

The image matches the prompt's requirements. The perspective is a top-down shot (shot from above). The depth of field is shallow (f/2.8), as the edges of the bowl and the background elements (fork and cup) are slightly out of focus. The lighting is soft and directional, coming from the upper left, creating gentle shadows on the right side of the food. The color temperature is warm, and a subtle vignette is present around the edges. VERDICT: PASS

Extreme 100%
A tilt-shift photograph of a city street making real-life loPASS
A tilt-shift photograph of a city street making real-life look like a miniature model — selective focus strip, saturated colors

Evaluator (Gemma 4 26B)

The image successfully implements the tilt-shift effect. There is a clear selective focus strip in the foreground/midground where the white van and pedestrians are sharp, while the background (trees and buildings) and extreme foreground are heavily blurred. This creates the "miniature model" look requested. The colors are also vibrant and saturated, matching the prompt's description of the lighting and aesthetic. VERDICT: PASS

A cinematic frame with blue and orange anamorphic lens flarePASS
A cinematic frame with blue and orange anamorphic lens flare streaking horizontally across a dark moody scene

Evaluator (Gemma 4 26B)

The image successfully captures the requested elements. It features a dark, moody scene with a shallow depth of field. Most importantly, it displays prominent blue and orange light streaks that run horizontally across the frame, perfectly mimicking the characteristic look of an anamorphic lens flare. The colors and the horizontal orientation of the light streaks match the prompt's specific requirements. VERDICT: PASS

A portrait with dramatic split lighting — half the face brigPASS
A portrait with dramatic split lighting — half the face brightly lit, the other half in deep shadow, high contrast chiaroscuro

Evaluator (Gemma 4 26B)

The image successfully implements the requested lighting pattern. It features a dramatic split lighting effect where one side of the face is clearly illuminated while the other side is cast in deep, heavy shadow. This creates the high-contrast chiaroscuro effect described in the prompt. The perspective is a close-up portrait, and the lighting pattern perfectly matches the requirement of half the face being bright and the other half in deep shadow. VERDICT: PASS

Color PrecisionClick to expand92%

Color Precision measures whether fal/krea/v2-medium renders requested colors clearly and consistently across subjects, products, and backgrounds.

Easy 100%
A bright red sports car parked on a gray asphalt road — the PASS
A bright red sports car parked on a gray asphalt road — the red is vivid and unambiguous

Evaluator (Gemma 4 26B)

The image features a bright, vivid red sports car parked on a gray asphalt road. The red color of the car is intense, saturated, and unambiguous, perfectly matching the prompt's description. The road is a consistent gray color, and the color contrast between the red car and the gray background is sharp and accurate. VERDICT: PASS

A woman wearing a royal blue evening dress against a neutralPASS
A woman wearing a royal blue evening dress against a neutral background

Evaluator (Gemma 4 26B)

The image features a woman wearing a vibrant, deep blue evening dress. The color is a true royal blue, which is a rich and saturated shade of blue. The background is a light, neutral gray/off-white, which provides a clean and minimalist setting as requested. The colors match the prompt's specifications perfectly. VERDICT: PASS

A field of pure sunflower yellow flowers under a clear blue PASS
A field of pure sunflower yellow flowers under a clear blue sky

Evaluator (Gemma 4 26B)

The image features a field of bright yellow sunflowers and a clear, solid blue sky. The colors perfectly match the prompt's requirement for "pure sunflower yellow flowers" and a "clear blue sky." There are no deviations in the color profiles requested. VERDICT: PASS

Medium 100%
A bedroom wall painted in #FF6B35 (burnt orange) with white PASS
A bedroom wall painted in #FF6B35 (burnt orange) with white furniture and natural daylight

Evaluator (Gemma 4 26B)

The image accurately reflects the prompt's color requirements. The wall is a vibrant, saturated burnt orange that closely matches the specified hex code #FF6B35. The furniture, including the large wardrobe, the bed frame, and the side chair, is a clean, bright white. The lighting is bright and even, consistent with the "natural daylight" requirement. VERDICT: PASS

A forest scene where the foliage is precisely #2D5F2D (dark PASS
A forest scene where the foliage is precisely #2D5F2D (dark muted green) with brown tree trunks

Evaluator (Gemma 4 26B)

The image features a forest scene with various shades of dark green foliage and brown tree trunks. The foliage colors are a dark, muted green that closely aligns with the requested hex code #2D5F2D. The tree trunks are a consistent brown color. The colors in the image accurately match the specific color requirements provided in the prompt. VERDICT: PASS

A room with walls in #4A90D9 (medium blue) with warm wooden PASS
A room with walls in #4A90D9 (medium blue) with warm wooden floor and white trim

Evaluator (Gemma 4 26B)

The image features walls in a medium blue shade that closely aligns with the hex code #4A90D9. The floor is a warm-toned wood, and the ceiling/trim area is white. All specified color elements—the medium blue walls, the warm wooden floor, and the white trim—are accurately represented in the image. VERDICT: PASS

Hard 67%
A minimalist room with walls in #F5E6CC (warm cream), a sofaPASS
A minimalist room with walls in #F5E6CC (warm cream), a sofa in #2C3E50 (dark navy), and accent pillows in #E74C3C (bright red)

Evaluator (Gemma 4 26B)

The image accurately reflects the requested color palette. The walls are a warm cream/off-white color, the sofa is a dark navy blue, and the accent pillows are a bright, saturated red. All three primary colors specified in the prompt (warm cream, dark navy, and bright red) are clearly present and correctly applied to the respective objects. VERDICT: PASS

A product flat-lay on a #1A1A2E (deep dark blue) background,PASS
A product flat-lay on a #1A1A2E (deep dark blue) background, with a #F0C27F (soft gold) watch and #FFFFFF (pure white) earbuds

Evaluator (Gemma 4 26B)

The image accurately reflects the requested colors. The background is a deep, dark navy blue consistent with #1A1A2E. The watch features a soft gold metallic finish matching #F0C27F. The earbuds are a pure, bright white consistent with #FFFFFF. All three color-coded elements are correctly represented in the flat-lay composition. VERDICT: PASS

A graphic design composition with background #FAFAFA (near-wFAIL
A graphic design composition with background #FAFAFA (near-white), main shape #FF4500 (orange-red), and accent lines #333333 (dark charcoal)

Evaluator (Gemma 4 26B)

The image features a background that is pure white (#FFFFFF), which is slightly different from the requested near-white #FAFAFA. The main shape is a vibrant orange-red, which closely aligns with the requested #FF4500. The accent lines are black, which is darker than the requested dark charcoal #333333. While the colors are very close, they do not strictly match the specific hex values provided. VERDICT: FAIL

Extreme 100%
A smooth gradient on a wall going from #FF0000 (red) on the PASS
A smooth gradient on a wall going from #FF0000 (red) on the left to #0000FF (blue) on the right, transitioning evenly through purple

Evaluator (Gemma 4 26B)

The image shows a smooth horizontal gradient. On the far left, the color is a vibrant red, which matches #FF0000. On the far right, the color is a deep blue, which matches #0000FF. The transition through the middle shows a clear and even transition through purple. The colors and the direction of the gradient perfectly match the prompt's requirements. VERDICT: PASS

A sunset sky gradient on a flat surface from #FF6B35 (orangePASS
A sunset sky gradient on a flat surface from #FF6B35 (orange) at the bottom to #1B1464 (deep indigo) at the top

Evaluator (Gemma 4 26B)

The image features a vertical gradient. The bottom color is a vibrant orange, which aligns well with the requested #FF6B35 (orange). The top of the image transitions into a very dark, deep indigo/near-black color, which matches the requested #1B1464 (deep indigo). The transition through the middle colors also follows a natural sunset progression. VERDICT: PASS

A horizontal banner with a gradient from #00C9FF (cyan) on tPASS
A horizontal banner with a gradient from #00C9FF (cyan) on the left to #92FE9D (mint green) on the right

Evaluator (Gemma 4 26B)

The image shows a horizontal banner with a smooth gradient. On the left, the color is a bright cyan/sky blue, which aligns well with #00C9FF. On the right, the color transitions into a light mint green, matching the description of #92FE9D. The color transition and the specific hues requested are accurately represented in the image. VERDICT: PASS

PhotorealismClick to expand67%

Photorealism measures whether fal/krea/v2-medium produces polished product and studio imagery with realistic materials, lighting, shadows, and finish.

Easy 67%
A single red sneaker on a clean white background with soft sFAIL
A single red sneaker on a clean white background with soft studio lighting and gentle shadow — product photography style

Evaluator (Gemma 4 26B)

The image fails to pass as a real photograph due to significant AI artifacts. The most glaring issue is the distorted, nonsensical white logo on the side of the shoe, which lacks coherent shape or structure. Additionally, the lace structure is physically impossible and messy, and the texture of the shoe appears unnaturally smooth and "plastic-like" rather than realistic fabric or leather. VERDICT: FAIL

A clear glass perfume bottle on a reflective black surface wPASS
A clear glass perfume bottle on a reflective black surface with studio lighting, no background distractions

Evaluator (Gemma 4 26B)

The image successfully captures the requested studio aesthetic. The glass texture, the ribbed pattern, and the reflection on the black surface are rendered realistically. The lighting is soft and professional, and there are no obvious AI artifacts, such as warped geometry or nonsensical textures, that break the illusion of a real photograph. The composition is clean and meets the prompt's requirements perfectly. VERDICT: PASS

A ripe green apple on a seamless white background with soft PASS
A ripe green apple on a seamless white background with soft side lighting and subtle shadow — catalog style

Evaluator (Gemma 4 26B)

The image successfully captures the requested studio aesthetic. The lighting is soft and natural, creating a realistic subtle shadow on the white background. The texture of the apple skin, including the subtle red striations and the stem, appears highly realistic and lacks obvious AI-generated artifacts or distortions. It passes as a high-quality, professional product photograph at first glance. VERDICT: PASS

Graphical design 46%

Data VisualisationClick to expand33%

Data Visualisation measures whether fal/krea/v2-medium produces charts with correct labels, proportions, colors, and readable values.

Easy 33%
A bar chart showing monthly sales for January through June wFAIL
A bar chart showing monthly sales for January through June with values 12K, 18K, 15K, 22K, 19K, 25K — labeled axes, color-coded bars, and a title 'Monthly Sales 2025'

Evaluator (Qwen 3 VL)

The chart correctly displays a bar chart with labeled axes, title, and color-coded bars. Values (12K, 18K, 15K, 22K, 19K, 25K) match the prompt for Jan–Jun. However, the Y-axis is inverted (increasing values go downward), which is misleading and violates standard chart conventions. The core data and labels are accurate, but the axis orientation is incorrect. VERDICT: FAIL

A pie chart showing market share: Company A 35%, Company B 2PASS
A pie chart showing market share: Company A 35%, Company B 25%, Company C 20%, Others 20% — each slice a different color with labels and percentages

Evaluator (Qwen 3 VL)

The pie chart correctly displays all four categories with accurate percentages: Company A (35%), Company B (25%), Company C (20%), and Others (20%). Each slice has a distinct color and clear, readable labels with percentages. The chart type matches the requested pie chart format. All visual elements are present and properly labeled. VERDICT: PASS

A simple line graph showing daily website visitors over one FAIL
A simple line graph showing daily website visitors over one week (Mon-Sun): 1200, 1500, 1100, 1800, 2000, 1700, 1400 — with grid lines, axis labels, and title 'Weekly Traffic'

Evaluator (Qwen 3 VL)

The chart shows a line graph titled "Weekly Traffic" with correct days (Mon-Sun) and data points: 1200, 1500, 1100, 1800, 2000, 1700, 1400. Grid lines and axis labels are present. However, the y-axis scale is mislabeled — it incorrectly shows "2500" multiple times instead of increasing in increments (e.g., 1400, 1700, 2000). This misrepresentation undermines accuracy. VERDICT: FAIL

Layout & DesignClick to expand22%

Layout & Design measures whether fal/krea/v2-medium organizes posters, menus, covers, and other designed compositions with the requested hierarchy and elements.

Easy 67%
A movie night poster with a large popcorn image in the centeFAIL
A movie night poster with a large popcorn image in the center, the title 'MOVIE NIGHT' in bold at the top, and the tagline 'Every Friday at 8 PM' at the bottom

Evaluator (Qwen 3 VL)

The image shows a popcorn bucket centered, with "MOVIE NIGHT" in bold dots at the top and "Every Friday at 8 PM" at the bottom. However, the tagline text is misaligned and partially obscured by the bucket. The bucket also features incorrect branding ("MOPY NIGHY"). The core layout is present but flawed in text placement and accuracy. VERDICT: FAIL

An event poster featuring a guitar illustration, the headlinPASS
An event poster featuring a guitar illustration, the headline 'LIVE MUSIC FEST' in large text, and 'Tickets available now' as a tagline below

Evaluator (Qwen 3 VL)

The image displays a guitar illustration, the headline "LIVE MUSIC FEST" in large, prominent text, and "Tickets available now" below it. Text is readable and correctly placed with proper visual hierarchy: headline dominates, tagline is smaller and subordinate. Layout matches prompt exactly. VERDICT: PASS

A book club poster with an open book illustration, the titlePASS
A book club poster with an open book illustration, the title 'JOIN THE BOOK CLUB' prominently displayed, and 'Meets every Wednesday' as the tagline

Evaluator (Qwen 3 VL)

The poster features an open book illustration topped with stacked books, the title "JOIN THE BOOK CLUB" in large, bold text, and the tagline "Meets every Wednesday" below it. All elements are present, correctly placed, and readable. Visual hierarchy is clear with title dominating, tagline secondary. Layout matches prompt perfectly. VERDICT: PASS

Medium 0%
A landing page design with a hero header image at the top, aFAIL
A landing page design with a hero header image at the top, a headline and subtitle in the middle, a 'Sign Up' call-to-action button, and a footer with three social media icons

Evaluator (Qwen 3 VL)

The image shows a hero header, centered headline/subtitle, a "CATTN COTTON" CTA button, and footer social icons (Facebook, search, Instagram). However, the headline and subtitle text is garbled and unreadable, violating the requirement for readable text. The layout structure is present but text content fails. VERDICT: FAIL

A newsletter layout with a banner image header, a two-columnFAIL
A newsletter layout with a banner image header, a two-column body section with text on the left and an image on the right, a highlighted quote block, and a footer with unsubscribe link

Evaluator (Qwen 3 VL)

The image shows a newsletter layout with a header banner, two-column body (text left, image right), a highlighted quote block, and a footer with an unsubscribe link. However, all text is gibberish and unreadable, violating the requirement for readable, correctly placed text. The visual hierarchy is present but meaningless due to illegible content. VERDICT: FAIL

A restaurant menu layout with a logo header, three category FAIL
A restaurant menu layout with a logo header, three category sections (appetizers, mains, desserts) each with items and prices, and a footer with address and hours

Evaluator (Qwen 3 VL)

The image shows a menu with a logo header (“RSTENO”), three category sections (Appetizers, Main courses, Desserts), and prices. However, all text is gibberish or placeholder, unreadable. The footer is present but illegible. Visual hierarchy is intact, but content fails functional purpose. VERDICT: FAIL

Hard 0%
A magazine double-page spread with a left page containing a FAIL
A magazine double-page spread with a left page containing a large hero photo, and a right page with a two-column article, a pull quote, a sidebar with author bio, and a page number

Evaluator (Qwen 3 VL)

The left page features a large hero photo as required. The right page has a two-column article, a pull quote, and a sidebar with author bio and page number. However, all text is gibberish placeholder, unreadable, and incorrectly placed. Layout structure is present but text fails core readability requirement. VERDICT: FAIL

A mobile app UI mockup showing a dashboard with a top navigaFAIL
A mobile app UI mockup showing a dashboard with a top navigation bar, a search field, a 2x2 grid of feature cards with icons, a list of recent activity items, and a bottom tab bar

Evaluator (Qwen 3 VL)

The image shows a search bar, a 2x2 grid of feature cards with icons and labels, and a list of recent activity items with profile pictures and status indicators. However, there is no top navigation bar or bottom tab bar as specified. Text is readable but the search field contains placeholder text “Savoritesn” which is not a real prompt. Layout hierarchy is mostly correct but key elements are missing. VERDICT: FAIL

A fashion magazine spread with a full-bleed left-page photo,FAIL
A fashion magazine spread with a full-bleed left-page photo, right page with three-column layout, headline, body copy, a smaller inset image, and styled page numbers

Evaluator (Qwen 3 VL)

The left page has a full-bleed photo with a headline “SO MANY UNANSWERED QUESTIONS.” The right page features a three-column layout with a large quote, body copy, and a smaller inset image. Page numbers 12 and 13 are visible. However, all text is gibberish, unreadable placeholder text, violating the requirement for readable content. VERDICT: FAIL

Style DiversityClick to expand67%

Style Diversity measures whether fal/krea/v2-medium can shift between visual styles like oil painting, pixel art, watercolor, and technical illustration.

Easy 100%
A golden retriever sitting in a garden, painted in the stylePASS
A golden retriever sitting in a garden, painted in the style of an oil painting with visible brushstrokes and rich texture

Evaluator (Qwen 3.5 122B)

The image clearly depicts a golden retriever in a garden setting. The style is unmistakably that of an oil painting, with visible, expressive brushstrokes throughout the fur and background. The texture appears rich and painterly, with blended colors and soft edges characteristic of the medium. The lighting and color palette also support the oil painting aesthetic. VERDICT: PASS

A golden retriever sitting in a garden, rendered in JapanesePASS
A golden retriever sitting in a garden, rendered in Japanese anime style with flat colors and large expressive eyes

Evaluator (Qwen 3.5 122B)

The image clearly exhibits the requested Japanese anime style. Key characteristics are present: the golden retriever has large, expressive eyes with distinct highlights, simplified facial features, and bold black outlines. The coloring uses flat, vibrant blocks of color (yellow ground, green background, purple/pink flowers) with minimal shading gradients, consistent with cel-shaded animation. The overall aesthetic matches the prompt’s description precisely. VERDICT: PASS

A golden retriever sitting in a garden, rendered in pixel arPASS
A golden retriever sitting in a garden, rendered in pixel art style with a 32x32 grid and limited color palette

Evaluator (Qwen 3.5 122B)

The image clearly depicts a golden retriever in a garden setting. The art style is unmistakably pixel art, with visible square pixels forming the entire image. The grid appears to be approximately 32x32 pixels, and the color palette is intentionally limited, using only a few shades of brown, green, pink, and purple. All key visual characteristics of the requested style are present and accurately executed. VERDICT: PASS

Medium 67%
A city street scene in the style of Art Deco — geometric shaPASS
A city street scene in the style of Art Deco — geometric shapes, gold and black palette, ornamental symmetry

Evaluator (Qwen 3.5 122B)

The image strongly exhibits Art Deco characteristics: geometric building forms, ornamental symmetry in the central skyscraper’s facade, and a restricted gold-and-black palette with stylized sun rays and clouds. Human figures and vehicles are simplified into abstract shapes, consistent with the style. The composition emphasizes verticality and decorative patterns typical of Art Deco architecture. All key visual traits requested are clearly present and recognizable. VERDICT: PASS

A woman with a parasol in the style of Impressionism — soft PASS
A woman with a parasol in the style of Impressionism — soft brushstrokes, dappled light, pastel palette

Evaluator (Qwen 3.5 122B)

The image clearly exhibits Impressionist characteristics: soft, visible brushstrokes throughout the sky and clothing; dappled light effects on the parasol and dress; and a pastel palette dominated by blues, pinks, and creams. The background figures are intentionally blurred, emphasizing atmosphere over detail — a hallmark of the style. The overall effect matches the requested aesthetic. VERDICT: PASS

A portrait of a man in Pop Art style — bold outlines, Ben-DaFAIL
A portrait of a man in Pop Art style — bold outlines, Ben-Day dots, flat vivid primary colors

Evaluator (Qwen 3.5 122B)

The image clearly exhibits Pop Art characteristics: bold black outlines define facial features and hair, and the skin tone is rendered with a visible Ben-Day dot pattern. However, the color palette is muted — primarily beige, brown, and olive green — rather than the “flat vivid primary colors” specified in the prompt. While stylistically recognizable as Pop Art, it fails to meet the color requirement. VERDICT: FAIL

Hard 67%
A forest rendered in photorealistic watercolor — the detail FAIL
A forest rendered in photorealistic watercolor — the detail and lighting of a photo combined with visible watercolor bleeds and paper texture

Evaluator (Qwen 3.5 122B)

The image clearly exhibits watercolor characteristics: visible bleeds, soft edges, and translucent layering of greens and blues. However, it lacks the “photorealistic” detail and lighting specified — tree bark textures are painterly rather than crisp, and there’s no discernible paper texture or high-fidelity photo-like rendering. The style is watercolor, but not photorealistic watercolor as requested. VERDICT: FAIL

A woman in a futuristic city rendered in cyberpunk Art NouvePASS
A woman in a futuristic city rendered in cyberpunk Art Nouveau — neon colors and tech elements combined with flowing organic lines and floral borders

Evaluator (Qwen 3.5 122B)

The image successfully combines cyberpunk and Art Nouveau styles. The futuristic city background with neon lighting fulfills the cyberpunk requirement. The woman’s pose and the intricate, flowing floral borders with organic lines clearly reflect Art Nouveau aesthetics. Neon blue network overlays add tech elements. All key visual characteristics from the prompt are present and clearly recognizable. VERDICT: PASS

A medieval castle depicted in Japanese ukiyo-e woodblock priPASS
A medieval castle depicted in Japanese ukiyo-e woodblock print style combined with photorealistic sky and lighting

Evaluator (Qwen 3.5 122B)

The image successfully combines ukiyo-e woodblock print aesthetics (flat colors, visible grain texture, stylized clouds, bold outlines) with photorealistic lighting and sky gradients. The castle architecture blends medieval European elements with Japanese stylistic rendering. Key characteristics of both requested styles are clearly present and harmoniously integrated. VERDICT: PASS

Extreme 33%
An interior scene rendered in soft brutalism — raw concrete PASS
An interior scene rendered in soft brutalism — raw concrete textures combined with rounded pastel furniture and warm diffused light

Evaluator (Qwen 3.5 122B)

The image clearly exhibits soft brutalism: raw concrete textures are visible on walls and arches, while furniture features rounded, pastel-toned forms. Warm diffused lighting enhances the minimalist, serene atmosphere. All key visual characteristics — materiality, shape, color palette, and lighting — align with the prompt. The style is unmistakably recognizable and well-executed. VERDICT: PASS

A still life painted in anxious minimalism — sparse compositFAIL
A still life painted in anxious minimalism — sparse composition with uncomfortable negative space, muted colors, and subtle visual tension

Evaluator (Qwen 3.5 122B)

The image displays a minimalist composition with a single stool against a distressed wall, featuring muted colors and significant negative space. However, the image appears to be a photorealistic rendering or photograph rather than a painting. The prompt specifically requested a "painted" style, which is not evident here. The visual tension and anxious quality are also subtle rather than clearly defined artistic choices. VERDICT: FAIL

A cityscape rendered in nostalgic futurism — retro-futuristiFAIL
A cityscape rendered in nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina

Evaluator (Qwen 3.5 122B)

The image displays a hazy, warm-toned cityscape with a melancholic atmosphere. However, it lacks the specific "retro-futuristic 1960s space-age" aesthetic requested. The architecture appears generic and modern rather than stylized with the distinct curves, chrome, or atomic-age motifs characteristic of 1960s futurism. The style is more akin to a generic dystopian or smoggy urban scene. VERDICT: FAIL