ImageBench

ImageBench V1 —

192 evaluations across 6 categories

Benchmark V1 verdicts are produced by VLM judges and can contain mistakes. Treat PASS/FAIL labels as machine-assisted assessments, and inspect the images yourself. Learn more about the methodology.

Generation Details

Source-backed model context, size, cost, and request settings for this ImageBench V1 run.

local/sefi-image-2b-turbo

Local

SeFi Image 2B Turbo is a locally fine-tuned text-to-image model produced by the SeFi image-generation fine-tuning pipeline and run on an NVIDIA DGX Spark. It is the ~2B turbo (few-step) variant of the SeFi Image family. It is not a publicly released hosted product; no external model card or citation is disclosed.

Maker
SeFi pipeline
Family
SeFi Image
Model Size
~2B
estimated
Cost
local run; no API price
not_applicable
Run Target
gx10/sefi-image-2b-turbo
Effective Request
Effective request fields unknown
45.8
Overall
66%
Capability
26.1
Est. Preference
126
Pass
66
Fail
3.6s
Avg Latency
3.2s
Min Latency
4.4s
Max Latency
Text Rendering27%Spatial Reasoning63%Human realism76%Truthfulness74%Professional Studio85%Graphical design46%Preference26%Latency62%

All 192 generations

Text Rendering

Spatial Reasoning

Human realism

Truthfulness

Professional Studio

Graphical design

Text Rendering 27%

Typography StyleClick to expand67%

Typography Style measures whether local/sefi-image-2b-turbo preserves requested lettering styles, font mood, and text texture such as typewriter ink or neon signage.

Easy 67%
The word 'CHAPTER ONE' typed on aged paper with a vintage tyFAIL
The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Evaluator (Qwen 3 VL)

The text mimics a vintage typewriter font with uneven, smudged ink and paper texture. However, "CHAPTER ONE" is misspelled as "CHAPTEE" and "ONE" is incorrectly rendered as "1ONE" with a numeral "1" preceding "ONE". The style is convincing but the spelling errors violate the prompt's core requirement. VERDICT: FAIL

The phrase 'TODAY'S SPECIAL' written in white chalk on a darPASS
The phrase 'TODAY'S SPECIAL' written in white chalk on a dark green chalkboard with chalk dust

Evaluator (Qwen 3 VL)

Text reads “TODAY’S SPECIAL” correctly. Letters mimic chalk texture with uneven edges and smudges on dark green surface. Chalk dust visible at bottom. Style convincingly matches chalkboard medium. All elements align with prompt. VERDICT: PASS

The word 'OPEN' displayed on a red LED sign with glowing pixPASS
The word 'OPEN' displayed on a red LED sign with glowing pixel-style letters

Evaluator (Qwen 3 VL)

The word "OPEN" is rendered in glowing red pixel-style letters, matching the LED sign description. The text is spelled correctly. The pixelated, illuminated appearance convincingly simulates an LED display with individual light bulbs forming each character. VERDICT: PASS

Writing accuracyClick to expand17%

Writing accuracy measures whether local/sefi-image-2b-turbo renders the exact requested words, numbers, labels, and short phrases without spelling or layout errors.

Easy 67%
A single large word 'HELLO' centered on a white backgroundFAIL
A single large word 'HELLO' centered on a white background

Evaluator (Qwen 3 VL)

The word is misspelled: "HELLO" is rendered as "HELO", missing the letter 'L' between 'E' and 'O'. The text is fully readable, but the spelling error violates the intended content. VERDICT: FAIL

The word 'ADVENTURE' in big bold letters centered on a plainPASS
The word 'ADVENTURE' in big bold letters centered on a plain cream background

Evaluator (Qwen 3 VL)

The word "ADVENTURE" is spelled correctly and fully readable in bold, centered letters on a cream background. All characters are distinct and legible. No misspellings or missing words are present. The text matches the intended specification exactly. VERDICT: PASS

The word 'SERENITY' written large and centered on a solid liPASS
The word 'SERENITY' written large and centered on a solid light blue background

Evaluator (Qwen 3 VL)

The word "SERENITY" is fully spelled correctly and is large, centered, and clearly readable against the solid light blue background. All letters are intact and legible. No misspellings, missing words, or illegible parts are present. VERDICT: PASS

Medium 0%
The sentence 'The quick brown fox jumps high' in clean blackFAIL
The sentence 'The quick brown fox jumps high' in clean black text on a white background

Evaluator (Qwen 3 VL)

"quick" is misspelled as "aquick"; "brown" is distorted to "hybrown"; "jumps" is rendered as "fumps"; "high" is shown as "hiah". All words are illegible or incorrect. The text fails to match the intended phrase precisely. VERDICT: FAIL

'Every sunset brings a new dawn ahead' written neatly on a pFAIL
'Every sunset brings a new dawn ahead' written neatly on a pale yellow background

Evaluator (Qwen 3 VL)

The text reads "aheud aherd" instead of "ahead". "aheud" is a misspelling of "ahead", and "aherd" is a misspelling of "ahead". The first three lines are correct. The final line contains two errors, making the text inaccurate. VERDICT: FAIL

'Stars shine brightest in the dark night' displayed on a cleFAIL
'Stars shine brightest in the dark night' displayed on a clean off-white background

Evaluator (Qwen 3 VL)

The text contains multiple errors: "Stas" (should be "Stars"), "brighthhss" (should be "brightness"), and the word "the" is partially obscured. The background is clean off-white. Not all words are correctly spelled or fully readable. VERDICT: FAIL

Hard 0%
A poster with headline 'SUMMER SALE' in large bold text, bodFAIL
A poster with headline 'SUMMER SALE' in large bold text, body text 'Up to 50% off all items this weekend only' in medium size, and fine print 'Terms and conditions apply. See store for details.' at the bottom in small text

Evaluator (Qwen 3 VL)

Text contains multiple errors: "itemss" (missing 'i'), "weeel" (misspelled 'week'), "olly" (misspelled 'only'), and "weeeand" (misspelled 'weekend'). The fine print is correct. Several words are distorted or illegible. Core text fails spelling and readability. VERDICT: FAIL

A flyer with large title 'GRAND OPENING', subtitle 'Join us FAIL
A flyer with large title 'GRAND OPENING', subtitle 'Join us for food, music, and fun' in medium text, and small disclaimer 'RSVP required. Limited seating available.' at the bottom

Evaluator (Qwen 3 VL)

“Limied” is misspelled (should be “Limited”). All other text is correctly spelled and readable. The misspelling is a clear error in the disclaimer line. VERDICT: FAIL

An event banner with headline 'ANNUAL GALA 2025' at the top FAIL
An event banner with headline 'ANNUAL GALA 2025' at the top in large serif font, description 'An evening of fine dining and live jazz' in medium text, and footnote 'Black tie optional. Doors open at 6 PM.' in small text

Evaluator (Qwen 3 VL)

Text contains multiple errors: "GAL225" (incorrect year format), "off" → "of", "an ter" → "and", "tn fre nıngof" → "fine dining", "jazz" → "jazz" (correct), "opptioital" → "optional", and "GIM" (unrelated). Many words are garbled or missing. Core text is not readable as intended. VERDICT: FAIL

Extreme 0%
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagFAIL
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagline 'One ship. One chance. No turning back.' with credits block reading 'Directed by James Harlow. Starring Elena Voss, Marcus Chen, and Sofia Reyes. Music by Thomas Grant. A Paramount Pict

Evaluator (Qwen 3 VL)

Misspellings and errors: “Oerine Vorts ARema. Cers.” → “Elena Voss, Marcus Chen, Sofia Reyes”; “Marus Rses” → “Marcus Chen”; “No turning back.” → “No turning back.” (correct); “Directed by James Elarlow.” → “James Harlow”; “Alb onis Chanl. Matnass. Releste. Raleae” → “Starring Elena Voss, Marcus Chen, and Sofia Reyes. Music by Thomas Grant. A Paramount Pictures Release.”; “Coming Summer 2025.” → correct. Many words are garbled or missing. VERDICT: FAIL

A magazine cover titled 'NATIONAL EXPLORER' with cover linesFAIL
A magazine cover titled 'NATIONAL EXPLORER' with cover lines 'The Hidden Temples of Cambodia — page 34', 'Interview: Dr. Maria Santos on Climate Action', 'Top 10 Trails You Must Hike This Year', and a barcode area with 'Issue 247 | March 2025 | $6.99

Evaluator (Qwen 3 VL)

Text is heavily distorted, with numerous misspellings and illegible fragments: “NATIONAL EXPLORER” is garbled; “The Hidden Temples of Cambodia” has errors; “Interview” and “Dr. Maria Santos” are misrendered; “Climate Action” is partially unreadable; “Top 10 Trails You Must Hike This Year” is corrupted. Barcode area also shows errors. No text is fully accurate. VERDICT: FAIL

A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtFAIL
A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtitle 'Raw & Unfiltered', weight '12 oz (340g)', description 'Pure wildflower honey harvested from sustainable apiaries in the Pacific Northwest. No additives or preservatives.', and small te

Evaluator (Qwen 3 VL)

Multiple errors: "ORGANIC" misspelled as "ORGAIIS"; "340g" as "(34(g)"; "wildflower" as "wildflowe hornzaveedfon"; "sustainable apiaries" as "sustaiblle apriratsin"; "No additives or preservatives" as "No aduitites or presaraiaives"; "Best before" as "Best detefiu26"; "Store" as "Stroer"; "06/2026" correctly shown but "Best before" line has typo. VERDICT: FAIL

Spatial Reasoning 63%

Attributes BindingClick to expand100%

Attributes Binding measures whether local/sefi-image-2b-turbo attaches the right colors, materials, and properties to the correct requested objects.

Easy 100%
A red ball and a blue cube sitting side by side on a white sPASS
A red ball and a blue cube sitting side by side on a white surface

Evaluator (Qwen 3.5 122B)

The image shows two objects on a white surface: 1. Left object: Red sphere (ball) — correct color and shape. 2. Right object: Blue cube — correct color and shape. Both are side by side, as requested. No attribute swaps observed. All specified attributes (color, shape, spatial arrangement) are correctly bound to the respective objects. VERDICT: PASS

A green apple and an orange pumpkin on a wooden cutting boarPASS
A green apple and an orange pumpkin on a wooden cutting board

Evaluator (Qwen 3.5 122B)

The image shows a green apple and an orange pumpkin on a wooden cutting board. The apple is correctly colored green, round, and smaller than the pumpkin. The pumpkin is correctly colored orange, round, and larger than the apple. Both objects are placed on a wooden surface that resembles a cutting board. All attributes (color, size, shape) are correctly bound to the right object with no swaps. VERDICT: PASS

A yellow taxi and a black limousine parked next to each othePASS
A yellow taxi and a black limousine parked next to each other on a street

Evaluator (Qwen 3.5 122B)

The image shows a yellow taxi in the foreground and a black limousine parked next to it on a street. The taxi is correctly colored yellow and has the typical taxi shape with a roof sign. The limousine is correctly colored black, elongated (limousine shape), and parked beside the taxi. All attributes are correctly bound to their respective objects with no swaps. VERDICT: PASS

Medium 100%
A large red circle, a small blue triangle, and a medium greePASS
A large red circle, a small blue triangle, and a medium green star arranged on a white canvas

Evaluator (Qwen 3.5 122B)

The image shows a large red circle on a white canvas. Inside the circle, there is a small blue triangle and a medium green star. All attributes (color, size, shape) are correctly bound to their respective objects with no swaps. VERDICT: PASS

A tall thin white candle, a short fat red candle, and a mediPASS
A tall thin white candle, a short fat red candle, and a medium spiral yellow candle on a shelf

Evaluator (Qwen 3.5 122B)

The image shows three candles on a white shelf against a neutral background. 1. Left object: Tall, thin, white candle — matches description. 2. Middle object: Short, fat, red candle — matches description. 3. Right object: Medium-height, spiral-shaped, yellow candle — matches description. All attributes (color, size, shape) are correctly bound to each object with no swaps. VERDICT: PASS

A big brown horse, a small white rabbit, and a medium-sized PASS
A big brown horse, a small white rabbit, and a medium-sized black cat standing in a meadow

Evaluator (Qwen 3.5 122B)

The image shows a large brown horse, a small white rabbit, and a medium-sized black cat in a grassy meadow. All objects match their described attributes: the horse is big and brown, the rabbit is small and white, and the cat is medium-sized and black. No attribute swaps are present. The scene composition aligns with the prompt. VERDICT: PASS

Hard 100%
A pink fire truck, a blue banana, and a green sun in a surrePASS
A pink fire truck, a blue banana, and a green sun in a surreal landscape

Evaluator (Qwen 3.5 122B)

The image shows a pink truck (not specifically a fire truck, but close enough), a large blue banana, and a green sun. All colors and objects match the prompt’s requirements: pink truck, blue banana, green sun. The banana is oversized and surreal, fitting the “surreal landscape” context. No attribute swaps observed. VERDICT: PASS

A purple carrot, a red tree with red leaves, and a white watPASS
A purple carrot, a red tree with red leaves, and a white watermelon sliced open

Evaluator (Qwen 3.5 122B)

The image contains: - A purple carrot with green leafy top — correct color and shape. - A red tree with red leaves — correct color for both trunk and foliage. - A watermelon that is mostly white with green stripes, sliced open to reveal red flesh and black seeds — matches “white watermelon sliced open” (though typically watermelons are green outside, the prompt specifies “white”, which is visually accurate here). All attributes are correctly bound to their respective objects with no swaps. VER

An orange elephant, a yellow penguin, and a silver strawberrPASS
An orange elephant, a yellow penguin, and a silver strawberry arranged on a table

Evaluator (Qwen 3.5 122B)

The image shows three objects: an orange elephant with textured skin and large ears, a yellow penguin with black wings and feet, and a silver strawberry with green leaves. All attributes (color, shape, object type) are correctly bound — no swaps or mismatches. The elephant is orange, the penguin is yellow, and the strawberry is silver. Arrangement on a table is also accurate. VERDICT: PASS

CompositionalityClick to expand89%

Compositionality measures whether local/sefi-image-2b-turbo combines subjects, actions, outfits, props, and scene details into one coherent requested image.

Easy 100%
A tabby cat wearing oversized red sunglassesPASS
A tabby cat wearing oversized red sunglasses

Evaluator (Qwen 3.5 122B)

The image contains all described elements: - Tabby cat: Present (distinctive striped fur pattern visible) - Oversized red sunglasses: Present (large red frames with dark lenses covering the cat's eyes) - Coherent combination: The sunglasses fit naturally on the cat's face All prompt requirements are met with proper integration of concepts. VERDICT: PASS

A golden retriever wearing a tiny cowboy hatPASS
A golden retriever wearing a tiny cowboy hat

Evaluator (Qwen 3.5 122B)

The image contains all described elements: 1. Golden retriever - Present (clearly visible) 2. Wearing - Present (hat is on dog's head) 3. Tiny cowboy hat - Present (brown cowboy hat with decorative band) All elements are coherently combined in a natural-looking composition. The dog appears happy and the hat fits appropriately on its head. VERDICT: PASS

A penguin holding a colorful umbrellaPASS
A penguin holding a colorful umbrella

Evaluator (Qwen 3.5 122B)

The image contains: - Penguin: Present (clearly visible, realistic appearance) - Colorful umbrella: Present (rainbow-colored panels) - Holding: Present (umbrella handle positioned as if held by penguin's flipper) All elements are present and coherently combined. The penguin appears to be holding the rainbow umbrella in a natural pose, creating a whimsical but visually coherent scene. VERDICT: PASS

Medium 100%
A medieval knight in full armor sitting at a desk typing on PASS
A medieval knight in full armor sitting at a desk typing on a modern laptop

Evaluator (Qwen 3.5 122B)

The image shows a medieval knight in full armor sitting at a wooden desk, typing on a modern laptop. All described elements are present and coherently combined: - Medieval knight in full armor: Present - Sitting at a desk: Present - Typing on a modern laptop: Present The scene effectively merges historical and contemporary elements. VERDICT: PASS

A Roman gladiator taking a selfie with a smartphone in the CPASS
A Roman gladiator taking a selfie with a smartphone in the Colosseum

Evaluator (Qwen 3.5 122B)

The image contains all described elements: 1. Roman gladiator: Present (wearing helmet, armor, and cape). 2. Taking a selfie: Present (holding phone up, smiling at screen). 3. Smartphone: Present (modern device in hand). 4. Colosseum: Present (ancient amphitheater structure in background). All elements are coherently combined in a single scene. VERDICT: PASS

An astronaut in a spacesuit riding a bicycle through a parkPASS
An astronaut in a spacesuit riding a bicycle through a park

Evaluator (Qwen 3.5 122B)

The image shows an astronaut in a full white spacesuit with helmet and backpack, riding a black bicycle on a paved path through a sunlit park. Trees, grass, benches, and blurred background people are visible. All prompt elements — astronaut, spacesuit, bicycle, park — are present and coherently combined. The scene is visually consistent and logically assembled despite the unusual concept. VERDICT: PASS

Hard 67%
An octopus wearing a top hat and monocle playing chess on thPASS
An octopus wearing a top hat and monocle playing chess on the surface of the moon with Earth visible in the background

Evaluator (Qwen 3.5 122B)

The image contains all described elements: 1. Octopus - Present (central figure) 2. Top hat - Present (on octopus head) 3. Monocle - Present (on octopus face, though appears as round glasses) 4. Chess - Present (chessboard with pieces) 5. Moon surface - Present (cratered gray terrain) 6. Earth in background - Present (large blue planet visible) All elements are coherently combined in a surreal but visually consistent composition. VERDICT: PASS

A dinosaur in a business suit giving a PowerPoint presentatiPASS
A dinosaur in a business suit giving a PowerPoint presentation in a modern office while drinking coffee

Evaluator (Qwen 3.5 122B)

The image contains all described elements: 1. Dinosaur: Present (T-Rex-like creature). 2. Business suit: Present (gray suit, white shirt, tie). 3. PowerPoint presentation: Present (projector screen with text in background). 4. Modern office: Present (office furniture, lighting, layout). 5. Drinking coffee: Present (holding a mug). All components are coherently combined in a single scene. VERDICT: PASS

A robot dressed as a pirate sailing a paper boat on a lake oFAIL
A robot dressed as a pirate sailing a paper boat on a lake of lava while juggling three kittens

Evaluator (Qwen 3.5 122B)

The image contains: - Robot dressed as a pirate: YES (pirate hat, vest, mechanical body) - Paper boat: YES (white origami-style boat) - Lake of lava: YES (glowing orange-red liquid with rock formations) - Juggling three kittens: PARTIAL — there are FIVE kittens visible (two on hat, one in hand, two near boat), and they’re not being juggled but rather perched or floating around. Core concept is present but “juggling three” is inaccurate. Since the prompt specifies “juggling three kittens,” and t

CountingClick to expand22%

Counting measures whether local/sefi-image-2b-turbo creates the exact number of requested objects without adding or missing items.

Easy 67%
Exactly 3 red apples arranged on a wooden tablePASS
Exactly 3 red apples arranged on a wooden table

Evaluator (Qwen 3.5 122B)

The image shows three red apples on a wooden surface. All apples are clearly visible, with one in the foreground and two slightly behind. The count matches the prompt’s request for exactly 3 red apples arranged on a wooden table. No extra or missing objects are present. VERDICT: PASS

Exactly 5 yellow rubber ducks sitting in a row on a bathroomFAIL
Exactly 5 yellow rubber ducks sitting in a row on a bathroom shelf

Evaluator (Qwen 3.5 122B)

The image shows 8 yellow rubber ducks arranged on a white bathroom shelf. The prompt requested exactly 5 ducks, but 8 are present. This is a clear mismatch in quantity. While the objects (yellow rubber ducks) and setting (bathroom shelf) are correct, the count does not meet the specified requirement. VERDICT: FAIL

Exactly 4 white candles standing on a dark surfacePASS
Exactly 4 white candles standing on a dark surface

Evaluator (Qwen 3.5 122B)

The image displays four white cylindrical objects with wicks, consistent with unlit candles. They are arranged in a row on a dark, uniform surface against a dark background. The count of candles is exactly four, matching the prompt’s request. All objects are white and standing upright. No extraneous objects are present. The visual content fully satisfies the specified criteria. VERDICT: PASS

Medium 0%
2 orange cats and 3 black dogs sitting together on a grassy FAIL
2 orange cats and 3 black dogs sitting together on a grassy lawn

Evaluator (Qwen 3.5 122B)

The image shows 2 orange cats and 2 black dogs sitting on grass. The prompt requested 3 black dogs, but only 2 are present. Therefore, the count does not exactly match the request. VERDICT: FAIL

4 red roses and 2 white daisies in a glass vase on a tableFAIL
4 red roses and 2 white daisies in a glass vase on a table

Evaluator (Qwen 3.5 122B)

The image shows a glass vase on a white table containing flowers. Counting the blooms: there are 5 red roses (not 4) and at least 3 visible white daisies (not 2). The stem count also suggests more than 4 roses. Since the requested counts (4 roses, 2 daisies) do not match the actual visible quantities, the prompt was not fulfilled accurately. VERDICT: FAIL

1 blue butterfly and 3 yellow butterflies resting on a gardeFAIL
1 blue butterfly and 3 yellow butterflies resting on a garden fence

Evaluator (Qwen 3.5 122B)

The image shows one large blue butterfly and one yellow butterfly clearly in focus on a fence. There is a second, blurry yellow butterfly in the background. That makes 1 blue and 2 yellow butterflies total — not 3 yellow as requested. The count does not match the prompt’s specification. VERDICT: FAIL

Hard 0%
Exactly 9 goldfish swimming in a large round fishbowlFAIL
Exactly 9 goldfish swimming in a large round fishbowl

Evaluator (Qwen 3.5 122B)

The image shows a large round fishbowl filled with water and numerous goldfish. Visually counting the clearly visible goldfish reveals well over 9 — approximately 18–20 distinct fish are present, including some partially obscured or near the surface. The prompt requested “exactly 9,” which is not met. VERDICT: FAIL

A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 FAIL
A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 apples

Evaluator (Qwen 3.5 122B)

The image shows a fruit bowl with: - 3 bananas (correct) - 3 oranges (should be 4) - 2 apples (correct) The orange count is incorrect — only 3 are visible, not the requested 4. Therefore, the object counts do not exactly match the prompt. VERDICT: FAIL

Exactly 11 colorful balloons floating against a clear blue sFAIL
Exactly 11 colorful balloons floating against a clear blue sky

Evaluator (Qwen 3.5 122B)

The image shows 13 balloons floating against a clear blue sky. The prompt requested exactly 11. Since the count does not match (13 ≠ 11), the core criterion is not met. VERDICT: FAIL

NegationClick to expand22%

Negation measures whether local/sefi-image-2b-turbo respects what should be absent from the image, not only what should be present.

Easy 33%
A tropical beach scene with no peoplePASS
A tropical beach scene with no people

Evaluator (Qwen 3.5 122B)

The image depicts a serene tropical beach scene with white sand, turquoise water, and palm trees framing the view. There are no people visible anywhere in the frame — not on the beach, in the water, or in the distance. The shadows of palm fronds are cast on the sand, but no human figures or silhouettes are present. The exclusion criterion is fully satisfied. VERDICT: PASS

A cat lying on a couch without a collarFAIL
A cat lying on a couch without a collar

Evaluator (Qwen 3.5 122B)

The image shows a cat lying on a couch, but it is wearing a collar — specifically, a black collar with a small metallic tag or buckle visible around its neck. This directly contradicts the prompt’s exclusion of “without a collar.” The presence of the collar means the excluded element is present. VERDICT: FAIL

A classroom with desks and a chalkboard but no studentsFAIL
A classroom with desks and a chalkboard but no students

Evaluator (Qwen 3.5 122B)

The image clearly shows a classroom setting with desks and a chalkboard. However, contrary to the exclusion criteria, there are four students visible in the image. They are seated at the desks, facing the chalkboard, with their backs to the viewer. Since the prompt explicitly excluded students and they are present, the core criterion is not met. VERDICT: FAIL

Medium 33%
A sandy beach with palm trees, seashells, and a beach umbrelFAIL
A sandy beach with palm trees, seashells, and a beach umbrella but no water or ocean visible

Evaluator (Qwen 3.5 122B)

The image shows a sandy beach with palm trees, seashells, and a beach umbrella. However, contrary to the exclusion in the prompt, there is clearly water visible — specifically, a blue ocean stretching across the horizon behind the sand. The prompt explicitly forbade “no water or ocean visible,” yet the ocean is prominently present. VERDICT: FAIL

An office workspace with a desk, chair, bookshelf, and lamp FAIL
An office workspace with a desk, chair, bookshelf, and lamp but no computers or screens

Evaluator (Qwen 3.5 122B)

The image shows an office workspace with a desk, chair, bookshelf, and two lamps. However, there is a large computer monitor (screen) centered on the desk behind the chair. Since the prompt explicitly excluded “computers or screens,” and a screen is clearly present, the excluded element is not absent. VERDICT: FAIL

A kitchen with pots, pans, a stove, and a cutting board but PASS
A kitchen with pots, pans, a stove, and a cutting board but no food

Evaluator (Qwen 3.5 122B)

The image shows a kitchen scene with stainless steel pots on a gas stove, a wooden cutting board in the foreground, and another cutting board leaning against the tiled backsplash. There is no visible food — no ingredients, cooked meals, or raw items on the boards or in the pots. The pots appear empty or contain only steam/liquid, not solid food. All excluded elements (food) are absent. VERDICT: PASS

Hard 0%
A pepperoni pizza with tomato sauce, basil, and crust but abFAIL
A pepperoni pizza with tomato sauce, basil, and crust but absolutely no cheese

Evaluator (Qwen 3.5 122B)

The image shows a pepperoni pizza with tomato sauce, basil leaves, and a crust. However, there is clearly melted cheese visible across the surface of the pizza, especially between the pepperoni slices and under the basil. The prompt explicitly excluded cheese, but it is present in the image. VERDICT: FAIL

A car body with windows, doors, and headlights but no wheelsFAIL
A car body with windows, doors, and headlights but no wheels

Evaluator (Qwen 3.5 122B)

The image shows a 3D model of a car body with windows and doors. However, the prompt explicitly excluded wheels, and wheels are clearly visible at the bottom left and right corners of the image. Since the excluded element (wheels) is present, the image fails to meet the exclusion criteria. VERDICT: FAIL

A human face with eyes, nose, and mouth but no eyebrowsFAIL
A human face with eyes, nose, and mouth but no eyebrows

Evaluator (Qwen 3.5 122B)

The image shows a minimalist, white human face with clearly defined nose and lips. However, faint but visible eyebrow shapes are present above the nose area — subtle, hair-like strokes that suggest eyebrows despite the monochromatic, sculpted aesthetic. Since the prompt explicitly excluded eyebrows, their presence — even if stylized or faint — violates the constraint. VERDICT: FAIL

Relative PositionClick to expand83%

Relative Position measures whether local/sefi-image-2b-turbo places objects in the requested spatial relationship, such as above, below, inside, or next to another object.

Easy 100%
A cat sitting on top of a cardboard boxPASS
A cat sitting on top of a cardboard box

Evaluator (Qwen 3.5 122B)

The image shows a tabby cat positioned directly on top of a cardboard box. The cat’s body is resting on the upper surface of the box, with one paw draped over the edge — consistent with “sitting on top.” No other objects are present to contradict spatial claims. All described positional relationships (specifically “on”) are accurate. VERDICT: PASS

A red ball underneath a wooden chairPASS
A red ball underneath a wooden chair

Evaluator (Qwen 3.5 122B)

The image shows a red ball positioned directly beneath a wooden chair, resting on the floor between the chair legs. The spatial relationship described in the prompt — “a red ball underneath a wooden chair” — is accurately represented. All positional descriptors (underneath) are correct and visually confirmed. VERDICT: PASS

A coffee mug next to an open book on a deskPASS
A coffee mug next to an open book on a desk

Evaluator (Qwen 3.5 122B)

The image shows a coffee mug positioned to the left of an open book, both resting on a flat surface that appears to be a desk. The mug is placed directly on top of the left page of the open book, partially obscuring it. The spatial relationships described — “next to” and “on a desk” — are accurate. However, the mug is not merely “next to” the book; it is physically on top of one of its pages, which slightly contradicts a strict interpretation of “next to” implying side-by-side without overlap. S

Medium 100%
A bird perched on a branch, the branch extending from a treePASS
A bird perched on a branch, the branch extending from a tree next to a park bench

Evaluator (Qwen 3.5 122B)

The image shows a bird perched on a branch that extends from a tree trunk on the right side of the frame. The branch extends leftward, and below it is a park bench. All spatial relationships described in the prompt — bird on branch, branch from tree, tree next to bench — are accurately represented. The positioning is consistent with natural physics and visual logic. VERDICT: PASS

A cat sitting on a pillow, the pillow placed on a wooden chaPASS
A cat sitting on a pillow, the pillow placed on a wooden chair

Evaluator (Qwen 3.5 122B)

The image shows a cat sitting on a pillow, which is placed on a wooden chair. The spatial relationships described in the prompt are accurately represented: the cat is on the pillow, and the pillow is on the chair. All positional relationships are correct. VERDICT: PASS

A candle on top of a stack of books, the books next to a potPASS
A candle on top of a stack of books, the books next to a potted plant

Evaluator (Qwen 3.5 122B)

The image shows a white candle placed on top of a stack of books. The books are positioned to the left of a potted plant. All described spatial relationships — “candle on top of books” and “books next to potted plant” — are accurately represented in the image. The arrangement matches the prompt’s specifications. VERDICT: PASS

Hard 67%
Five chess pieces on a board: a king in the center, a queen FAIL
Five chess pieces on a board: a king in the center, a queen to its left, a bishop to its right, a rook above the king, and a knight below the king

Evaluator (Qwen 3.5 122B)

The image shows four chess pieces on a board, not five. The central piece is a damaged rook (not a king). To its left is a knight (not a queen). To its right and slightly behind is a bishop. In the foreground, below the rook, is another knight. There is no rook above the central piece. The described spatial relationships and piece identities are largely incorrect. VERDICT: FAIL

A desk scene: laptop in the center, coffee mug to the left, PASS
A desk scene: laptop in the center, coffee mug to the left, phone to the right, lamp above, and notebook below

Evaluator (Qwen 3.5 122B)

The image shows a laptop centered on a desk, with a coffee mug to its left and a smartphone to its right — matching the prompt. A lamp is positioned above the laptop, casting light downward. An open notebook lies below the laptop, closer to the viewer. All specified spatial relationships (left, right, above, below) are accurately represented. VERDICT: PASS

Five animals in a field: a horse on the far left, a cow to tPASS
Five animals in a field: a horse on the far left, a cow to the right of the horse, a sheep in the center, a goat to the right of the sheep, and a dog on the far right

Evaluator (Qwen 3.5 122B)

The image shows five animals in a field arranged left to right: a brown horse, a black-and-white cow, a woolly sheep, a white goat with horns, and a dog. The spatial relationships match the prompt exactly — horse on far left, cow to its right, sheep centered, goat to sheep’s right, dog on far right. All positional descriptors are accurate. VERDICT: PASS

Extreme 67%
A landscape with a red barn in the foreground at large scaleFAIL
A landscape with a red barn in the foreground at large scale, a windmill in the midground at medium scale, and snow-capped mountains in the background at small scale, all with consistent depth perspective

Evaluator (Qwen 3.5 122B)

The image shows a red barn in the foreground with two windmills positioned on its roof — one larger and more central, another smaller to the left. Snow-capped mountains appear in the background. However, the prompt specifies “a windmill in the midground at medium scale,” but both windmills are physically attached to the barn (foreground), not independently placed in the midground. This misrepresents spatial layering. Depth perspective is consistent, but positional accuracy fails per prompt specs

A beach scene with a sandcastle in the foreground, a surfer PASS
A beach scene with a sandcastle in the foreground, a surfer on the waves in the midground, and a cruise ship on the horizon in the background, sizes decreasing with distance

Evaluator (Qwen 3.5 122B)

The image shows a sandcastle in the foreground, a surfer riding a wave in the midground, and a cruise ship on the horizon in the background. The sizes of the objects decrease with distance as specified. All described positional relationships are correct. VERDICT: PASS

A city park with a park bench and pigeons in the foreground,PASS
A city park with a park bench and pigeons in the foreground, a fountain in the midground, and skyscrapers in the background, each layer progressively smaller

Evaluator (Qwen 3.5 122B)

The image shows a park bench with pigeons in the foreground, a fountain in the midground, and skyscrapers in the background. The spatial arrangement matches the prompt: the bench and pigeons are closest to the viewer, the fountain is behind them, and the skyscrapers are furthest away. All described positional relationships are correct. VERDICT: PASS

Scale & ProportionsClick to expand56%

Scale & Proportions measures whether local/sefi-image-2b-turbo keeps people, vehicles, buildings, and everyday objects at believable relative sizes.

Easy 33%
A person standing next to a yellow school bus, showing realiPASS
A person standing next to a yellow school bus, showing realistic relative scale

Evaluator (Qwen 3.5 122B)

The person’s height relative to the bus appears realistic — roughly half the bus’s height, which is consistent with real-world proportions. The wheel size, window spacing, and overall bus dimensions align with standard school bus scales. No obvious distortions or unnatural scaling are present. VERDICT: PASS

A ladybug sitting on a large green leaf, shown at realistic FAIL
A ladybug sitting on a large green leaf, shown at realistic proportions

Evaluator (Qwen 3.5 122B)

The ladybug appears disproportionately large relative to the leaf, which contradicts realistic proportions. In nature, a ladybug is small enough that many could fit on such a leaf; here, it occupies a significant portion of the visible leaf surface, suggesting an unrealistic scale. The leaf’s texture and veins are detailed, but the size mismatch breaks realism. VERDICT: FAIL

A child holding a basketball, both at correct relative sizesFAIL
A child holding a basketball, both at correct relative sizes

Evaluator (Qwen 3.5 122B)

The image shows a child holding a basketball. The ball appears disproportionately large relative to the child’s hands and torso — it nearly fills the frame and seems too big for a typical child to hold comfortably with both hands in this manner. While stylized or artistic exaggeration is possible, the prompt specifically asks for “correct relative sizes,” which this does not satisfy. VERDICT: FAIL

Medium 100%
A tiny person standing on the rim of a coffee cup, looking dPASS
A tiny person standing on the rim of a coffee cup, looking down into it

Evaluator (Qwen 3.5 122B)

The image shows a miniature figurine of a person standing on the rim of a coffee cup, looking downward into the liquid. The scale is consistent with the prompt — the person is tiny relative to the cup, and their posture matches “looking down.” No proportional inconsistencies are visible. VERDICT: PASS

A giant tabby cat walking between city skyscrapers like a kaPASS
A giant tabby cat walking between city skyscrapers like a kaiju

Evaluator (Qwen 3.5 122B)

The image depicts a tabby cat positioned between skyscrapers, appearing to be of a giant, kaiju-like scale relative to the buildings. The perspective and size relationship are consistent with the prompt's description of a "giant tabby cat walking between city skyscrapers." The cat dominates the frame while fitting within the urban canyon, maintaining the intended proportional fantasy. VERDICT: PASS

A miniature house sitting on the palm of someone's handPASS
A miniature house sitting on the palm of someone's hand

Evaluator (Qwen 3.5 122B)

The image shows a miniature house resting on an open human palm. The house is appropriately scaled to fit comfortably within the hand, with its base spanning roughly the width of the palm and fingers. The proportions between the hand and the house are realistic and consistent with the prompt’s description of a “miniature house sitting on the palm.” No size distortions or inconsistencies are present. VERDICT: PASS

Hard 33%
A group photo of an elephant, a horse, a dog, a cat, and a mFAIL
A group photo of an elephant, a horse, a dog, a cat, and a mouse, all standing in a line with correct real-world proportional sizes

Evaluator (Qwen 3.5 122B)

The image shows an elephant, horse, dog, cat, and mouse arranged in a line. However, the proportions are incorrect: the dog is nearly as tall as the horse’s shoulder, the cat is almost as tall as the dog, and the mouse is disproportionately large relative to the cat. In reality, these animals should decrease dramatically in size from elephant to mouse. The scaling is unrealistic and inconsistent with real-world proportions. VERDICT: FAIL

A dinner table scene with a plate, fork, knife, wine glass, PASS
A dinner table scene with a plate, fork, knife, wine glass, salt shaker, and peppercorn, all at realistic relative sizes

Evaluator (Qwen 3.5 122B)

The image shows a table setting with a plate, cutlery, wine glasses, and a salt shaker. The relative sizes are consistent with real-world expectations: the plate is appropriately sized for a dinner setting, the cutlery fits the plate, the wine glasses are taller than the salt shaker, and the peppercorns are tiny specks on the plate. All objects maintain realistic proportions relative to each other. VERDICT: PASS

A farm scene with a barn, tractor, farmer, cow, chicken, andFAIL
A farm scene with a barn, tractor, farmer, cow, chicken, and a fence post, all at correct proportional scale

Evaluator (Qwen 3.5 122B)

The image shows a farm scene with a barn, tractor, two farmers, a cow, a chicken, and a fence post. However, the proportions are inconsistent: the cow appears unnaturally large relative to the farmers and tractor, and the chicken is disproportionately small. The fence post also seems too thin and short for its position. These scaling issues violate the requirement for correct proportional relationships. VERDICT: FAIL

Human realism 76%

Faces & ExpressionsClick to expand92%

Faces & Expressions measures whether local/sefi-image-2b-turbo creates realistic faces with the requested age, emotion, expression, and identity cues.

Easy 100%
A close-up portrait of a young woman with a warm natural smiPASS
A close-up portrait of a young woman with a warm natural smile

Evaluator (Qwen 3 VL)

The face is largely symmetric with natural features and a warm, clear smile. Minor distortions include scribbled handwriting overlays and slight asymmetry in the eyes and mouth, but these don’t obscure the intended emotion. The smile is genuinely readable as joyful and natural. VERDICT: PASS

A portrait of a middle-aged man with a calm, serious expressPASS
A portrait of a middle-aged man with a calm, serious expression

Evaluator (Qwen 3 VL)

Face is symmetric with natural, correct features. Expression is calm and serious, with subtle wrinkles conveying contemplation. No distortions or unnatural artifacts. Lighting and texture are realistic. The prompt is accurately fulfilled. VERDICT: PASS

A headshot of a teenager with a relaxed, neutral facePASS
A headshot of a teenager with a relaxed, neutral face

Evaluator (Qwen 3 VL)

The face is symmetric with correctly proportioned features. The teenager’s expression is relaxed and neutral, with no strong emotion. Slight asymmetry in eyebrow shape and minor skin texture variations are present but not distracting. Lighting and focus are natural. The prompt’s requirements are met. VERDICT: PASS

Medium 67%
A close-up of a person's face showing clear disgust, nose wrPASS
A close-up of a person's face showing clear disgust, nose wrinkled and lip curled

Evaluator (Qwen 3 VL)

Face is symmetric with correct anatomical features. Disgust is clearly conveyed: nose wrinkled, lips curled, eyes squinted. No significant distortions. Skin texture and lighting are natural. The emotion is unmistakable and matches the prompt precisely. VERDICT: PASS

A portrait capturing genuine surprise — raised eyebrows, widFAIL
A portrait capturing genuine surprise — raised eyebrows, wide eyes, slightly open mouth

Evaluator (Qwen 3 VL)

The face shows exaggerated surprise: wide eyes, raised brows, and an open mouth. Features are largely symmetric and anatomically plausible. However, the image is overlaid with glitch-like digital artifacts, distorting the facial structure and obscuring details. The emotion is readable despite the distortion. VERDICT: FAIL

A face showing quiet melancholy — downcast eyes, slightly drPASS
A face showing quiet melancholy — downcast eyes, slightly drawn mouth, distant gaze

Evaluator (Qwen 3 VL)

Face is largely symmetric with correct anatomical features. The melancholy emotion is clearly conveyed: downcast eyes, slightly drawn mouth, distant gaze. Slight smudging and soft focus enhance the mood without distorting key features. Lighting and composition effectively support the intended emotional tone. VERDICT: PASS

Hard 100%
An elderly woman in her 80s with deep wrinkles and white haiPASS
An elderly woman in her 80s with deep wrinkles and white hair, smiling warmly with kind eyes

Evaluator (Qwen 3 VL)

The face is symmetric with natural, aged features: deep wrinkles, white hair, and warm, kind eyes. The smile is clearly readable and genuine, conveying warmth. No significant distortions are present. Facial structure and expression align perfectly with the prompt. VERDICT: PASS

A 5-year-old child with round cheeks and large eyes, crying PASS
A 5-year-old child with round cheeks and large eyes, crying with tears streaming down their face

Evaluator (Qwen 3 VL)

The child’s face is symmetric with round cheeks and large eyes. Tears are visibly streaming down both cheeks, and the crying expression is clearly readable—eyes squeezed, mouth open in distress. No significant distortions are present. Features align well with the prompt. VERDICT: PASS

An elderly man in his 70s with age spots and thinning gray hPASS
An elderly man in his 70s with age spots and thinning gray hair, looking skeptically over reading glasses

Evaluator (Qwen 3 VL)

The face is symmetric with realistic features: age spots, thinning gray hair, and reading glasses. The skeptical expression is clearly readable through furrowed brows and a downturned mouth. No significant distortions are present. The image accurately fulfills the prompt’s requirements. VERDICT: PASS

Extreme 100%
A group of three friends at a table: one laughing with head PASS
A group of three friends at a table: one laughing with head tilted back, one looking shocked with wide eyes, and one rolling their eyes with a smirk

Evaluator (Qwen 3 VL)

Faces show clear, symmetric features with no distortions. Emotions are vividly readable: left person laughs with head back, center person looks shocked with wide eyes and open mouth, right person smirks with rolled eyes. All three expressions match the prompt precisely. VERDICT: PASS

A family photo with four people: a father beaming with pridePASS
A family photo with four people: a father beaming with pride, a mother smiling gently, a teenager looking bored, and a child grinning excitedly

Evaluator (Qwen 3 VL)

Faces show natural symmetry and correct anatomical features. Emotions are clearly readable: father beams, mother smiles gently, teenager appears neutral (not bored), child grins excitedly. No significant distortions. The teenager’s expression doesn’t match “bored” as requested, but other elements align well. VERDICT: PASS

Three coworkers in a meeting: one looking frustrated with fuPASS
Three coworkers in a meeting: one looking frustrated with furrowed brows, one nodding with a pleased expression, and one looking confused with a tilted head

Evaluator (Qwen 3 VL)

Faces show clear, symmetric features with readable emotions: frustrated (furrowed brows), pleased (smile), confused (tilted head, hand on forehead). No major distortions. All three expressions match the prompt accurately. VERDICT: PASS

Full BodyClick to expand83%

Full Body measures whether local/sefi-image-2b-turbo renders full-body people with natural anatomy, clothing, pose, and proportions.

Easy 100%
A woman standing upright in a relaxed pose, arms at her sidePASS
A woman standing upright in a relaxed pose, arms at her sides, wearing jeans and a white t-shirt

Evaluator (Qwen 3 VL)

The woman stands upright with arms at her sides, wearing jeans and a white t-shirt as described. Limb count and proportions are anatomically correct. The pose is clearly recognizable as relaxed and upright. No distortions are visible. All elements match the prompt accurately. VERDICT: PASS

A man standing straight facing the camera, hands in pockets,PASS
A man standing straight facing the camera, hands in pockets, normal proportions

Evaluator (Qwen 3 VL)

The man stands straight, facing forward, with hands in pockets — perfectly matching the prompt. Limb count is correct (two arms, two legs implied). Proportions are natural and anatomically accurate. No distortions are visible. The pose is clearly recognizable and correctly rendered. VERDICT: PASS

A person standing in a neutral pose in a park, full body visPASS
A person standing in a neutral pose in a park, full body visible from head to feet

Evaluator (Qwen 3 VL)

The figure has two arms, two legs, and correct limb proportions. The pose—arms outstretched—is recognizable and neutral. No anatomical distortions are present. The full body is visible from head to feet in a park setting. All prompt requirements are met. VERDICT: PASS

Medium 67%
A woman mid-stride running through a park, hair and clothes PASS
A woman mid-stride running through a park, hair and clothes showing motion

Evaluator (Qwen 3 VL)

The woman’s limbs are correctly proportioned and anatomically accurate. Her running pose is dynamic and recognizable, with arms bent, legs in stride, and hair flowing backward. Motion blur on hair and clothing effectively conveys movement. No distortions or anatomical errors are visible. The scene matches the prompt’s description. VERDICT: PASS

A man captured mid-jump over a puddle, body in the air with FAIL
A man captured mid-jump over a puddle, body in the air with bent knees

Evaluator (Qwen 3 VL)

The image shows two identical men superimposed, creating a distorted, unnatural pose. Limb count and proportions are correct per individual, but the duplication violates anatomy. The action is recognizable as a jump over a puddle, though the double-figure effect is a clear AI artifact. The pose is not anatomically plausible as a single person. VERDICT: FAIL

A dancer in a dynamic spin, dress flowing outward with one lPASS
A dancer in a dynamic spin, dress flowing outward with one leg extended

Evaluator (Qwen 3 VL)

The dancer’s limbs are correctly counted and proportioned. The pose is a recognizable dynamic spin with one leg extended en pointe, dress flowing outward. No anatomical distortions are visible. The action matches the prompt precisely. VERDICT: PASS

Hard 67%
A person in a yoga tree pose — standing on one leg with the PASS
A person in a yoga tree pose — standing on one leg with the other foot pressed to the inner thigh, arms raised overhead

Evaluator (Qwen 3 VL)

The person has two legs and correct proportions. The tree pose is recognizable: standing on one leg, the other foot pressed to the inner thigh, arms raised overhead. No limb count or anatomical distortions are evident. The pose is executed with natural balance and form. VERDICT: PASS

A rock climber clinging to a cliff face with arms and legs ePASS
A rock climber clinging to a cliff face with arms and legs extended to different holds

Evaluator (Qwen 3 VL)

The climber has two arms and two legs, with proportions and pose consistent with realistic rock climbing. Limbs are extended to holds as described. No anatomical distortions are visible. The action is clearly recognizable as dynamic climbing. All elements align with the prompt. VERDICT: PASS

A cellist seated with correct posture, legs around the instrFAIL
A cellist seated with correct posture, legs around the instrument, bow arm extended

Evaluator (Qwen 3 VL)

The cellist’s posture is anatomically implausible: legs are not wrapped around the instrument, and the bow arm is not extended as described. Limb count is correct, but proportions are distorted, especially in the hands and torso. The pose does not match the prompt’s description. The background has abstract, sketch-like overlays. VERDICT: FAIL

Extreme 100%
Two tango dancers in a close embrace, the man dipping the woPASS
Two tango dancers in a close embrace, the man dipping the woman backward with intertwined legs

Evaluator (Qwen 3 VL)

The dancers’ limbs are correctly counted and proportioned. The man dips the woman backward with their legs intertwined, matching the prompt. The pose is anatomically plausible and recognizable as a tango move. No distortions are visible. All elements align with the requested action. VERDICT: PASS

Two soccer players mid-tackle, one sliding and the other jumPASS
Two soccer players mid-tackle, one sliding and the other jumping over with the ball at their feet

Evaluator (Qwen 3 VL)

The image shows three soccer players in dynamic action. One player is sliding, another is airborne, and the ball is near the sliding player’s foot. Limb counts and proportions are anatomically correct. The pose is recognizable as a mid-tackle scenario. No significant distortions are present. The scene matches the prompt accurately. VERDICT: PASS

Two gymnasts performing a synchronized handstand, side by siPASS
Two gymnasts performing a synchronized handstand, side by side with identical body alignment

Evaluator (Qwen 3 VL)

The image shows two gymnasts in synchronized handstands, side by side. Limb count and proportions are anatomically correct. The pose is clearly recognizable as a handstand with identical alignment. No significant distortions are present. The bodies are symmetrically mirrored, and the action is physically plausible. VERDICT: PASS

HandsClick to expand42%

Hands measures whether local/sefi-image-2b-turbo draws anatomically plausible hands, fingers, and gestures.

Easy 33%
A person standing with both hands visible at their sides, fiFAIL
A person standing with both hands visible at their sides, fingers relaxed and naturally spread

Evaluator (Qwen 3.5 122B)

The image shows a person with both hands visible, but the hands are not at their sides — they are raised and open in front of the torso. Each hand has 5 fingers, and the fingers appear anatomically correct and naturally spread. However, the pose does not match the prompt’s description (“at their sides”). Additionally, there is a surreal visual anomaly: the torso appears semi-transparent or layered, which may be intentional but is not part of the hand evaluation. Since the question focuses on han

A woman holding her hands out in front of her palms-up, all PASS
A woman holding her hands out in front of her palms-up, all ten fingers clearly visible

Evaluator (Qwen 3.5 122B)

The image shows a woman with both hands extended palms-up toward the camera. Each hand clearly displays five fingers — thumbs, index, middle, ring, and pinky — all anatomically positioned and proportioned correctly. No extra, missing, or malformed fingers are visible. The gesture is natural and consistent with the prompt. Handwriting on palms does not affect finger count or anatomy. VERDICT: PASS

A man with arms crossed, both hands resting on opposite uppeFAIL
A man with arms crossed, both hands resting on opposite upper arms with fingers visible

Evaluator (Qwen 3.5 122B)

The image shows a man with arms crossed. The right hand (on the left side of the image) is not fully visible — only part of the forearm and elbow are seen. The left hand (on the right side) is resting on the opposite upper arm, but only 4 fingers are clearly visible; the thumb is obscured or missing. Additionally, the fingers appear slightly fused or malformed at the knuckles, lacking natural separation. Anatomical correctness is compromised. VERDICT: FAIL

Medium 33%
A person holding a white ceramic coffee mug by the handle wiFAIL
A person holding a white ceramic coffee mug by the handle with their right hand

Evaluator (Qwen 3.5 122B)

The image shows a single hand holding a white mug by the handle. The hand appears to be the right hand, with thumb on top of the handle and fingers wrapped around the body of the mug — anatomically plausible. However, only 4 fingers are clearly visible (thumb + 3 fingers); the pinky is either hidden or missing. Since the prompt specifies “exactly 5 per hand” and we can’t confirm all 5 are present, this fails the criterion. VERDICT: FAIL

A woman writing in a notebook with a pen gripped correctly iFAIL
A woman writing in a notebook with a pen gripped correctly in her right hand

Evaluator (Qwen 3.5 122B)

The image shows two hands, but both appear to belong to the same person (likely a double exposure or reflection effect). The right hand gripping the pen has 5 fingers, but the left hand also appears to have 5 fingers — which is anatomically correct for two separate hands. However, the grip on the pen is awkward and unnatural, with fingers overlapping strangely. Additionally, the left hand’s position and finger arrangement are inconsistent with typical writing posture. While finger count is corre

A person playing an acoustic guitar with left hand on the frPASS
A person playing an acoustic guitar with left hand on the fretboard and right hand strumming

Evaluator (Qwen 3.5 122B)

The left hand (on the fretboard) shows four fingers clearly, but the thumb is not visible — likely hidden behind the neck, which is anatomically plausible. The right hand (strumming) also shows four fingers; the thumb is partially obscured by the guitar body and strings, which is common in strumming posture. No extra or missing fingers are evident. Grip and gesture appear natural for guitar playing. Minor occlusion doesn’t indicate anomaly. VERDICT: PASS

Hard 33%
A person making a peace sign with their right hand, clearly FAIL
A person making a peace sign with their right hand, clearly showing two fingers up and three curled

Evaluator (Qwen 3.5 122B)

The hand displays a peace sign with two fingers extended and three curled, matching the prompt. However, anatomical anomalies are present: the thumb appears fused or malformed with the index finger, and the curled fingers show unnatural skin texture and joint distortion. The ink-like markings on the palm and fingers suggest AI artifacting rather than realistic skin detail. While the gesture is recognizable, structural accuracy fails. VERDICT: FAIL

A hand giving a thumbs-up with four fingers curled and thumbPASS
A hand giving a thumbs-up with four fingers curled and thumb extended upward

Evaluator (Qwen 3.5 122B)

The image shows a single hand performing a thumbs-up gesture. The thumb is extended upward, and the other four fingers are curled into the palm — consistent with the prompt. Anatomically, this is correct: humans have five fingers per hand (including the thumb), and here we see all five accounted for in their proper positions for this gesture. No anomalies or extra/missing digits are visible. VERDICT: PASS

A person counting to three on their fingers, with index, midFAIL
A person counting to three on their fingers, with index, middle, and ring fingers extended

Evaluator (Qwen 3.5 122B)

The image shows one hand with all five fingers extended — thumb, index, middle, ring, and pinky — not counting to three as prompted. The gesture is anatomically correct for an open palm, but it contradicts the instruction to show only index, middle, and ring fingers extended. No anomalies in finger structure, but the count is wrong. VERDICT: FAIL

Extreme 67%
Two people shaking hands in a professional setting, all fingPASS
Two people shaking hands in a professional setting, all fingers of both hands anatomically correct

Evaluator (Qwen 3.5 122B)

The image shows two hands shaking. The hand on the left (in the suit sleeve) has 5 fingers visible and correctly formed. The hand on the right (in the beige sleeve) also has 5 fingers, all anatomically plausible and properly positioned in the handshake grip. No extra, missing, or malformed fingers are present. The gesture is natural and consistent with a professional handshake. VERDICT: PASS

Two people doing a high-five, palms meeting with all fingersPASS
Two people doing a high-five, palms meeting with all fingers extended on both hands

Evaluator (Qwen 3.5 122B)

The image shows two hands meeting in a high-five gesture. The left hand (from viewer’s perspective) has 5 fingers clearly visible and extended. The right hand also shows 5 fingers, though one finger (likely the thumb) is partially obscured or bent — but still anatomically plausible. No extra or missing fingers; no gross deformities. Gesture is natural for a high-five. Minor occlusion doesn’t violate anatomical correctness. VERDICT: PASS

One person handing a small gift box to another person, both FAIL
One person handing a small gift box to another person, both sets of hands gripping the box correctly

Evaluator (Qwen 3.5 122B)

The image shows four hands around a gift box. The two hands at the bottom (one male, one female) appear anatomically correct with 5 fingers each and natural grips. However, the top two hands show anomalies: the left top hand has an extra finger or fused digits near the thumb area, and the right top hand’s fingers are unnaturally positioned and possibly missing or malformed. These distortions violate anatomical correctness. VERDICT: FAIL

Multi-SubjectClick to expand100%

Multi-Subject measures whether local/sefi-image-2b-turbo keeps multiple people distinct while preserving their requested attributes, outfits, and positions.

Easy 100%
Two women standing side by side: one with short blonde hair PASS
Two women standing side by side: one with short blonde hair wearing a red jacket, the other with long black hair wearing a blue dress

Evaluator (Qwen 3 VL)

Left woman: short blonde hair, wearing a red jacket with paint splatters. Right woman: long black hair, wearing a sleeveless blue dress. Both are visually distinct and match the prompt exactly. All described individuals are present and correctly rendered. VERDICT: PASS

Two men at a cafe: one with a beard and glasses wearing a grPASS
Two men at a cafe: one with a beard and glasses wearing a green sweater, the other clean-shaven in a black hoodie

Evaluator (Qwen 3 VL)

Left man: beard, glasses, green sweater — matches description. Right man: clean-shaven (bald head), black hoodie — matches description. Both are visually distinct and clearly identifiable. All prompt features are accurately rendered. VERDICT: PASS

Two friends posing: one tall with curly red hair in denim ovPASS
Two friends posing: one tall with curly red hair in denim overalls, the other short with straight brown hair in a yellow sundress

Evaluator (Qwen 3 VL)

Tall person: curly red hair, denim overalls — matches. Short person: straight brown hair, yellow sundress — matches. Both visually distinct, posing together. All described features accurately represented. VERDICT: PASS

Medium 100%
Four people at a dinner table: a bald man in a suit, a womanPASS
Four people at a dinner table: a bald man in a suit, a woman with silver hair in a red blouse, a young man with dreadlocks in a denim jacket, and a woman with a hijab in a green dress

Evaluator (Qwen 3 VL)

All four individuals are present and visually distinct. Left: bald man in suit. Second: woman with silver hair in red blouse. Third: young man with dreadlocks in denim jacket. Right: woman with hijab in green dress. Features match prompt exactly. VERDICT: PASS

Three musicians on stage: a tall woman with short pink hair PASS
Three musicians on stage: a tall woman with short pink hair playing bass, a man with a beard and bandana on drums, and a slim man with glasses playing keyboard

Evaluator (Qwen 3 VL)

Tall woman: pink bob haircut, playing bass. Man with beard/bandana: drum kit, bearded, wearing headband. Slim man with glasses: keyboard, glasses, dark hair. All three are visually distinct and match descriptions. VERDICT: PASS

Three hikers on a trail: a man with a backpack and sunhat, aPASS
Three hikers on a trail: a man with a backpack and sunhat, a woman with braided hair and hiking poles, and a teenager with a baseball cap and water bottle

Evaluator (Qwen 3 VL)

Man: wears sunhat, backpack, beard. Woman: braided hair, hiking poles, tank top. Teenager: baseball cap, water bottle, backpack. All three are visually distinct and match described features. VERDICT: PASS

Truthfulness 74%

PhotorealismClick to expand100%

Photorealism measures whether local/sefi-image-2b-turbo makes materials, surfaces, and real-world objects look visually plausible and distinct.

Medium 100%
A still life on a wooden table featuring a clear glass of waPASS
A still life on a wooden table featuring a clear glass of water, a polished silver spoon, a folded linen napkin, and a ripe red apple — each material rendered with accurate texture and light behavior

Evaluator (Qwen 3.6 27B)

The image displays high-fidelity material rendering. The glass exhibits correct refraction and transparency, distorting the background appropriately. The silver spoon shows sharp, accurate specular highlights and reflections. The linen napkin has a visible woven texture with natural folds and soft shadows. The wooden table surface presents a realistic grain and matte finish, while the apple’s skin reflects light with a subtle sheen and detailed texture. All materials behave physically correctly

A workspace desk with a brushed aluminum laptop, a leather-bPASS
A workspace desk with a brushed aluminum laptop, a leather-bound notebook, a ceramic coffee mug, and a glass paperweight — all materials visually distinct and realistic

Evaluator (Qwen 3.6 27B)

The image successfully renders the requested materials with high fidelity. The laptop exhibits a convincing brushed aluminum texture with appropriate metallic sheen. The notebook displays realistic leather grain and stitching details. The mug has a matte ceramic finish, and the paperweight demonstrates accurate glass properties, including refraction and specular highlights. The visual distinction between these materials is clear and physically plausible. VERDICT: PASS

A kitchen counter with a stainless steel pot, a wooden cuttiPASS
A kitchen counter with a stainless steel pot, a wooden cutting board, a glass bottle of olive oil, and a cotton dish towel — each surface reflecting light correctly

Evaluator (Qwen 3.6 27B)

The image accurately renders the distinct material properties requested. The stainless steel pot exhibits sharp, distorted reflections consistent with polished metal. The wooden cutting board shows natural grain and diffuse light interaction. The glass bottle demonstrates translucency and refraction, while the cotton towel displays soft, fibrous texture with appropriate light absorption. All surfaces reflect light in a physically plausible manner. VERDICT: PASS

Physics & ReflectionsClick to expand83%

Physics & Reflections measures whether local/sefi-image-2b-turbo follows basic physical cues such as shadows, reflections, gravity, transparency, and contact points.

Easy 67%
A red sphere on a white surface with a single shadow cast toPASS
A red sphere on a white surface with a single shadow cast to the right, consistent with a light source from the upper left

Evaluator (Qwen 3.6 27B)

The shadow is cast to the right and slightly downward, consistent with a light source from the upper left. The highlight on the sphere’s upper-left surface aligns with this lighting direction. The shadow’s shape and softness are physically plausible for a diffuse light source. No distortions or color anomalies are present. The reflection on the sphere’s surface is subtle and realistic for a matte or semi-gloss material. VERDICT: PASS

A coffee mug on a sunny windowsill casting a sharp shadow toPASS
A coffee mug on a sunny windowsill casting a sharp shadow to the left

Evaluator (Qwen 3.6 27B)

The lighting originates from the right side, consistent with the window's position. The mug casts a distinct, sharp shadow to the left on both the wall and the sill, which is physically correct. The shadow's shape accurately mirrors the mug's silhouette, including the handle. The interior of the mug shows appropriate shading, with the left side in shadow and the right side illuminated. The overall lighting and shadow interaction are realistic and coherent. VERDICT: PASS

A person standing outdoors at midday with a short shadow dirFAIL
A person standing outdoors at midday with a short shadow directly below them

Evaluator (Qwen 3.6 27B)

The image displays severe physical inconsistencies. The central figure appears to be emerging from or embedded in a flat surface, with limbs distorted and disconnected. Shadows are chaotic and contradictory: the main figure casts a shadow that doesn't align with a single light source, and other shadow-like forms appear detached or floating. The overall composition defies basic principles of perspective and lighting, making the shadows/reflections physically implausible. VERDICT: FAIL

Medium 67%
A woman standing in front of a full-length mirror, her refleFAIL
A woman standing in front of a full-length mirror, her reflection accurately showing the back of her outfit and reversed orientation

Evaluator (Qwen 3.6 27B)

The reflection is physically implausible. The woman on the left wears a long-sleeved dress, while her reflection on the right wears a strapless mini-dress. A mirror cannot change the subject’s clothing or body shape. Additionally, the reflection’s orientation and posture do not match the source figure, violating basic laws of optics. The image fails to depict accurate reflections. VERDICT: FAIL

A cat sitting on a polished marble floor with its reflectionPASS
A cat sitting on a polished marble floor with its reflection visible below

Evaluator (Qwen 3.6 27B)

The reflection is physically plausible. It correctly mirrors the cat’s orientation, with the head appearing at the bottom and tail at the top in the reflection. The distortion matches the perspective of a polished floor, and the colors are consistent with a realistic reflection. The lighting and shadows align with the scene, enhancing the realism of the polished marble surface. VERDICT: PASS

A vase of flowers on a glass table with a clear reflection vPASS
A vase of flowers on a glass table with a clear reflection visible on the table surface

Evaluator (Qwen 3.6 27B)

The reflection on the glass table is physically plausible. It is oriented correctly, appearing as a mirror image of the vase and flowers directly beneath them. The reflection shows appropriate distortion and slight blurring consistent with a glass surface, and the colors match the objects above. The lighting and shadows are consistent with natural light coming from the window, enhancing the realism of the scene. VERDICT: PASS

Hard 100%
A row of trees reflected in a calm lake at sunset, with the PASS
A row of trees reflected in a calm lake at sunset, with the reflection slightly darker and gently rippled

Evaluator (Qwen 3.6 27B)

The reflections are vertically inverted, matching the orientation of the trees above. The gentle ripples distort the image realistically, creating wavy lines consistent with water movement. The color gradient transitions smoothly from the warm sunset tones at the top to darker, cooler shades at the bottom, accurately simulating depth and light absorption. The shadows appear slightly darker than the source, which is physically plausible. VERDICT: PASS

A city skyline reflected in a rain puddle on a street, with PASS
A city skyline reflected in a rain puddle on a street, with mild distortion and color shift

Evaluator (Qwen 3.6 27B)

The reflection is vertically inverted, matching the skyline above. The distortion is consistent with a wet, uneven surface, showing ripples and slight warping. Colors are muted and slightly shifted, typical of reflections in water. The puddle’s edges and surrounding wet pavement enhance realism. No directional inconsistencies or unnatural artifacts are present. VERDICT: PASS

A mountain range reflected in a still alpine lake, with the PASS
A mountain range reflected in a still alpine lake, with the reflection inverted and subtly muted in color

Evaluator (Qwen 3.6 27B)

The reflection is perfectly inverted, matching the mountain’s orientation. The color is subtly muted and slightly darker than the original, consistent with light absorption in water. The surface shows minimal distortion, indicating a still lake as described. Shadows and highlights align correctly between the real mountain and its reflection. No physical inconsistencies are present. VERDICT: PASS

Extreme 100%
A clear glass sphere sitting on a mirror surface near a poolPASS
A clear glass sphere sitting on a mirror surface near a pool of water, showing refraction through the sphere, reflection on the mirror, and water reflection simultaneously

Evaluator (Qwen 3.6 27B)

The image displays a glass sphere resting on a reflective surface. The reflection below the sphere is vertically aligned and correctly inverted, matching the sphere's position. The refraction within the sphere shows an inverted, wide-angle view of the background, which is physically accurate for a convex lens. The colors and lighting are consistent between the real scene, the refraction, and the reflection. The physics of light interaction appears correct. VERDICT: PASS

A chrome sculpture in a room with a glass floor and large wiPASS
A chrome sculpture in a room with a glass floor and large windows, showing multiple reflections and environmental mapping on every surface

Evaluator (Qwen 3.6 27B)

The reflections are physically plausible. The chrome surfaces accurately mirror the surrounding environment, including the grid of windows and other sculptures, with appropriate distortion based on curvature. The glass floor creates clear, slightly dimmer reflections of the figures below, consistent with real-world optics. The lighting is coherent, with highlights and shadows aligning logically with the light sources from the windows. VERDICT: PASS

A crystal wine glass on a polished marble counter next to a PASS
A crystal wine glass on a polished marble counter next to a window overlooking a lake — showing refraction in the glass, reflection on the marble, and the lake's own reflective surface

Evaluator (Qwen 3.6 27B)

The shadows and reflections are physically plausible. The wine glass casts a soft shadow consistent with natural light from the window. The marble surface shows accurate reflections of the glass stem and base, with appropriate distortion due to the polished texture. The lake’s surface reflects the surrounding trees and sky realistically, with gentle ripples affecting the reflection. Refraction through the glass subtly distorts the background, as expected. All elements align with real-world optic

World KnowledgeClick to expand58%

World Knowledge measures whether local/sefi-image-2b-turbo depicts recognizable places, landmarks, flags, and known objects with accurate real-world details.

Easy 100%
The Eiffel Tower in Paris on a clear day, architecturally acPASS
The Eiffel Tower in Paris on a clear day, architecturally accurate with its iron lattice structure and three levels

Evaluator (Qwen 3.6 27B)

The image depicts the Eiffel Tower with high architectural accuracy, showcasing the intricate iron lattice structure and the three distinct levels (two open platforms and the enclosed top section). The perspective is a low-angle shot looking up, which correctly emphasizes the tower’s height and structural details. The sky is clear blue, matching the prompt’s description. No significant inaccuracies are visible in the structure or context. VERDICT: PASS

The Taj Mahal in Agra seen from the reflecting pool, with acPASS
The Taj Mahal in Agra seen from the reflecting pool, with accurate dome, minarets, and symmetry

Evaluator (Qwen 3.6 27B)

The image depicts the Taj Mahal viewed through an archway, which is a common photographic composition. The architectural elements—the central dome, four minarets, and intricate marble inlay—are recognizable and structurally accurate. The symmetry is preserved, and the reflection in the pool is rendered realistically. While the framing adds an artistic element, the core subject remains factually correct and identifiable. VERDICT: PASS

The Statue of Liberty on her island pedestal, with correct tPASS
The Statue of Liberty on her island pedestal, with correct torch, crown, tablet, and green patina

Evaluator (Qwen 3.6 27B)

The statue is recognizable as the Statue of Liberty, featuring the correct green patina, crown, torch, and tablet. However, the pedestal is architecturally inaccurate; it appears as a simple stone block rather than the actual neoclassical structure with arches and columns. Additionally, the background scenery does not match the expected New York Harbor view, suggesting a composite or AI-generated environment. Despite these issues, the core subject remains identifiable. VERDICT: PASS

Medium 33%
A traditional Japanese shamisen instrument with accurate thrFAIL
A traditional Japanese shamisen instrument with accurate three-stringed design, wooden body, and bachi plectrum

Evaluator (Qwen 3.6 27B)

The image depicts a shamisen, but it contains significant factual inaccuracies. The instrument shown has four strings instead of the required three. Additionally, the tuning pegs are arranged incorrectly for a traditional shamisen (which typically has two pegs on one side and one on the other, or all three on one side depending on style, but not four). The bachi plectrum is missing from the scene. These errors make the depiction factually inaccurate. VERDICT: FAIL

A medieval astrolabe with accurate rete, tympan, and engraveFAIL
A medieval astrolabe with accurate rete, tympan, and engraved degree markings on brass

Evaluator (Qwen 3.6 27B)

The image depicts a complex, ornate brass instrument resembling an astrolabe. However, the text and symbols are nonsensical gibberish rather than accurate astronomical markings or Latin inscriptions typical of medieval artifacts. The mechanical structure is also somewhat chaotic and lacks the precise geometric logic of a functional rete and tympan. While visually impressive, it fails the factual accuracy test for a specific historical instrument. VERDICT: FAIL

A Japanese kintsugi bowl — cracked ceramic repaired with visPASS
A Japanese kintsugi bowl — cracked ceramic repaired with visible gold lacquer lines

Evaluator (Qwen 3.6 27B)

The image depicts a ceramic bowl with a crackle glaze pattern filled with gold, consistent with the kintsugi aesthetic. The subject is clearly recognizable as a bowl. While the background features an unnatural, repeating crackle texture that suggests AI generation, the core subject itself is factually accurate and recognizable. The prompt asks if the subject is recognizable and accurate, which it is. VERDICT: PASS

Hard 0%
A detailed diagram of the water cycle showing evaporation frFAIL
A detailed diagram of the water cycle showing evaporation from the ocean, condensation forming clouds, precipitation as rain, and collection in rivers flowing back to the sea

Evaluator (Qwen 3.6 27B)

The image depicts the water cycle with recognizable elements like clouds, rain, and a river flowing into the sea. However, the text labels are nonsensical gibberish, failing to provide the necessary factual information for a diagram. Additionally, the arrows indicating the cycle flow are disjointed and do not clearly connect the stages (e.g., evaporation to condensation) in a logical, continuous loop. The visual representation is artistic but lacks the structural accuracy required for a function

A cross-section showing tectonic plates at a convergent bounFAIL
A cross-section showing tectonic plates at a convergent boundary with subduction, a trench, volcanic arc, and layered mantle

Evaluator (Qwen 3.6 27B)

The image depicts a geological cross-section, but the text labels are nonsensical gibberish (e.g., "Cwetune," "820e"), which is a common AI artifact. Furthermore, the geological features are not clearly recognizable or accurate; the subduction zone and volcanic arc are vaguely sketched without clear structural definition, and the mantle layers lack distinct textural differentiation. The overall composition fails to accurately represent the specific scientific subject requested. VERDICT: FAIL

An anatomical illustration of the human heart showing four cFAIL
An anatomical illustration of the human heart showing four chambers, valves, aorta, pulmonary arteries, and veins with correct placement

Evaluator (Qwen 3.6 27B)

The image depicts a human heart with recognizable anatomical features such as chambers, valves, and major vessels. However, the labels are nonsensical gibberish (e.g., "psprtfx Hlo3s", "Ruong"), making it impossible to verify correct placement or factual accuracy. Additionally, some structural details appear distorted or stylized beyond standard anatomical illustration norms. Since the prompt requires correct placement and factual accuracy — which cannot be confirmed due to illegible/incorrect l

Extreme 100%
A Morpho menelaus butterfly with accurate iridescent blue doPASS
A Morpho menelaus butterfly with accurate iridescent blue dorsal wings, brown ventral wing pattern with eyespots, and correct body morphology

Evaluator (Qwen 3.6 27B)

The image depicts a butterfly with the characteristic iridescent blue dorsal wings and brown ventral wings with eyespots of a Morpho species. The body morphology, including antennae and wing venation, appears anatomically correct. The coloration and pattern distribution align with real-world specimens of Morpho menelaus or closely related species. No obvious AI artifacts or anatomical inaccuracies are present. VERDICT: PASS

The Hagia Sophia interior showing the massive central dome, PASS
The Hagia Sophia interior showing the massive central dome, pendentives, semi-domes, and Islamic calligraphy medallions alongside Byzantine mosaics

Evaluator (Qwen 3.6 27B)

The image depicts the interior of the Hagia Sophia, recognizable by its massive central dome, pendentives, and semi-domes. The Islamic calligraphy medallions are present and stylistically accurate. However, the claim of "Byzantine mosaics" is misleading; while faint traces or outlines might be suggested, the dominant visual is the calligraphy and structural elements, with no clear, distinct mosaics visible in this view. The architectural structure is largely accurate, but the lighting and textur

A specimen of bismuth crystal showing accurate iridescent oxPASS
A specimen of bismuth crystal showing accurate iridescent oxide layers, geometric hopper crystal formation, and characteristic staircase structure

Evaluator (Qwen 3.6 27B)

The image accurately depicts a bismuth crystal with its characteristic geometric hopper structure and staircase formation. The iridescent oxide layers are realistically rendered, showing the expected rainbow-like colors due to thin-film interference. The crystalline facets and overall morphology align with real-world specimens of bismuth. No significant inaccuracies are present in the depiction. VERDICT: PASS

Professional Studio 85%

Camera & LightingClick to expand75%

Camera & Lighting measures whether local/sefi-image-2b-turbo follows specific lens, angle, focus, exposure, and lighting instructions.

Easy 100%
A wide-angle shot of a modern living room interior showing ePASS
A wide-angle shot of a modern living room interior showing expansive space with slight barrel distortion at the edges

Evaluator (Gemma 4 26B)

The image successfully implements the requested camera effects. It features a wide-angle perspective that captures an expansive living room. Most notably, it exhibits significant barrel distortion, particularly visible in the extreme curvature of the floor and the warping of the vertical lines at the edges of the frame. The lighting is bright and even, consistent with a modern interior. VERDICT: PASS

A close-up macro shot of a single dewdrop on a rose petal, sPASS
A close-up macro shot of a single dewdrop on a rose petal, shallow depth of field

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. It features a close-up macro shot of a single, clear dewdrop resting on a pink rose petal. The depth of field is very shallow, with the background and the edges of the petal being softly blurred, which emphasizes the droplet. The lighting is soft and natural, highlighting the transparency and the internal reflections within the water drop. VERDICT: PASS

An aerial bird's-eye view looking straight down at a trafficPASS
An aerial bird's-eye view looking straight down at a traffic roundabout with cars and trees

Evaluator (Gemma 4 26B)

The image successfully depicts an aerial bird's-eye view looking straight down at a traffic roundabout. The perspective is a top-down orthographic-style view, and the lighting is consistent across the scene, showing a bright, slightly hazy daylight pattern. The depth of field is deep, keeping the entire street layout and the cars in focus, which is appropriate for a high-altitude aerial shot. VERDICT: PASS

Medium 67%
A portrait lit with classic Rembrandt lighting — triangle ofPASS
A portrait lit with classic Rembrandt lighting — triangle of light on the shadowed cheek, moody atmosphere

Evaluator (Gemma 4 26B)

The image features a dramatic, moody portrait. The lighting pattern is highly successful in achieving the "Rembrandt lighting" effect: there is a distinct, small triangle of light visible on the shadowed cheek below the eye. The light source is positioned to create a strong contrast between the illuminated side of the face and the deep shadows, creating the requested moody atmosphere. VERDICT: PASS

A landscape photograph bathed in golden hour light — long waPASS
A landscape photograph bathed in golden hour light — long warm shadows, orange sky, soft glow

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. It depicts a landscape with rolling hills under a vibrant orange sky. The lighting is a quintessential "golden hour" effect, featuring a warm, soft glow and long, dramatic shadows cast across the terrain. The atmospheric haze and light rays create a strong sense of depth and a soft, diffused light pattern as requested. VERDICT: PASS

A high-key portrait of a person against a bright white backgFAIL
A high-key portrait of a person against a bright white background, minimal shadows, airy and luminous

Evaluator (Gemma 4 26B)

The image is extremely overexposed to the point of being nearly a blank white frame. While the prompt requested a "high-key" look with "minimal shadows" and an "airy and luminous" feel, the execution has resulted in a loss of almost all detail. The subject is barely visible through a heavy white haze, failing to provide a clear, high-quality portrait. It lacks the intentionality of high-key lighting and instead looks like a technical error or extreme overexposure. VERDICT: FAIL

Hard 67%
A portrait shot at 85mm f/1.4 with creamy bokeh background, PASS
A portrait shot at 85mm f/1.4 with creamy bokeh background, backlit with warm rim light separating the subject from a blurred garden

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. The portrait features a shallow depth of field (consistent with 85mm f/1.4) with a very creamy, blurred garden background. The lighting includes a distinct warm rim light on the subject's hair and shoulder, creating a clear separation from the background, which effectively simulates a backlit golden hour effect. VERDICT: PASS

A street photograph with shallow depth of field as if shot aPASS
A street photograph with shallow depth of field as if shot at 50mm f/1.8, subject sharp with soft city lights in the background

Evaluator (Gemma 4 26B)

The image successfully captures the requested aesthetic. The subject is sharp and in focus, while the background features a shallow depth of field with prominent, soft bokeh from city lights. This creates the intended 50mm f/1.8 look. The perspective is a standard street photography eye-level shot, and the lighting pattern of the out-of-focus background lights matches the description perfectly. VERDICT: PASS

A food photograph shot from above at 35mm f/2.8 with naturalFAIL
A food photograph shot from above at 35mm f/2.8 with natural window light from the left, slight vignette, warm color temperature

Evaluator (Gemma 4 26B)

The image fails to meet the prompt's requirements. The prompt specifies a "food photograph shot from above," but the image shows a distorted, surrealist graphic/poster of a face on a wall. The perspective is an angled side view, not a top-down shot. While there is light from the left and a warm tone, the core subject matter (food) and the specific camera perspective (shot from above) are entirely absent. VERDICT: FAIL

Extreme 67%
A tilt-shift photograph of a city street making real-life loFAIL
A tilt-shift photograph of a city street making real-life look like a miniature model — selective focus strip, saturated colors

Evaluator (Gemma 4 26B)

The image fails to meet the prompt's requirements. While it attempts a blur, it lacks the specific "selective focus strip" characteristic of tilt-shift photography, which requires a sharp horizontal band of focus. Instead, the entire image is extremely blurry and out of focus, appearing as a low-quality, indistinct smudge. Furthermore, the colors are muted and washed out, failing the "saturated colors" requirement. VERDICT: FAIL

A cinematic frame with blue and orange anamorphic lens flarePASS
A cinematic frame with blue and orange anamorphic lens flare streaking horizontally across a dark moody scene

Evaluator (Gemma 4 26B)

The image features a dark, moody scene with a prominent horizontal light streak. The streak transitions from a blue hue on the left to an orange/white glow on the right, accurately capturing the "blue and orange anamorphic lens flare" described. The light streaks horizontally across the frame as requested. The lighting pattern and color scheme perfectly match the prompt's specific cinematic requirements. VERDICT: PASS

A portrait with dramatic split lighting — half the face brigPASS
A portrait with dramatic split lighting — half the face brightly lit, the other half in deep shadow, high contrast chiaroscuro

Evaluator (Gemma 4 26B)

The image perfectly executes the requested lighting pattern. It features a dramatic split lighting effect where the left side of the face is brightly illuminated and the right side is cast in deep, heavy shadow. This creates a high-contrast chiaroscuro effect as described. The perspective is a close-up portrait, and the lighting pattern is sharp and precise, meeting all the specific criteria of the prompt. VERDICT: PASS

Color PrecisionClick to expand100%

Color Precision measures whether local/sefi-image-2b-turbo renders requested colors clearly and consistently across subjects, products, and backgrounds.

Easy 100%
A bright red sports car parked on a gray asphalt road — the PASS
A bright red sports car parked on a gray asphalt road — the red is vivid and unambiguous

Evaluator (Gemma 4 26B)

The image features a sports car with a bright, vivid, and unambiguous red color, exactly as specified in the prompt. The car is parked on a gray asphalt road, which also matches the description. The colors are consistent with the requested "bright red" and "gray" tones. VERDICT: PASS

A woman wearing a royal blue evening dress against a neutralPASS
A woman wearing a royal blue evening dress against a neutral background

Evaluator (Gemma 4 26B)

The woman is wearing a vibrant, deep blue evening dress that perfectly matches the description of "royal blue." The background is a solid, light beige/taupe color, which qualifies as a "neutral background." All color requirements specified in the prompt are accurately reflected in the image. VERDICT: PASS

A field of pure sunflower yellow flowers under a clear blue PASS
A field of pure sunflower yellow flowers under a clear blue sky

Evaluator (Gemma 4 26B)

The image features a vast field of bright, sunflower yellow flowers and a clear, solid blue sky. The colors are vibrant and accurately reflect the prompt's description of "pure sunflower yellow" and "clear blue sky." The colors are consistent throughout the field and the sky. VERDICT: PASS

Medium 100%
A bedroom wall painted in #FF6B35 (burnt orange) with white PASS
A bedroom wall painted in #FF6B35 (burnt orange) with white furniture and natural daylight

Evaluator (Gemma 4 26B)

The wall is a vibrant orange color that closely matches the description of "burnt orange" (#FF6B35). The furniture, including the bed frame and the side table/desk, is white as requested. The scene is illuminated by bright, natural daylight coming from the side windows. All color requirements specified in the prompt are accurately represented in the image. VERDICT: PASS

A forest scene where the foliage is precisely #2D5F2D (dark PASS
A forest scene where the foliage is precisely #2D5F2D (dark muted green) with brown tree trunks

Evaluator (Gemma 4 26B)

The foliage in the image consists of various shades of green. The dominant color for the leaves is a dark, muted green that closely aligns with the specified hex code #2D5F2D. The tree trunks are a consistent brown color as requested. The colors in the image accurately match the descriptions and the specific hex value provided in the prompt. VERDICT: PASS

A room with walls in #4A90D9 (medium blue) with warm wooden PASS
A room with walls in #4A90D9 (medium blue) with warm wooden floor and white trim

Evaluator (Gemma 4 26B)

The walls are a medium blue color that closely aligns with the requested hex code #4A90D9. The floor is a warm-toned wood, and the baseboard/trim is white, matching the prompt's description of "warm wooden floor and white trim." All color requirements are met. VERDICT: PASS

Hard 100%
A minimalist room with walls in #F5E6CC (warm cream), a sofaPASS
A minimalist room with walls in #F5E6CC (warm cream), a sofa in #2C3E50 (dark navy), and accent pillows in #E74C3C (bright red)

Evaluator (Gemma 4 26B)

The image accurately reflects the requested colors: - The walls are a warm cream color (#F5E6CC). - The sofa is a dark navy blue (#2C3E50). - The accent pillows are a bright red (#E74C3C). All three specific color requirements, including the color names and the intended visual tones, are correctly implemented in the generated image. VERDICT: PASS

A product flat-lay on a #1A1A2E (deep dark blue) background,PASS
A product flat-lay on a #1A1A2E (deep dark blue) background, with a #F0C27F (soft gold) watch and #FFFFFF (pure white) earbuds

Evaluator (Gemma 4 26B)

The image accurately reflects the requested colors. The background is a deep, dark navy blue consistent with #1A1A2E. The watch features a soft gold color (#F0C27F) on both the casing and the strap. The earbuds are a pure, bright white (#FFFFFF). All three color requirements are met precisely. VERDICT: PASS

A graphic design composition with background #FAFAFA (near-wPASS
A graphic design composition with background #FAFAFA (near-white), main shape #FF4500 (orange-red), and accent lines #333333 (dark charcoal)

Evaluator (Gemma 4 26B)

The image features a near-white background, which aligns with #FAFAFA. The main shapes are a vibrant orange-red, matching the description of #FF4500. The dark accent elements are a dark charcoal/grey, consistent with #333333. All specified colors are accurately represented in the graphic composition. VERDICT: PASS

Extreme 100%
A smooth gradient on a wall going from #FF0000 (red) on the PASS
A smooth gradient on a wall going from #FF0000 (red) on the left to #0000FF (blue) on the right, transitioning evenly through purple

Evaluator (Gemma 4 26B)

The image displays a smooth gradient transitioning from a bright red on the left to a blue on the right, passing through a purple middle section. The colors align with the requested hex values: the left side is a vibrant red (#FF0000), the right side is a clear blue (#0000FF), and the transition through purple is smooth and even. VERDICT: PASS

A sunset sky gradient on a flat surface from #FF6B35 (orangePASS
A sunset sky gradient on a flat surface from #FF6B35 (orange) at the bottom to #1B1464 (deep indigo) at the top

Evaluator (Gemma 4 26B)

The image displays a vertical gradient. The top color is a very dark, deep indigo/navy, which aligns with #1B1464. The bottom color is a vibrant, saturated orange, which aligns with the specified #FF6B35. The transition through the middle shows a reddish-purple hue, creating a smooth sunset effect that matches the requested color scheme and direction. VERDICT: PASS

A horizontal banner with a gradient from #00C9FF (cyan) on tPASS
A horizontal banner with a gradient from #00C9FF (cyan) on the left to #92FE9D (mint green) on the right

Evaluator (Gemma 4 26B)

The image shows a horizontal banner with a gradient. On the left, the color is a bright cyan, which closely matches the specified #00C9FF. On the right, the color transitions into a light mint green, which aligns with the specified #92FE9D. The gradient transition is smooth and follows the requested color scheme accurately. VERDICT: PASS

PhotorealismClick to expand67%

Photorealism measures whether local/sefi-image-2b-turbo produces polished product and studio imagery with realistic materials, lighting, shadows, and finish.

Easy 67%
A single red sneaker on a clean white background with soft sPASS
A single red sneaker on a clean white background with soft studio lighting and gentle shadow — product photography style

Evaluator (Gemma 4 26B)

The image successfully captures a high-quality product photography aesthetic. The lighting is soft and professional, creating a natural, gentle shadow beneath the sneaker. The texture of the red fabric is detailed and consistent. There are no obvious AI artifacts, such as warped shapes or nonsensical stitching, and the white background is clean and seamless. It looks like a genuine studio shot. VERDICT: PASS

A clear glass perfume bottle on a reflective black surface wFAIL
A clear glass perfume bottle on a reflective black surface with studio lighting, no background distractions

Evaluator (Gemma 4 26B)

The image fails to meet the standard of a professional studio product shot. There are significant AI artifacts: the glass bottle lacks realistic refraction and depth, appearing as a flat, distorted shape. The straw inside the bottle is unnaturally curved and disconnected from the nozzle. Furthermore, the reflection on the surface is poorly defined and lacks the high-fidelity detail expected in a real photograph. VERDICT: FAIL

A ripe green apple on a seamless white background with soft PASS
A ripe green apple on a seamless white background with soft side lighting and subtle shadow — catalog style

Evaluator (Gemma 4 26B)

The image successfully captures the requested studio quality. The green apple features realistic textures, subtle specular highlights, and a natural-looking stem. The lighting is soft and directional, creating a gentle shadow on the seamless white background that grounds the object realistically. There are no obvious AI artifacts, such as distorted shapes or unnatural textures, and it passes as a high-quality professional product photograph. VERDICT: PASS

Graphical design 46%

Data VisualisationClick to expand0%

Data Visualisation measures whether local/sefi-image-2b-turbo produces charts with correct labels, proportions, colors, and readable values.

Easy 0%
A bar chart showing monthly sales for January through June wFAIL
A bar chart showing monthly sales for January through June with values 12K, 18K, 15K, 22K, 19K, 25K — labeled axes, color-coded bars, and a title 'Monthly Sales 2025'

Evaluator (Qwen 3 VL)

The chart shows only 4 bars (Jan, Feb, Mar, Apr) with incorrect values (12K, 18K, 15K, 22K, 19K, 25K), missing May and June. Axes are mislabeled (y-axis reversed, x-axis labels incorrect). Title is misspelled (“Monhily”). Color-coding is present but inconsistent. Overall, it fails to meet the requested specifications. VERDICT: FAIL

A pie chart showing market share: Company A 35%, Company B 2FAIL
A pie chart showing market share: Company A 35%, Company B 25%, Company C 20%, Others 20% — each slice a different color with labels and percentages

Evaluator (Qwen 3 VL)

The chart misrepresents the requested data: percentages and labels are incorrect (e.g., "B55%" and "35%" overlap; "Others" is 20% but mislabeled). Labels are garbled (e.g., "Company B" appears as "Cteeer1"). The chart type is correct (pie chart), but content fails to match the prompt. VERDICT: FAIL

A simple line graph showing daily website visitors over one FAIL
A simple line graph showing daily website visitors over one week (Mon-Sun): 1200, 1500, 1100, 1800, 2000, 1700, 1400 — with grid lines, axis labels, and title 'Weekly Traffic'

Evaluator (Qwen 3 VL)

The chart misrepresents data: y-axis labels are inverted, x-axis labels are illegible (showing "11", "3", "260"), and the title is duplicated. Values on axes don’t match the requested data (1200, 1500, etc.). The chart type is a line graph, but visual elements are incorrect and unreadable. VERDICT: FAIL

Layout & DesignClick to expand11%

Layout & Design measures whether local/sefi-image-2b-turbo organizes posters, menus, covers, and other designed compositions with the requested hierarchy and elements.

Easy 33%
A movie night poster with a large popcorn image in the centeFAIL
A movie night poster with a large popcorn image in the center, the title 'MOVIE NIGHT' in bold at the top, and the tagline 'Every Friday at 8 PM' at the bottom

Evaluator (Qwen 3 VL)

The image features a large popcorn bucket centered, with "MOVIE NIGHT" at the top and "Every Friday at 8 PM" at the bottom. However, the central popcorn bucket has distorted, unreadable text ("MCP NACN") overlaid, violating the prompt’s layout and readability requirements. The visual hierarchy is partially correct but compromised by the illegible text. VERDICT: FAIL

An event poster featuring a guitar illustration, the headlinPASS
An event poster featuring a guitar illustration, the headline 'LIVE MUSIC FEST' in large text, and 'Tickets available now' as a tagline below

Evaluator (Qwen 3 VL)

The poster features a guitar illustration and the headline "LIVE MUSIC FEST" in large, distressed white text with correct visual hierarchy. The tagline "Tickets available now" appears below, though with a minor typo ("avaialble"). Text is readable and correctly placed. The core layout elements are present and well-structured. VERDICT: PASS

A book club poster with an open book illustration, the titleFAIL
A book club poster with an open book illustration, the title 'JOIN THE BOOK CLUB' prominently displayed, and 'Meets every Wednesday' as the tagline

Evaluator (Qwen 3 VL)

The image features an open book illustration with "JOIN THE BOOK CLUB" prominently displayed in large, bold text across both pages. Below, placeholder text mimics book content. The tagline "Meets every Wednesday" appears at the bottom, but is misspelled ("Weeneesday"). The visual hierarchy is correct, though the tagline’s error undermines accuracy. VERDICT: FAIL

Medium 0%
A landing page design with a hero header image at the top, aFAIL
A landing page design with a hero header image at the top, a headline and subtitle in the middle, a 'Sign Up' call-to-action button, and a footer with three social media icons

Evaluator (Qwen 3 VL)

The image shows a hero header with a background image, a headline and subtitle (though text is gibberish and unreadable), a centered "Sign Up" button, and a single social media icon at the bottom. The layout structure is present, but the text is nonsensical and not readable. The visual hierarchy is correct, but content fails the core requirement of readable text. VERDICT: FAIL

A newsletter layout with a banner image header, a two-columnFAIL
A newsletter layout with a banner image header, a two-column body section with text on the left and an image on the right, a highlighted quote block, and a footer with unsubscribe link

Evaluator (Qwen 3 VL)

The image shows a banner header, two-column body (text left, image right), and footer with “Ubclusrsecde”. However, all text is gibberish and unreadable, violating the requirement for readable, correctly placed text. The layout structure is present but content fails functional criteria. VERDICT: FAIL

A restaurant menu layout with a logo header, three category FAIL
A restaurant menu layout with a logo header, three category sections (appetizers, mains, desserts) each with items and prices, and a footer with address and hours

Evaluator (Qwen 3 VL)

The image shows a menu with a logo header, three category columns (Appetizers, Mains, Desserts), and a footer with address/hours. However, all text is gibberish, unreadable, and incorrectly placed. The layout structure is present but the content violates the prompt’s requirement for readable, correctly placed text. VERDICT: FAIL

Hard 0%
A magazine double-page spread with a left page containing a FAIL
A magazine double-page spread with a left page containing a large hero photo, and a right page with a two-column article, a pull quote, a sidebar with author bio, and a page number

Evaluator (Qwen 3 VL)

The left page has a large hero photo with overlaid text, but no clear article layout. The right page shows two columns, a pull quote, a sidebar with author bio, and page number — all correctly placed. However, all text is gibberish, unreadable, and violates the prompt’s requirement for readable content. Layout structure is mostly present but text fails core functionality. VERDICT: FAIL

A mobile app UI mockup showing a dashboard with a top navigaFAIL
A mobile app UI mockup showing a dashboard with a top navigation bar, a search field, a 2x2 grid of feature cards with icons, a list of recent activity items, and a bottom tab bar

Evaluator (Qwen 3 VL)

The image shows a mobile app UI with a top search bar, a 2x2 grid of feature cards, a recent activity list, and a bottom tab bar — matching the prompt’s layout. However, all text is gibberish and unreadable, violating the requirement for readable, correctly placed text. Visual hierarchy is present but meaningless due to illegible content. VERDICT: FAIL

A fashion magazine spread with a full-bleed left-page photo,FAIL
A fashion magazine spread with a full-bleed left-page photo, right page with three-column layout, headline, body copy, a smaller inset image, and styled page numbers

Evaluator (Qwen 3 VL)

Left page has full-bleed photo with text overlay. Right page shows three-column layout with headline, body copy, inset image, and page number. However, all text is gibberish and unreadable, violating the requirement for readable, correctly placed text. Layout structure is present but content fails functional criteria. VERDICT: FAIL

Style DiversityClick to expand83%

Style Diversity measures whether local/sefi-image-2b-turbo can shift between visual styles like oil painting, pixel art, watercolor, and technical illustration.

Easy 67%
A golden retriever sitting in a garden, painted in the stylePASS
A golden retriever sitting in a garden, painted in the style of an oil painting with visible brushstrokes and rich texture

Evaluator (Qwen 3.5 122B)

The image clearly depicts a golden retriever sitting in a garden setting. The style is unmistakably that of an oil painting, with thick, visible brushstrokes (impasto technique) creating rich texture throughout the composition. The dappled light, color blending, and painterly execution all align with the requested style. The key visual characteristics are fully present and well-executed. VERDICT: PASS

A golden retriever sitting in a garden, rendered in JapanesePASS
A golden retriever sitting in a garden, rendered in Japanese anime style with flat colors and large expressive eyes

Evaluator (Qwen 3.5 122B)

The image clearly exhibits the requested Japanese anime style. Key characteristics are present: the golden retriever has large, expressive eyes with distinct highlights, simplified facial features, and clean line art. The coloring uses flat tones with minimal shading gradients, typical of anime aesthetics. The background flowers and foliage also follow this stylized approach with bold outlines and vibrant, non-photorealistic colors. The overall composition and rendering match the prompt’s artist

A golden retriever sitting in a garden, rendered in pixel arFAIL
A golden retriever sitting in a garden, rendered in pixel art style with a 32x32 grid and limited color palette

Evaluator (Qwen 3.5 122B)

The image clearly depicts a golden retriever in a garden setting. The pixel art style is evident with visible pixelation and blocky forms. However, the grid overlay is much finer than 32x32 (appearing closer to 64x64 or higher), and the color palette is quite extensive rather than limited. While the core style is recognizable, the specific technical constraints (32x32 grid and limited palette) are not accurately met. VERDICT: FAIL

Medium 100%
A city street scene in the style of Art Deco — geometric shaPASS
A city street scene in the style of Art Deco — geometric shapes, gold and black palette, ornamental symmetry

Evaluator (Qwen 3.5 122B)

The image strongly exhibits the requested Art Deco style. Key characteristics are clearly present: 1. **Geometric Shapes:** The architecture is defined by sharp angles, vertical lines, and repeating geometric patterns (zigzags, chevrons). 2. **Gold and Black Palette:** The color scheme is strictly limited to metallic gold/brass tones against deep black shadows, creating high contrast. 3. **Ornamental Symmetry:** The composition is largely symmetrical, featuring ornate, stylized decorative elemen

A woman with a parasol in the style of Impressionism — soft PASS
A woman with a parasol in the style of Impressionism — soft brushstrokes, dappled light, pastel palette

Evaluator (Qwen 3.5 122B)

The image clearly exhibits Impressionist characteristics: soft, visible brushstrokes are evident throughout, especially in the background foliage and dress folds. Dappled light filters through the parasol and onto the subject’s face and clothing, creating a luminous effect. The palette is dominated by pastels — pale blues, yellows, pinks, and whites — consistent with the style. The overall impression prioritizes light and atmosphere over sharp detail, fulfilling the prompt’s requirements. VERDI

A portrait of a man in Pop Art style — bold outlines, Ben-DaPASS
A portrait of a man in Pop Art style — bold outlines, Ben-Day dots, flat vivid primary colors

Evaluator (Qwen 3.5 122B)

The image clearly exhibits Pop Art characteristics: bold black outlines define facial features, Ben-Day dots are visible throughout (especially in the background and skin tones), and vivid primary/secondary colors are used in a flat, graphic manner. The halftone pattern, high contrast, and stylized color blocking align with the requested style. All key visual elements are present and recognizable. VERDICT: PASS

Hard 67%
A forest rendered in photorealistic watercolor — the detail PASS
A forest rendered in photorealistic watercolor — the detail and lighting of a photo combined with visible watercolor bleeds and paper texture

Evaluator (Qwen 3.5 122B)

The image clearly exhibits the requested photorealistic watercolor style. Key characteristics are present: visible paper texture, soft watercolor bleeds and washes, and a realistic forest scene with natural lighting and depth. The blend of photographic detail in tree forms with translucent, bleeding pigments matches the prompt’s intent. VERDICT: PASS

A woman in a futuristic city rendered in cyberpunk Art NouvePASS
A woman in a futuristic city rendered in cyberpunk Art Nouveau — neon colors and tech elements combined with flowing organic lines and floral borders

Evaluator (Qwen 3.5 122B)

The image clearly embodies the requested "cyberpunk Art Nouveau" style. Key characteristics are present: the central figure is framed by ornate, flowing organic lines and floral borders typical of Art Nouveau. The background features a futuristic city with neon colors and tech elements, fulfilling the cyberpunk aspect. The combination of these elements is visually cohesive and recognizable. VERDICT: PASS

A medieval castle depicted in Japanese ukiyo-e woodblock priFAIL
A medieval castle depicted in Japanese ukiyo-e woodblock print style combined with photorealistic sky and lighting

Evaluator (Qwen 3.5 122B)

The image displays a medieval-style castle with some East Asian architectural elements (curved roofs), but it does not convincingly replicate the ukiyo-e woodblock print style. Key characteristics like flat color planes, bold outlines, and traditional woodblock texture are absent. The lighting and sky appear photorealistic, but the overall aesthetic is more of a digital painting or concept art than ukiyo-e. The requested style fusion is not clearly recognizable. VERDICT: FAIL

Extreme 100%
An interior scene rendered in soft brutalism — raw concrete PASS
An interior scene rendered in soft brutalism — raw concrete textures combined with rounded pastel furniture and warm diffused light

Evaluator (Qwen 3.5 122B)

The image clearly exhibits soft brutalism: raw, textured concrete walls with visible imperfections and seams are prominent. Rounded, pastel pink furniture pieces dominate the foreground, contrasting with the rough surfaces. Warm, diffused light filters from above, casting gentle shadows and enhancing the serene mood. All key visual characteristics — concrete texture, pastel curves, and ambient warmth — are present and well-executed. VERDICT: PASS

A still life painted in anxious minimalism — sparse compositPASS
A still life painted in anxious minimalism — sparse composition with uncomfortable negative space, muted colors, and subtle visual tension

Evaluator (Qwen 3.5 122B)

The image exhibits a sparse composition with a single white cylindrical object on a table against a large, empty wall — fulfilling the “uncomfortable negative space” and “sparse composition” criteria. Colors are muted (grays and whites), and there’s subtle visual tension in the starkness and isolation of the object. The style aligns closely with “anxious minimalism.” While not overtly dramatic, the mood is conveyed through restraint and emptiness. VERDICT: PASS

A cityscape rendered in nostalgic futurism — retro-futuristiPASS
A cityscape rendered in nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina

Evaluator (Qwen 3.5 122B)

The image displays a cityscape with tall, slender spires and flying saucer-like structures, consistent with retro-futuristic 1960s aesthetics. The color palette is a monochromatic sepia/warm brown, creating a nostalgic, melancholic patina. The architectural forms are distinctly mid-century modern sci-fi. The style is clearly recognizable and matches the prompt's requirements. VERDICT: PASS