ImageBench

ImageBench V1 —

192 evaluations across 6 categories

Benchmark V1 verdicts are produced by VLM judges and can contain mistakes. Treat PASS/FAIL labels as machine-assisted assessments, and inspect the images yourself. Learn more about the methodology.

Generation Details

Source-backed model context, size, cost, and request settings for this ImageBench V1 run.

local/sefi-image-5b-base

Local

SeFi Image 5B Base is a locally fine-tuned text-to-image model produced by the SeFi image-generation fine-tuning pipeline and run on an NVIDIA DGX Spark. It is the ~5B base (non-distilled) variant of the SeFi Image family. It is not a publicly released hosted product; no external model card or citation is disclosed.

Maker
SeFi pipeline
Family
SeFi Image
Model Size
~5B
estimated
Cost
local run; no API price
not_applicable
Run Target
gx10/sefi-image-5b-base
Effective Request
Effective request fields unknown
60.1
Overall
80%
Capability
40.4
Est. Preference
153
Pass
39
Fail
130.6s
Avg Latency
124.4s
Min Latency
376.6s
Max Latency
Text Rendering80%Spatial Reasoning83%Human realism88%Truthfulness70%Professional Studio93%Graphical design54%Preference40%Latency0%

All 192 generations

Text Rendering

Spatial Reasoning

Human realism

Truthfulness

Professional Studio

Graphical design

Text Rendering 80%

Typography StyleClick to expand100%

Typography Style measures whether local/sefi-image-5b-base preserves requested lettering styles, font mood, and text texture such as typewriter ink or neon signage.

Easy 100%
The word 'CHAPTER ONE' typed on aged paper with a vintage tyPASS
The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Evaluator (Qwen 3 VL)

The text "CHAPTER ONE" is spelled correctly and rendered in a bold, blocky font resembling vintage typewriter lettering. The ink appears slightly uneven and textured, matching the "slightly uneven ink" description. The aged paper background enhances the vintage feel. The style convincingly simulates a typewriter output. VERDICT: PASS

The phrase 'TODAY'S SPECIAL' written in white chalk on a darPASS
The phrase 'TODAY'S SPECIAL' written in white chalk on a dark green chalkboard with chalk dust

Evaluator (Qwen 3 VL)

Text reads "TODAY'S SPECIAL" correctly. Letters mimic chalk texture and smudges on a dark green board, with visible chalk dust residue. The style convincingly simulates chalk writing. Minor imperfections in stroke uniformity are authentic to chalk. All elements align with the prompt. VERDICT: PASS

The word 'OPEN' displayed on a red LED sign with glowing pixPASS
The word 'OPEN' displayed on a red LED sign with glowing pixel-style letters

Evaluator (Qwen 3 VL)

The word "OPEN" is correctly spelled and rendered in a glowing red, pixel-style LED matrix, matching the prompt’s description. Letters are composed of distinct illuminated dots, creating a convincing digital display effect against a dark background. The style and medium are accurately represented. VERDICT: PASS

Writing accuracyClick to expand75%

Writing accuracy measures whether local/sefi-image-5b-base renders the exact requested words, numbers, labels, and short phrases without spelling or layout errors.

Easy 100%
A single large word 'HELLO' centered on a white backgroundPASS
A single large word 'HELLO' centered on a white background

Evaluator (Qwen 3 VL)

The word "HELLO" is fully spelled correctly and is clearly readable. All letters are distinct, properly formed, and legible against the white background. No misspellings or missing characters are present. The text meets the specified requirement exactly. VERDICT: PASS

The word 'ADVENTURE' in big bold letters centered on a plainPASS
The word 'ADVENTURE' in big bold letters centered on a plain cream background

Evaluator (Qwen 3 VL)

The word "ADVENTURE" is spelled correctly and fully readable in bold, centered text on a cream background. All letters are distinct and unambiguous. No misspellings or illegible parts exist. The visual matches the intended specification precisely. VERDICT: PASS

The word 'SERENITY' written large and centered on a solid liPASS
The word 'SERENITY' written large and centered on a solid light blue background

Evaluator (Qwen 3 VL)

The word "SERENITY" is spelled correctly and is fully readable, centered on a solid light blue background. No misspellings, missing words, or illegible parts are present. The text matches the intended specification exactly. VERDICT: PASS

Medium 100%
The sentence 'The quick brown fox jumps high' in clean blackPASS
The sentence 'The quick brown fox jumps high' in clean black text on a white background

Evaluator (Qwen 3 VL)

All words are spelled correctly and fully readable: "The", "quick", "brown", "fox", "jumps", "high". Text is clean black on white background. No misspellings or illegible parts. VERDICT: PASS

'Every sunset brings a new dawn ahead' written neatly on a pPASS
'Every sunset brings a new dawn ahead' written neatly on a pale yellow background

Evaluator (Qwen 3 VL)

Text reads: "Every sunset brings a new dawn ahead" — matches intended phrase exactly. All words are spelled correctly and fully readable in black cursive on pale yellow background. No misspellings or illegible parts. VERDICT: PASS

'Stars shine brightest in the dark night' displayed on a clePASS
'Stars shine brightest in the dark night' displayed on a clean off-white background

Evaluator (Qwen 3 VL)

Text reads: "Stars shine brightest in the dark night" — matches intended phrase exactly. All words are spelled correctly and fully legible against the clean off-white background. No misspellings or missing words detected. VERDICT: PASS

Hard 100%
A poster with headline 'SUMMER SALE' in large bold text, bodPASS
A poster with headline 'SUMMER SALE' in large bold text, body text 'Up to 50% off all items this weekend only' in medium size, and fine print 'Terms and conditions apply. See store for details.' at the bottom in small text

Evaluator (Qwen 3 VL)

All text matches the intended content exactly. "SUMMER SALE" is prominent. Body text reads "Up to 50% off all items this weekend only". Fine print: "Terms and conditions apply. See store for details." No misspellings, missing words, or illegible parts. Text is fully readable and accurate. VERDICT: PASS

A flyer with large title 'GRAND OPENING', subtitle 'Join us PASS
A flyer with large title 'GRAND OPENING', subtitle 'Join us for food, music, and fun' in medium text, and small disclaimer 'RSVP required. Limited seating available.' at the bottom

Evaluator (Qwen 3 VL)

All text matches the intended content exactly. "GRAND OPENING" is correct. Subtitle "Join us for food, music, and fun" is fully readable. Disclaimer "RSVP required. Limited seating available." is accurate. No misspellings, missing words, or illegible parts detected. Text is clear and correctly rendered. VERDICT: PASS

An event banner with headline 'ANNUAL GALA 2025' at the top PASS
An event banner with headline 'ANNUAL GALA 2025' at the top in large serif font, description 'An evening of fine dining and live jazz' in medium text, and footnote 'Black tie optional. Doors open at 6 PM.' in small text

Evaluator (Qwen 3 VL)

All text matches the intended content exactly: "ANNUAL GALA 2025", "An evening of fine dining and live jazz", and "Black tie optional. Doors open at 6 PM.". No misspellings, missing words, or illegible parts. Font styles and sizes are appropriate. Text is fully readable. VERDICT: PASS

Extreme 0%
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagFAIL
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagline 'One ship. One chance. No turning back.' with credits block reading 'Directed by James Harlow. Starring Elena Voss, Marcus Chen, and Sofia Reyes. Music by Thomas Grant. A Paramount Pict

Evaluator (Qwen 3 VL)

Text is mostly accurate. "Elena Voss" is misspelled as "Elena Voss" (correct spelling is "Voss" with one 's'). All other names, credits, and tagline are correctly spelled and fully readable. The core intent is met despite one minor spelling error. VERDICT: FAIL

A magazine cover titled 'NATIONAL EXPLORER' with cover linesFAIL
A magazine cover titled 'NATIONAL EXPLORER' with cover lines 'The Hidden Temples of Cambodia — page 34', 'Interview: Dr. Maria Santos on Climate Action', 'Top 10 Trails You Must Hike This Year', and a barcode area with 'Issue 247 | March 2025 | $6.99

Evaluator (Qwen 3 VL)

Text errors: “Ete Nons This Y ar” is garbled (should be “Top 10 Trails You Must Hike This Year”). “$6.99” is correct, but “March 20.99” is invalid (should be “March 2025”). “Issue 247 | 0025” is inconsistent. Multiple words are misspelled or missing. VERDICT: FAIL

A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtFAIL
A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtitle 'Raw & Unfiltered', weight '12 oz (340g)', description 'Pure wildflower honey harvested from sustainable apiaries in the Pacific Northwest. No additives or preservatives.', and small te

Evaluator (Qwen 3 VL)

The word "aparies" is misspelled; it should be "apiaries". All other text matches the intended content exactly and is fully readable. The misspelling is the only error. VERDICT: FAIL

Spatial Reasoning 83%

Attributes BindingClick to expand100%

Attributes Binding measures whether local/sefi-image-5b-base attaches the right colors, materials, and properties to the correct requested objects.

Easy 100%
A red ball and a blue cube sitting side by side on a white sPASS
A red ball and a blue cube sitting side by side on a white surface

Evaluator (Qwen 3.5 122B)

The image shows two objects on a white surface. On the left is a red sphere (ball), and on the right is a blue cube. Both objects are correctly colored and shaped as specified. They are positioned side by side. No attributes are swapped. VERDICT: PASS

A green apple and an orange pumpkin on a wooden cutting boarPASS
A green apple and an orange pumpkin on a wooden cutting board

Evaluator (Qwen 3.5 122B)

The image shows a green apple and an orange pumpkin on a wooden cutting board. The apple is correctly green, round, and small; the pumpkin is orange, ribbed, and larger with a stem. All attributes (color, shape, size) are correctly bound to each object — no swaps or misattributions. The scene matches the prompt exactly. VERDICT: PASS

A yellow taxi and a black limousine parked next to each othePASS
A yellow taxi and a black limousine parked next to each other on a street

Evaluator (Qwen 3.5 122B)

The image shows a yellow taxi and a black limousine parked next to each other on a street. The taxi is correctly colored yellow and has the typical shape of a taxi, including a roof sign. The limousine is correctly colored black and has the elongated shape characteristic of a limousine. Both vehicles are appropriately sized relative to each other and the environment. VERDICT: PASS

Medium 100%
A large red circle, a small blue triangle, and a medium greePASS
A large red circle, a small blue triangle, and a medium green star arranged on a white canvas

Evaluator (Qwen 3.5 122B)

The image shows three geometric shapes on a white background: 1. A large red circle — correct color, size, and shape. 2. A small blue triangle — correct color, size, and shape. 3. A medium green star — correct color, size, and shape. All attributes are correctly bound to the right objects with no swaps. The arrangement matches the prompt’s description. VERDICT: PASS

A tall thin white candle, a short fat red candle, and a mediPASS
A tall thin white candle, a short fat red candle, and a medium spiral yellow candle on a shelf

Evaluator (Qwen 3.5 122B)

The image shows three candles on a shelf. The leftmost candle is tall, thin, and white — matching the first description. The middle candle is short, fat, and red — matching the second. The rightmost candle is medium height, spiral-shaped, and yellow — matching the third. All attributes are correctly bound to their respective objects with no swaps. VERDICT: PASS

A big brown horse, a small white rabbit, and a medium-sized PASS
A big brown horse, a small white rabbit, and a medium-sized black cat standing in a meadow

Evaluator (Qwen 3.5 122B)

The image shows a large brown horse standing in a meadow, correctly matching the prompt. A small white rabbit is present in the foreground, also correct. A medium-sized black cat is standing nearby, facing the rabbit — its color, size, and posture match the description. All attributes (color, size, shape) are correctly bound to each object with no swaps or misassignments. The scene composition aligns with the prompt’s requirements. VERDICT: PASS

Hard 100%
A pink fire truck, a blue banana, and a green sun in a surrePASS
A pink fire truck, a blue banana, and a green sun in a surreal landscape

Evaluator (Qwen 3.5 122B)

The image contains a pink fire truck, a blue banana, and two suns (one yellow, one green). The prompt requested a green sun — which is present — but also implied only one sun. However, the core attributes are correctly bound: the fire truck is pink, the banana is blue, and there is a green sun. The extra yellow sun doesn’t violate the binding requirement for the specified objects. All requested objects with correct attributes are present. VERDICT: PASS

A purple carrot, a red tree with red leaves, and a white watPASS
A purple carrot, a red tree with red leaves, and a white watermelon sliced open

Evaluator (Qwen 3.5 122B)

The image contains: 1. Purple carrot with green top – correct color and shape. 2. Red tree with red leaves – correct color and form. 3. White watermelon slice with black seeds and green rind – correct internal color, external pattern, and sliced shape. All attributes are correctly bound to their respective objects. No swaps or misattributions observed. VERDICT: PASS

An orange elephant, a yellow penguin, and a silver strawberrPASS
An orange elephant, a yellow penguin, and a silver strawberry arranged on a table

Evaluator (Qwen 3.5 122B)

The image shows three objects on a white surface: 1. An orange elephant toy with white tusks and large ears — matches “orange elephant”. 2. A yellow penguin toy with black eyes and orange beak/feet — matches “yellow penguin”. 3. A silver strawberry with textured surface and green-tipped leaves — matches “silver strawberry”. All attributes (color, shape, object type) are correctly bound to each item. No swaps or mismatches observed. VERDICT: PASS

CompositionalityClick to expand89%

Compositionality measures whether local/sefi-image-5b-base combines subjects, actions, outfits, props, and scene details into one coherent requested image.

Easy 67%
A tabby cat wearing oversized red sunglassesPASS
A tabby cat wearing oversized red sunglasses

Evaluator (Qwen 3.5 122B)

The image contains all described elements: 1. Tabby cat - Present (clear tabby markings on fur) 2. Oversized red sunglasses - Present (large red frames covering eyes) 3. Coherent combination - Present (glasses fit naturally on cat's face) All prompt requirements are met with proper integration of concepts. VERDICT: PASS

A golden retriever wearing a tiny cowboy hatFAIL
A golden retriever wearing a tiny cowboy hat

Evaluator (Qwen 3.5 122B)

The image shows a golden retriever wearing a cowboy hat. However, the hat is not “tiny” — it fits proportionally on the dog’s head, appearing normal-sized relative to the animal. The prompt specifically requested a “tiny” hat, which is absent. While the breed and hat type are correct, the size descriptor is inaccurate. VERDICT: FAIL

A penguin holding a colorful umbrellaPASS
A penguin holding a colorful umbrella

Evaluator (Qwen 3.5 122B)

The image contains all required elements: 1. Penguin - Present (appears to be an Emperor penguin) 2. Holding - Present (penguin's flipper is gripping the umbrella handle) 3. Colorful umbrella - Present (rainbow-colored with red, yellow, green, blue, purple sections) All elements are coherently combined in a natural-looking composition. The penguin is standing on rocks while holding the umbrella, creating a believable scene despite the unusual concept. VERDICT: PASS

Medium 100%
A medieval knight in full armor sitting at a desk typing on PASS
A medieval knight in full armor sitting at a desk typing on a modern laptop

Evaluator (Qwen 3.5 122B)

The image shows a medieval knight in full armor sitting at a wooden desk, typing on a modern laptop. All described elements are present and coherently combined: - Medieval knight in full armor: Present - Sitting at a desk: Present - Typing on a modern laptop: Present The combination is logical and visually coherent. VERDICT: PASS

A Roman gladiator taking a selfie with a smartphone in the CPASS
A Roman gladiator taking a selfie with a smartphone in the Colosseum

Evaluator (Qwen 3.5 122B)

The image contains all required elements: 1. Roman gladiator: Present (wearing helmet, armor, cape) 2. Selfie action: Present (holding phone up, looking at screen) 3. Smartphone: Present (modern device in hand) 4. Colosseum: Present (ancient amphitheater in background) All elements are coherently combined in a single scene with proper spatial relationships and lighting consistency. VERDICT: PASS

An astronaut in a spacesuit riding a bicycle through a parkPASS
An astronaut in a spacesuit riding a bicycle through a park

Evaluator (Qwen 3.5 122B)

The image shows an astronaut in a full white spacesuit with helmet and backpack, riding a dark-colored bicycle on a paved path. The background includes green grass, trees, and park benches — clearly indicating a park setting. All prompt elements are present: astronaut, spacesuit, bicycle, and park. The combination is coherent and visually consistent. VERDICT: PASS

Hard 100%
An octopus wearing a top hat and monocle playing chess on thPASS
An octopus wearing a top hat and monocle playing chess on the surface of the moon with Earth visible in the background

Evaluator (Qwen 3.5 122B)

The image contains all described elements: - Octopus: Present, central figure - Top hat: Present, worn by octopus - Monocle: Present, on octopus's eye - Chess game: Present, board with pieces in foreground - Moon surface: Present, rocky terrain beneath - Earth in background: Present, large planet visible All elements are coherently combined in a surreal but visually consistent composition. VERDICT: PASS

A dinosaur in a business suit giving a PowerPoint presentatiPASS
A dinosaur in a business suit giving a PowerPoint presentation in a modern office while drinking coffee

Evaluator (Qwen 3.5 122B)

The image contains all described elements: 1. Dinosaur - Present (T-Rex type) 2. Business suit - Present (jacket, tie, dress shoes) 3. PowerPoint presentation - Present (on screen behind dinosaur) 4. Modern office - Present (desks, chairs, lighting) 5. Drinking coffee - Present (holding white mug) All elements are coherently combined in a single scene. The dinosaur is wearing the suit, standing in the office, holding coffee, with presentation visible. VERDICT: PASS

A robot dressed as a pirate sailing a paper boat on a lake oPASS
A robot dressed as a pirate sailing a paper boat on a lake of lava while juggling three kittens

Evaluator (Qwen 3.5 122B)

The image contains all described elements: - Robot dressed as a pirate: Present (skull face, pirate hat, vest) - Paper boat: Present (white origami-style boat) - Lake of lava: Present (glowing red molten surface) - Three kittens: Present (three tabby kittens floating above) - Juggling action: Present (robot’s arms raised as if juggling) All components are coherently combined in a surreal but visually consistent scene. VERDICT: PASS

CountingClick to expand56%

Counting measures whether local/sefi-image-5b-base creates the exact number of requested objects without adding or missing items.

Easy 100%
Exactly 3 red apples arranged on a wooden tablePASS
Exactly 3 red apples arranged on a wooden table

Evaluator (Qwen 3.5 122B)

The image displays three red apples placed side by side on a wooden surface. The apples are uniformly red with visible stems and slight natural imperfections. The wooden table has a light grain pattern. The count of apples is exactly three, matching the prompt’s request. No other objects are present. All criteria are met. VERDICT: PASS

Exactly 5 yellow rubber ducks sitting in a row on a bathroomPASS
Exactly 5 yellow rubber ducks sitting in a row on a bathroom shelf

Evaluator (Qwen 3.5 122B)

The image shows exactly five yellow rubber ducks arranged in a row on a white bathroom shelf. Each duck is identical in appearance, with orange beaks and black eyes. The count matches the prompt’s request precisely. No extra or missing ducks are visible. The setting (bathroom shelf) is also accurate. VERDICT: PASS

Exactly 4 white candles standing on a dark surfacePASS
Exactly 4 white candles standing on a dark surface

Evaluator (Qwen 3.5 122B)

The image shows four white candles with flames, standing upright on a dark surface. All candles are lit and evenly spaced. The count of candles is exactly four, matching the prompt’s request. No other objects are present. The visual content fully satisfies the specified criteria. VERDICT: PASS

Medium 33%
2 orange cats and 3 black dogs sitting together on a grassy FAIL
2 orange cats and 3 black dogs sitting together on a grassy lawn

Evaluator (Qwen 3.5 122B)

The image shows two orange cats and two black dogs sitting on grass. The prompt requested 2 orange cats and 3 black dogs. The cat count is correct, but the dog count is off by one — only two dogs are visible, not three. Therefore, the object count does not exactly match the request. VERDICT: FAIL

4 red roses and 2 white daisies in a glass vase on a tableFAIL
4 red roses and 2 white daisies in a glass vase on a table

Evaluator (Qwen 3.5 122B)

The image contains 4 red roses, which matches the request. However, there are 4 white daisies visible (one top left, one top right, one center left, one center right), not 2 as requested. The glass vase and table are present. Since the daisy count is incorrect, the prompt’s specification is not fully met. VERDICT: FAIL

1 blue butterfly and 3 yellow butterflies resting on a gardePASS
1 blue butterfly and 3 yellow butterflies resting on a garden fence

Evaluator (Qwen 3.5 122B)

The image shows 1 blue butterfly and 3 yellow butterflies resting on a wooden garden fence. The count exactly matches the requested objects: one blue and three yellow butterflies. All are positioned on or near the fence as described. VERDICT: PASS

Hard 33%
Exactly 9 goldfish swimming in a large round fishbowlPASS
Exactly 9 goldfish swimming in a large round fishbowl

Evaluator (Qwen 3.5 122B)

The image shows a round fishbowl containing goldfish. Counting them: there are 9 goldfish visible in the bowl. The prompt requested “Exactly 9 goldfish swimming in a large round fishbowl” — this matches precisely. All fish appear to be goldfish, and the bowl is large and round. No extra or missing fish. VERDICT: PASS

A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 FAIL
A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 apples

Evaluator (Qwen 3.5 122B)

The image shows a fruit bowl with bananas, oranges, and apples. Counting carefully: there are 4 bananas (not 3), 5 oranges (not 4), and 2 apples (correct). The counts for bananas and oranges do not match the requested numbers. Since the prompt requires exact counts and two of the three categories are incorrect, the core criterion is not met. VERDICT: FAIL

Exactly 11 colorful balloons floating against a clear blue sFAIL
Exactly 11 colorful balloons floating against a clear blue sky

Evaluator (Qwen 3.5 122B)

The image displays colorful balloons against a clear blue sky. Counting them: there are 12 balloons total — red, green, yellow, purple, orange, white, and blue variants. The prompt requested exactly 11. Since the count is 12, it does not match the requested number. VERDICT: FAIL

NegationClick to expand78%

Negation measures whether local/sefi-image-5b-base respects what should be absent from the image, not only what should be present.

Easy 100%
A tropical beach scene with no peoplePASS
A tropical beach scene with no people

Evaluator (Qwen 3.5 122B)

The image depicts a serene tropical beach scene with palm trees leaning over white sand and turquoise water. There are no visible people in the frame — no individuals on the beach, in the water, or in the distance. The exclusion criterion (“no people”) is fully satisfied. VERDICT: PASS

A cat lying on a couch without a collarPASS
A cat lying on a couch without a collar

Evaluator (Qwen 3.5 122B)

The image shows a tabby and white cat lying on a gray couch. The cat is looking directly at the camera with a calm expression. Its fur is short and well-groomed, with distinct stripes on its head and back. The cat’s paws are tucked neatly under its body, and its tail is curled around its side. There is no collar visible on the cat, confirming the excluded element is absent. VERDICT: PASS

A classroom with desks and a chalkboard but no studentsPASS
A classroom with desks and a chalkboard but no students

Evaluator (Qwen 3.5 122B)

The image shows an empty classroom with wooden desks and chairs arranged in rows facing a large black chalkboard mounted on the wall. There are no students visible anywhere in the scene — the room is completely unoccupied. The excluded element (students) is truly absent. VERDICT: PASS

Medium 67%
A sandy beach with palm trees, seashells, and a beach umbrelFAIL
A sandy beach with palm trees, seashells, and a beach umbrella but no water or ocean visible

Evaluator (Qwen 3.5 122B)

The image shows a sandy beach with palm trees, seashells in the foreground, and a white beach umbrella. However, in the background, beyond the sand and umbrella, there is clearly visible blue water — an ocean or sea — stretching to the horizon under a clear sky. Since the prompt explicitly excluded “no water or ocean visible,” and water is present, the exclusion criterion is violated. VERDICT: FAIL

An office workspace with a desk, chair, bookshelf, and lamp PASS
An office workspace with a desk, chair, bookshelf, and lamp but no computers or screens

Evaluator (Qwen 3.5 122B)

The image shows a clean office workspace with a wooden desk, black rolling chair, tall bookshelf filled with books, and a floor lamp with two shades. There are no computers, monitors, tablets, or any type of screen visible anywhere in the scene. The excluded elements are fully absent. VERDICT: PASS

A kitchen with pots, pans, a stove, and a cutting board but PASS
A kitchen with pots, pans, a stove, and a cutting board but no food

Evaluator (Qwen 3.5 122B)

The image shows a kitchen with blue cabinets, a stove, and multiple stainless steel pots and pans. There is a wooden cutting board on the island. Crucially, there is no food visible in any of the pots, pans, or on the cutting board. The excluded element (food) is truly absent. VERDICT: PASS

Hard 67%
A pepperoni pizza with tomato sauce, basil, and crust but abPASS
A pepperoni pizza with tomato sauce, basil, and crust but absolutely no cheese

Evaluator (Qwen 3.5 122B)

The image shows a pepperoni pizza with a visible tomato sauce base, scattered green herbs (likely basil), and a golden-brown crust. There is no melted or shredded cheese layer present — the sauce is directly exposed between pepperoni slices. The excluded element (cheese) is indeed absent. VERDICT: PASS

A car body with windows, doors, and headlights but no wheelsPASS
A car body with windows, doors, and headlights but no wheels

Evaluator (Qwen 3.5 122B)

The image shows a car body with windows, doors, and headlight openings — all consistent with the prompt. Crucially, there are no wheels present; the wheel wells are empty and open. The excluded element (wheels) is truly absent. All other described components (windows, doors, headlights as openings) are correctly included. VERDICT: PASS

A human face with eyes, nose, and mouth but no eyebrowsFAIL
A human face with eyes, nose, and mouth but no eyebrows

Evaluator (Qwen 3.5 122B)

The image shows a human face with eyes, nose, and mouth. However, the excluded element — eyebrows — is clearly present. Both eyebrows are visible above the eyes, well-defined and natural-looking. Since the prompt explicitly required “no eyebrows,” their presence violates the constraint. VERDICT: FAIL

Relative PositionClick to expand92%

Relative Position measures whether local/sefi-image-5b-base places objects in the requested spatial relationship, such as above, below, inside, or next to another object.

Easy 100%
A cat sitting on top of a cardboard boxPASS
A cat sitting on top of a cardboard box

Evaluator (Qwen 3.5 122B)

The image shows a cat sitting directly on top of a cardboard box. The cat is positioned centrally on the upper surface of the box, with its paws resting on the box’s lid. The box sits on a wooden floor against a plain wall. All spatial relationships described in the prompt — specifically “on top of” — are accurately represented. VERDICT: PASS

A red ball underneath a wooden chairPASS
A red ball underneath a wooden chair

Evaluator (Qwen 3.5 122B)

The image shows a red ball positioned directly beneath the seat of a wooden chair, resting on the floor between the chair legs. The spatial relationship “underneath” is accurately depicted — the ball is below the chair’s seat and within the footprint of the chair’s base. All described positional relationships are correct. VERDICT: PASS

A coffee mug next to an open book on a deskPASS
A coffee mug next to an open book on a desk

Evaluator (Qwen 3.5 122B)

The image shows a white coffee mug filled with dark liquid positioned to the right of an open book. Both objects rest on a wooden desk surface. The spatial arrangement matches the prompt: the mug is next to the book, and both are on the desk. All described positional relationships are accurate. VERDICT: PASS

Medium 100%
A bird perched on a branch, the branch extending from a treePASS
A bird perched on a branch, the branch extending from a tree next to a park bench

Evaluator (Qwen 3.5 122B)

The image shows a bird perched on a tree branch that extends from a large tree trunk on the right side of the frame. Below and to the left of the branch is a park bench, situated on grass. The spatial relationships described — bird on branch, branch extending from tree, tree next to bench — are all accurately represented. All positional cues (on, next to, below) are correct. VERDICT: PASS

A cat sitting on a pillow, the pillow placed on a wooden chaPASS
A cat sitting on a pillow, the pillow placed on a wooden chair

Evaluator (Qwen 3.5 122B)

The image shows a cat sitting directly on a white pillow. The pillow is placed on the seat of a wooden chair. All spatial relationships described in the prompt — “cat on pillow,” “pillow on chair” — are accurately represented. The cat is above the pillow, the pillow is above the chair seat, and no conflicting left/right/under/above relationships are present. VERDICT: PASS

A candle on top of a stack of books, the books next to a potPASS
A candle on top of a stack of books, the books next to a potted plant

Evaluator (Qwen 3.5 122B)

The image shows a lit candle placed on top of a stack of three books. The books are arranged vertically, with the candle centered on the top book. To the right of the stack is a potted plant in a terracotta pot. All spatial relationships described — “candle on top of books,” “books next to potted plant” — are accurately represented. No positional errors detected. VERDICT: PASS

Hard 67%
Five chess pieces on a board: a king in the center, a queen FAIL
Five chess pieces on a board: a king in the center, a queen to its left, a bishop to its right, a rook above the king, and a knight below the king

Evaluator (Qwen 3.5 122B)

The image shows four chess pieces on a board: a white queen on the left, a white king in the center, a dark bishop on the right, and a white knight in front (below) the king. There is no rook visible above the king. The prompt specified five pieces including a rook above the king — this piece is missing. Therefore, not all described positional relationships are correct. VERDICT: FAIL

A desk scene: laptop in the center, coffee mug to the left, PASS
A desk scene: laptop in the center, coffee mug to the left, phone to the right, lamp above, and notebook below

Evaluator (Qwen 3.5 122B)

The image shows a desk scene with a laptop centered, a coffee mug to its left, and a phone to its right — all matching the prompt. A lamp is positioned above the laptop, casting light downward. An open notebook lies below the laptop, closer to the viewer. All specified spatial relationships (left, right, above, below) are accurately represented. VERDICT: PASS

Five animals in a field: a horse on the far left, a cow to tPASS
Five animals in a field: a horse on the far left, a cow to the right of the horse, a sheep in the center, a goat to the right of the sheep, and a dog on the far right

Evaluator (Qwen 3.5 122B)

The image shows five animals in a field arranged left to right: a brown horse on the far left, a black-and-white cow to its right, a woolly sheep in the center, a white goat to the sheep’s right, and a tricolor dog on the far right. All specified spatial relationships match the prompt exactly. VERDICT: PASS

Extreme 100%
A landscape with a red barn in the foreground at large scalePASS
A landscape with a red barn in the foreground at large scale, a windmill in the midground at medium scale, and snow-capped mountains in the background at small scale, all with consistent depth perspective

Evaluator (Qwen 3.5 122B)

The image shows a red barn in the foreground, a windmill on a hill in the midground, and snow-capped mountains in the background. The spatial arrangement matches the prompt: barn (foreground, large), windmill (midground, medium), mountains (background, small). Depth perspective is consistent. All positional relationships are correctly represented. VERDICT: PASS

A beach scene with a sandcastle in the foreground, a surfer PASS
A beach scene with a sandcastle in the foreground, a surfer on the waves in the midground, and a cruise ship on the horizon in the background, sizes decreasing with distance

Evaluator (Qwen 3.5 122B)

The image correctly depicts a beach scene with the specified spatial arrangement: a sandcastle in the foreground, a surfer on the waves in the midground, and a cruise ship on the horizon in the background. The sizes of the objects decrease with distance as described. All positional relationships are accurate. VERDICT: PASS

A city park with a park bench and pigeons in the foreground,PASS
A city park with a park bench and pigeons in the foreground, a fountain in the midground, and skyscrapers in the background, each layer progressively smaller

Evaluator (Qwen 3.5 122B)

The image correctly depicts a city park scene with the specified spatial arrangement. In the foreground, there is a park bench with pigeons on the ground around it. The midground features a large fountain spraying water. In the background, skyscrapers rise above the trees, appearing progressively smaller due to perspective. All described positional relationships are accurate. VERDICT: PASS

Scale & ProportionsClick to expand78%

Scale & Proportions measures whether local/sefi-image-5b-base keeps people, vehicles, buildings, and everyday objects at believable relative sizes.

Easy 67%
A person standing next to a yellow school bus, showing realiPASS
A person standing next to a yellow school bus, showing realistic relative scale

Evaluator (Qwen 3.5 122B)

The person standing next to the school bus appears to be of normal adult height relative to the vehicle. The bus’s proportions — including wheel size, window height, and overall length — are consistent with a real school bus. The person’s stance and shadow placement also align naturally with the scene. No obvious scaling distortions or anatomical inconsistencies are present. VERDICT: PASS

A ladybug sitting on a large green leaf, shown at realistic FAIL
A ladybug sitting on a large green leaf, shown at realistic proportions

Evaluator (Qwen 3.5 122B)

The ladybug appears disproportionately large relative to the leaf’s visible surface area and vein structure. In reality, a ladybug is small enough that multiple could fit within a single leaf segment; here, the insect dominates the frame unnaturally, suggesting scale distortion. Leaf texture and veins also appear overly uniform and magnified, inconsistent with natural macro photography proportions. VERDICT: FAIL

A child holding a basketball, both at correct relative sizesPASS
A child holding a basketball, both at correct relative sizes

Evaluator (Qwen 3.5 122B)

The child’s hands are holding the basketball in a natural, proportionate manner — the ball is appropriately sized relative to the child’s torso and arms, consistent with real-world expectations for a young child holding a standard basketball. No anatomical or object-size distortions are present. VERDICT: PASS

Medium 100%
A tiny person standing on the rim of a coffee cup, looking dPASS
A tiny person standing on the rim of a coffee cup, looking down into it

Evaluator (Qwen 3.5 122B)

The image shows a miniature figurine of a person standing on the rim of a coffee cup, which aligns with the prompt’s description of “a tiny person standing on the rim of a coffee cup, looking down into it.” The scale is intentionally surreal — the person is small relative to the cup, as expected in such conceptual imagery. There are no proportional inconsistencies within the scene itself; the cup, liquid, and figurine maintain internal visual logic. The reflection in the coffee also matches the

A giant tabby cat walking between city skyscrapers like a kaPASS
A giant tabby cat walking between city skyscrapers like a kaiju

Evaluator (Qwen 3.5 122B)

The image depicts a giant tabby cat towering over city skyscrapers, consistent with the "kaiju" description. The cat’s paws are larger than entire buildings, and its body stretches between high-rises, maintaining the intended scale relationship. No proportional inconsistencies are visible — the cat dominates the urban landscape as described. VERDICT: PASS

A miniature house sitting on the palm of someone's handPASS
A miniature house sitting on the palm of someone's hand

Evaluator (Qwen 3.5 122B)

The image shows a miniature house resting on an open human palm. The house is appropriately sized relative to the hand — small enough to fit comfortably in the palm, with proportions that suggest it’s a model or toy (e.g., windows, roof, chimney are scaled down realistically). The hand appears adult-sized, and the house is roughly the size of a large thumb or small palm object, which is consistent with the prompt’s description. No anatomical or scale inconsistencies are visible. VERDICT: PASS

Hard 67%
A group photo of an elephant, a horse, a dog, a cat, and a mPASS
A group photo of an elephant, a horse, a dog, a cat, and a mouse, all standing in a line with correct real-world proportional sizes

Evaluator (Qwen 3.5 122B)

The image displays five animals in a line: elephant, dog, cat, horse, and mouse. The elephant is largest, followed by the horse — which is correctly smaller than the elephant but larger than the dog. The dog is appropriately sized relative to the cat, and the mouse is smallest. All proportions align with real-world expectations. No anatomical or scaling errors are evident. VERDICT: PASS

A dinner table scene with a plate, fork, knife, wine glass, PASS
A dinner table scene with a plate, fork, knife, wine glass, salt shaker, and peppercorn, all at realistic relative sizes

Evaluator (Qwen 3.5 122B)

The image shows a table setting with a plate, fork, knife, wine glass, salt shaker, and peppercorn shaker. The relative sizes appear realistic: the plate is appropriately large compared to the cutlery; the wine glass is taller than the shakers but not disproportionately so; the shakers are small and consistent with real-world proportions. All objects maintain plausible scale relationships for a dining scene. VERDICT: PASS

A farm scene with a barn, tractor, farmer, cow, chicken, andFAIL
A farm scene with a barn, tractor, farmer, cow, chicken, and a fence post, all at correct proportional scale

Evaluator (Qwen 3.5 122B)

The image displays a farm scene with a barn, tractor, farmer, cow, and chickens. The relative sizes appear mostly consistent: the barn is large in the background, the tractor is appropriately sized next to it, the farmer is smaller than the cow, and the chickens are small in the foreground. However, the cow’s head appears unnaturally large and distorted compared to its body, and the farmer’s proportions seem slightly off. These inconsistencies violate the requirement for correct proportional sca

Human realism 88%

Faces & ExpressionsClick to expand100%

Faces & Expressions measures whether local/sefi-image-5b-base creates realistic faces with the requested age, emotion, expression, and identity cues.

Easy 100%
A close-up portrait of a young woman with a warm natural smiPASS
A close-up portrait of a young woman with a warm natural smile

Evaluator (Qwen 3 VL)

Face is symmetric with natural, well-proportioned features. Smile is warm and clearly conveys happiness, with visible teeth and slight cheek lift. No obvious distortions; eyes, nose, and mouth are anatomically plausible. Lighting and skin texture look natural. VERDICT: PASS

A portrait of a middle-aged man with a calm, serious expressPASS
A portrait of a middle-aged man with a calm, serious expression

Evaluator (Qwen 3 VL)

Face is symmetric with natural, correctly proportioned features. Expression is calm and serious, with subtle lines conveying contemplation. No visible distortions. Lighting and texture are realistic. The prompt’s intent is accurately captured. VERDICT: PASS

A headshot of a teenager with a relaxed, neutral facePASS
A headshot of a teenager with a relaxed, neutral face

Evaluator (Qwen 3 VL)

The face is symmetric with natural, correctly proportioned features. The teenager’s expression is relaxed and neutral, with no discernible emotion. No distortions are visible. Lighting and detail are realistic. VERDICT: PASS

Medium 100%
A close-up of a person's face showing clear disgust, nose wrPASS
A close-up of a person's face showing clear disgust, nose wrinkled and lip curled

Evaluator (Qwen 3 VL)

The face shows strong disgust: eyes squeezed shut, nose wrinkled, lips curled downward. Features are mostly symmetric and anatomically plausible. Slight asymmetry in eyebrow tension and minor skin texture artifacts exist, but core emotion is clearly readable. No major distortions impair recognition. VERDICT: PASS

A portrait capturing genuine surprise — raised eyebrows, widPASS
A portrait capturing genuine surprise — raised eyebrows, wide eyes, slightly open mouth

Evaluator (Qwen 3 VL)

Face is symmetric with correctly rendered features. The expression clearly conveys genuine surprise: wide eyes, raised eyebrows, and an open mouth. No distortions or anatomical errors are visible. The emotion is vivid and unmistakable. VERDICT: PASS

A face showing quiet melancholy — downcast eyes, slightly drPASS
A face showing quiet melancholy — downcast eyes, slightly drawn mouth, distant gaze

Evaluator (Qwen 3 VL)

Face is symmetric with natural, correctly rendered features. Downcast eyes and slightly drawn mouth convey quiet melancholy. Distant gaze is evident through lowered eyelids and softened focus. Lighting enhances the somber mood without distorting anatomy. Emotion is clearly readable and aligned with prompt. VERDICT: PASS

Hard 100%
An elderly woman in her 80s with deep wrinkles and white haiPASS
An elderly woman in her 80s with deep wrinkles and white hair, smiling warmly with kind eyes

Evaluator (Qwen 3 VL)

Face is symmetric with natural, aged features: deep wrinkles, white hair, and warm smile. Eyes convey kindness. No significant distortions. Lighting and texture are realistic. Emotion is clearly readable as joyful and gentle. All prompt elements are accurately rendered. VERDICT: PASS

A 5-year-old child with round cheeks and large eyes, crying PASS
A 5-year-old child with round cheeks and large eyes, crying with tears streaming down their face

Evaluator (Qwen 3 VL)

The child’s face is symmetric with natural features: round cheeks, large eyes, and tears streaming down. The crying expression is clearly readable, with furrowed brows, a downturned mouth, and visible tears. No significant distortions are present. The image accurately fulfills the prompt’s requirements. VERDICT: PASS

An elderly man in his 70s with age spots and thinning gray hPASS
An elderly man in his 70s with age spots and thinning gray hair, looking skeptically over reading glasses

Evaluator (Qwen 3 VL)

Face is symmetric with realistic elderly features: age spots, thinning gray hair, and reading glasses. Skeptical expression is clearly readable via furrowed brow and downturned mouth. No significant distortions. Skin texture and lighting are natural. All prompt elements are accurately rendered. VERDICT: PASS

Extreme 100%
A group of three friends at a table: one laughing with head PASS
A group of three friends at a table: one laughing with head tilted back, one looking shocked with wide eyes, and one rolling their eyes with a smirk

Evaluator (Qwen 3 VL)

Faces are symmetric with correct anatomical features. Emotions are clearly readable: left person laughs (head back), center looks shocked (wide eyes, open mouth), right person rolls eyes with smirk. No distortions or unnatural artifacts. All three expressions match the prompt precisely. VERDICT: PASS

A family photo with four people: a father beaming with pridePASS
A family photo with four people: a father beaming with pride, a mother smiling gently, a teenager looking bored, and a child grinning excitedly

Evaluator (Qwen 3 VL)

Faces are symmetric with correct anatomical features. Emotions are clearly readable: father beams, mother smiles gently, teenager looks bored (hands on face, eyes closed), child grins excitedly. No distortions or unnatural artifacts. All four individuals match the prompt’s emotional descriptions accurately. VERDICT: PASS

Three coworkers in a meeting: one looking frustrated with fuPASS
Three coworkers in a meeting: one looking frustrated with furrowed brows, one nodding with a pleased expression, and one looking confused with a tilted head

Evaluator (Qwen 3 VL)

Faces are symmetric with correct features. Emotions are clearly readable: left man frustrated (furrowed brows, hand to face), center man pleased (smiling, thumbs up), right man confused (tilted head, hand to temple). No distortions. All three expressions match the prompt accurately. VERDICT: PASS

Full BodyClick to expand100%

Full Body measures whether local/sefi-image-5b-base renders full-body people with natural anatomy, clothing, pose, and proportions.

Easy 100%
A woman standing upright in a relaxed pose, arms at her sidePASS
A woman standing upright in a relaxed pose, arms at her sides, wearing jeans and a white t-shirt

Evaluator (Qwen 3 VL)

The woman stands upright with arms at her sides, wearing jeans and a white t-shirt as described. Limb count and proportions are anatomically correct. Pose is relaxed and recognizable. No distortions or anomalies are visible. All elements match the prompt precisely. VERDICT: PASS

A man standing straight facing the camera, hands in pockets,PASS
A man standing straight facing the camera, hands in pockets, normal proportions

Evaluator (Qwen 3 VL)

The man stands straight, facing forward, with hands in pockets. Limb count and proportions are anatomically correct. The pose is clearly recognizable as described. No distortions are present. All elements match the prompt accurately. VERDICT: PASS

A person standing in a neutral pose in a park, full body visPASS
A person standing in a neutral pose in a park, full body visible from head to feet

Evaluator (Qwen 3 VL)

The person’s full body is visible from head to feet in a neutral standing pose. Limb count and proportions appear anatomically correct. The pose is recognizable as standing still, barefoot on grass. No significant distortions are evident. The scene matches the prompt’s description of a park setting. VERDICT: PASS

Medium 100%
A woman mid-stride running through a park, hair and clothes PASS
A woman mid-stride running through a park, hair and clothes showing motion

Evaluator (Qwen 3 VL)

Anatomically correct: limbs and proportions are accurate. Pose is a natural running stride with motion blur in hair and clothing. No distortions. Hair flows dynamically, clothes show realistic movement. Action is clearly recognizable as running. All elements align with the prompt. VERDICT: PASS

A man captured mid-jump over a puddle, body in the air with PASS
A man captured mid-jump over a puddle, body in the air with bent knees

Evaluator (Qwen 3 VL)

The man’s limbs are correctly counted and proportioned. His pose—bent knees, arms outstretched mid-air—is anatomically plausible for a jump. The action is clearly recognizable: he’s leaping over a puddle, with water splashing beneath him. No significant distortions are visible. The scene captures the dynamic motion accurately. VERDICT: PASS

A dancer in a dynamic spin, dress flowing outward with one lPASS
A dancer in a dynamic spin, dress flowing outward with one leg extended

Evaluator (Qwen 3 VL)

The dancer’s limbs are correctly counted and proportioned. The pose—dynamic spin with one leg extended—is anatomically plausible and recognizable. Fabric motion suggests rotation. Slight blurring on the extended leg and arms indicates motion, enhancing realism. No major distortions. VERDICT: PASS

Hard 100%
A person in a yoga tree pose — standing on one leg with the PASS
A person in a yoga tree pose — standing on one leg with the other foot pressed to the inner thigh, arms raised overhead

Evaluator (Qwen 3 VL)

The person stands on one leg with the other foot pressed against the inner thigh, arms raised overhead in prayer position. Limb count and proportions are accurate. The pose is clearly recognizable as Tree Pose (Vrksasana). No anatomical distortions are visible. The body alignment is natural and balanced. VERDICT: PASS

A rock climber clinging to a cliff face with arms and legs ePASS
A rock climber clinging to a cliff face with arms and legs extended to different holds

Evaluator (Qwen 3 VL)

The climber has two arms and two legs, with proportions and limb placement consistent with human anatomy. The pose is recognizable as dynamic rock climbing, with limbs extended to different holds. No significant distortions are visible. The action matches the prompt accurately. VERDICT: PASS

A cellist seated with correct posture, legs around the instrPASS
A cellist seated with correct posture, legs around the instrument, bow arm extended

Evaluator (Qwen 3 VL)

The cellist has correct posture, legs wrapped around the cello, and bow arm extended. Limb count and proportions are anatomically accurate. The pose is clearly recognizable as playing the cello. No distortions are visible. All elements of the prompt are faithfully rendered. VERDICT: PASS

Extreme 100%
Two tango dancers in a close embrace, the man dipping the woPASS
Two tango dancers in a close embrace, the man dipping the woman backward with intertwined legs

Evaluator (Qwen 3 VL)

The dancers’ limbs are correctly counted and proportioned. The woman’s body is arched backward with intertwined legs, matching the “dip” description. The pose is anatomically plausible for tango, with no visible distortions. The man supports her securely, and both wear appropriate attire. The action is clearly recognizable as a tango dip. VERDICT: PASS

Two soccer players mid-tackle, one sliding and the other jumPASS
Two soccer players mid-tackle, one sliding and the other jumping over with the ball at their feet

Evaluator (Qwen 3 VL)

Both players show correct limb count and proportions. The sliding player’s pose is anatomically plausible, though his hand placement is slightly unnatural. The jumping player’s posture is dynamic and recognizable as a tackle evasion. No major distortions; motion blur and dirt particles enhance realism. The ball is at the sliding player’s feet as described. VERDICT: PASS

Two gymnasts performing a synchronized handstand, side by siPASS
Two gymnasts performing a synchronized handstand, side by side with identical body alignment

Evaluator (Qwen 3 VL)

Both gymnasts show correct limb count and proportions. Their synchronized handstand pose is anatomically accurate and recognizable. Bodies are symmetrically aligned as requested. No significant distortions are visible. VERDICT: PASS

HandsClick to expand58%

Hands measures whether local/sefi-image-5b-base draws anatomically plausible hands, fingers, and gestures.

Easy 33%
A person standing with both hands visible at their sides, fiFAIL
A person standing with both hands visible at their sides, fingers relaxed and naturally spread

Evaluator (Qwen 3.5 122B)

Upon close inspection, the left hand (viewer’s left) has 5 fingers, but the right hand (viewer’s right) appears to have only 4 visible fingers — the pinky is either missing or fused with the ring finger. The gesture is otherwise relaxed and natural, but the anatomical anomaly in finger count violates the prompt’s requirement for “exactly 5 per hand.” This is a clear structural error. VERDICT: FAIL

A woman holding her hands out in front of her palms-up, all PASS
A woman holding her hands out in front of her palms-up, all ten fingers clearly visible

Evaluator (Qwen 3.5 122B)

The image shows a woman with both hands extended forward, palms facing up. Each hand clearly displays five fingers — thumbs and four fingers — totaling ten visible fingers. The anatomy appears correct: fingers are proportionate, joints are natural, and the gesture is anatomically plausible. No extra or missing digits; no fusion or distortion. The focus is sharp on the hands, background blurred. All criteria met. VERDICT: PASS

A man with arms crossed, both hands resting on opposite uppeFAIL
A man with arms crossed, both hands resting on opposite upper arms with fingers visible

Evaluator (Qwen 3.5 122B)

The left hand (on the right side of the image) shows 4 fingers clearly, with the thumb tucked or obscured — not 5 visible fingers. The right hand (left side of image) also appears to show only 4 fingers; the pinky is partially hidden or merged. Anatomically, the grip is plausible for crossed arms, but finger count is inaccurate per prompt requirement (“fingers visible” and “exactly 5 per hand”). Anomaly: missing/obscured fingers violate explicit prompt criteria. VERDICT: FAIL

Medium 33%
A person holding a white ceramic coffee mug by the handle wiFAIL
A person holding a white ceramic coffee mug by the handle with their right hand

Evaluator (Qwen 3.5 122B)

The hand gripping the mug shows anatomical issues. The thumb is positioned unusually high and appears fused or malformed at the base. The fingers curling around the handle are indistinct — the ring and pinky fingers seem merged or missing clear separation. There’s no visible fifth finger (pinky) clearly defined; instead, the hand appears to have only 4 distinct digits. Grip is unnatural for a standard mug hold. VERDICT: FAIL

A woman writing in a notebook with a pen gripped correctly iFAIL
A woman writing in a notebook with a pen gripped correctly in her right hand

Evaluator (Qwen 3.5 122B)

The right hand gripping the pen shows 5 fingers, but the thumb and index finger appear fused or malformed at the tip. The left hand in the background is blurry but appears to have 5 fingers with normal spacing. The pen grip is slightly awkward due to the right hand’s thumb-index anomaly. Overall, anatomical correctness is compromised. VERDICT: FAIL

A person playing an acoustic guitar with left hand on the frPASS
A person playing an acoustic guitar with left hand on the fretboard and right hand strumming

Evaluator (Qwen 3.5 122B)

The left hand (on the fretboard) shows only four fingers clearly, with the thumb not visible — which is anatomically plausible for a chord grip. The right hand (strumming area) also shows four fingers; the thumb is partially obscured but likely present. No obvious extra or missing fingers beyond typical playing posture. Grip appears natural for guitar playing. Minor occlusion due to angle, not anomaly. VERDICT: PASS

Hard 100%
A person making a peace sign with their right hand, clearly PASS
A person making a peace sign with their right hand, clearly showing two fingers up and three curled

Evaluator (Qwen 3.5 122B)

The image shows a single right hand making a peace sign. Two fingers (index and middle) are extended upward, while the thumb, ring finger, and pinky are curled inward — consistent with the gesture described. All five digits are present and anatomically positioned correctly for this pose. No anomalies or extra/missing fingers. The grip is natural and matches the prompt. VERDICT: PASS

A hand giving a thumbs-up with four fingers curled and thumbPASS
A hand giving a thumbs-up with four fingers curled and thumb extended upward

Evaluator (Qwen 3.5 122B)

The image shows a single hand making a thumbs-up gesture. The thumb is extended upward, and the other four fingers are curled into a fist — this matches the prompt’s description. Anatomically, the hand appears correct: 5 fingers total (1 thumb + 4 curled), proper knuckle alignment, natural skin texture, and no extra or missing digits. No anomalies detected. VERDICT: PASS

A person counting to three on their fingers, with index, midPASS
A person counting to three on their fingers, with index, middle, and ring fingers extended

Evaluator (Qwen 3.5 122B)

The image shows one hand making a gesture with three fingers extended: index, middle, and ring fingers. The thumb is curled inward, and the pinky is also curled — consistent with counting to three. Anatomically, this is correct for the described gesture. There are no visible anomalies or extra/missing fingers. The hand appears natural and properly formed. VERDICT: PASS

Extreme 67%
Two people shaking hands in a professional setting, all fingPASS
Two people shaking hands in a professional setting, all fingers of both hands anatomically correct

Evaluator (Qwen 3.5 122B)

The image shows two hands shaking. The left hand (from viewer’s perspective) has 5 fingers clearly visible and anatomically correct. The right hand also displays 5 fingers, all properly formed and positioned in a natural handshake grip. No extra, missing, or malformed fingers are present. The gesture is realistic and consistent with a professional handshake. All anatomical criteria are met. VERDICT: PASS

Two people doing a high-five, palms meeting with all fingersPASS
Two people doing a high-five, palms meeting with all fingers extended on both hands

Evaluator (Qwen 3.5 122B)

The image shows two hands meeting in a prayer-like or high-five gesture. The left hand clearly displays 5 fingers. The right hand also appears to have 5 fingers, though the thumb is partially obscured by the angle. The gesture is anatomically plausible — fingers are extended and palms are facing each other as described. No obvious anomalies like extra or missing fingers are visible. The lighting and shadows are consistent with real anatomy. VERDICT: PASS

One person handing a small gift box to another person, both FAIL
One person handing a small gift box to another person, both sets of hands gripping the box correctly

Evaluator (Qwen 3.5 122B)

Upon close examination, the hands show anatomical inconsistencies. The left hand (person in gray suit) has a thumb and fingers that appear fused or malformed near the gift box. The right hand (recipient) shows fingers that are indistinct and possibly extra or merged — particularly the lower hand’s fingers seem to blend into each other unnaturally. Grip is plausible but not fully anatomically correct due to finger anomalies. VERDICT: FAIL

Multi-SubjectClick to expand100%

Multi-Subject measures whether local/sefi-image-5b-base keeps multiple people distinct while preserving their requested attributes, outfits, and positions.

Easy 100%
Two women standing side by side: one with short blonde hair PASS
Two women standing side by side: one with short blonde hair wearing a red jacket, the other with long black hair wearing a blue dress

Evaluator (Qwen 3 VL)

Left woman: short blonde hair, red jacket over black top. Right woman: long black hair, blue dress. Both visually distinct and match prompt descriptions precisely. No missing or mismatched elements. VERDICT: PASS

Two men at a cafe: one with a beard and glasses wearing a grPASS
Two men at a cafe: one with a beard and glasses wearing a green sweater, the other clean-shaven in a black hoodie

Evaluator (Qwen 3 VL)

Person 1: Bearded, wears glasses, green sweater — matches description. Person 2: Clean-shaven, wears black hoodie — matches description. Both are visually distinct and clearly identifiable. All prompt details are accurately rendered. VERDICT: PASS

Two friends posing: one tall with curly red hair in denim ovPASS
Two friends posing: one tall with curly red hair in denim overalls, the other short with straight brown hair in a yellow sundress

Evaluator (Qwen 3 VL)

Tall person: curly red hair, denim overalls over a white tee. Short person: straight brown hair, yellow sundress. Both visually distinct and match descriptions precisely. No missing or mismatched elements. VERDICT: PASS

Medium 100%
Four people at a dinner table: a bald man in a suit, a womanPASS
Four people at a dinner table: a bald man in a suit, a woman with silver hair in a red blouse, a young man with dreadlocks in a denim jacket, and a woman with a hijab in a green dress

Evaluator (Qwen 3 VL)

Bald man in suit: present, distinct. Silver-haired woman in red blouse: present, distinct. Young man with dreadlocks in denim jacket: present, distinct. Woman in green hijab and dress: present, distinct. All match descriptions precisely. VERDICT: PASS

Three musicians on stage: a tall woman with short pink hair PASS
Three musicians on stage: a tall woman with short pink hair playing bass, a man with a beard and bandana on drums, and a slim man with glasses playing keyboard

Evaluator (Qwen 3 VL)

Tall woman: pink bob haircut, playing bass. Man with beard/bandana: behind drums. Slim man with glasses: playing keyboard. All three are visually distinct and match descriptions precisely. VERDICT: PASS

Three hikers on a trail: a man with a backpack and sunhat, aPASS
Three hikers on a trail: a man with a backpack and sunhat, a woman with braided hair and hiking poles, and a teenager with a baseball cap and water bottle

Evaluator (Qwen 3 VL)

Man: wears sunhat and backpack. Woman: has braided hair, uses hiking poles, carries hydration pack. Teenager: wears baseball cap, holds water bottle, carries backpack. All three are visually distinct and match described features. VERDICT: PASS

Truthfulness 70%

PhotorealismClick to expand100%

Photorealism measures whether local/sefi-image-5b-base makes materials, surfaces, and real-world objects look visually plausible and distinct.

Medium 100%
A still life on a wooden table featuring a clear glass of waPASS
A still life on a wooden table featuring a clear glass of water, a polished silver spoon, a folded linen napkin, and a ripe red apple — each material rendered with accurate texture and light behavior

Evaluator (Qwen 3.6 27B)

The image accurately renders the requested materials. The glass shows proper refraction and transparency, the spoon exhibits metallic reflectivity, the napkin displays a soft fabric texture, the apple has a realistic skin sheen, and the table shows a natural wood grain. Lighting and shadows are consistent across all objects. VERDICT: PASS

A workspace desk with a brushed aluminum laptop, a leather-bPASS
A workspace desk with a brushed aluminum laptop, a leather-bound notebook, a ceramic coffee mug, and a glass paperweight — all materials visually distinct and realistic

Evaluator (Qwen 3.6 27B)

The image successfully renders the requested materials with high fidelity. The laptop exhibits a convincing brushed aluminum texture with appropriate metallic sheen. The notebook displays realistic leather grain and stitching. The ceramic mug has a matte, opaque finish, while the glass paperweight shows correct refraction and transparency. The wooden desk features a natural grain pattern. All materials are visually distinct and realistic. VERDICT: PASS

A kitchen counter with a stainless steel pot, a wooden cuttiPASS
A kitchen counter with a stainless steel pot, a wooden cutting board, a glass bottle of olive oil, and a cotton dish towel — each surface reflecting light correctly

Evaluator (Qwen 3.6 27B)

The image accurately renders the distinct material properties requested. The stainless steel pot exhibits sharp, metallic reflections and a smooth surface. The glass bottle shows appropriate translucency and refraction of the olive oil inside. The wooden cutting board displays natural grain and matte texture, while the cotton towel features a soft, fibrous weave. The countertop’s glossy reflection correctly mirrors the objects above it, demonstrating high fidelity in material simulation. VERDIC

Physics & ReflectionsClick to expand75%

Physics & Reflections measures whether local/sefi-image-5b-base follows basic physical cues such as shadows, reflections, gravity, transparency, and contact points.

Easy 67%
A red sphere on a white surface with a single shadow cast toPASS
A red sphere on a white surface with a single shadow cast to the right, consistent with a light source from the upper left

Evaluator (Qwen 3.6 27B)

The shadow is cast to the right and slightly downward, which is consistent with a light source from the upper left. The shadow’s soft edges and gradual fade suggest a diffused light source, which is physically plausible. The red sphere’s highlight on the upper left further confirms the lighting direction. No distortions or color anomalies are present in the shadow or reflection. VERDICT: PASS

A coffee mug on a sunny windowsill casting a sharp shadow toPASS
A coffee mug on a sunny windowsill casting a sharp shadow to the left

Evaluator (Qwen 3.6 27B)

The lighting is consistent, with the sun coming from the right window. The mug casts a sharp, dark shadow to the left, which aligns perfectly with the light source. The shadow's shape accurately reflects the mug's form, including the handle. The warm, golden hue of the sunlight on the white surfaces is realistic. There are no distortions or impossible reflections. VERDICT: PASS

A person standing outdoors at midday with a short shadow dirFAIL
A person standing outdoors at midday with a short shadow directly below them

Evaluator (Qwen 3.6 27B)

The shadow is cast directly forward from the subject’s feet, which contradicts the “midday” condition where shadows should be short and nearly directly beneath the person. Additionally, the shadow appears unnaturally elongated and distorted — resembling a stretched silhouette rather than a realistic projection under overhead sunlight. The orientation and length are inconsistent with midday lighting physics. VERDICT: FAIL

Medium 67%
A woman standing in front of a full-length mirror, her refleFAIL
A woman standing in front of a full-length mirror, her reflection accurately showing the back of her outfit and reversed orientation

Evaluator (Qwen 3.6 27B)

The reflection is physically impossible. The woman is facing the mirror, so her reflection should show her front. However, the reflection shows her front while the real woman’s back is visible — this is correct for orientation. But the reflection’s pose and body alignment do not match a true mirror image; specifically, the reflection appears to be facing forward as if it were a separate person, not a mirrored version. Additionally, the reflection’s clothing and posture are inconsistent with a tr

A cat sitting on a polished marble floor with its reflectionPASS
A cat sitting on a polished marble floor with its reflection visible below

Evaluator (Qwen 3.6 27B)

The reflection is physically plausible. It is positioned directly beneath the cat, matching the expected orientation for a horizontal reflective surface. The reflection shows appropriate distortion and blurring consistent with a polished but not perfectly mirror-like floor. The colors and lighting in the reflection align with the real cat above, and the direction of light sources (evident from highlights on the floor) supports the reflection’s appearance. No obvious artifacts or inconsistencies

A vase of flowers on a glass table with a clear reflection vPASS
A vase of flowers on a glass table with a clear reflection visible on the table surface

Evaluator (Qwen 3.6 27B)

The reflection is vertically inverted and correctly positioned beneath the vase, matching the expected behavior on a horizontal glass surface. The colors and details in the reflection align with the original bouquet, though slightly softened due to the glass medium—this is physically accurate. The stems and water line are mirrored appropriately. No directional or orientation errors are present. VERDICT: PASS

Hard 67%
A row of trees reflected in a calm lake at sunset, with the PASS
A row of trees reflected in a calm lake at sunset, with the reflection slightly darker and gently rippled

Evaluator (Qwen 3.6 27B)

The reflections are vertically inverted and aligned directly beneath the trees, which is physically correct. The water surface shows gentle ripples that distort the reflection appropriately, creating a slightly darker and softer appearance compared to the actual trees. The colors in the reflection match the sky’s gradient, maintaining consistency with the sunset lighting. No directional or orientation errors are present. VERDICT: PASS

A city skyline reflected in a rain puddle on a street, with FAIL
A city skyline reflected in a rain puddle on a street, with mild distortion and color shift

Evaluator (Qwen 3.6 27B)

The image depicts a city skyline that is physically impossible. The buildings are oriented upside down relative to the ground plane, with their spires pointing toward the viewer and their bases merging into the sky. This violates the laws of physics regarding how reflections work in a puddle; a reflection should mirror the upright objects above it, not invert the entire world. The orientation is fundamentally incorrect. VERDICT: FAIL

A mountain range reflected in a still alpine lake, with the PASS
A mountain range reflected in a still alpine lake, with the reflection inverted and subtly muted in color

Evaluator (Qwen 3.6 27B)

The reflection is perfectly inverted and mirrors the mountain range accurately, consistent with a still water surface. The colors in the reflection are slightly darker and more muted than the actual mountains, which is physically plausible due to light absorption and scattering in water. There is no unnatural distortion or directional mismatch. The overall composition adheres to real-world optical principles for reflections on calm water. VERDICT: PASS

Extreme 100%
A clear glass sphere sitting on a mirror surface near a poolPASS
A clear glass sphere sitting on a mirror surface near a pool of water, showing refraction through the sphere, reflection on the mirror, and water reflection simultaneously

Evaluator (Qwen 3.6 27B)

The image displays a glass sphere on a reflective surface with a pool in the background. The reflection of the sphere is vertically aligned and correctly inverted, showing appropriate distortion consistent with a convex surface reflecting onto a flat mirror. The refraction through the sphere captures the background elements (pool edge and sky) with realistic inversion and magnification. The lighting and shadows are consistent with an outdoor setting, and the colors match the environment. The phy

A chrome sculpture in a room with a glass floor and large wiPASS
A chrome sculpture in a room with a glass floor and large windows, showing multiple reflections and environmental mapping on every surface

Evaluator (Qwen 3.6 27B)

The reflections are physically plausible. The chrome sculpture accurately mirrors the surrounding environment, including the windows and room structure, with appropriate distortion due to its curved surface. The glass floor creates a clear, inverted reflection of the sculpture, consistent with the light sources and perspective. Shadows and highlights align with the lighting from the windows, enhancing realism. The environmental mapping on the sculpture’s surface is detailed and coherent. VERDIC

A crystal wine glass on a polished marble counter next to a PASS
A crystal wine glass on a polished marble counter next to a window overlooking a lake — showing refraction in the glass, reflection on the marble, and the lake's own reflective surface

Evaluator (Qwen 3.6 27B)

The image displays physically plausible optics. The reflection on the marble is correctly oriented and positioned directly beneath the glass, mirroring its shape with appropriate perspective distortion. The refraction within the crystal glass accurately inverts and distorts the background scenery, consistent with light passing through curved glass. The lake’s surface reflects the sky and distant trees with a soft, realistic blur, matching the depth of field. Lighting direction is consistent acro

World KnowledgeClick to expand58%

World Knowledge measures whether local/sefi-image-5b-base depicts recognizable places, landmarks, flags, and known objects with accurate real-world details.

Easy 100%
The Eiffel Tower in Paris on a clear day, architecturally acPASS
The Eiffel Tower in Paris on a clear day, architecturally accurate with its iron lattice structure and three levels

Evaluator (Qwen 3.6 27B)

The image depicts the Eiffel Tower with high architectural accuracy, including the iron lattice structure, three levels, and surrounding Parisian landscape. The sky is clear, matching the prompt. Minor details like distant buildings and trees are plausible. No significant distortions or inaccuracies are visible. The subject is clearly recognizable and factually correct. VERDICT: PASS

The Taj Mahal in Agra seen from the reflecting pool, with acPASS
The Taj Mahal in Agra seen from the reflecting pool, with accurate dome, minarets, and symmetry

Evaluator (Qwen 3.6 27B)

The image depicts the Taj Mahal with high fidelity. The central dome, four minarets, and the reflecting pool are accurately represented. The symmetry is perfect, matching the real-world structure. The architectural details, such as the arches and intricate carvings, are well-rendered. The lighting and reflections in the pool enhance the realism. Overall, the depiction is both recognizable and factually accurate. VERDICT: PASS

The Statue of Liberty on her island pedestal, with correct tPASS
The Statue of Liberty on her island pedestal, with correct torch, crown, tablet, and green patina

Evaluator (Qwen 3.6 27B)

The image depicts the Statue of Liberty with accurate architectural details: the green patina, seven-pointed crown, raised torch, and tablet are all correctly rendered. The pedestal structure matches the real-world design. The statue is recognizable and factually accurate in its key features. No significant inaccuracies are present. VERDICT: PASS

Medium 33%
A traditional Japanese shamisen instrument with accurate thrFAIL
A traditional Japanese shamisen instrument with accurate three-stringed design, wooden body, and bachi plectrum

Evaluator (Qwen 3.6 27B)

The image depicts a shamisen-like instrument, but it contains significant inaccuracies. A traditional shamisen has a rectangular body with a skin-covered soundboard, not a round wooden one like a banjo or lute. The tuning pegs are typically arranged differently, and the bachi (plectrum) is usually held in the hand, not attached to the instrument. These structural errors make it factually inaccurate. VERDICT: FAIL

A medieval astrolabe with accurate rete, tympan, and engraveFAIL
A medieval astrolabe with accurate rete, tympan, and engraved degree markings on brass

Evaluator (Qwen 3.6 27B)

The image depicts a brass astrolabe with recognizable components like the rete, mater, and alidade. However, the engraved markings are nonsensical gibberish rather than accurate degree scales or zodiac symbols, which is a common AI artifact. While the general form is correct, the lack of factual accuracy in the engravings fails the specific requirement for "accurate... engraved degree markings." VERDICT: FAIL

A Japanese kintsugi bowl — cracked ceramic repaired with visPASS
A Japanese kintsugi bowl — cracked ceramic repaired with visible gold lacquer lines

Evaluator (Qwen 3.6 27B)

The image accurately depicts a Japanese kintsugi bowl. The subject is clearly recognizable as a ceramic vessel with visible cracks repaired using gold lacquer, consistent with the traditional art form. The texture, glaze, and application of gold appear realistic and factually accurate to real-world examples. No architectural or factual inaccuracies are present. VERDICT: PASS

Hard 0%
A detailed diagram of the water cycle showing evaporation frFAIL
A detailed diagram of the water cycle showing evaporation from the ocean, condensation forming clouds, precipitation as rain, and collection in rivers flowing back to the sea

Evaluator (Qwen 3.6 27B)

The image depicts the water cycle with recognizable elements like evaporation, condensation, precipitation, and collection. However, the text labels are nonsensical gibberish rather than accurate scientific terms (e.g., "WHIT HJRAELTR" instead of Evaporation). This makes the diagram factually inaccurate as an educational tool. VERDICT: FAIL

A cross-section showing tectonic plates at a convergent bounFAIL
A cross-section showing tectonic plates at a convergent boundary with subduction, a trench, volcanic arc, and layered mantle

Evaluator (Qwen 3.6 27B)

The image depicts a geological cross-section, but it contains significant factual inaccuracies regarding the requested subject. The prompt asks for a "convergent boundary with subduction," yet the image shows two plates pulling apart (divergent boundary) with magma rising from a rift. There is no subduction zone, trench, or volcanic arc visible; instead, it illustrates seafloor spreading. Therefore, the subject is not factually accurate relative to the prompt's specific geological requirements.

An anatomical illustration of the human heart showing four cFAIL
An anatomical illustration of the human heart showing four chambers, valves, aorta, pulmonary arteries, and veins with correct placement

Evaluator (Qwen 3.6 27B)

The image is clearly recognizable as a human heart, but it contains significant anatomical inaccuracies. The great vessels are misshapen and incorrectly positioned—the aorta and pulmonary artery lack proper branching structure, and the superior/inferior vena cava are not accurately depicted. The atria and ventricles are vaguely suggested but lack clear chamber definition or valve placement. While stylistically rendered, it fails factual accuracy for an anatomical illustration. VERDICT: FAIL

Extreme 100%
A Morpho menelaus butterfly with accurate iridescent blue doPASS
A Morpho menelaus butterfly with accurate iridescent blue dorsal wings, brown ventral wing pattern with eyespots, and correct body morphology

Evaluator (Qwen 3.6 27B)

The image depicts a butterfly with the characteristic iridescent blue dorsal wings and brown ventral patterns with eyespots of a Morpho menelaus. The wing venation, body structure, and coloration are anatomically correct and visually consistent with real-world specimens. The symmetry and detail in the wing patterns further enhance its realism. No significant inaccuracies are observed. VERDICT: PASS

The Hagia Sophia interior showing the massive central dome, PASS
The Hagia Sophia interior showing the massive central dome, pendentives, semi-domes, and Islamic calligraphy medallions alongside Byzantine mosaics

Evaluator (Qwen 3.6 27B)

The image accurately depicts the interior of the Hagia Sophia, featuring the massive central dome, semi-domes, and pendentives. The Islamic calligraphy medallions are correctly placed and stylized. The architectural details, including the windows and arches, are consistent with the actual structure. The Byzantine mosaics are not prominently visible in this view, but the overall composition is factually and architecturally accurate. VERDICT: PASS

A specimen of bismuth crystal showing accurate iridescent oxPASS
A specimen of bismuth crystal showing accurate iridescent oxide layers, geometric hopper crystal formation, and characteristic staircase structure

Evaluator (Qwen 3.6 27B)

The image accurately depicts a bismuth crystal specimen with characteristic geometric hopper formations and staircase structures. The iridescent oxide layers display realistic rainbow hues (blues, greens, purples) consistent with real bismuth crystals. The crystalline geometry, metallic luster, and surface textures are factually accurate representations of this mineral. No significant architectural or factual inaccuracies are present. VERDICT: PASS

Professional Studio 93%

Camera & LightingClick to expand92%

Camera & Lighting measures whether local/sefi-image-5b-base follows specific lens, angle, focus, exposure, and lighting instructions.

Easy 100%
A wide-angle shot of a modern living room interior showing ePASS
A wide-angle shot of a modern living room interior showing expansive space with slight barrel distortion at the edges

Evaluator (Gemma 4 26B)

The image features a wide-angle perspective of a modern living room. It exhibits extreme barrel distortion, appearing as a fisheye or circular panorama, which aligns with the prompt's request for "slight barrel distortion" but actually goes much further. The lighting is even and bright, consistent with a modern interior. While the distortion is much more aggressive than "slight," the core elements of wide-angle perspective and the described setting are present. VERDICT: PASS

A close-up macro shot of a single dewdrop on a rose petal, sPASS
A close-up macro shot of a single dewdrop on a rose petal, shallow depth of field

Evaluator (Gemma 4 26B)

The image features a close-up macro shot of a pink rose petal. The central focus is a single, large, clear dewdrop that exhibits high detail and refraction. The depth of field is very shallow, with the background and foreground petals being heavily blurred, which is consistent with macro photography. The lighting is soft and natural, highlighting the texture of the petal and the clarity of the water droplet. VERDICT: PASS

An aerial bird's-eye view looking straight down at a trafficPASS
An aerial bird's-eye view looking straight down at a traffic roundabout with cars and trees

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. It features a true bird's-eye view (nadir perspective) looking straight down at a traffic roundabout. The perspective is flat and top-down, which is consistent with the "looking straight down" instruction. The lighting is even and diffuse, typical of an aerial shot, and the depth of field is deep, keeping the entire scene from the center to the edges in sharp focus. VERDICT: PASS

Medium 67%
A portrait lit with classic Rembrandt lighting — triangle ofFAIL
A portrait lit with classic Rembrandt lighting — triangle of light on the shadowed cheek, moody atmosphere

Evaluator (Gemma 4 26B)

The image features a portrait with a moody atmosphere. However, it fails to meet the specific requirement for Rembrandt lighting. In true Rembrandt lighting, there should be a distinct, small triangle of light on the shadowed cheek. In this image, the light is a broad, harsh patch that covers a large portion of the face, lacking the characteristic light triangle on the dark side. VERDICT: FAIL

A landscape photograph bathed in golden hour light — long waPASS
A landscape photograph bathed in golden hour light — long warm shadows, orange sky, soft glow

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt's requirements. The lighting is a classic golden hour scene with a low-hanging sun creating a bright, soft glow. The sky is a vibrant orange gradient, and the landscape features long, dramatic warm shadows stretching toward the viewer. The perspective is a wide landscape shot with a deep depth of field, capturing the vastness of the field and the distant hills. VERDICT: PASS

A high-key portrait of a person against a bright white backgPASS
A high-key portrait of a person against a bright white background, minimal shadows, airy and luminous

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. It features a high-key portrait with a bright, white, overexposed background. The lighting is extremely soft and luminous, creating an airy feel with virtually no visible shadows on the subject's face. The perspective is a direct, centered portrait, and the overall aesthetic is bright and ethereal as requested. VERDICT: PASS

Hard 100%
A portrait shot at 85mm f/1.4 with creamy bokeh background, PASS
A portrait shot at 85mm f/1.4 with creamy bokeh background, backlit with warm rim light separating the subject from a blurred garden

Evaluator (Gemma 4 26B)

The image perfectly matches the technical specifications. The portrait features a shallow depth of field with a very creamy, blurred bokeh background, consistent with an 85mm f/1.4 lens. The lighting is a clear backlit setup, featuring a strong, warm golden rim light that highlights the subject's hair and shoulder, effectively separating her from the blurred garden background. VERDICT: PASS

A street photograph with shallow depth of field as if shot aPASS
A street photograph with shallow depth of field as if shot at 50mm f/1.8, subject sharp with soft city lights in the background

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt's requirements. The subject is sharp and in focus, while the background features a shallow depth of field with soft, circular bokeh from city lights. The perspective is consistent with a 50mm lens, and the lighting captures a naturalistic street photography aesthetic during twilight/evening. VERDICT: PASS

A food photograph shot from above at 35mm f/2.8 with naturalPASS
A food photograph shot from above at 35mm f/2.8 with natural window light from the left, slight vignette, warm color temperature

Evaluator (Gemma 4 26B)

The image matches all the specified criteria. The perspective is a top-down shot (shot from above). The lighting is natural light coming from a window on the left side of the frame, creating a clear light-to-dark gradient. The color temperature is warm, and there is a noticeable vignette around the edges of the plate. The depth of field is consistent with a wide aperture (f/2.8), with the edges of the plate slightly softer than the center. VERDICT: PASS

Extreme 100%
A tilt-shift photograph of a city street making real-life loPASS
A tilt-shift photograph of a city street making real-life look like a miniature model — selective focus strip, saturated colors

Evaluator (Gemma 4 26B)

The image perfectly captures the tilt-shift effect. It features a high-angle perspective that makes the city street look like a miniature model. The depth of field is highly controlled, with a sharp selective focus strip along the center of the street and heavy blur (bokeh) in the foreground and background. Additionally, the colors are highly saturated, enhancing the toy-like appearance as requested. VERDICT: PASS

A cinematic frame with blue and orange anamorphic lens flarePASS
A cinematic frame with blue and orange anamorphic lens flare streaking horizontally across a dark moody scene

Evaluator (Gemma 4 26B)

The image features a dark, moody scene with a prominent horizontal light streak. The streak transitions from a bright blue on the left to a vibrant orange on the right, perfectly matching the "blue and orange anamorphic lens flare" description. The light streaks horizontally across the center of the frame as requested. VERDICT: PASS

A portrait with dramatic split lighting — half the face brigPASS
A portrait with dramatic split lighting — half the face brightly lit, the other half in deep shadow, high contrast chiaroscuro

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt's requirements. It features a close-up portrait with a clear split lighting pattern: the left side of the face is brightly illuminated, while the right side is cast in deep, heavy shadow. This creates the high-contrast chiaroscuro effect requested. The perspective is a standard portrait shot, and the lighting pattern is executed precisely as described. VERDICT: PASS

Color PrecisionClick to expand100%

Color Precision measures whether local/sefi-image-5b-base renders requested colors clearly and consistently across subjects, products, and backgrounds.

Easy 100%
A bright red sports car parked on a gray asphalt road — the PASS
A bright red sports car parked on a gray asphalt road — the red is vivid and unambiguous

Evaluator (Gemma 4 26B)

The image features a bright, vivid red sports car parked on a gray asphalt road. The red color of the car is intense, saturated, and unambiguous, perfectly matching the description provided in the prompt. The gray of the asphalt is also consistent with the requirement. VERDICT: PASS

A woman wearing a royal blue evening dress against a neutralPASS
A woman wearing a royal blue evening dress against a neutral background

Evaluator (Gemma 4 26B)

The image features a woman wearing a vibrant, deep blue evening dress that perfectly matches the description of "royal blue." The background is a solid, light gray, which qualifies as a "neutral background." All color requirements specified in the prompt are accurately represented in the generated image. VERDICT: PASS

A field of pure sunflower yellow flowers under a clear blue PASS
A field of pure sunflower yellow flowers under a clear blue sky

Evaluator (Gemma 4 26B)

The image features a vast field of bright, sunflower yellow flowers. The petals are a vibrant, saturated yellow, and the centers are a dark brownish-yellow. Above the field, there is a clear, bright blue sky. The colors perfectly match the prompt's description of "pure sunflower yellow flowers" and a "clear blue sky." VERDICT: PASS

Medium 100%
A bedroom wall painted in #FF6B35 (burnt orange) with white PASS
A bedroom wall painted in #FF6B35 (burnt orange) with white furniture and natural daylight

Evaluator (Gemma 4 26B)

The wall is a vibrant burnt orange, which aligns with the color name and the hex code #FF6B35. The furniture (bed, chair, and side table) is white, and the room is filled with bright, natural daylight coming from the window on the left. All color and lighting requirements are met. VERDICT: PASS

A forest scene where the foliage is precisely #2D5F2D (dark PASS
A forest scene where the foliage is precisely #2D5F2D (dark muted green) with brown tree trunks

Evaluator (Gemma 4 26B)

The image features a forest scene with dark, muted green foliage and brown tree trunks. The foliage color is a deep, desaturated green that aligns closely with the requested hex code #2D5F2D (a dark, muted forest green). The tree trunks are a natural brown. The colors accurately reflect the specific color requirements provided in the prompt. VERDICT: PASS

A room with walls in #4A90D9 (medium blue) with warm wooden PASS
A room with walls in #4A90D9 (medium blue) with warm wooden floor and white trim

Evaluator (Gemma 4 26B)

The image accurately reflects the prompt's requirements. The walls are a medium blue color consistent with #4A90D9, the floor is a warm wood tone, and the baseboard trim is white. The colors and materials match the specific descriptions provided. VERDICT: PASS

Hard 100%
A minimalist room with walls in #F5E6CC (warm cream), a sofaPASS
A minimalist room with walls in #F5E6CC (warm cream), a sofa in #2C3E50 (dark navy), and accent pillows in #E74C3C (bright red)

Evaluator (Gemma 4 26B)

The image accurately reflects the requested colors: - The walls are a warm cream color, matching #F5E6CC. - The sofa is a dark navy blue, matching #2C3E50. - The accent pillows are a bright red, matching #E74C3C. All specified colors and hex-related descriptions are correctly represented in the generated image. VERDICT: PASS

A product flat-lay on a #1A1A2E (deep dark blue) background,PASS
A product flat-lay on a #1A1A2E (deep dark blue) background, with a #F0C27F (soft gold) watch and #FFFFFF (pure white) earbuds

Evaluator (Gemma 4 26B)

The image accurately reflects the requested colors. The background is a deep, dark navy blue consistent with #1A1A2E. The watch features a soft gold/champagne tone matching #F0C27F. The earbuds are a bright, pure white consistent with #FFFFFF. All three color requirements are met precisely. VERDICT: PASS

A graphic design composition with background #FAFAFA (near-wPASS
A graphic design composition with background #FAFAFA (near-white), main shape #FF4500 (orange-red), and accent lines #333333 (dark charcoal)

Evaluator (Gemma 4 26B)

The image accurately reflects the requested color palette. The background is a near-white (#FAFAFA), the main geometric shape is a vibrant orange-red (#FF4500), and the accent lines are a dark charcoal gray (#333333). The colors match the specific hex values and descriptions provided in the prompt. VERDICT: PASS

Extreme 100%
A smooth gradient on a wall going from #FF0000 (red) on the PASS
A smooth gradient on a wall going from #FF0000 (red) on the left to #0000FF (blue) on the right, transitioning evenly through purple

Evaluator (Gemma 4 26B)

The image displays a smooth gradient transitioning from a pure red on the left to a pure blue on the right. The transition passes through a purple middle section, exactly as described. The colors match the requested hex values: #FF0000 (red) on the left and #0000FF (blue) on the right, with an even transition through purple. VERDICT: PASS

A sunset sky gradient on a flat surface from #FF6B35 (orangePASS
A sunset sky gradient on a flat surface from #FF6B35 (orange) at the bottom to #1B1464 (deep indigo) at the top

Evaluator (Gemma 4 26B)

The image displays a vertical gradient. At the bottom, there is a vibrant orange color that closely aligns with the specified #FF6B35. Moving upward, the color transitions through a lighter peach/pinkish hue into a deep, dark purple/indigo at the top, which matches the description of #1B1464. The gradient transition and the specific color targets are accurately represented. VERDICT: PASS

A horizontal banner with a gradient from #00C9FF (cyan) on tPASS
A horizontal banner with a gradient from #00C9FF (cyan) on the left to #92FE9D (mint green) on the right

Evaluator (Gemma 4 26B)

The image shows a horizontal banner with a smooth gradient. On the left side, the color is a bright cyan, which matches the description of #00C9FF. On the right side, the color transitions into a light, pale mint green, which matches the description of #92FE9D. The color transition is accurate to the requested hex values and named colors. VERDICT: PASS

PhotorealismClick to expand67%

Photorealism measures whether local/sefi-image-5b-base produces polished product and studio imagery with realistic materials, lighting, shadows, and finish.

Easy 67%
A single red sneaker on a clean white background with soft sPASS
A single red sneaker on a clean white background with soft studio lighting and gentle shadow — product photography style

Evaluator (Gemma 4 26B)

The image successfully captures the requested studio aesthetic. The lighting is soft and even, creating a natural shadow beneath the sneaker that grounds the object. The textures of the leather and the laces appear realistic and consistent. There are no obvious AI artifacts, such as warped geometry or nonsensical stitching, and the white background is clean and professional. It passes as a high-quality product photograph. VERDICT: PASS

A clear glass perfume bottle on a reflective black surface wFAIL
A clear glass perfume bottle on a reflective black surface with studio lighting, no background distractions

Evaluator (Gemma 4 26B)

The image fails to meet the quality of a professional studio photograph. There are significant AI artifacts, most notably in the physics of the liquid and the glass. The straw/dip tube is unnaturally warped and disconnected from the pump mechanism. Furthermore, the reflection on the surface is inconsistent with the bottle's shape, and the glass edges appear blurry and structurally unsound. It does not look like a real photograph. VERDICT: FAIL

A ripe green apple on a seamless white background with soft PASS
A ripe green apple on a seamless white background with soft side lighting and subtle shadow — catalog style

Evaluator (Gemma 4 26B)

The image is a high-quality studio shot of a green apple. The lighting is soft and natural, creating a realistic sense of volume and texture. The seamless white background and subtle shadow perfectly match the prompt. There are no obvious AI artifacts, such as unnatural textures, warping, or nonsensical stem structures. It looks like a professional catalog photograph. VERDICT: PASS

Graphical design 54%

Data VisualisationClick to expand0%

Data Visualisation measures whether local/sefi-image-5b-base produces charts with correct labels, proportions, colors, and readable values.

Easy 0%
A bar chart showing monthly sales for January through June wFAIL
A bar chart showing monthly sales for January through June with values 12K, 18K, 15K, 22K, 19K, 25K — labeled axes, color-coded bars, and a title 'Monthly Sales 2025'

Evaluator (Qwen 3 VL)

The chart shows correct values (12K, 18K, 15K, 22K, 19K, 25K) and title. Bars are color-coded. However, x-axis labels are incorrect (e.g., "1BK", "TEL", "JUK") instead of months (Jan–Jun). Missing axis labels (x/y) and no gridlines. Chart type is correct (bar chart), but critical labeling errors invalidate the visualization. VERDICT: FAIL

A pie chart showing market share: Company A 35%, Company B 2FAIL
A pie chart showing market share: Company A 35%, Company B 25%, Company C 20%, Others 20% — each slice a different color with labels and percentages

Evaluator (Qwen 3 VL)

The chart is a pie chart with labeled slices, but it contains critical errors: “Company A” is incorrectly labeled as both 35% and 25%. The percentages for Company C (20%) and Others (20%) are correct. The chart type matches, but data accuracy fails. VERDICT: FAIL

A simple line graph showing daily website visitors over one FAIL
A simple line graph showing daily website visitors over one week (Mon-Sun): 1200, 1500, 1100, 1800, 2000, 1700, 1400 — with grid lines, axis labels, and title 'Weekly Traffic'

Evaluator (Qwen 3 VL)

The chart shows a line graph titled "Weekly Traffic" with grid lines. However, the x-axis labels are reversed (e.g., "Monn" for Monday, then 1500, 1100, etc.), and the y-axis is mislabeled as "00700000" instead of "Visitors". The data points do not match the requested values (1200, 1500, 1100, 1800, 2000, 1700, 1400). The chart type is correct, but critical data and labels are incorrect and unreadable. VERDICT: FAIL

Layout & DesignClick to expand33%

Layout & Design measures whether local/sefi-image-5b-base organizes posters, menus, covers, and other designed compositions with the requested hierarchy and elements.

Easy 100%
A movie night poster with a large popcorn image in the centePASS
A movie night poster with a large popcorn image in the center, the title 'MOVIE NIGHT' in bold at the top, and the tagline 'Every Friday at 8 PM' at the bottom

Evaluator (Qwen 3 VL)

The image features "MOVIE NIGHT" in bold yellow text at the top, a large popcorn image centered below it, and "Every Friday at 8 PM" at the bottom. All specified elements are present with correct visual hierarchy and readable text placement. The layout matches the prompt exactly. VERDICT: PASS

An event poster featuring a guitar illustration, the headlinPASS
An event poster featuring a guitar illustration, the headline 'LIVE MUSIC FEST' in large text, and 'Tickets available now' as a tagline below

Evaluator (Qwen 3 VL)

The poster includes a guitar illustration, the headline "LIVE MUSIC FEST" in large, bold text with correct visual hierarchy, and the tagline "Tickets available now" below. Text is readable and correctly placed. All required elements are present and accurately rendered. VERDICT: PASS

A book club poster with an open book illustration, the titlePASS
A book club poster with an open book illustration, the title 'JOIN THE BOOK CLUB' prominently displayed, and 'Meets every Wednesday' as the tagline

Evaluator (Qwen 3 VL)

The image features an open book illustration with "JOIN THE BOOK CLUB" prominently displayed in large, bold text. Below it, "Meets every Wednesday" appears in smaller text, correctly placed as a tagline. All required elements are present with appropriate visual hierarchy and readable text. Layout matches the prompt exactly. VERDICT: PASS

Medium 0%
A landing page design with a hero header image at the top, aFAIL
A landing page design with a hero header image at the top, a headline and subtitle in the middle, a 'Sign Up' call-to-action button, and a footer with three social media icons

Evaluator (Qwen 3 VL)

The image shows a hero header, centered headline/subtitle, “Sign Up” button, and footer with three icons — matching the layout. However, the headline and subtitle text is gibberish and unreadable, violating the requirement for readable, correctly placed text. Visual hierarchy is intact, but content fails. VERDICT: FAIL

A newsletter layout with a banner image header, a two-columnFAIL
A newsletter layout with a banner image header, a two-column body section with text on the left and an image on the right, a highlighted quote block, and a footer with unsubscribe link

Evaluator (Qwen 3 VL)

The image shows a newsletter layout with a header, two-column body (text left, image right), quote block, and footer unsubscribe link — all present. However, all text is gibberish, unreadable placeholder text. Visual hierarchy is structurally correct but content is meaningless. Layout elements are correctly positioned. VERDICT: FAIL

A restaurant menu layout with a logo header, three category FAIL
A restaurant menu layout with a logo header, three category sections (appetizers, mains, desserts) each with items and prices, and a footer with address and hours

Evaluator (Qwen 3 VL)

The image shows a menu layout with a logo header, three category sections (Appetizers, Mains, Desserts), and a footer with address and hours. However, all text is gibberish or corrupted, making it unreadable. The visual hierarchy is present but meaningless due to illegible content. The structure is correct, but functional text is not. VERDICT: FAIL

Hard 0%
A magazine double-page spread with a left page containing a FAIL
A magazine double-page spread with a left page containing a large hero photo, and a right page with a two-column article, a pull quote, a sidebar with author bio, and a page number

Evaluator (Qwen 3 VL)

The layout matches the prompt: left page has a large hero photo, right page has two columns, a pull quote, and a sidebar with author bio. Page numbers (52, 17) are present. However, all text is gibberish, unreadable placeholder text, not actual content. Visual hierarchy is correct, but text is not legible or meaningful. VERDICT: FAIL

A mobile app UI mockup showing a dashboard with a top navigaFAIL
A mobile app UI mockup showing a dashboard with a top navigation bar, a search field, a 2x2 grid of feature cards with icons, a list of recent activity items, and a bottom tab bar

Evaluator (Qwen 3 VL)

The image shows a mobile app UI with a top navigation bar, search field, 2x2 grid of feature cards with icons, "Recent Activity" list, and bottom tab bar — all present. However, all text is gibberish or unreadable placeholder text, violating the requirement for readable, correctly placed text. Layout hierarchy is visually intact but text fails core functionality. VERDICT: FAIL

A fashion magazine spread with a full-bleed left-page photo,FAIL
A fashion magazine spread with a full-bleed left-page photo, right page with three-column layout, headline, body copy, a smaller inset image, and styled page numbers

Evaluator (Qwen 3 VL)

The image shows a left-page full-bleed photo and a right-page three-column layout with a headline, body copy, and a smaller inset image. Page numbers are present. However, all text is gibberish, unreadable, and incorrectly placed, violating the prompt’s requirement for readable, correctly positioned text. VERDICT: FAIL

Style DiversityClick to expand83%

Style Diversity measures whether local/sefi-image-5b-base can shift between visual styles like oil painting, pixel art, watercolor, and technical illustration.

Easy 67%
A golden retriever sitting in a garden, painted in the stylePASS
A golden retriever sitting in a garden, painted in the style of an oil painting with visible brushstrokes and rich texture

Evaluator (Qwen 3.5 122B)

The image clearly depicts a golden retriever sitting in a garden setting. The style is unmistakably that of an oil painting, with visible, textured brushstrokes throughout the fur and background. The color blending and impasto-like effects are consistent with traditional oil painting techniques. The rich texture and painterly quality are evident across the entire composition. VERDICT: PASS

A golden retriever sitting in a garden, rendered in JapanesePASS
A golden retriever sitting in a garden, rendered in Japanese anime style with flat colors and large expressive eyes

Evaluator (Qwen 3.5 122B)

The image clearly exhibits Japanese anime style characteristics: the golden retriever has large, expressive eyes with prominent highlights, simplified facial features, and soft shading typical of anime. The colors are vibrant yet relatively flat with gentle gradients, and the overall composition matches the requested aesthetic. The garden setting is rendered with stylized foliage and flowers consistent with anime backgrounds. VERDICT: PASS

A golden retriever sitting in a garden, rendered in pixel arFAIL
A golden retriever sitting in a garden, rendered in pixel art style with a 32x32 grid and limited color palette

Evaluator (Qwen 3.5 122B)

The image clearly depicts a golden retriever in a garden setting. The pixel art style is evident with visible grid lines and blocky rendering. However, the image appears to be much larger than 32x32 pixels (likely 256x256 or similar), and the color palette is quite extensive rather than limited. While the core concept is present, the specific technical requirements (32x32 grid and limited palette) are not met. VERDICT: FAIL

Medium 100%
A city street scene in the style of Art Deco — geometric shaPASS
A city street scene in the style of Art Deco — geometric shapes, gold and black palette, ornamental symmetry

Evaluator (Qwen 3.5 122B)

The image clearly exhibits the requested Art Deco style. Key characteristics are present: strong geometric shapes (zigzags, chevrons, sunburst motifs), a dominant gold and black palette, and ornamental symmetry framing the city street. The composition uses vertical lines and stylized patterns typical of the Art Deco movement. The vintage cars and architecture further reinforce the era’s aesthetic. All core visual elements align with the prompt. VERDICT: PASS

A woman with a parasol in the style of Impressionism — soft PASS
A woman with a parasol in the style of Impressionism — soft brushstrokes, dappled light, pastel palette

Evaluator (Qwen 3.5 122B)

The image clearly exhibits Impressionist characteristics: soft, visible brushstrokes throughout; dappled light filtering through foliage and onto the parasol; and a pastel palette dominated by lavenders, blues, and pale yellows. The woman’s form is rendered with loose, painterly strokes rather than sharp lines, and the background dissolves into abstract color patches — all hallmarks of the style. The overall effect matches the prompt’s requirements precisely. VERDICT: PASS

A portrait of a man in Pop Art style — bold outlines, Ben-DaPASS
A portrait of a man in Pop Art style — bold outlines, Ben-Day dots, flat vivid primary colors

Evaluator (Qwen 3.5 122B)

The image clearly exhibits Pop Art characteristics: bold outlines define facial features and clothing, Ben-Day dots are visibly used for shading and texture across the face and suit, and vivid primary colors (blue, red, yellow) dominate against a flat orange background. The stylized, graphic quality matches the requested aesthetic precisely. VERDICT: PASS

Hard 100%
A forest rendered in photorealistic watercolor — the detail PASS
A forest rendered in photorealistic watercolor — the detail and lighting of a photo combined with visible watercolor bleeds and paper texture

Evaluator (Qwen 3.5 122B)

The image successfully combines photorealistic forest details (tree bark, lighting, foliage density) with distinct watercolor characteristics. Visible wet-on-wet bleeds, pigment pooling at the edges, and a textured paper background are clearly present. The style matches the prompt’s request for a hybrid of photo-realism and watercolor aesthetics. VERDICT: PASS

A woman in a futuristic city rendered in cyberpunk Art NouvePASS
A woman in a futuristic city rendered in cyberpunk Art Nouveau — neon colors and tech elements combined with flowing organic lines and floral borders

Evaluator (Qwen 3.5 122B)

The image perfectly embodies the requested "cyberpunk Art Nouveau" style. The central figure wears a metallic, flowing bodysuit with organic, swirling patterns typical of Art Nouveau, while the background features a neon-lit futuristic cityscape characteristic of cyberpunk. The composition is framed by distinct floral borders with stylized flowers and leaves, completing the fusion of tech and nature. All key visual characteristics are clearly present and well-executed. VERDICT: PASS

A medieval castle depicted in Japanese ukiyo-e woodblock priPASS
A medieval castle depicted in Japanese ukiyo-e woodblock print style combined with photorealistic sky and lighting

Evaluator (Qwen 3.5 122B)

The image successfully combines ukiyo-e woodblock print aesthetics (flat colors, visible grain, stylized architecture) with a photorealistic sky gradient and lighting. The castle’s layered roofs, stone base, and pine trees reflect traditional Japanese castle design rendered in woodblock style. Sky transitions smoothly from dusk pink to deep blue, mimicking real atmospheric lighting. Core stylistic elements are clearly present and cohesively integrated. VERDICT: PASS

Extreme 67%
An interior scene rendered in soft brutalism — raw concrete PASS
An interior scene rendered in soft brutalism — raw concrete textures combined with rounded pastel furniture and warm diffused light

Evaluator (Qwen 3.5 122B)

The image perfectly embodies "soft brutalism" with its raw, unfinished concrete walls, ceiling, and structural pillar. The furniture consists of rounded, pastel-colored chairs (pink, cream, blue) that contrast with the hard architecture. Warm, diffused lighting emanates from recessed shelves, creating the requested ambiance. All key visual characteristics are present and clearly recognizable. VERDICT: PASS

A still life painted in anxious minimalism — sparse compositFAIL
A still life painted in anxious minimalism — sparse composition with uncomfortable negative space, muted colors, and subtle visual tension

Evaluator (Qwen 3.5 122B)

The image displays a minimalist composition with two horizontal bands of muted gray tones, separated by a subtle line. While it exhibits sparse composition and muted colors consistent with minimalism, it lacks the “anxious” quality or “subtle visual tension” described — there’s no visual unease, imbalance, or psychological weight. The negative space is calm, not uncomfortable. Thus, key stylistic elements are missing. VERDICT: FAIL

A cityscape rendered in nostalgic futurism — retro-futuristiPASS
A cityscape rendered in nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina

Evaluator (Qwen 3.5 122B)

The image clearly exhibits the requested nostalgic futurism style. Key characteristics include: 1. **Retro-futuristic architecture:** The prominent "Googie" style structures with saucer shapes, spires, and white concrete finishes are hallmarks of 1960s space-age design. 2. **Period vehicles:** The cars on the street are distinctly mid-20th-century models, reinforcing the retro setting. 3. **Color palette:** The image has a warm, sepia-toned filter that creates the requested "melancholic warm-