ImageBench

ImageBench V1 —

192 evaluations across 6 categories

Benchmark V1 verdicts are produced by VLM judges and can contain mistakes. Treat PASS/FAIL labels as machine-assisted assessments, and inspect the images yourself. Learn more about the methodology.

Generation Details

Source-backed model context, size, cost, and request settings for this ImageBench V1 run.

local/sefi-image-5b-rl

Local

SeFi Image 5B RL is a locally fine-tuned text-to-image model produced by the SeFi image-generation fine-tuning pipeline and run on an NVIDIA DGX Spark. It is the ~5B reinforcement-learning-tuned variant of the SeFi Image family. It is not a publicly released hosted product; no external model card or citation is disclosed.

Maker
SeFi pipeline
Family
SeFi Image
Model Size
~5B
estimated
Cost
local run; no API price
not_applicable
Run Target
gx10/sefi-image-5b-rl
Effective Request
Effective request fields unknown
58.6
Overall
78%
Capability
39.1
Est. Preference
150
Pass
42
Fail
130.7s
Avg Latency
124.0s
Min Latency
284.8s
Max Latency
Text Rendering93%Spatial Reasoning74%Human realism81%Truthfulness74%Professional Studio93%Graphical design63%Preference39%Latency0%

All 192 generations

Text Rendering

Spatial Reasoning

Human realism

Truthfulness

Professional Studio

Graphical design

Text Rendering 93%

Typography StyleClick to expand100%

Typography Style measures whether local/sefi-image-5b-rl preserves requested lettering styles, font mood, and text texture such as typewriter ink or neon signage.

Easy 100%
The word 'CHAPTER ONE' typed on aged paper with a vintage tyPASS
The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Evaluator (Qwen 3 VL)

Text reads “CHAPTER ONE” correctly. Font mimics vintage typewriter style with blocky, slightly uneven letterforms and ink texture. Paper shows aged, cream-colored texture. Typewriter carriage and platen visible below, enhancing realism. Ink appears consistent with mechanical typing, though minor smudges are absent. Overall, convincingly matches described medium. VERDICT: PASS

The phrase 'TODAY'S SPECIAL' written in white chalk on a darPASS
The phrase 'TODAY'S SPECIAL' written in white chalk on a dark green chalkboard with chalk dust

Evaluator (Qwen 3 VL)

Text reads “TODAY’S SPECIAL” correctly. Letters mimic chalk texture with smudges and uneven strokes on a dark green chalkboard surface. Chalk dust effect is visible around strokes. Medium convincingly rendered as chalk on board. Spelling is accurate. VERDICT: PASS

The word 'OPEN' displayed on a red LED sign with glowing pixPASS
The word 'OPEN' displayed on a red LED sign with glowing pixel-style letters

Evaluator (Qwen 3 VL)

The word "OPEN" is correctly spelled and rendered in a glowing, pixel-style LED matrix with distinct red dots forming each letter. The style convincingly mimics a real LED sign, with appropriate brightness, glow, and dot-based construction. All visual elements align with the prompt. VERDICT: PASS

Writing accuracyClick to expand92%

Writing accuracy measures whether local/sefi-image-5b-rl renders the exact requested words, numbers, labels, and short phrases without spelling or layout errors.

Easy 100%
A single large word 'HELLO' centered on a white backgroundPASS
A single large word 'HELLO' centered on a white background

Evaluator (Qwen 3 VL)

The word "HELLO" is fully spelled correctly and is perfectly legible against the white background. All letters are intact and clearly rendered. No misspellings, missing letters, or illegible parts are present. The text matches the intended specification exactly. VERDICT: PASS

The word 'ADVENTURE' in big bold letters centered on a plainPASS
The word 'ADVENTURE' in big bold letters centered on a plain cream background

Evaluator (Qwen 3 VL)

The word "ADVENTURE" is spelled correctly and fully readable. All letters are present and clearly formed against the cream background. The text is centered and bold as intended. No misspellings or illegible parts detected. VERDICT: PASS

The word 'SERENITY' written large and centered on a solid liPASS
The word 'SERENITY' written large and centered on a solid light blue background

Evaluator (Qwen 3 VL)

The word "SERENITY" is spelled correctly and fully readable, centered on a solid light blue background. No misspellings, missing words, or illegible parts are present. Text matches the intended specification exactly. VERDICT: PASS

Medium 100%
The sentence 'The quick brown fox jumps high' in clean blackPASS
The sentence 'The quick brown fox jumps high' in clean black text on a white background

Evaluator (Qwen 3 VL)

All words are spelled correctly and fully readable: "The", "quick", "brown", "fox", "jumps", "high". Text is clean black on white background. No misspellings or illegible parts. Matches intended text exactly. VERDICT: PASS

'Every sunset brings a new dawn ahead' written neatly on a pPASS
'Every sunset brings a new dawn ahead' written neatly on a pale yellow background

Evaluator (Qwen 3 VL)

Text reads: "Every sunset brings a new dawn ahead." — matches intended phrase exactly. All words are spelled correctly and fully legible in handwritten style. No misspellings or missing words. Background is pale yellow as specified. VERDICT: PASS

'Stars shine brightest in the dark night' displayed on a clePASS
'Stars shine brightest in the dark night' displayed on a clean off-white background

Evaluator (Qwen 3 VL)

Text reads: "Stars shine brightest in the dark night" — matches intended phrase exactly. All words are spelled correctly and fully readable against clean off-white background. No misspellings or illegible parts. VERDICT: PASS

Hard 100%
A poster with headline 'SUMMER SALE' in large bold text, bodPASS
A poster with headline 'SUMMER SALE' in large bold text, body text 'Up to 50% off all items this weekend only' in medium size, and fine print 'Terms and conditions apply. See store for details.' at the bottom in small text

Evaluator (Qwen 3 VL)

All text matches the intended message exactly. "SUMMER SALE", "Up to 50% off all items this weekend only", and "Terms and conditions apply. See store for details." are all spelled correctly and fully readable. No misspellings or illegible parts detected. VERDICT: PASS

A flyer with large title 'GRAND OPENING', subtitle 'Join us PASS
A flyer with large title 'GRAND OPENING', subtitle 'Join us for food, music, and fun' in medium text, and small disclaimer 'RSVP required. Limited seating available.' at the bottom

Evaluator (Qwen 3 VL)

All text matches the intended content exactly: "GRAND OPENING", "Join us for food, music, and fun", and "RSVP required. Limited seating available." Every word is spelled correctly and fully readable. No misspellings or illegible parts detected. VERDICT: PASS

An event banner with headline 'ANNUAL GALA 2025' at the top PASS
An event banner with headline 'ANNUAL GALA 2025' at the top in large serif font, description 'An evening of fine dining and live jazz' in medium text, and footnote 'Black tie optional. Doors open at 6 PM.' in small text

Evaluator (Qwen 3 VL)

All text matches the intended content exactly: "ANNUAL GALA 2025", "An evening of fine dining and live jazz", and "Black tie optional. Doors open at 6 PM." Every word is spelled correctly and fully readable. No misspellings or illegible parts detected. VERDICT: PASS

Extreme 67%
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagPASS
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagline 'One ship. One chance. No turning back.' with credits block reading 'Directed by James Harlow. Starring Elena Voss, Marcus Chen, and Sofia Reyes. Music by Thomas Grant. A Paramount Pict

Evaluator (Qwen 3 VL)

All text matches the intended content exactly: title, tagline, credits, and release info. No misspellings, missing words, or illegible text. Font, layout, and readability are clear and accurate. Every word is present and correctly rendered. VERDICT: PASS

A magazine cover titled 'NATIONAL EXPLORER' with cover linesFAIL
A magazine cover titled 'NATIONAL EXPLORER' with cover lines 'The Hidden Temples of Cambodia — page 34', 'Interview: Dr. Maria Santos on Climate Action', 'Top 10 Trails You Must Hike This Year', and a barcode area with 'Issue 247 | March 2025 | $6.99

Evaluator (Qwen 3 VL)

“Cambodia Action” is incorrectly phrased — should be “Cambodia — page 34” (missing dash and “page 34” misaligned). “Action” is a stray word. All other text is correct and readable. The barcode area is accurate. VERDICT: FAIL

A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtPASS
A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtitle 'Raw & Unfiltered', weight '12 oz (340g)', description 'Pure wildflower honey harvested from sustainable apiaries in the Pacific Northwest. No additives or preservatives.', and small te

Evaluator (Qwen 3 VL)

All text matches the intended content exactly: "GOLDEN HARVEST ORGANIC HONEY", "Raw & Unfiltered", "12 oz (340g)", description, and "Best before: 06/2026. Store in cool, dry place. Product of USA." No misspellings or illegible parts. All words are fully readable and accurate. VERDICT: PASS

Spatial Reasoning 74%

Attributes BindingClick to expand89%

Attributes Binding measures whether local/sefi-image-5b-rl attaches the right colors, materials, and properties to the correct requested objects.

Easy 100%
A red ball and a blue cube sitting side by side on a white sPASS
A red ball and a blue cube sitting side by side on a white surface

Evaluator (Qwen 3.5 122B)

The image shows two objects on a white surface: a red sphere (ball) and a blue cube. The red object is correctly shaped as a ball and colored red. The blue object is correctly shaped as a cube and colored blue. Both are positioned side by side. All attributes (color, shape) are correctly bound to the right objects with no swaps. VERDICT: PASS

A green apple and an orange pumpkin on a wooden cutting boarPASS
A green apple and an orange pumpkin on a wooden cutting board

Evaluator (Qwen 3.5 122B)

The image shows a green apple and an orange pumpkin on a wooden cutting board. The apple is correctly green, round, and has a stem. The pumpkin is correctly orange, round, and has a stem. All attributes are correctly bound to the right objects with no swaps. VERDICT: PASS

A yellow taxi and a black limousine parked next to each othePASS
A yellow taxi and a black limousine parked next to each other on a street

Evaluator (Qwen 3.5 122B)

The image shows a yellow taxi and a black limousine parked side by side on a street. The taxi is correctly colored yellow and has the typical taxi roof sign. The limousine is black, elongated, and clearly larger than the taxi, matching the description of a limousine. No attributes are swapped between the vehicles. VERDICT: PASS

Medium 100%
A large red circle, a small blue triangle, and a medium greePASS
A large red circle, a small blue triangle, and a medium green star arranged on a white canvas

Evaluator (Qwen 3.5 122B)

The image contains three distinct shapes on a white background. The large red circle is correctly rendered. The small blue triangle is correctly rendered. The medium green star is correctly rendered. All attributes (color, size, shape) are correctly bound to their respective objects with no swaps. VERDICT: PASS

A tall thin white candle, a short fat red candle, and a mediPASS
A tall thin white candle, a short fat red candle, and a medium spiral yellow candle on a shelf

Evaluator (Qwen 3.5 122B)

The image shows three candles on a shelf: 1. Left: Tall, thin, white candle — matches description. 2. Center: Short, fat, red candle — matches description. 3. Right: Medium height, spiral-shaped, yellow candle — matches description. All attributes (color, size, shape) are correctly bound to each object with no swaps. The spatial arrangement and visual properties align precisely with the prompt. VERDICT: PASS

A big brown horse, a small white rabbit, and a medium-sized PASS
A big brown horse, a small white rabbit, and a medium-sized black cat standing in a meadow

Evaluator (Qwen 3.5 122B)

The image contains a large brown horse with a white facial marking, a small white rabbit sitting on the grass, and a medium-sized black cat standing nearby. All objects match their described attributes: the horse is big and brown, the rabbit is small and white, and the cat is medium-sized and black. No attribute swaps are present. The scene is set in a meadow as requested. VERDICT: PASS

Hard 67%
A pink fire truck, a blue banana, and a green sun in a surrePASS
A pink fire truck, a blue banana, and a green sun in a surreal landscape

Evaluator (Qwen 3.5 122B)

The image shows a surreal landscape with three main objects. The fire truck is pink, matching the prompt. The banana is blue, also matching. The sun is green, as requested. All attributes are correctly bound to their respective objects with no swaps. The banana is large and curved, the fire truck is detailed with lights and wheels, and the sun is bright green with rays. VERDICT: PASS

A purple carrot, a red tree with red leaves, and a white watFAIL
A purple carrot, a red tree with red leaves, and a white watermelon sliced open

Evaluator (Qwen 3.5 122B)

The image shows: - A purple carrot (correct color and shape) with red maple-like leaves growing from its top — this swaps the “tree” attribute onto the carrot, violating object-boundary logic. - A sliced watermelon with white flesh and red seeds (correct attributes). - No separate red tree; the red foliage is attached to the carrot. The prompt requested three distinct objects: purple carrot, red tree, white watermelon. Here, the tree is merged with the carrot, breaking object separation

An orange elephant, a yellow penguin, and a silver strawberrPASS
An orange elephant, a yellow penguin, and a silver strawberry arranged on a table

Evaluator (Qwen 3.5 122B)

The image contains an orange elephant with white tusks and toenails, a yellow penguin with white belly and orange beak/feet, and a silver glittery strawberry with green leaves. All three objects match the prompt’s specified colors and types. No attribute swaps are present — each object retains its correct color, shape, and identity as described. The arrangement on a wooden table is also accurate. VERDICT: PASS

CompositionalityClick to expand89%

Compositionality measures whether local/sefi-image-5b-rl combines subjects, actions, outfits, props, and scene details into one coherent requested image.

Easy 67%
A tabby cat wearing oversized red sunglassesPASS
A tabby cat wearing oversized red sunglasses

Evaluator (Qwen 3.5 122B)

The image contains all described elements: - Tabby cat: Present (clear brown/black striped pattern) - Oversized red sunglasses: Present (large red frames covering eyes) - Coherent combination: Yes (glasses fit naturally on cat’s face) All prompt requirements are met with no contradictions or missing components. VERDICT: PASS

A golden retriever wearing a tiny cowboy hatFAIL
A golden retriever wearing a tiny cowboy hat

Evaluator (Qwen 3.5 122B)

The image shows a golden retriever wearing a cowboy hat. However, the hat is not tiny — it’s proportionally large, covering much of the dog’s head and ears, which contradicts the “tiny” descriptor in the prompt. While the core concept (dog + cowboy hat) is present, the size specification is inaccurate. VERDICT: FAIL

A penguin holding a colorful umbrellaPASS
A penguin holding a colorful umbrella

Evaluator (Qwen 3.5 122B)

The image contains: - A penguin: Present, clearly depicted with realistic feathers and anatomy. - A colorful umbrella: Present, rainbow-colored panels (red, orange, yellow, green, blue, purple). - Holding action: Present, the penguin’s flipper grips the umbrella handle. All elements are coherently combined — the penguin stands upright, holding the umbrella naturally as if shielding itself. No contradictions or missing components. VERDICT: PASS

Medium 100%
A medieval knight in full armor sitting at a desk typing on PASS
A medieval knight in full armor sitting at a desk typing on a modern laptop

Evaluator (Qwen 3.5 122B)

The image shows a medieval knight in full armor sitting at a wooden desk, typing on a modern laptop. All described elements are present and coherently combined: - Medieval knight in full armor: Present - Sitting at a desk: Present - Typing on a modern laptop: Present The scene is well-composed with appropriate lighting and detail. VERDICT: PASS

A Roman gladiator taking a selfie with a smartphone in the CPASS
A Roman gladiator taking a selfie with a smartphone in the Colosseum

Evaluator (Qwen 3.5 122B)

The image contains all three core elements: 1. Roman gladiator: Present (man in armor with helmet and plume) 2. Smartphone: Present (black phone held in hand) 3. Colosseum: Present (ancient amphitheater in background) All elements are coherently combined in a single scene. The gladiator is actively taking a selfie with the smartphone, and the Colosseum serves as the backdrop. The composition is logical and visually consistent. VERDICT: PASS

An astronaut in a spacesuit riding a bicycle through a parkPASS
An astronaut in a spacesuit riding a bicycle through a park

Evaluator (Qwen 3.5 122B)

The image shows an astronaut in a full white spacesuit with helmet and backpack, riding a black bicycle on a paved path through a green park. Trees, grass, and blurred background figures are visible. All prompt elements — astronaut, spacesuit, bicycle, park — are clearly present and coherently combined. The scene is visually consistent and logically assembled. VERDICT: PASS

Hard 100%
An octopus wearing a top hat and monocle playing chess on thPASS
An octopus wearing a top hat and monocle playing chess on the surface of the moon with Earth visible in the background

Evaluator (Qwen 3.5 122B)

The image contains all required elements: 1. Octopus - Present (central figure) 2. Top hat - Present (on octopus head) 3. Monocle - Present (over one eye) 4. Chess board/pieces - Present (in foreground) 5. Moon surface - Present (textured ground) 6. Earth in background - Present (visible planet) All elements are coherently combined in a surreal but visually consistent composition. The octopus appears to be actively playing chess with tentacles positioned over pieces. VERDICT: PASS

A dinosaur in a business suit giving a PowerPoint presentatiPASS
A dinosaur in a business suit giving a PowerPoint presentation in a modern office while drinking coffee

Evaluator (Qwen 3.5 122B)

Analysis of elements: - Dinosaur: Present (T-Rex head) - Business suit: Present (grey suit, tie, white shirt) - PowerPoint presentation: Present (screen with text behind) - Modern office: Present (desks, chairs, lighting) - Drinking coffee: Present (holding white mug) All elements are present and coherently combined. The image successfully merges the surreal concept with realistic office setting. VERDICT: PASS

A robot dressed as a pirate sailing a paper boat on a lake oPASS
A robot dressed as a pirate sailing a paper boat on a lake of lava while juggling three kittens

Evaluator (Qwen 3.5 122B)

The image contains all described elements: - Robot dressed as a pirate: Present (wearing pirate hat, skull-and-crossbones emblem, leather vest) - Sailing a paper boat: Present (white origami-style boat) - Lake of lava: Present (glowing red-orange molten surface with dark rocks) - Juggling three kittens: Present (three kittens floating mid-air above the robot’s hands) All components are coherently combined in a surreal, visually consistent scene. VERDICT: PASS

CountingClick to expand56%

Counting measures whether local/sefi-image-5b-rl creates the exact number of requested objects without adding or missing items.

Easy 100%
Exactly 3 red apples arranged on a wooden tablePASS
Exactly 3 red apples arranged on a wooden table

Evaluator (Qwen 3.5 122B)

The image displays three red apples placed on a wooden surface. The apples are arranged in a loose triangular formation, with one slightly forward and two behind. The wood grain is visible, and the background is neutral. The count of apples matches the requested quantity exactly. VERDICT: PASS

Exactly 5 yellow rubber ducks sitting in a row on a bathroomPASS
Exactly 5 yellow rubber ducks sitting in a row on a bathroom shelf

Evaluator (Qwen 3.5 122B)

The image shows exactly five yellow rubber ducks arranged in a row on a white bathroom shelf. Each duck is identical in appearance, with orange beaks and black eyes. The count matches the prompt’s request precisely: five ducks, correct color, correct setting. No extra or missing objects. VERDICT: PASS

Exactly 4 white candles standing on a dark surfacePASS
Exactly 4 white candles standing on a dark surface

Evaluator (Qwen 3.5 122B)

The image displays exactly four white candles, each with a lit flame, standing upright on a dark surface. The count matches the requested specification precisely. All candles are visible and clearly distinct. VERDICT: PASS

Medium 67%
2 orange cats and 3 black dogs sitting together on a grassy PASS
2 orange cats and 3 black dogs sitting together on a grassy lawn

Evaluator (Qwen 3.5 122B)

The image shows two orange tabby cats and three black dogs sitting together on a grassy lawn. The count matches the prompt exactly: 2 cats and 3 dogs. All animals are seated or positioned closely on grass, consistent with the description. No extra or missing animals are present. VERDICT: PASS

4 red roses and 2 white daisies in a glass vase on a tableFAIL
4 red roses and 2 white daisies in a glass vase on a table

Evaluator (Qwen 3.5 122B)

The image shows 5 red roses and 4 white daisies in a glass vase on a table, with 2 additional daisies lying on the table surface. The prompt requested exactly 4 roses and 2 daisies — both counts are exceeded. Therefore, the object count does not match the request. VERDICT: FAIL

1 blue butterfly and 3 yellow butterflies resting on a gardePASS
1 blue butterfly and 3 yellow butterflies resting on a garden fence

Evaluator (Qwen 3.5 122B)

The image shows 1 blue butterfly on the left fence post and 3 yellow butterflies — one mid-flight between posts, one resting on the right post, and a third partially visible behind it. All are positioned on or near a wooden garden fence. The count matches the prompt exactly: 1 blue, 3 yellow, all associated with the fence. VERDICT: PASS

Hard 0%
Exactly 9 goldfish swimming in a large round fishbowlFAIL
Exactly 9 goldfish swimming in a large round fishbowl

Evaluator (Qwen 3.5 122B)

The image shows a round fishbowl containing goldfish. Counting the visible fish: there are 10 distinct goldfish swimming in the bowl. The prompt requested exactly 9 goldfish. Since the actual count (10) does not match the requested count (9), the core criterion is not met. VERDICT: FAIL

A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 FAIL
A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 apples

Evaluator (Qwen 3.5 122B)

The image shows a fruit bowl with: - 3 bananas (correct) - 3 oranges (should be 4) - 2 apples (correct) The orange count is off by one. Since the prompt requires *exactly* 4 oranges and only 3 are present, the core criterion is not met. VERDICT: FAIL

Exactly 11 colorful balloons floating against a clear blue sFAIL
Exactly 11 colorful balloons floating against a clear blue sky

Evaluator (Qwen 3.5 122B)

The image displays a cluster of colorful balloons against a clear blue sky. Counting them: there are 12 balloons visible (red, green, purple, orange, yellow, blue, pink, teal, white, etc.). The prompt requested exactly 11 balloons. Since the actual count is 12, it does not match the requested number. VERDICT: FAIL

NegationClick to expand44%

Negation measures whether local/sefi-image-5b-rl respects what should be absent from the image, not only what should be present.

Easy 67%
A tropical beach scene with no peoplePASS
A tropical beach scene with no people

Evaluator (Qwen 3.5 122B)

The image shows a tropical beach scene with white sand, turquoise water, and palm trees along the shoreline. The beach is empty — there are no people visible anywhere in the frame, including the sand, water, or under the trees. The exclusion criterion (“no people”) is fully satisfied. VERDICT: PASS

A cat lying on a couch without a collarFAIL
A cat lying on a couch without a collar

Evaluator (Qwen 3.5 122B)

The image shows a cat lying on a couch, but it is wearing a black collar with a silver buckle around its neck. Since the prompt explicitly excluded “a collar,” and the collar is clearly present, the excluded element is not absent. VERDICT: FAIL

A classroom with desks and a chalkboard but no studentsPASS
A classroom with desks and a chalkboard but no students

Evaluator (Qwen 3.5 122B)

The image shows an empty classroom with rows of wooden desks and chairs facing a large black chalkboard. There are no students visible in the room — only furniture and architectural elements like windows, ceiling lights, and walls. The excluded element (students) is fully absent. VERDICT: PASS

Medium 33%
A sandy beach with palm trees, seashells, and a beach umbrelFAIL
A sandy beach with palm trees, seashells, and a beach umbrella but no water or ocean visible

Evaluator (Qwen 3.5 122B)

The image clearly shows a sandy beach with palm trees, seashells in the foreground, and a white beach umbrella. However, the prompt explicitly excluded “water or ocean visible,” and in this image, the ocean is prominently visible — turquoise water with gentle waves lapping the shore, extending to the horizon. The excluded element is present. VERDICT: FAIL

An office workspace with a desk, chair, bookshelf, and lamp FAIL
An office workspace with a desk, chair, bookshelf, and lamp but no computers or screens

Evaluator (Qwen 3.5 122B)

The image shows an office workspace with a desk, chair, bookshelf, and lamp — as expected. However, there is clearly a laptop (a computer) open on the desk. Since the prompt explicitly excluded “computers or screens,” and a laptop is present, the exclusion criterion is violated. VERDICT: FAIL

A kitchen with pots, pans, a stove, and a cutting board but PASS
A kitchen with pots, pans, a stove, and a cutting board but no food

Evaluator (Qwen 3.5 122B)

The image shows a clean kitchen scene with multiple pots and pans on a gas stove, along with wooden cutting boards on the countertop. There is no visible food in any of the pots, pans, or on the cutting boards. The excluded element (food) is truly absent. VERDICT: PASS

Hard 33%
A pepperoni pizza with tomato sauce, basil, and crust but abFAIL
A pepperoni pizza with tomato sauce, basil, and crust but absolutely no cheese

Evaluator (Qwen 3.5 122B)

The image shows a pepperoni pizza with tomato sauce, basil, and crust — but it clearly contains melted cheese covering the surface beneath the toppings. The cheese is visible as a white-to-golden layer between the sauce, pepperoni, and basil. Since the prompt explicitly excluded cheese (“absolutely no cheese”), and cheese is present, the core criterion is violated. VERDICT: FAIL

A car body with windows, doors, and headlights but no wheelsPASS
A car body with windows, doors, and headlights but no wheels

Evaluator (Qwen 3.5 122B)

The image shows a white, untextured 3D model of a car body. It clearly includes windows (as openings), doors with handles, and headlight housings. Crucially, there are no wheels attached — the wheel wells are empty, and the car appears to be floating or suspended without any tires or rims. The excluded element (wheels) is truly absent. VERDICT: PASS

A human face with eyes, nose, and mouth but no eyebrowsFAIL
A human face with eyes, nose, and mouth but no eyebrows

Evaluator (Qwen 3.5 122B)

The image shows a human face with eyes, nose, and mouth. However, the excluded element — eyebrows — are clearly present. Both eyebrows are well-defined, natural-looking, and positioned above the eyes. Since the prompt explicitly required “no eyebrows,” their presence violates the constraint. VERDICT: FAIL

Relative PositionClick to expand83%

Relative Position measures whether local/sefi-image-5b-rl places objects in the requested spatial relationship, such as above, below, inside, or next to another object.

Easy 100%
A cat sitting on top of a cardboard boxPASS
A cat sitting on top of a cardboard box

Evaluator (Qwen 3.5 122B)

The image shows a tabby cat sitting upright on top of a closed cardboard box. The cat’s paws are placed on the box’s surface, and its tail drapes over the edge. The spatial relationship “cat on top of box” is accurately depicted. No conflicting or incorrect positional descriptors are present. VERDICT: PASS

A red ball underneath a wooden chairPASS
A red ball underneath a wooden chair

Evaluator (Qwen 3.5 122B)

The image shows a red ball positioned on a wooden floor directly beneath the seat of a wooden chair. The ball is centered between the front legs of the chair, clearly “underneath” it as specified. All spatial relationships described in the prompt are accurately represented. VERDICT: PASS

A coffee mug next to an open book on a deskPASS
A coffee mug next to an open book on a desk

Evaluator (Qwen 3.5 122B)

The image shows a white coffee mug positioned to the left and slightly behind an open book, both resting on a wooden desk surface. The spatial relationship “next to” is accurately represented — the mug is adjacent to the book. Both objects are “on” the desk as specified. No contradictory positional claims (e.g., above, under) are present or violated. All described relationships hold true. VERDICT: PASS

Medium 100%
A bird perched on a branch, the branch extending from a treePASS
A bird perched on a branch, the branch extending from a tree next to a park bench

Evaluator (Qwen 3.5 122B)

The image shows a bird perched on a branch that extends from a tree trunk on the right side of the frame. The branch angles downward toward the left. In the background, slightly out of focus, is a park bench positioned below and to the left of the bird. All spatial relationships described — bird “on” branch, branch “from” tree, tree “next to” bench (in scene context), and bench “below” bird — are accurately represented. VERDICT: PASS

A cat sitting on a pillow, the pillow placed on a wooden chaPASS
A cat sitting on a pillow, the pillow placed on a wooden chair

Evaluator (Qwen 3.5 122B)

The image shows a tabby cat sitting upright on a white pillow. The pillow is placed directly on the seat of a wooden chair. The spatial relationships described — “cat on pillow,” “pillow on chair” — are all accurately represented. No positional errors (left/right/above/below) are present. All specified relationships hold true. VERDICT: PASS

A candle on top of a stack of books, the books next to a potPASS
A candle on top of a stack of books, the books next to a potted plant

Evaluator (Qwen 3.5 122B)

The image shows a lit candle placed directly on top of a stack of four books. To the left of the book stack is a potted plant with green leaves. The spatial relationships described — “candle on top of books” and “books next to potted plant” — are accurately represented. All positional descriptors (on, next to) match the visual layout. VERDICT: PASS

Hard 67%
Five chess pieces on a board: a king in the center, a queen FAIL
Five chess pieces on a board: a king in the center, a queen to its left, a bishop to its right, a rook above the king, and a knight below the king

Evaluator (Qwen 3.5 122B)

The image shows five chess pieces on a board, but the spatial relationships do not match the prompt. The king is central, but there is no queen to its left — instead, a pawn is there. To the king’s right is a dark bishop, which matches. There is no rook above the king; instead, a knight is in front (below) the king. The prompt’s “rook above” and “queen left” are incorrect. VERDICT: FAIL

A desk scene: laptop in the center, coffee mug to the left, PASS
A desk scene: laptop in the center, coffee mug to the left, phone to the right, lamp above, and notebook below

Evaluator (Qwen 3.5 122B)

The image shows a desk scene with a laptop centered, a coffee mug to its left, a phone to its right, a lamp positioned above and to the left of the laptop, and an open notebook below the laptop in the foreground. All specified spatial relationships — left, right, above, below — are accurately represented as described in the prompt. VERDICT: PASS

Five animals in a field: a horse on the far left, a cow to tPASS
Five animals in a field: a horse on the far left, a cow to the right of the horse, a sheep in the center, a goat to the right of the sheep, and a dog on the far right

Evaluator (Qwen 3.5 122B)

The image shows five animals in a field arranged left to right: a brown horse, a black-and-white cow, a white sheep, a white goat, and a tricolor dog. The spatial relationships match the prompt exactly — horse on far left, cow to its right, sheep centered, goat to sheep’s right, dog on far right. All positional descriptors are accurate. VERDICT: PASS

Extreme 67%
A landscape with a red barn in the foreground at large scaleFAIL
A landscape with a red barn in the foreground at large scale, a windmill in the midground at medium scale, and snow-capped mountains in the background at small scale, all with consistent depth perspective

Evaluator (Qwen 3.5 122B)

The image shows a red barn in the foreground, large and prominent. A windmill stands behind it in the midground, smaller in scale. Snow-capped mountains dominate the background, appearing distant and large due to perspective — not “small scale” as specified. The prompt incorrectly describes mountain scale; they are visually dominant, not small. Spatial relationships (foreground/midground/background) are correct, but the “small scale” descriptor for mountains is inaccurate. VERDICT: FAIL

A beach scene with a sandcastle in the foreground, a surfer PASS
A beach scene with a sandcastle in the foreground, a surfer on the waves in the midground, and a cruise ship on the horizon in the background, sizes decreasing with distance

Evaluator (Qwen 3.5 122B)

The image correctly depicts a beach scene with a sandcastle in the foreground, a surfer riding waves in the midground, and a cruise ship on the horizon in the background. The sizes of the objects decrease appropriately with distance, maintaining proper spatial relationships as described. All positional relationships (on, under, left, right, above, below) are accurate. VERDICT: PASS

A city park with a park bench and pigeons in the foreground,PASS
A city park with a park bench and pigeons in the foreground, a fountain in the midground, and skyscrapers in the background, each layer progressively smaller

Evaluator (Qwen 3.5 122B)

The image shows a park bench and pigeons in the foreground, a fountain in the midground, and skyscrapers in the background — matching the described spatial layers. However, the prompt specifies “each layer progressively smaller,” which is not visually accurate: the fountain appears larger than the distant skyscrapers due to perspective, but the skyscrapers are actually much larger in reality; the phrase likely intends depth-based scaling, which is correctly rendered. All positional relationships

Scale & ProportionsClick to expand78%

Scale & Proportions measures whether local/sefi-image-5b-rl keeps people, vehicles, buildings, and everyday objects at believable relative sizes.

Easy 67%
A person standing next to a yellow school bus, showing realiPASS
A person standing next to a yellow school bus, showing realistic relative scale

Evaluator (Qwen 3.5 122B)

The image shows a boy standing in front of a yellow school bus. The relative scale appears realistic — the boy is appropriately sized compared to the bus, neither too large nor too small. His height relative to the bus windows and door aligns with real-world proportions. No obvious distortions or scaling errors are present. VERDICT: PASS

A ladybug sitting on a large green leaf, shown at realistic FAIL
A ladybug sitting on a large green leaf, shown at realistic proportions

Evaluator (Qwen 3.5 122B)

The ladybug appears disproportionately large relative to the leaf veins and surface texture. In reality, a ladybug is small enough that leaf veins would appear much larger in comparison. The insect’s scale suggests it’s unnaturally oversized for a typical leaf, breaking realistic proportion expectations. This inconsistency violates the prompt’s requirement for “realistic proportions.” VERDICT: FAIL

A child holding a basketball, both at correct relative sizesPASS
A child holding a basketball, both at correct relative sizes

Evaluator (Qwen 3.5 122B)

The child’s body proportions and the basketball size appear realistic and consistent with a typical young child holding a standard-sized basketball. The ball fits naturally in the hand, and limb-to-torso ratios are plausible. No obvious distortions or scaling errors are present. VERDICT: PASS

Medium 100%
A tiny person standing on the rim of a coffee cup, looking dPASS
A tiny person standing on the rim of a coffee cup, looking down into it

Evaluator (Qwen 3.5 122B)

The image shows a miniature figurine of a person standing on the rim of a white coffee cup filled with dark liquid. The scale is consistent with the prompt: the person is tiny relative to the cup, which appears normal-sized. The proportions between the figurine and the cup are believable for a “tiny person” concept — the figure fits naturally on the rim without appearing oversized or undersized. Lighting and perspective support the illusion of scale. VERDICT: PASS

A giant tabby cat walking between city skyscrapers like a kaPASS
A giant tabby cat walking between city skyscrapers like a kaiju

Evaluator (Qwen 3.5 122B)

The image depicts a massive tabby cat towering over skyscrapers, consistent with the "kaiju" description. The cat’s paws are larger than entire building sections, and its head reaches mid-rise levels, correctly establishing the intended giant scale. Human figures and street elements appear tiny in comparison, reinforcing the size relationship. Lighting and perspective support the surreal proportionality without visual contradictions. VERDICT: PASS

A miniature house sitting on the palm of someone's handPASS
A miniature house sitting on the palm of someone's hand

Evaluator (Qwen 3.5 122B)

The image shows a miniature house resting on an open human palm. The house is appropriately scaled to fit comfortably within the hand, with proportions that are realistic for a small model or toy house. The fingers and palm appear natural in size relative to the house, and there are no distortions or inconsistencies in the spatial relationship between the objects. VERDICT: PASS

Hard 67%
A group photo of an elephant, a horse, a dog, a cat, and a mFAIL
A group photo of an elephant, a horse, a dog, a cat, and a mouse, all standing in a line with correct real-world proportional sizes

Evaluator (Qwen 3.5 122B)

The image shows an elephant, horse, dog, two cats, and a mouse arranged in a line. The relative sizes are mostly accurate: elephant > horse > dog > cats > mouse. However, there are two cats instead of one as specified, and the second cat is nearly as large as the dog — which is unrealistic for typical domestic cats. This breaks the “correct real-world proportional sizes” requirement. VERDICT: FAIL

A dinner table scene with a plate, fork, knife, wine glass, PASS
A dinner table scene with a plate, fork, knife, wine glass, salt shaker, and peppercorn, all at realistic relative sizes

Evaluator (Qwen 3.5 122B)

The image displays a table setting with a plate, fork, knife, wine glass, salt shaker, and pepper shaker. The relative sizes appear realistic: the plate is appropriately sized for a dinner setting, the cutlery is proportional to the plate, the wine glass is standard height relative to the other items, and the shakers are small but correctly scaled. No obvious distortions or size inconsistencies are present. VERDICT: PASS

A farm scene with a barn, tractor, farmer, cow, chicken, andPASS
A farm scene with a barn, tractor, farmer, cow, chicken, and a fence post, all at correct proportional scale

Evaluator (Qwen 3.5 122B)

The image displays a farm scene with a barn, tractor, farmer, cow, chicken, and fence post. The relative sizes are generally consistent with real-world expectations: the barn is the largest structure, the tractor is substantial, the farmer is human-sized, the cow is appropriately large for its species, the chicken is small, and the fence post is of standard height. No obvious scaling errors are present. VERDICT: PASS

Human realism 81%

Faces & ExpressionsClick to expand92%

Faces & Expressions measures whether local/sefi-image-5b-rl creates realistic faces with the requested age, emotion, expression, and identity cues.

Easy 100%
A close-up portrait of a young woman with a warm natural smiPASS
A close-up portrait of a young woman with a warm natural smile

Evaluator (Qwen 3 VL)

Face is symmetric with natural, correctly proportioned features. Warm, genuine smile is clearly readable, conveying happiness. No significant distortions; eyes, nose, and mouth align naturally. Skin texture and lighting appear realistic. Slight over-smoothness in skin is minor but not distracting. VERDICT: PASS

A portrait of a middle-aged man with a calm, serious expressPASS
A portrait of a middle-aged man with a calm, serious expression

Evaluator (Qwen 3 VL)

Face is symmetric with natural, correctly proportioned features. Expression is calm and serious, with subtle brow lines and neutral mouth. No visible distortions. Skin texture and lighting are realistic. Matches prompt description accurately. VERDICT: PASS

A headshot of a teenager with a relaxed, neutral facePASS
A headshot of a teenager with a relaxed, neutral face

Evaluator (Qwen 3 VL)

Face is symmetric with correctly proportioned features. Expression is relaxed and neutral, with no strong emotion. Slight asymmetry in eye gaze and minor lighting shadows, but no significant distortions. Skin texture and hair look natural. Meets prompt requirements. VERDICT: PASS

Medium 100%
A close-up of a person's face showing clear disgust, nose wrPASS
A close-up of a person's face showing clear disgust, nose wrinkled and lip curled

Evaluator (Qwen 3 VL)

The face shows strong, symmetrical disgust: nose wrinkled, lips curled, and eyes squinted. Features are anatomically plausible and well-defined. No significant distortions or asymmetries. The emotion is unmistakably clear and matches the prompt precisely. VERDICT: PASS

A portrait capturing genuine surprise — raised eyebrows, widPASS
A portrait capturing genuine surprise — raised eyebrows, wide eyes, slightly open mouth

Evaluator (Qwen 3 VL)

Face is symmetric with correctly rendered features. Eyes are wide, eyebrows raised, and mouth slightly open — clearly conveying surprise. No significant distortions; anatomy is plausible. The expression is exaggerated but readable and aligns with the prompt. VERDICT: PASS

A face showing quiet melancholy — downcast eyes, slightly drPASS
A face showing quiet melancholy — downcast eyes, slightly drawn mouth, distant gaze

Evaluator (Qwen 3 VL)

Face is symmetric with natural, correctly rendered features. Eyes are downcast, lips slightly drawn, gaze distant — all aligning with the prompt’s melancholy description. No visible distortions. Lighting and composition enhance the somber mood. VERDICT: PASS

Hard 100%
An elderly woman in her 80s with deep wrinkles and white haiPASS
An elderly woman in her 80s with deep wrinkles and white hair, smiling warmly with kind eyes

Evaluator (Qwen 3 VL)

Face is symmetric with natural, aged features and deep wrinkles. White hair is present. Warm, genuine smile with crinkled eyes conveys kindness. No significant distortions. Facial structure and expression align well with the prompt. VERDICT: PASS

A 5-year-old child with round cheeks and large eyes, crying PASS
A 5-year-old child with round cheeks and large eyes, crying with tears streaming down their face

Evaluator (Qwen 3 VL)

The child’s face is symmetric with natural features: round cheeks, large eyes, and a tear streaming down the right cheek. The crying expression is clearly readable, with furrowed brows, an open mouth, and visible tears. No significant distortions are present. The image accurately matches the prompt’s description. VERDICT: PASS

An elderly man in his 70s with age spots and thinning gray hPASS
An elderly man in his 70s with age spots and thinning gray hair, looking skeptically over reading glasses

Evaluator (Qwen 3 VL)

Face is symmetric with natural wrinkles, age spots, and thinning gray hair. Skeptical expression is clearly readable via furrowed brow and downturned mouth. No distortions; features are anatomically accurate. Glasses are correctly placed. All prompt elements are faithfully rendered. VERDICT: PASS

Extreme 67%
A group of three friends at a table: one laughing with head PASS
A group of three friends at a table: one laughing with head tilted back, one looking shocked with wide eyes, and one rolling their eyes with a smirk

Evaluator (Qwen 3 VL)

Faces show clear, symmetrical features with no anatomical distortions. Emotions are vividly readable: left person laughs with head back, center looks shocked with wide eyes, right person rolls eyes with smirk. Expressions are exaggerated but natural and intentional. Lighting and composition support clarity. All three faces meet the prompt’s emotional requirements. VERDICT: PASS

A family photo with four people: a father beaming with prideFAIL
A family photo with four people: a father beaming with pride, a mother smiling gently, a teenager looking bored, and a child grinning excitedly

Evaluator (Qwen 3 VL)

Faces show mostly symmetrical features with no major distortions. The father beams proudly, the child grins excitedly, but the two women display neutral or slightly displeased expressions—not the “gentle smile” or “bored” teen intended. The teen is not clearly identifiable as such. Emotions are partially misaligned with prompt. VERDICT: FAIL

Three coworkers in a meeting: one looking frustrated with fuPASS
Three coworkers in a meeting: one looking frustrated with furrowed brows, one nodding with a pleased expression, and one looking confused with a tilted head

Evaluator (Qwen 3 VL)

Faces are symmetric with correct anatomical features. Emotions are clearly readable: left man shows frustration (furrowed brows, grimace), right man shows confusion (tilted head, furrowed brow), center woman shows pleasure (smile, hand to cheek). No distortions or artifacts. All three expressions match the prompt. VERDICT: PASS

Full BodyClick to expand100%

Full Body measures whether local/sefi-image-5b-rl renders full-body people with natural anatomy, clothing, pose, and proportions.

Easy 100%
A woman standing upright in a relaxed pose, arms at her sidePASS
A woman standing upright in a relaxed pose, arms at her sides, wearing jeans and a white t-shirt

Evaluator (Qwen 3 VL)

The woman stands upright with arms at her sides, wearing jeans and a white t-shirt as described. Limb count and proportions are anatomically correct. Pose is relaxed and recognizable. No distortions observed. Clothing and posture match the prompt precisely. VERDICT: PASS

A man standing straight facing the camera, hands in pockets,PASS
A man standing straight facing the camera, hands in pockets, normal proportions

Evaluator (Qwen 3 VL)

The man stands straight, facing forward, with hands in pockets. Limb count and proportions are anatomically correct. The pose is clearly recognizable as described. No distortions are visible. All elements match the prompt precisely. VERDICT: PASS

A person standing in a neutral pose in a park, full body visPASS
A person standing in a neutral pose in a park, full body visible from head to feet

Evaluator (Qwen 3 VL)

The person’s limbs are correctly counted and proportioned. The neutral standing pose is anatomically accurate and recognizable. No distortions are visible. The full body from head to feet is clearly shown in a park setting. All prompt requirements are met. VERDICT: PASS

Medium 100%
A woman mid-stride running through a park, hair and clothes PASS
A woman mid-stride running through a park, hair and clothes showing motion

Evaluator (Qwen 3 VL)

Anatomically correct limb count and proportions. Pose matches running mid-stride with dynamic motion blur in hair and clothing. No distortions. Action is clearly recognizable as running through a park. All elements align with the prompt. VERDICT: PASS

A man captured mid-jump over a puddle, body in the air with PASS
A man captured mid-jump over a puddle, body in the air with bent knees

Evaluator (Qwen 3 VL)

The man’s limbs are correctly counted and proportioned. His pose—bent knees, arms forward, mid-air over a puddle with splash—is anatomically plausible and matches the prompt. No distortions are visible. The action is clearly recognizable as jumping over water. VERDICT: PASS

A dancer in a dynamic spin, dress flowing outward with one lPASS
A dancer in a dynamic spin, dress flowing outward with one leg extended

Evaluator (Qwen 3 VL)

The dancer’s limbs are correctly proportioned and anatomically plausible. The pose—dynamic spin with one leg extended and dress flowing—is clearly recognizable and executed with balletic grace. No distortions in limb count or structure are visible. The motion and fabric dynamics are convincingly rendered. VERDICT: PASS

Hard 100%
A person in a yoga tree pose — standing on one leg with the PASS
A person in a yoga tree pose — standing on one leg with the other foot pressed to the inner thigh, arms raised overhead

Evaluator (Qwen 3 VL)

The person stands on one leg, with the other foot pressed against the inner thigh, arms raised overhead in prayer position. Limb count and proportions are accurate. The pose is clearly recognizable as Tree Pose (Vrksasana). No anatomical distortions are visible. The execution is anatomically correct and matches the prompt. VERDICT: PASS

A rock climber clinging to a cliff face with arms and legs ePASS
A rock climber clinging to a cliff face with arms and legs extended to different holds

Evaluator (Qwen 3 VL)

The climber has two arms and two legs, with proportions and joint angles consistent with human anatomy. The pose—arms and legs extended to different holds—is clearly recognizable as a dynamic climbing position. No distortions are evident; limbs are correctly articulated and positioned for the action. The scene matches the prompt accurately. VERDICT: PASS

A cellist seated with correct posture, legs around the instrPASS
A cellist seated with correct posture, legs around the instrument, bow arm extended

Evaluator (Qwen 3 VL)

The cellist has correct limb count and proportions. The pose is anatomically accurate: seated, legs encircling the cello, bow arm extended. The instrument rests on a stand, not between legs as implied by “legs around the instrument.” The action is recognizable as cello playing. Minor distortion: the bow appears slightly misaligned with the strings. VERDICT: PASS

Extreme 100%
Two tango dancers in a close embrace, the man dipping the woPASS
Two tango dancers in a close embrace, the man dipping the woman backward with intertwined legs

Evaluator (Qwen 3 VL)

The dancers’ limbs are correctly proportioned and anatomically plausible. The woman is dipped backward with intertwined legs, matching the prompt. No distortions are visible; the pose is recognizable as a classic tango lift. All details align with the requested action. VERDICT: PASS

Two soccer players mid-tackle, one sliding and the other jumPASS
Two soccer players mid-tackle, one sliding and the other jumping over with the ball at their feet

Evaluator (Qwen 3 VL)

Both players show correct limb count and proportions. The sliding player’s body posture and the jumping player’s mid-air pose are anatomically plausible and match the described action. The ball is at the jumping player’s feet, and the motion is recognizable as a tackle. No significant distortions are visible. VERDICT: PASS

Two gymnasts performing a synchronized handstand, side by siPASS
Two gymnasts performing a synchronized handstand, side by side with identical body alignment

Evaluator (Qwen 3 VL)

Both gymnasts show correct limb count and proportions. Their synchronized handstand pose is anatomically accurate, with straight legs, aligned torsos, and stable hand placement. The action is clearly recognizable as a handstand. No distortions are visible. VERDICT: PASS

HandsClick to expand50%

Hands measures whether local/sefi-image-5b-rl draws anatomically plausible hands, fingers, and gestures.

Easy 33%
A person standing with both hands visible at their sides, fiFAIL
A person standing with both hands visible at their sides, fingers relaxed and naturally spread

Evaluator (Qwen 3.5 122B)

Upon close examination, the left hand (viewer’s left) has 5 fingers, but the right hand (viewer’s right) appears to have 6 fingers — an extra digit is visible between the ring and pinky fingers. This violates anatomical correctness and the prompt’s implied natural human form. The gesture otherwise looks relaxed, but the finger count anomaly is critical. VERDICT: FAIL

A woman holding her hands out in front of her palms-up, all PASS
A woman holding her hands out in front of her palms-up, all ten fingers clearly visible

Evaluator (Qwen 3.5 122B)

The image shows a woman with both hands extended forward, palms facing up. Each hand clearly displays five fingers — left hand has thumb, index, middle, ring, and pinky; right hand mirrors this. All ten fingers are visible and anatomically correct in positioning and proportion. No anomalies detected. Gesture is natural and consistent with the prompt. VERDICT: PASS

A man with arms crossed, both hands resting on opposite uppeFAIL
A man with arms crossed, both hands resting on opposite upper arms with fingers visible

Evaluator (Qwen 3.5 122B)

The left hand (on the right side of the image) has 5 fingers, but the right hand (on the left side) has 6 fingers — an anatomical anomaly. The extra finger is clearly visible between the thumb and index finger. This violates the prompt’s requirement for anatomical correctness and accurate finger count. While the gesture is otherwise plausible, the supernumerary finger constitutes a critical failure in realism. VERDICT: FAIL

Medium 33%
A person holding a white ceramic coffee mug by the handle wiPASS
A person holding a white ceramic coffee mug by the handle with their right hand

Evaluator (Qwen 3.5 122B)

The hand gripping the mug shows anatomical correctness: thumb on top of the handle, four fingers wrapped around the opposite side — totaling five visible digits. No extra or missing fingers. Grip posture is natural and consistent with holding a mug. No anomalies detected. VERDICT: PASS

A woman writing in a notebook with a pen gripped correctly iFAIL
A woman writing in a notebook with a pen gripped correctly in her right hand

Evaluator (Qwen 3.5 122B)

The right hand gripping the pen shows 5 fingers, but the thumb and index finger appear fused or malformed at the joint. The left hand resting on the page also has 5 fingers, but they are unnaturally elongated and slightly misaligned. While the overall gesture is recognizable, anatomical inaccuracies in finger structure and grip precision indicate AI generation artifacts. VERDICT: FAIL

A person playing an acoustic guitar with left hand on the frFAIL
A person playing an acoustic guitar with left hand on the fretboard and right hand strumming

Evaluator (Qwen 3.5 122B)

The right hand (strumming) shows only four fingers clearly, with the thumb partially visible — but the index finger appears fused or malformed near the strings. The left hand (fretting) has five fingers, but the thumb is positioned unnaturally behind the neck, and the pinky seems to merge with the ring finger. Grip is not anatomically correct; anomalies include fused digits and improper thumb placement. VERDICT: FAIL

Hard 67%
A person making a peace sign with their right hand, clearly PASS
A person making a peace sign with their right hand, clearly showing two fingers up and three curled

Evaluator (Qwen 3.5 122B)

The image shows a single right hand making a peace sign. Two fingers (index and middle) are extended upward, while the thumb, ring, and pinky fingers are curled inward — totaling five fingers visible and anatomically consistent. No anomalies detected; gesture is correct. VERDICT: PASS

A hand giving a thumbs-up with four fingers curled and thumbPASS
A hand giving a thumbs-up with four fingers curled and thumb extended upward

Evaluator (Qwen 3.5 122B)

The image shows a single hand making a thumbs-up gesture. The thumb is extended upward, and the other four fingers are curled into a fist — this matches the prompt’s description. Anatomically, the hand appears correct: five fingers total (one thumb + four fingers), with natural skin texture, creases, and proportions. No anomalies or extra/missing digits are visible. The gesture is standard and realistic. VERDICT: PASS

A person counting to three on their fingers, with index, midFAIL
A person counting to three on their fingers, with index, middle, and ring fingers extended

Evaluator (Qwen 3.5 122B)

The image shows a single hand with four fingers extended: index, middle, ring, and thumb — not three as specified in the prompt. The gesture is anatomically plausible for counting to four, but it fails the prompt’s requirement of “counting to three” with only index, middle, and ring fingers extended. Additionally, the thumb is also raised, making it four fingers total, not three. No anomalies in anatomy, but incorrect finger count per prompt. VERDICT: FAIL

Extreme 67%
Two people shaking hands in a professional setting, all fingPASS
Two people shaking hands in a professional setting, all fingers of both hands anatomically correct

Evaluator (Qwen 3.5 122B)

Upon close examination, the hand on the left (wearing white cuff) shows 5 fingers clearly. The hand on the right (blue cuff) also displays 5 fingers with natural positioning and grip. No extra, missing, or malformed fingers are visible. The handshake gesture appears anatomically correct and realistic. VERDICT: PASS

Two people doing a high-five, palms meeting with all fingersFAIL
Two people doing a high-five, palms meeting with all fingers extended on both hands

Evaluator (Qwen 3.5 122B)

Upon close examination, the left hand (on the right side of the image) appears to have 5 fingers. However, the right hand (on the left side of the image) shows only 4 fingers — the thumb is missing or obscured in a way that breaks anatomical correctness for a standard high-five. Additionally, the fingers are not fully extended or properly aligned for a natural high-five gesture; they appear fused or distorted at the contact point. This violates the prompt’s requirement for “all fingers extended

One person handing a small gift box to another person, both PASS
One person handing a small gift box to another person, both sets of hands gripping the box correctly

Evaluator (Qwen 3.5 122B)

The image shows four hands interacting with a gift box. The left person’s hands (wearing gray sleeves) each show 5 fingers — thumbs and four fingers visible, gripping the box naturally. The right person’s hands (brown sweater) also show 5 fingers per hand — thumbs and fingers positioned correctly to receive or hold the box. No extra, missing, or fused fingers; no anatomical distortions. Grip is realistic and consistent with human anatomy. VERDICT: PASS

Multi-SubjectClick to expand83%

Multi-Subject measures whether local/sefi-image-5b-rl keeps multiple people distinct while preserving their requested attributes, outfits, and positions.

Easy 100%
Two women standing side by side: one with short blonde hair PASS
Two women standing side by side: one with short blonde hair wearing a red jacket, the other with long black hair wearing a blue dress

Evaluator (Qwen 3 VL)

Left woman: short blonde hair, wearing red jacket. Right woman: long black hair, wearing blue dress. Both are visually distinct and match descriptions exactly. Features are clear and accurate. VERDICT: PASS

Two men at a cafe: one with a beard and glasses wearing a grPASS
Two men at a cafe: one with a beard and glasses wearing a green sweater, the other clean-shaven in a black hoodie

Evaluator (Qwen 3 VL)

Person 1: Bearded, wears black-rimmed glasses, green sweater. Person 2: Clean-shaven, short buzz cut, black hoodie. Both visually distinct and match descriptions perfectly. All features accurately rendered. VERDICT: PASS

Two friends posing: one tall with curly red hair in denim ovPASS
Two friends posing: one tall with curly red hair in denim overalls, the other short with straight brown hair in a yellow sundress

Evaluator (Qwen 3 VL)

Tall person: curly red hair, denim overalls over a cream shirt. Short person: straight brown hair, yellow sundress. Both individuals are visually distinct and match described features. All prompt elements are accurately rendered. VERDICT: PASS

Medium 67%
Four people at a dinner table: a bald man in a suit, a womanPASS
Four people at a dinner table: a bald man in a suit, a woman with silver hair in a red blouse, a young man with dreadlocks in a denim jacket, and a woman with a hijab in a green dress

Evaluator (Qwen 3 VL)

All four individuals are present and visually distinct. Left: bald man in suit. Second: woman with silver hair in red blouse. Third: young man with dreadlocks in denim jacket. Right: woman with hijab in green dress. Features match prompt exactly. VERDICT: PASS

Three musicians on stage: a tall woman with short pink hair FAIL
Three musicians on stage: a tall woman with short pink hair playing bass, a man with a beard and bandana on drums, and a slim man with glasses playing keyboard

Evaluator (Qwen 3 VL)

Tall woman: pink short hair, playing bass — matches. Man with beard: behind drums, no visible bandana — mismatch. Slim man: glasses, playing keyboard — matches. Drummer’s bandana is absent. Core prompt requirement (bandana) not met. VERDICT: FAIL

Three hikers on a trail: a man with a backpack and sunhat, aPASS
Three hikers on a trail: a man with a backpack and sunhat, a woman with braided hair and hiking poles, and a teenager with a baseball cap and water bottle

Evaluator (Qwen 3 VL)

Man: wears sunhat, backpack, hiking poles. Woman: braided hair, backpack, hiking poles. Teenager: baseball cap, water bottle, backpack. All three are visually distinct and match described features. VERDICT: PASS

Truthfulness 74%

PhotorealismClick to expand100%

Photorealism measures whether local/sefi-image-5b-rl makes materials, surfaces, and real-world objects look visually plausible and distinct.

Medium 100%
A still life on a wooden table featuring a clear glass of waPASS
A still life on a wooden table featuring a clear glass of water, a polished silver spoon, a folded linen napkin, and a ripe red apple — each material rendered with accurate texture and light behavior

Evaluator (Qwen 3.6 27B)

The image successfully renders the requested materials with high fidelity. The glass exhibits correct refraction and transparency, distorting the background wood grain. The silver spoon shows sharp, accurate specular highlights. The wooden table displays a natural grain texture, while the linen napkin captures soft fabric folds and matte light absorption. The apple’s skin reflects light realistically with subtle subsurface scattering. All visual properties align with physical reality. VERDICT:

A workspace desk with a brushed aluminum laptop, a leather-bPASS
A workspace desk with a brushed aluminum laptop, a leather-bound notebook, a ceramic coffee mug, and a glass paperweight — all materials visually distinct and realistic

Evaluator (Qwen 3.6 27B)

The image successfully renders the requested materials with high fidelity. The laptop exhibits a convincing brushed aluminum texture with appropriate metallic reflections. The notebook displays realistic leather grain and stitching details. The coffee mug has a matte ceramic finish, and the paperweight shows accurate glass refraction and transparency. All materials are visually distinct and realistic. VERDICT: PASS

A kitchen counter with a stainless steel pot, a wooden cuttiPASS
A kitchen counter with a stainless steel pot, a wooden cutting board, a glass bottle of olive oil, and a cotton dish towel — each surface reflecting light correctly

Evaluator (Qwen 3.6 27B)

The image successfully renders the distinct material properties requested. The stainless steel pot exhibits sharp, metallic reflections. The glass bottle shows appropriate translucency and refraction of the oil within, with a clear reflection on the counter. The wooden board displays a natural grain texture, and the cotton towel has a soft, fibrous appearance. The glossy countertop correctly reflects all objects, demonstrating accurate light interaction across different surfaces. VERDICT: PASS

Physics & ReflectionsClick to expand83%

Physics & Reflections measures whether local/sefi-image-5b-rl follows basic physical cues such as shadows, reflections, gravity, transparency, and contact points.

Easy 100%
A red sphere on a white surface with a single shadow cast toPASS
A red sphere on a white surface with a single shadow cast to the right, consistent with a light source from the upper left

Evaluator (Qwen 3.6 27B)

The shadow is cast to the right and slightly downward, which is consistent with a light source from the upper left. The shadow’s shape is an elongated oval, appropriately distorted by perspective and the angle of the light. The color is dark and realistic for a shadow on a white surface. The highlight on the sphere also aligns with the stated light direction. All elements are physically plausible. VERDICT: PASS

A coffee mug on a sunny windowsill casting a sharp shadow toPASS
A coffee mug on a sunny windowsill casting a sharp shadow to the left

Evaluator (Qwen 3.6 27B)

The lighting originates from the upper right, consistent with sunlight entering through the window. The mug casts a distinct, sharp shadow to the left and slightly downward, which aligns perfectly with the light source direction. The shadow’s shape matches the mug’s silhouette, and the intensity gradient is natural. No distortions or color anomalies are present. The scene is physically coherent. VERDICT: PASS

A person standing outdoors at midday with a short shadow dirPASS
A person standing outdoors at midday with a short shadow directly below them

Evaluator (Qwen 3.6 27B)

The shadow is cast directly downward, consistent with a midday sun position. The silhouette accurately reflects the person’s stance and clothing folds, with natural softness at the edges due to ambient light. No distortion or color anomalies are present. The shadow’s length and direction align with the prompt’s description of a short, directly below shadow. VERDICT: PASS

Medium 67%
A woman standing in front of a full-length mirror, her refleFAIL
A woman standing in front of a full-length mirror, her reflection accurately showing the back of her outfit and reversed orientation

Evaluator (Qwen 3.6 27B)

The image depicts a woman facing a mirror, but the reflection shows her front view rather than the expected back view or a reversed orientation consistent with her actual position. The reflection appears to be a direct frontal portrait, which contradicts the physics of mirror reflections. Additionally, the lighting and shadows do not align with a realistic reflection scenario. VERDICT: FAIL

A cat sitting on a polished marble floor with its reflectionPASS
A cat sitting on a polished marble floor with its reflection visible below

Evaluator (Qwen 3.6 27B)

The reflection is vertically inverted and positioned directly beneath the cat, which is physically correct for a polished floor. The orientation matches the subject perfectly. The reflection exhibits appropriate blurring and reduced contrast compared to the real cat, simulating the optical properties of a glossy surface. The lighting and shadows are consistent with the reflection's presence. No distortions or color mismatches are evident. VERDICT: PASS

A vase of flowers on a glass table with a clear reflection vPASS
A vase of flowers on a glass table with a clear reflection visible on the table surface

Evaluator (Qwen 3.6 27B)

The reflection on the glass table is physically plausible. It correctly mirrors the vase and stems with appropriate vertical inversion and slight distortion due to the glass surface. The reflection’s position aligns with the vase’s base, and its clarity diminishes naturally toward the edges, consistent with real-world optics. No directional or color inconsistencies are present. VERDICT: PASS

Hard 67%
A row of trees reflected in a calm lake at sunset, with the PASS
A row of trees reflected in a calm lake at sunset, with the reflection slightly darker and gently rippled

Evaluator (Qwen 3.6 27B)

The reflections are vertically aligned with the trees, correctly inverted, and exhibit gentle horizontal ripples consistent with a calm water surface. The color gradient of the sunset is mirrored accurately, with the reflection appearing slightly darker due to light absorption by the water. The distortion matches the subtle wave patterns without breaking the silhouette structure. All physical properties—direction, orientation, distortion, and color—are plausible. VERDICT: PASS

A city skyline reflected in a rain puddle on a street, with FAIL
A city skyline reflected in a rain puddle on a street, with mild distortion and color shift

Evaluator (Qwen 3.6 27B)

The reflection is physically implausible. In a puddle, the reflection should be vertically inverted (upside down) relative to the real objects. Here, the buildings appear upright within the puddle, which defies the laws of optics for a horizontal water surface. Additionally, the transition between the "puddle" and the surrounding asphalt is abrupt and lacks the expected refraction or edge distortion typical of water reflections. VERDICT: FAIL

A mountain range reflected in a still alpine lake, with the PASS
A mountain range reflected in a still alpine lake, with the reflection inverted and subtly muted in color

Evaluator (Qwen 3.6 27B)

The reflection is perfectly symmetrical and inverted, matching the mountain range above. The water surface is glassy, creating a mirror-like effect with minimal distortion. The reflected colors are slightly darker and more muted than the direct sunlight on the peaks, which is physically accurate for reflections in deep water. The lighting direction is consistent between the real mountains and their reflection. VERDICT: PASS

Extreme 100%
A clear glass sphere sitting on a mirror surface near a poolPASS
A clear glass sphere sitting on a mirror surface near a pool of water, showing refraction through the sphere, reflection on the mirror, and water reflection simultaneously

Evaluator (Qwen 3.6 27B)

The glass sphere exhibits correct optical refraction, inverting the background scene within its volume. The reflection on the mirror surface is sharp and accurately mirrors the sphere’s position and lighting. The water in the background shows natural ripples and reflections consistent with a pool environment. Shadows and highlights align logically with a single light source. All elements—refraction, reflection, and shadow—are physically coherent and well-rendered. VERDICT: PASS

A chrome sculpture in a room with a glass floor and large wiPASS
A chrome sculpture in a room with a glass floor and large windows, showing multiple reflections and environmental mapping on every surface

Evaluator (Qwen 3.6 27B)

The reflections are physically plausible. The chrome sculpture accurately mirrors the surrounding environment, including the windows and ceiling lights, with appropriate distortion based on its curved surface. The glass floor provides a clear, inverted reflection of the sculpture, consistent with the light source direction from the windows. The mirrored wall on the left also correctly reflects the sculpture and room elements. Shadows and highlights align with the lighting setup, enhancing realis

A crystal wine glass on a polished marble counter next to a PASS
A crystal wine glass on a polished marble counter next to a window overlooking a lake — showing refraction in the glass, reflection on the marble, and the lake's own reflective surface

Evaluator (Qwen 3.6 27B)

The image displays physically plausible optics. The reflection on the marble counter is sharp and correctly oriented, mirroring the glass stem and base. The refraction through the glass bowl distorts the background lake view appropriately, showing inverted and magnified segments of the scenery. The lighting is consistent, with highlights on the glass rim and stem matching the ambient light source from the window. The lake’s surface also reflects the distant treeline, maintaining environmental co

World KnowledgeClick to expand58%

World Knowledge measures whether local/sefi-image-5b-rl depicts recognizable places, landmarks, flags, and known objects with accurate real-world details.

Easy 100%
The Eiffel Tower in Paris on a clear day, architecturally acPASS
The Eiffel Tower in Paris on a clear day, architecturally accurate with its iron lattice structure and three levels

Evaluator (Qwen 3.6 27B)

The image depicts the Eiffel Tower with high architectural accuracy, showcasing the intricate iron lattice structure, the three distinct levels (two visible platforms and the top), and the characteristic arches at the base. The perspective and details align well with real-world photographs of the landmark. The background buildings and clear sky further enhance the realism. No significant inaccuracies are observed. VERDICT: PASS

The Taj Mahal in Agra seen from the reflecting pool, with acPASS
The Taj Mahal in Agra seen from the reflecting pool, with accurate dome, minarets, and symmetry

Evaluator (Qwen 3.6 27B)

The image depicts the Taj Mahal with high architectural accuracy, including the central dome, four minarets, and symmetrical layout. The reflection in the pool is clear and well-rendered. Minor details like the smaller domes and arches are correctly placed. The overall composition matches real-world photographs of the monument. No significant inaccuracies are present. VERDICT: PASS

The Statue of Liberty on her island pedestal, with correct tPASS
The Statue of Liberty on her island pedestal, with correct torch, crown, tablet, and green patina

Evaluator (Qwen 3.6 27B)

The image depicts the Statue of Liberty with high fidelity. The green patina, spiked crown, raised torch, and tablet are all present and correctly positioned. The pedestal architecture matches the real-world structure. The lighting and sky appear natural. There are no visible artifacts or distortions typical of AI generation. The subject is clearly recognizable and factually accurate. VERDICT: PASS

Medium 33%
A traditional Japanese shamisen instrument with accurate thrFAIL
A traditional Japanese shamisen instrument with accurate three-stringed design, wooden body, and bachi plectrum

Evaluator (Qwen 3.6 27B)

The image depicts a shamisen, but it is not factually accurate. The headstock features four tuning pegs arranged in a cross shape, whereas a traditional shamisen has only three strings and three pegs. Additionally, the bridge and tailpiece assembly appears structurally incorrect for a shamisen. While the round body and general silhouette are recognizable, these specific mechanical inaccuracies make it fail the "factually accurate" criterion. VERDICT: FAIL

A medieval astrolabe with accurate rete, tympan, and engraveFAIL
A medieval astrolabe with accurate rete, tympan, and engraved degree markings on brass

Evaluator (Qwen 3.6 27B)

The image depicts a brass navigational compass (specifically a mariner’s compass with a compass rose), not an astrolabe. An astrolabe is a complex astronomical instrument with a mater, tympan, and rete used for solving problems related to time and the position of the stars. This object lacks those specific components and functions. While the engraving quality is high, the subject matter is factually incorrect relative to the prompt. VERDICT: FAIL

A Japanese kintsugi bowl — cracked ceramic repaired with visPASS
A Japanese kintsugi bowl — cracked ceramic repaired with visible gold lacquer lines

Evaluator (Qwen 3.6 27B)

The image accurately depicts a Japanese kintsugi bowl. The subject is clearly recognizable as a ceramic vessel with visible cracks repaired using gold lacquer, which aligns perfectly with the traditional art form of kintsugi. The lighting, texture, and composition appear realistic and consistent with high-quality photography of such an object. There are no apparent inaccuracies in the depiction. VERDICT: PASS

Hard 33%
A detailed diagram of the water cycle showing evaporation frPASS
A detailed diagram of the water cycle showing evaporation from the ocean, condensation forming clouds, precipitation as rain, and collection in rivers flowing back to the sea

Evaluator (Qwen 3.6 27B)

The image is a stylized, cartoon-like diagram of the water cycle. It clearly shows evaporation (arrows rising from ocean/land), condensation (clouds forming), precipitation (rain falling), and collection (river flowing back to sea). While not photorealistic, it accurately represents the scientific process in a simplified, educational format. The components are recognizable and correctly labeled by visual cues. No factual or architectural inaccuracies are present for this type of illustrative dia

A cross-section showing tectonic plates at a convergent bounFAIL
A cross-section showing tectonic plates at a convergent boundary with subduction, a trench, volcanic arc, and layered mantle

Evaluator (Qwen 3.6 27B)

The image depicts a geological cross-section, but it fails to accurately represent the requested subject. While it shows a subduction-like process, the "tectonic plates" are rendered as abstract, textured layers rather than recognizable lithospheric structures. The "volcanic arc" is missing; instead, there is a single, stylized magma conduit. The mantle is shown as a uniform orange block, lacking the expected convection currents or layered structure. The overall style is artistic and abstract, n

An anatomical illustration of the human heart showing four cFAIL
An anatomical illustration of the human heart showing four chambers, valves, aorta, pulmonary arteries, and veins with correct placement

Evaluator (Qwen 3.6 27B)

The image depicts a human heart, but it fails the accuracy criteria. The major vessels are incorrectly placed and colored: the aorta is shown as a red vessel arching over the heart, but it should emerge from the left ventricle and arch posteriorly, not sit atop the pulmonary trunk in this manner. The pulmonary arteries (blue) and veins are mislabeled or misplaced — for example, the large blue vessel entering the right atrium is likely meant to be the superior vena cava, but its connection and or

Extreme 67%
A Morpho menelaus butterfly with accurate iridescent blue doPASS
A Morpho menelaus butterfly with accurate iridescent blue dorsal wings, brown ventral wing pattern with eyespots, and correct body morphology

Evaluator (Qwen 3.6 27B)

The image depicts a Morpho butterfly with the characteristic iridescent blue dorsal wings and brown ventral patterns with eyespots. The body morphology, including antennae and wing venation, appears anatomically correct for the species. The coloration and pattern distribution are consistent with real-world Morpho menelaus specimens. No significant inaccuracies are present. VERDICT: PASS

The Hagia Sophia interior showing the massive central dome, PASS
The Hagia Sophia interior showing the massive central dome, pendentives, semi-domes, and Islamic calligraphy medallions alongside Byzantine mosaics

Evaluator (Qwen 3.6 27B)

The image accurately depicts the interior of the Hagia Sophia, showcasing its iconic central dome, semi-domes, and pendentives. The architectural details, including the windows at the base of the dome and the intricate Islamic calligraphy medallions, are correctly represented. The Byzantine mosaics are also visible, blending harmoniously with the Ottoman additions. The overall composition is factually and architecturally accurate. VERDICT: PASS

A specimen of bismuth crystal showing accurate iridescent oxFAIL
A specimen of bismuth crystal showing accurate iridescent oxide layers, geometric hopper crystal formation, and characteristic staircase structure

Evaluator (Qwen 3.6 27B)

The image depicts a mineral specimen that is visually inconsistent with the prompt's request for bismuth. While it exhibits geometric hopper crystals and iridescence, the material appears translucent to transparent with internal inclusions, resembling fluorite or quartz rather than the opaque, metallic, and distinctly stepped structure of bismuth. The iridescence here is an optical effect within the crystal lattice, not the oxide layers characteristic of bismuth. VERDICT: FAIL

Professional Studio 93%

Camera & LightingClick to expand92%

Camera & Lighting measures whether local/sefi-image-5b-rl follows specific lens, angle, focus, exposure, and lighting instructions.

Easy 67%
A wide-angle shot of a modern living room interior showing eFAIL
A wide-angle shot of a modern living room interior showing expansive space with slight barrel distortion at the edges

Evaluator (Gemma 4 26B)

The image features a very wide-angle, fisheye-style perspective that creates extreme distortion. While the prompt requested "slight barrel distortion," the image displays extreme curvilinear distortion typical of a fisheye lens, where the edges are heavily bowed and the circular vignette is very pronounced. This exceeds the "slight" requirement. However, the depth of field and lighting (natural light from windows and soft interior glow) are consistent with a modern living room. VERDICT: FAIL

A close-up macro shot of a single dewdrop on a rose petal, sPASS
A close-up macro shot of a single dewdrop on a rose petal, shallow depth of field

Evaluator (Gemma 4 26B)

The image features a close-up macro shot of a pink flower petal with a prominent dewdrop. The perspective is a tight macro view, and the depth of field is shallow, with the background and foreground petals appearing softly blurred. The lighting is soft and even, consistent with a natural macro photography setting. The image successfully meets all the specified criteria. VERDICT: PASS

An aerial bird's-eye view looking straight down at a trafficPASS
An aerial bird's-eye view looking straight down at a traffic roundabout with cars and trees

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. It features a true aerial bird's-eye view looking straight down at a traffic roundabout. The perspective is top-down, showing the circular road, cars, and surrounding trees from a vertical orientation. The lighting is even and consistent with a high-angle daylight shot, and the depth of field is deep, keeping the entire scene in focus from the center to the edges. VERDICT: PASS

Medium 100%
A portrait lit with classic Rembrandt lighting — triangle ofPASS
A portrait lit with classic Rembrandt lighting — triangle of light on the shadowed cheek, moody atmosphere

Evaluator (Gemma 4 26B)

The image successfully implements Rembrandt lighting. There is a distinct, characteristic triangle of light on the shadowed cheek (the subject's right side), which is the hallmark of this lighting style. The high contrast between the illuminated side of the face and the deep shadows creates the requested moody, dramatic atmosphere. The perspective is a standard portrait shot, and the lighting pattern perfectly matches the prompt's description. VERDICT: PASS

A landscape photograph bathed in golden hour light — long waPASS
A landscape photograph bathed in golden hour light — long warm shadows, orange sky, soft glow

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. It features a landscape of sand dunes bathed in golden hour light. The sky is a vibrant orange/yellow hue with the sun low on the horizon, creating a soft glow. The lighting produces long, warm shadows across the dunes, and the depth of field captures the vastness of the desert landscape effectively. VERDICT: PASS

A high-key portrait of a person against a bright white backgPASS
A high-key portrait of a person against a bright white background, minimal shadows, airy and luminous

Evaluator (Gemma 4 26B)

The image successfully implements a high-key portrait. The background is a bright, seamless white, and the lighting is extremely bright and airy, resulting in minimal shadows on the subject's face. The overexposed, luminous quality matches the prompt's requirement for a high-key aesthetic perfectly. While the image is very bright, it achieves the intended "luminous" and "airy" effect without losing the subject entirely. VERDICT: PASS

Hard 100%
A portrait shot at 85mm f/1.4 with creamy bokeh background, PASS
A portrait shot at 85mm f/1.4 with creamy bokeh background, backlit with warm rim light separating the subject from a blurred garden

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. The portrait shot features a shallow depth of field consistent with an 85mm f/1.4 lens, creating a creamy, blurred garden background. The lighting is highly effective, featuring a strong warm rim light (backlighting) that highlights the subject's hair and shoulders, successfully separating her from the background. The golden hour glow and bokeh effects are well-executed. VERDICT: PASS

A street photograph with shallow depth of field as if shot aPASS
A street photograph with shallow depth of field as if shot at 50mm f/1.8, subject sharp with soft city lights in the background

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. The subject is sharp and in focus, while the background features a very shallow depth of field with soft, circular bokeh from city lights. This creates the intended 50mm f/1.8 look. The lighting is consistent with a street photograph at dusk/night, and the perspective is a standard portrait shot. VERDICT: PASS

A food photograph shot from above at 35mm f/2.8 with naturalPASS
A food photograph shot from above at 35mm f/2.8 with natural window light from the left, slight vignette, warm color temperature

Evaluator (Gemma 4 26B)

The image matches all the specified criteria. The perspective is a top-down shot (from above). The lighting is clearly coming from the window on the left, creating a natural light pattern. The color temperature is warm, and there is a noticeable vignette around the edges. The depth of field is shallow, consistent with an f/2.8 aperture, as the edges of the bowl and the window frame are slightly out of focus. VERDICT: PASS

Extreme 100%
A tilt-shift photograph of a city street making real-life loPASS
A tilt-shift photograph of a city street making real-life look like a miniature model — selective focus strip, saturated colors

Evaluator (Gemma 4 26B)

The image perfectly captures the tilt-shift effect. It features a distinct selective focus strip (a narrow band of sharp detail in the middle ground) with heavy blur in the foreground and background, creating the "miniature model" illusion. The colors are highly saturated and vibrant, and the high-angle perspective effectively mimics the look of a small-scale diorama. VERDICT: PASS

A cinematic frame with blue and orange anamorphic lens flarePASS
A cinematic frame with blue and orange anamorphic lens flare streaking horizontally across a dark moody scene

Evaluator (Gemma 4 26B)

The image features a dark, moody scene with a prominent horizontal light streak. The streak transitions from a bright blue on the left to an orange/amber hue on the right, effectively capturing the "blue and orange" requirement. The light streaks horizontally across the frame as specified, creating a cinematic anamorphic lens flare effect against the dark background. VERDICT: PASS

A portrait with dramatic split lighting — half the face brigPASS
A portrait with dramatic split lighting — half the face brightly lit, the other half in deep shadow, high contrast chiaroscuro

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt's requirements. It features a close-up portrait with dramatic split lighting. One side of the subject's face is brightly illuminated, while the other side is cast in deep, heavy shadow, creating a high-contrast chiaroscuro effect against a black background. The lighting pattern is sharp and follows the vertical division of the face as described. VERDICT: PASS

Color PrecisionClick to expand100%

Color Precision measures whether local/sefi-image-5b-rl renders requested colors clearly and consistently across subjects, products, and backgrounds.

Easy 100%
A bright red sports car parked on a gray asphalt road — the PASS
A bright red sports car parked on a gray asphalt road — the red is vivid and unambiguous

Evaluator (Gemma 4 26B)

The image features a bright, vivid red sports car parked on a gray asphalt road. The red color of the car is intense, clear, and unambiguous, exactly as described in the prompt. The road is a standard gray asphalt. All color requirements are met perfectly. VERDICT: PASS

A woman wearing a royal blue evening dress against a neutralPASS
A woman wearing a royal blue evening dress against a neutral background

Evaluator (Gemma 4 26B)

The woman is wearing a vibrant, deep blue dress that perfectly matches the description of "royal blue." The background is a solid, light gray, which qualifies as a neutral background. All color requirements specified in the prompt are accurately reflected in the image. VERDICT: PASS

A field of pure sunflower yellow flowers under a clear blue PASS
A field of pure sunflower yellow flowers under a clear blue sky

Evaluator (Gemma 4 26B)

The image features a vast field of sunflowers with bright, vibrant yellow petals. The sky above is a clear, solid blue. The colors perfectly match the prompt's requirement for "pure sunflower yellow flowers" and a "clear blue sky." VERDICT: PASS

Medium 100%
A bedroom wall painted in #FF6B35 (burnt orange) with white PASS
A bedroom wall painted in #FF6B35 (burnt orange) with white furniture and natural daylight

Evaluator (Gemma 4 26B)

The image features a bedroom with a prominent accent wall in a vibrant burnt orange color, which closely aligns with the hex code #FF6B35. The furniture, including the bed, nightstands, and lamps, is white as requested. The scene also depicts natural daylight hitting the wall and bed, creating bright highlights. All color and lighting requirements from the prompt are met. VERDICT: PASS

A forest scene where the foliage is precisely #2D5F2D (dark PASS
A forest scene where the foliage is precisely #2D5F2D (dark muted green) with brown tree trunks

Evaluator (Gemma 4 26B)

The image depicts a forest scene with tree trunks and foliage. The foliage consists of a dark, muted green color that aligns well with the hex code #2D5F2D (a deep, desaturated forest green). The tree trunks are a consistent brown color as requested. The overall color palette successfully adheres to the specific color requirements provided in the prompt. VERDICT: PASS

A room with walls in #4A90D9 (medium blue) with warm wooden PASS
A room with walls in #4A90D9 (medium blue) with warm wooden floor and white trim

Evaluator (Gemma 4 26B)

The image accurately reflects the requested colors. The walls are a medium blue consistent with the hex code #4A90D9, the floor is a warm wood tone, and the trim is white. All specified color elements are present and correctly rendered. VERDICT: PASS

Hard 100%
A minimalist room with walls in #F5E6CC (warm cream), a sofaPASS
A minimalist room with walls in #F5E6CC (warm cream), a sofa in #2C3E50 (dark navy), and accent pillows in #E74C3C (bright red)

Evaluator (Gemma 4 26B)

The image accurately reflects the requested colors. The walls are a warm cream/off-white color, consistent with #F5E6CC. The sofa is a dark navy blue, matching the #2C3E50 description. The accent pillows are a bright, saturated red, consistent with #E74C3C. All three color requirements are met precisely. VERDICT: PASS

A product flat-lay on a #1A1A2E (deep dark blue) background,PASS
A product flat-lay on a #1A1A2E (deep dark blue) background, with a #F0C27F (soft gold) watch and #FFFFFF (pure white) earbuds

Evaluator (Gemma 4 26B)

The image accurately reflects the requested colors. The background is a deep, dark navy blue consistent with #1A1A2E. The watch features a soft gold/champagne tone matching #F0C27F, and the earbuds are a clean, pure white (#FFFFFF). All three color components align with the specific hex values and descriptions provided in the prompt. VERDICT: PASS

A graphic design composition with background #FAFAFA (near-wPASS
A graphic design composition with background #FAFAFA (near-white), main shape #FF4500 (orange-red), and accent lines #333333 (dark charcoal)

Evaluator (Gemma 4 26B)

The image accurately reflects the requested color palette. The background is a near-white (#FAFAFA), the main shape is a vibrant orange-red (#FF4500), and the accent lines are a dark charcoal gray (#333333). The colors are consistent with the hex values provided in the prompt. VERDICT: PASS

Extreme 100%
A smooth gradient on a wall going from #FF0000 (red) on the PASS
A smooth gradient on a wall going from #FF0000 (red) on the left to #0000FF (blue) on the right, transitioning evenly through purple

Evaluator (Gemma 4 26B)

The image shows a smooth horizontal gradient. On the far left, the color is a vibrant red (#FF0000). As it moves toward the right, it transitions through a deep purple/magenta middle section, ending in a bright blue (#0000FF) on the far right. The transition is even and follows the exact color specifications and hex values provided in the prompt. VERDICT: PASS

A sunset sky gradient on a flat surface from #FF6B35 (orangePASS
A sunset sky gradient on a flat surface from #FF6B35 (orange) at the bottom to #1B1464 (deep indigo) at the top

Evaluator (Gemma 4 26B)

The image displays a vertical gradient. The bottom color is a vibrant orange, which aligns well with the requested #FF6B35. The top color is a very dark, deep indigo/purple, which matches the description of #1B1464. The transition through the middle colors is smooth and consistent with a sunset sky gradient. VERDICT: PASS

A horizontal banner with a gradient from #00C9FF (cyan) on tPASS
A horizontal banner with a gradient from #00C9FF (cyan) on the left to #92FE9D (mint green) on the right

Evaluator (Gemma 4 26B)

The image features a horizontal banner with a smooth gradient. On the left side, the color is a bright cyan, which aligns with the specified #00C9FF. On the right side, the color transitions into a light mint green, which matches the description of #92FE9D. The gradient flows correctly from left to right as requested. VERDICT: PASS

PhotorealismClick to expand67%

Photorealism measures whether local/sefi-image-5b-rl produces polished product and studio imagery with realistic materials, lighting, shadows, and finish.

Easy 67%
A single red sneaker on a clean white background with soft sPASS
A single red sneaker on a clean white background with soft studio lighting and gentle shadow — product photography style

Evaluator (Gemma 4 26B)

The image successfully achieves the studio-quality look requested. The lighting is soft and consistent, creating a natural, gentle shadow beneath the shoe. The textures of the fabric and the sole appear realistic and high-resolution. There are no obvious AI artifacts, such as warped laces, nonsensical geometry, or blurred edges. The composition is clean, and the product looks like a genuine professional photograph. VERDICT: PASS

A clear glass perfume bottle on a reflective black surface wFAIL
A clear glass perfume bottle on a reflective black surface with studio lighting, no background distractions

Evaluator (Gemma 4 26B)

The image fails to meet the criteria for a high-quality product shot. While the lighting is decent, there are significant AI artifacts. Most notably, the straw/dip tube is physically impossible; it curves erratically through the liquid and exits the bottle in a way that defies physics and glass geometry. Additionally, the reflection on the surface is slightly mismatched with the bottle's base, and the glass edges lack the crispness of a real photograph. VERDICT: FAIL

A ripe green apple on a seamless white background with soft PASS
A ripe green apple on a seamless white background with soft side lighting and subtle shadow — catalog style

Evaluator (Gemma 4 26B)

The image successfully captures a studio-quality look. The lighting is soft and directional, creating a realistic subtle shadow on the white background. The texture of the green apple skin, including the slight imperfections and the way light reflects off its surface, looks highly realistic. There are no obvious AI artifacts, such as warped shapes or unnatural textures, and it passes easily as a real photograph at first glance. VERDICT: PASS

Graphical design 63%

Data VisualisationClick to expand0%

Data Visualisation measures whether local/sefi-image-5b-rl produces charts with correct labels, proportions, colors, and readable values.

Easy 0%
A bar chart showing monthly sales for January through June wFAIL
A bar chart showing monthly sales for January through June with values 12K, 18K, 15K, 22K, 19K, 25K — labeled axes, color-coded bars, and a title 'Monthly Sales 2025'

Evaluator (Qwen 3 VL)

The chart shows a bar graph titled "Monthly Sales 2025" with six bars. Values (12K, 18K, 15K, 22K, 19K, 25K) are correctly labeled atop bars. However, x-axis labels are incorrect (e.g., "Yey" for Feb, "Pes" for Mar, "Be" for Apr, "Uu" for May), and the y-axis scale is inconsistent (e.g., 128 between 100 and 142). The chart type matches, but critical labeling errors invalidate it. VERDICT: FAIL

A pie chart showing market share: Company A 35%, Company B 2FAIL
A pie chart showing market share: Company A 35%, Company B 25%, Company C 20%, Others 20% — each slice a different color with labels and percentages

Evaluator (Qwen 3 VL)

The chart misrepresents the data: Company A is 35% (correct), but Company B and Company C are both shown as 25% (should be 25% and 20% respectively). “Others” is incorrectly split into multiple segments, and the title “Company A Market Sharce” is misspelled. The chart type is a pie chart, but the data is inaccurate. VERDICT: FAIL

A simple line graph showing daily website visitors over one FAIL
A simple line graph showing daily website visitors over one week (Mon-Sun): 1200, 1500, 1100, 1800, 2000, 1700, 1400 — with grid lines, axis labels, and title 'Weekly Traffic'

Evaluator (Qwen 3 VL)

The chart shows a line graph titled "Weekly Traffic" with grid lines and axis labels. However, the x-axis labels (Mon, 1500, 1100, etc.) are incorrect and misaligned with the data. The y-axis values are also mislabeled (e.g., 1800 appears twice). The data points do not match the requested sequence. The chart type is correct, but critical data and labels are erroneous. VERDICT: FAIL

Layout & DesignClick to expand44%

Layout & Design measures whether local/sefi-image-5b-rl organizes posters, menus, covers, and other designed compositions with the requested hierarchy and elements.

Easy 100%
A movie night poster with a large popcorn image in the centePASS
A movie night poster with a large popcorn image in the center, the title 'MOVIE NIGHT' in bold at the top, and the tagline 'Every Friday at 8 PM' at the bottom

Evaluator (Qwen 3 VL)

The image features a large popcorn mound centered, with "MOVIE NIGHT" in bold black text at the top and "Every Friday at 8 PM" in white text at the bottom. Text is readable and correctly placed per the prompt. Visual hierarchy is clear: title dominates top, popcorn center, tagline bottom. All required layout elements are present and correctly positioned. VERDICT: PASS

An event poster featuring a guitar illustration, the headlinPASS
An event poster featuring a guitar illustration, the headline 'LIVE MUSIC FEST' in large text, and 'Tickets available now' as a tagline below

Evaluator (Qwen 3 VL)

The image displays a guitar illustration centered vertically. "LIVE MUSIC FEST" appears in large, bold text across the guitar’s body, establishing clear visual hierarchy. Below, "Tickets available now" is legible and correctly positioned under the guitar. All required elements are present and correctly placed with appropriate contrast and scale. VERDICT: PASS

A book club poster with an open book illustration, the titlePASS
A book club poster with an open book illustration, the title 'JOIN THE BOOK CLUB' prominently displayed, and 'Meets every Wednesday' as the tagline

Evaluator (Qwen 3 VL)

The image features an open book illustration with "JOIN THE BOOK CLUB" prominently displayed in bold, centered text. Below, "Meets every Wednesday" is clearly visible in smaller font. All required elements are present with correct visual hierarchy: main title dominates, tagline is secondary and properly placed. Text is fully readable and correctly positioned. VERDICT: PASS

Medium 33%
A landing page design with a hero header image at the top, aPASS
A landing page design with a hero header image at the top, a headline and subtitle in the middle, a 'Sign Up' call-to-action button, and a footer with three social media icons

Evaluator (Qwen 3 VL)

The image shows a hero header image (person at beach), a headline (“Tooplierting the Faone”), a “Sign Up” button, and a footer with three social icons. Text is readable and correctly placed. Visual hierarchy is clear: header → headline → CTA → footer. All required layout zones are present and correctly structured. VERDICT: PASS

A newsletter layout with a banner image header, a two-columnFAIL
A newsletter layout with a banner image header, a two-column body section with text on the left and an image on the right, a highlighted quote block, and a footer with unsubscribe link

Evaluator (Qwen 3 VL)

The image shows a newsletter layout with a header banner, two-column body (text left, image right), a highlighted quote block, and a footer unsubscribe link. However, all text is gibberish and unreadable, violating the requirement for readable, correctly placed text. The visual hierarchy is present but meaningless due to illegible content. VERDICT: FAIL

A restaurant menu layout with a logo header, three category FAIL
A restaurant menu layout with a logo header, three category sections (appetizers, mains, desserts) each with items and prices, and a footer with address and hours

Evaluator (Qwen 3 VL)

The image shows a menu with a logo header, three category sections (Appetizers, Mains, Desserts), and a footer. However, the text is gibberish and unreadable, and the footer contains nonsensical placeholder text. The layout structure is present but the content fails to meet the prompt’s requirement for readable, correctly placed text. VERDICT: FAIL

Hard 0%
A magazine double-page spread with a left page containing a FAIL
A magazine double-page spread with a left page containing a large hero photo, and a right page with a two-column article, a pull quote, a sidebar with author bio, and a page number

Evaluator (Qwen 3 VL)

The left page features a large hero photo as required. The right page has a two-column article, a pull quote, and a sidebar with an author bio and page number. However, all text is gibberish placeholder, making it unreadable and violating the requirement for readable content. Layout structure is correct, but text fails the core criterion. VERDICT: FAIL

A mobile app UI mockup showing a dashboard with a top navigaFAIL
A mobile app UI mockup showing a dashboard with a top navigation bar, a search field, a 2x2 grid of feature cards with icons, a list of recent activity items, and a bottom tab bar

Evaluator (Qwen 3 VL)

The image shows a mobile app UI with a top navigation bar (app icon + search field), a 2x2 grid of feature cards with icons, a recent activity list, and a bottom tab bar — all present. However, all text is garbled Chinese characters, making it unreadable and incorrectly placed. Visual hierarchy is intact, but illegible text violates core requirements. VERDICT: FAIL

A fashion magazine spread with a full-bleed left-page photo,FAIL
A fashion magazine spread with a full-bleed left-page photo, right page with three-column layout, headline, body copy, a smaller inset image, and styled page numbers

Evaluator (Qwen 3 VL)

The left page features a full-bleed photo as required. The right page has a three-column layout with a headline, body copy, and a smaller inset image. Page numbers are styled correctly. However, all text is gibberish, unreadable placeholder text, violating the requirement for readable content. Layout structure is correct, but text fails the core criterion. VERDICT: FAIL

Style DiversityClick to expand92%

Style Diversity measures whether local/sefi-image-5b-rl can shift between visual styles like oil painting, pixel art, watercolor, and technical illustration.

Easy 67%
A golden retriever sitting in a garden, painted in the stylePASS
A golden retriever sitting in a garden, painted in the style of an oil painting with visible brushstrokes and rich texture

Evaluator (Qwen 3.5 122B)

The image clearly depicts a golden retriever sitting in a garden setting. The artistic style strongly resembles an oil painting, with visible, textured brushstrokes throughout the composition. The fur, foliage, and background all exhibit the characteristic impasto technique and color blending typical of oil paintings. The rich texture and painterly quality are evident across the entire image, fulfilling the requested style requirements. VERDICT: PASS

A golden retriever sitting in a garden, rendered in JapanesePASS
A golden retriever sitting in a garden, rendered in Japanese anime style with flat colors and large expressive eyes

Evaluator (Qwen 3.5 122B)

The image clearly exhibits the requested Japanese anime style. Key characteristics are present: the golden retriever has large, expressive eyes with prominent highlights, simplified facial features, and flat color shading with minimal gradients. The background garden is rendered with stylized, soft-focus flowers and foliage typical of anime backgrounds. The overall aesthetic matches the prompt’s description precisely. VERDICT: PASS

A golden retriever sitting in a garden, rendered in pixel arFAIL
A golden retriever sitting in a garden, rendered in pixel art style with a 32x32 grid and limited color palette

Evaluator (Qwen 3.5 122B)

The image clearly depicts a golden retriever in a garden setting. The pixel art style is unmistakable, with visible square pixels forming the entire composition. However, the grid is significantly larger than the requested 32x32 — it appears to be at least 64x64 or higher, given the level of detail and pixel count. While the limited color palette is present, the grid size requirement is not met. VERDICT: FAIL

Medium 100%
A city street scene in the style of Art Deco — geometric shaPASS
A city street scene in the style of Art Deco — geometric shapes, gold and black palette, ornamental symmetry

Evaluator (Qwen 3.5 122B)

The image strongly exhibits Art Deco characteristics. The color palette is strictly gold and black as requested. Geometric shapes dominate the composition, particularly in the stylized skyscrapers with vertical lines and the ornamental framing border. The scene features perfect ornamental symmetry down the center axis. The visual style is unmistakably Art Deco. VERDICT: PASS

A woman with a parasol in the style of Impressionism — soft PASS
A woman with a parasol in the style of Impressionism — soft brushstrokes, dappled light, pastel palette

Evaluator (Qwen 3.5 122B)

The image clearly exhibits Impressionist characteristics: soft, visible brushstrokes are evident throughout, especially in the background and dress. Dappled light effects appear on the parasol and foliage, created through broken color patches. The palette is predominantly pastel — lavenders, blues, yellows, and pinks — with gentle transitions. The subject (woman with parasol) matches the prompt. Style is unmistakably Impressionist. VERDICT: PASS

A portrait of a man in Pop Art style — bold outlines, Ben-DaPASS
A portrait of a man in Pop Art style — bold outlines, Ben-Day dots, flat vivid primary colors

Evaluator (Qwen 3.5 122B)

The image clearly exhibits Pop Art characteristics: bold black outlines define the subject, Ben-Day dots are visibly used for shading and texture across the face and clothing, and vivid, saturated colors (neon yellow, magenta, cyan) dominate the palette. The flat color application and stylized rendering align with the requested aesthetic. All key visual elements of the specified style are present and recognizable. VERDICT: PASS

Hard 100%
A forest rendered in photorealistic watercolor — the detail PASS
A forest rendered in photorealistic watercolor — the detail and lighting of a photo combined with visible watercolor bleeds and paper texture

Evaluator (Qwen 3.5 122B)

The image clearly exhibits the requested photorealistic watercolor style. Key characteristics are present: visible paper texture, soft watercolor bleeds at edges, layered translucent washes, and realistic forest lighting with dappled sunlight. Tree trunks and foliage show photographic detail while retaining painterly fluidity. The blend of realism and medium-specific effects is well-executed. VERDICT: PASS

A woman in a futuristic city rendered in cyberpunk Art NouvePASS
A woman in a futuristic city rendered in cyberpunk Art Nouveau — neon colors and tech elements combined with flowing organic lines and floral borders

Evaluator (Qwen 3.5 122B)

The image perfectly embodies the requested "cyberpunk Art Nouveau" style. The central figure wears a dress with flowing, organic lines that mimic Art Nouveau aesthetics, while the background features a neon-lit futuristic city with pink and blue hues characteristic of cyberpunk. The composition is framed by distinct floral borders and vine-like patterns, directly fulfilling the prompt's specific stylistic requirements. All key visual characteristics are clearly present and well-executed. VERDIC

A medieval castle depicted in Japanese ukiyo-e woodblock priPASS
A medieval castle depicted in Japanese ukiyo-e woodblock print style combined with photorealistic sky and lighting

Evaluator (Qwen 3.5 122B)

The image successfully combines ukiyo-e woodblock print aesthetics (flat colors, bold outlines, stylized architecture) with a photorealistic sky gradient and lighting effects. The castle’s layered roofs, stone base, and decorative elements match traditional Japanese castle design rendered in ukiyo-e style. Sky transitions smoothly from deep blue to warm sunset tones, mimicking photographic realism. Core stylistic requirements are clearly met. VERDICT: PASS

Extreme 100%
An interior scene rendered in soft brutalism — raw concrete PASS
An interior scene rendered in soft brutalism — raw concrete textures combined with rounded pastel furniture and warm diffused light

Evaluator (Qwen 3.5 122B)

The image perfectly embodies the requested "soft brutalism" style. The raw, unfinished concrete walls, ceiling beams, and floor provide the brutalist foundation. This is contrasted by the large, rounded, pastel pink sofa that dominates the foreground. Warm, diffused light enters from the right, casting a soft glow on the concrete and furniture, fulfilling all key visual characteristics of the prompt. VERDICT: PASS

A still life painted in anxious minimalism — sparse compositPASS
A still life painted in anxious minimalism — sparse composition with uncomfortable negative space, muted colors, and subtle visual tension

Evaluator (Qwen 3.5 122B)

The image depicts a solitary wooden chair against a vast, empty gray background. The composition is extremely sparse, with the chair placed low and off-center, creating significant negative space that evokes unease. The color palette is muted and monochromatic, dominated by grays and browns. The lighting is flat and diffuse, contributing to a sense of stillness and subtle tension. These elements align closely with the described style of "anxious minimalism." VERDICT: PASS

A cityscape rendered in nostalgic futurism — retro-futuristiPASS
A cityscape rendered in nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina

Evaluator (Qwen 3.5 122B)

The image clearly exhibits the requested nostalgic futurism style. Key characteristics are present: the architecture features iconic 1960s space-age elements like saucer-shaped structures, domes, and sleek spires. The color palette successfully achieves a "melancholic warm-toned patina" through the use of dusty pinks, creams, and a hazy, vintage sky. The overall aesthetic is a faithful representation of retro-futurism. VERDICT: PASS