ImageBench

ImageBench V1 —

192 evaluations across 6 categories

Benchmark V1 verdicts are produced by VLM judges and can contain mistakes. Treat PASS/FAIL labels as machine-assisted assessments, and inspect the images yourself. Learn more about the methodology.

Generation Details

Source-backed model context, size, cost, and request settings for this ImageBench V1 run.

local/sefi-image-5b-turbo

Local

SeFi Image 5B Turbo is a locally fine-tuned text-to-image model produced by the SeFi image-generation fine-tuning pipeline and run on an NVIDIA DGX Spark. It is the ~5B turbo (few-step) variant of the SeFi Image family. It is not a publicly released hosted product; no external model card or citation is disclosed.

Maker
SeFi pipeline
Family
SeFi Image
Model Size
~5B
estimated
Cost
local run; no API price
not_applicable
Run Target
gx10/sefi-image-5b-turbo
Effective Request
Effective request fields unknown
53.6
Overall
76%
Capability
31.2
Est. Preference
146
Pass
46
Fail
5.9s
Avg Latency
5.7s
Min Latency
6.6s
Max Latency
Text Rendering53%Spatial Reasoning77%Human realism86%Truthfulness74%Professional Studio93%Graphical design54%Preference31%Latency48%

All 192 generations

Text Rendering

Spatial Reasoning

Human realism

Truthfulness

Professional Studio

Graphical design

Text Rendering 53%

Typography StyleClick to expand100%

Typography Style measures whether local/sefi-image-5b-turbo preserves requested lettering styles, font mood, and text texture such as typewriter ink or neon signage.

Easy 100%
The word 'CHAPTER ONE' typed on aged paper with a vintage tyPASS
The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Evaluator (Qwen 3 VL)

Text reads “CHAPTER ONE” correctly. Font mimics vintage typewriter style with uneven, slightly smudged ink and distressed edges. Paper texture appears aged and tactile. The visual cues convincingly simulate ink on paper from an old typewriter. All prompt elements are accurately rendered. VERDICT: PASS

The phrase 'TODAY'S SPECIAL' written in white chalk on a darPASS
The phrase 'TODAY'S SPECIAL' written in white chalk on a dark green chalkboard with chalk dust

Evaluator (Qwen 3 VL)

Text reads “TODAY’S SPECIAL” correctly. Letters mimic chalk texture with uneven edges and smudges, and chalk dust is visible. Background is dark green chalkboard. Style convincingly matches chalk medium. Minor imperfections in lettering are authentic to chalk writing. VERDICT: PASS

The word 'OPEN' displayed on a red LED sign with glowing pixPASS
The word 'OPEN' displayed on a red LED sign with glowing pixel-style letters

Evaluator (Qwen 3 VL)

The word "OPEN" is correctly spelled and rendered in a glowing, pixel-style LED format with distinct red dots forming each letter. The style convincingly mimics a classic LED sign with visible individual light elements and ambient glow. The medium and text match the prompt precisely. VERDICT: PASS

Writing accuracyClick to expand42%

Writing accuracy measures whether local/sefi-image-5b-turbo renders the exact requested words, numbers, labels, and short phrases without spelling or layout errors.

Easy 67%
A single large word 'HELLO' centered on a white backgroundFAIL
A single large word 'HELLO' centered on a white background

Evaluator (Qwen 3 VL)

The word is misspelled: "HELLO" incorrectly shows as "FELLO" with a "F" instead of an "H". The text is fully readable but contains a clear error. The intended word is not correctly rendered. VERDICT: FAIL

The word 'ADVENTURE' in big bold letters centered on a plainPASS
The word 'ADVENTURE' in big bold letters centered on a plain cream background

Evaluator (Qwen 3 VL)

The word "ADVENTURE" is spelled correctly and fully readable. All letters are present and clear against the cream background. The text is centered and bold as intended. No misspellings or illegible parts detected. VERDICT: PASS

The word 'SERENITY' written large and centered on a solid liPASS
The word 'SERENITY' written large and centered on a solid light blue background

Evaluator (Qwen 3 VL)

The word "SERENITY" is spelled correctly and is fully readable, centered on a solid light blue background. No misspellings, missing words, or illegible parts are present. The text matches the intended specification exactly. VERDICT: PASS

Medium 33%
The sentence 'The quick brown fox jumps high' in clean blackPASS
The sentence 'The quick brown fox jumps high' in clean black text on a white background

Evaluator (Qwen 3 VL)

All words are spelled correctly and fully readable: "The", "quick", "brown", "fox", "jumps", "high". Text is clean black on white background as intended. No misspellings or illegible parts. VERDICT: PASS

'Every sunset brings a new dawn ahead' written neatly on a pFAIL
'Every sunset brings a new dawn ahead' written neatly on a pale yellow background

Evaluator (Qwen 3 VL)

The text reads "Every sunset brings a ew dawn ahead". "ew" is a misspelling of "new". The word "a" is correctly placed before "new", but "ew" is illegible as intended. All other words are correct and readable. The misspelling significantly alters the intended message. VERDICT: FAIL

'Stars shine brightest in the dark night' displayed on a cleFAIL
'Stars shine brightest in the dark night' displayed on a clean off-white background

Evaluator (Qwen 3 VL)

Text contains errors: “aα” (incorrect symbol instead of “the”) and “tne” (misspelled “the”). Words “brightest” and “dark night” are correct. The phrase is not fully readable due to these distortions. VERDICT: FAIL

Hard 67%
A poster with headline 'SUMMER SALE' in large bold text, bodPASS
A poster with headline 'SUMMER SALE' in large bold text, body text 'Up to 50% off all items this weekend only' in medium size, and fine print 'Terms and conditions apply. See store for details.' at the bottom in small text

Evaluator (Qwen 3 VL)

All text matches the intended content exactly. "SUMMER SALE", "Up to 50% off all items this weekend only", and "Terms and conditions apply. See store for details." are all spelled correctly and fully readable. No misspellings or illegible parts detected. VERDICT: PASS

A flyer with large title 'GRAND OPENING', subtitle 'Join us PASS
A flyer with large title 'GRAND OPENING', subtitle 'Join us for food, music, and fun' in medium text, and small disclaimer 'RSVP required. Limited seating available.' at the bottom

Evaluator (Qwen 3 VL)

All text matches the intended content exactly. "GRAND OPENING" is correct. Subtitle "Join us for food, music, and fun" is fully readable. Disclaimer "RSVP required. Limited seating available." is accurate. No misspellings, missing words, or illegible parts detected. Text is clear and correctly rendered. VERDICT: PASS

An event banner with headline 'ANNUAL GALA 2025' at the top FAIL
An event banner with headline 'ANNUAL GALA 2025' at the top in large serif font, description 'An evening of fine dining and live jazz' in medium text, and footnote 'Black tie optional. Doors open at 6 PM.' in small text

Evaluator (Qwen 3 VL)

“ANNUAL” is misspelled as “ANNUA_” (missing ‘L’ and has underscore). “lie jazz” should be “live jazz”. “optalinal” is misspelled. “Black tie optional” is incorrect. All text is illegible or incorrect. VERDICT: FAIL

Extreme 0%
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagFAIL
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagline 'One ship. One chance. No turning back.' with credits block reading 'Directed by James Harlow. Starring Elena Voss, Marcus Chen, and Sofia Reyes. Music by Thomas Grant. A Paramount Pict

Evaluator (Qwen 3 VL)

Text contains multiple misspellings and illegible parts: “One chance ase.” (missing “t”), “Concs No turturing back.” (incorrect), “Directed by a claste.” (incorrect), “Lchit Fiomnwk.” (incorrect), “Diterted by laganstate” (incorrect), “Maariby Ouria Keys” (incorrect), “Cireteced Irians'” (incorrect), “Staring Vens,” (incorrect), “Elana’ Cosn” (incorrect), “Marcias Rhen,” (incorrect), “Sofia Meyes” (incorrect), “aby Thomas Grant.” (incorrect), “Pararid ony Radks Rteast” (incorrect), “A Paramionc

A magazine cover titled 'NATIONAL EXPLORER' with cover linesFAIL
A magazine cover titled 'NATIONAL EXPLORER' with cover lines 'The Hidden Temples of Cambodia — page 34', 'Interview: Dr. Maria Santos on Climate Action', 'Top 10 Trails You Must Hike This Year', and a barcode area with 'Issue 247 | March 2025 | $6.99

Evaluator (Qwen 3 VL)

Text errors: “fs for” (should be “of”), “Maria” (should be “Maria Santos”), “Claction” (should be “Climate Action”), “Issue 20 Trails You Must Hik6” (misspelled “Hik6” for “Hike”), “Issue 247 | March 2025” (incorrect issue number). Multiple misspellings and missing words. VERDICT: FAIL

A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtFAIL
A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtitle 'Raw & Unfiltered', weight '12 oz (340g)', description 'Pure wildflower honey harvested from sustainable apiaries in the Pacific Northwest. No additives or preservatives.', and small te

Evaluator (Qwen 3 VL)

“ORGANIC” is misspelled as “ORGAN’Y”. “Apiaries” is misspelled as “apjaries”. “Additives” is misspelled as “aadditives” and “diatess dth to” is garbled. “Best before: 06/2026. ~ en.96/2026.” contains an illegible symbol and incorrect date format. Multiple critical errors present. VERDICT: FAIL

Spatial Reasoning 77%

Attributes BindingClick to expand89%

Attributes Binding measures whether local/sefi-image-5b-turbo attaches the right colors, materials, and properties to the correct requested objects.

Easy 100%
A red ball and a blue cube sitting side by side on a white sPASS
A red ball and a blue cube sitting side by side on a white surface

Evaluator (Qwen 3.5 122B)

The image shows two objects on a white surface: a red sphere (ball) on the left and a blue cube on the right. Both objects match the requested color, shape, and spatial arrangement. No attributes are swapped — the ball is red and spherical, the cube is blue and cubic. All criteria are correctly fulfilled. VERDICT: PASS

A green apple and an orange pumpkin on a wooden cutting boarPASS
A green apple and an orange pumpkin on a wooden cutting board

Evaluator (Qwen 3.5 122B)

The image shows a green apple and an orange pumpkin on a wooden cutting board. The apple is correctly green, round, and has a stem. The pumpkin is correctly orange, round, and has a stem. All attributes (color, shape, object type) are correctly bound with no swaps. The setting matches the prompt. VERDICT: PASS

A yellow taxi and a black limousine parked next to each othePASS
A yellow taxi and a black limousine parked next to each other on a street

Evaluator (Qwen 3.5 122B)

The image shows two vehicles parked side by side on a street. The vehicle on the left is a yellow taxi, identifiable by its bright yellow color, taxi roof sign, and New York-style license plate. The vehicle on the right is a black limousine, characterized by its elongated body, tinted windows, and luxury appearance. Both objects match the prompt’s description: correct colors (yellow taxi, black limousine), correct shapes (standard taxi sedan vs. stretched limo), and correct spatial arrangement (

Medium 67%
A large red circle, a small blue triangle, and a medium greePASS
A large red circle, a small blue triangle, and a medium green star arranged on a white canvas

Evaluator (Qwen 3.5 122B)

The image shows a large red circle, a small blue triangle, and a medium green star on a white canvas. All attributes (color, size, shape) are correctly bound to their respective objects with no swaps. The red circle is large, the blue triangle is small, and the green star is medium-sized. All elements match the prompt’s specifications. VERDICT: PASS

A tall thin white candle, a short fat red candle, and a mediPASS
A tall thin white candle, a short fat red candle, and a medium spiral yellow candle on a shelf

Evaluator (Qwen 3.5 122B)

The image shows three candles on a shelf: 1. Left: Tall, thin, white candle — matches description. 2. Center: Short, fat, red candle — matches description. 3. Right: Medium height, spiral-shaped, yellow candle — matches description. All attributes (color, size, shape) are correctly bound to each object with no swaps. The arrangement and visual properties align precisely with the prompt. VERDICT: PASS

A big brown horse, a small white rabbit, and a medium-sized FAIL
A big brown horse, a small white rabbit, and a medium-sized black cat standing in a meadow

Evaluator (Qwen 3.5 122B)

The image shows a large brown horse standing in a meadow, correctly matching the prompt. A small white rabbit is present in the foreground, also matching. However, the black cat is sitting, not standing as requested. Additionally, the cat has two tails visible — an anatomical impossibility and visual artifact. These errors violate the “no swaps” and “correct attributes” criteria. VERDICT: FAIL

Hard 100%
A pink fire truck, a blue banana, and a green sun in a surrePASS
A pink fire truck, a blue banana, and a green sun in a surreal landscape

Evaluator (Qwen 3.5 122B)

The image shows a pink fire truck, a large blue banana, and a green sun in a surreal landscape. The colors and objects match the prompt exactly: the fire truck is pink, the banana is blue, and the sun is green. All attributes are correctly bound to their respective objects with no swaps. VERDICT: PASS

A purple carrot, a red tree with red leaves, and a white watPASS
A purple carrot, a red tree with red leaves, and a white watermelon sliced open

Evaluator (Qwen 3.5 122B)

The image contains: 1. A purple carrot — correctly colored and shaped, with green stems. 2. A red tree with red leaves — trunk and foliage are uniformly red, matching description. 3. A white watermelon sliced open — interior is white with black seeds, rind is green-striped; matches “white watermelon” as described (referring to flesh color). All attributes are correctly bound to their respective objects. No swaps or misattributions observed. VERDICT: PASS

An orange elephant, a yellow penguin, and a silver strawberrPASS
An orange elephant, a yellow penguin, and a silver strawberry arranged on a table

Evaluator (Qwen 3.5 122B)

The image shows three objects on a wooden surface: 1. An orange elephant with white tusks and toenails — matches “orange elephant”. 2. A yellow penguin with black head, wings, and feet — matches “yellow penguin”. 3. A silver strawberry with textured surface and green-silver leaves — matches “silver strawberry”. All attributes (color, shape, object type) are correctly bound. No swaps or mismatches observed. VERDICT: PASS

CompositionalityClick to expand100%

Compositionality measures whether local/sefi-image-5b-turbo combines subjects, actions, outfits, props, and scene details into one coherent requested image.

Easy 100%
A tabby cat wearing oversized red sunglassesPASS
A tabby cat wearing oversized red sunglasses

Evaluator (Qwen 3.5 122B)

The image contains all described elements: a tabby cat (present), oversized red sunglasses (present), and they are coherently combined (the glasses fit naturally on the cat’s face). The composition is visually consistent and logically assembled. No missing or contradictory elements. VERDICT: PASS

A golden retriever wearing a tiny cowboy hatPASS
A golden retriever wearing a tiny cowboy hat

Evaluator (Qwen 3.5 122B)

The image contains: - Golden retriever: Present and clearly visible - Cowboy hat: Present on the dog's head - "Tiny" descriptor: The hat appears proportionally normal-sized for the dog, not notably tiny All core elements are present and coherently combined. While the hat isn't distinctly "tiny," it fits the overall concept appropriately. VERDICT: PASS

A penguin holding a colorful umbrellaPASS
A penguin holding a colorful umbrella

Evaluator (Qwen 3.5 122B)

The image shows a penguin standing on snow, holding a rainbow-colored umbrella with its flipper. The umbrella has distinct panels of red, orange, yellow, green, blue, purple, and pink. The penguin is rendered realistically with black-and-white plumage and an orange beak. All described elements — penguin, colorful umbrella, and the act of holding it — are present and coherently combined. VERDICT: PASS

Medium 100%
A medieval knight in full armor sitting at a desk typing on PASS
A medieval knight in full armor sitting at a desk typing on a modern laptop

Evaluator (Qwen 3.5 122B)

The image contains all described elements: 1. Medieval knight in full armor - Present (full plate armor with helmet) 2. Sitting at a desk - Present (wooden table/desk) 3. Typing on a modern laptop - Present (Apple MacBook visible) The concepts are coherently combined with the knight positioned naturally at the desk, hands on the keyboard, creating a believable anachronistic scene. The lighting and composition support the unusual combination. VERDICT: PASS

A Roman gladiator taking a selfie with a smartphone in the CPASS
A Roman gladiator taking a selfie with a smartphone in the Colosseum

Evaluator (Qwen 3.5 122B)

The image shows a man dressed as a Roman gladiator (helmet with red crest, armor, red tunic) standing in front of the Colosseum. He is holding a modern smartphone and taking a selfie, smiling at the camera. All prompt elements — gladiator, selfie, smartphone, Colosseum — are clearly present and coherently combined. The scene blends historical costume with modern technology in a plausible tourist context. VERDICT: PASS

An astronaut in a spacesuit riding a bicycle through a parkPASS
An astronaut in a spacesuit riding a bicycle through a park

Evaluator (Qwen 3.5 122B)

The image shows an astronaut in a full spacesuit riding a bicycle on a paved path through a park with trees and grass. All described elements are present: astronaut, spacesuit, bicycle, and park setting. The combination is coherent and visually plausible within the surreal context. No missing or contradictory elements. VERDICT: PASS

Hard 100%
An octopus wearing a top hat and monocle playing chess on thPASS
An octopus wearing a top hat and monocle playing chess on the surface of the moon with Earth visible in the background

Evaluator (Qwen 3.5 122B)

The image contains all described elements: - Octopus: Present, central figure. - Top hat: Present, worn by the octopus. - Monocle: Present, over one eye. - Chess: Present, board and pieces in front of the octopus. - Moon surface: Present, textured ground beneath the board. - Earth in background: Present, large planet behind the octopus. All elements are coherently combined in a surreal but visually consistent scene. VERDICT: PASS

A dinosaur in a business suit giving a PowerPoint presentatiPASS
A dinosaur in a business suit giving a PowerPoint presentation in a modern office while drinking coffee

Evaluator (Qwen 3.5 122B)

The image shows a dinosaur (T-Rex) wearing a business suit, standing in a modern office with desks and chairs. It is gesturing toward a projector screen displaying a presentation slide. The dinosaur is holding a white coffee mug in one hand. All elements from the prompt are present and coherently combined: dinosaur, business suit, PowerPoint presentation, modern office, and drinking coffee. VERDICT: PASS

A robot dressed as a pirate sailing a paper boat on a lake oPASS
A robot dressed as a pirate sailing a paper boat on a lake of lava while juggling three kittens

Evaluator (Qwen 3.5 122B)

The image contains all described elements: - Robot dressed as a pirate: Present (wearing tricorn hat, sash, belt) - Paper boat: Present (white origami-style boat) - Lake of lava: Present (glowing red molten surface) - Three kittens: Present (three tabby kittens floating around the robot) - Juggling action: Present (robot’s arms raised, kittens positioned mid-air as if being juggled) All components are coherently combined in a surreal but visually consistent scene. VERDICT: PASS

CountingClick to expand33%

Counting measures whether local/sefi-image-5b-turbo creates the exact number of requested objects without adding or missing items.

Easy 100%
Exactly 3 red apples arranged on a wooden tablePASS
Exactly 3 red apples arranged on a wooden table

Evaluator (Qwen 3.5 122B)

The image displays three red apples placed on a wooden surface. The apples are arranged in a slight arc, with visible stems and natural texture. The count matches the prompt’s request for exactly three red apples on a wooden table. No extra or missing objects are present. VERDICT: PASS

Exactly 5 yellow rubber ducks sitting in a row on a bathroomPASS
Exactly 5 yellow rubber ducks sitting in a row on a bathroom shelf

Evaluator (Qwen 3.5 122B)

The image shows exactly five yellow rubber ducks arranged in a row on a white bathroom shelf. All ducks are identical in appearance, with orange beaks and black eyes. The count matches the prompt’s request precisely: five ducks, correct color, correct setting. No extra or missing objects. VERDICT: PASS

Exactly 4 white candles standing on a dark surfacePASS
Exactly 4 white candles standing on a dark surface

Evaluator (Qwen 3.5 122B)

The image displays exactly four white candles, all standing upright on a dark surface. Each candle is lit with a visible flame. The count matches the prompt’s request precisely: four candles, white in color, standing on a dark background. No extra or missing objects are present. VERDICT: PASS

Medium 0%
2 orange cats and 3 black dogs sitting together on a grassy FAIL
2 orange cats and 3 black dogs sitting together on a grassy lawn

Evaluator (Qwen 3.5 122B)

The image shows 1 orange cat and 4 black dogs sitting on grass. The prompt requested 2 orange cats and 3 black dogs. The counts do not match — there is one fewer cat and one extra dog than specified. VERDICT: FAIL

4 red roses and 2 white daisies in a glass vase on a tableFAIL
4 red roses and 2 white daisies in a glass vase on a table

Evaluator (Qwen 3.5 122B)

The image shows 5 red roses and 2 white daisies in a clear glass vase on a white table. The prompt requested exactly 4 red roses, but there are 5 visible. Therefore, the count does not match the specification. VERDICT: FAIL

1 blue butterfly and 3 yellow butterflies resting on a gardeFAIL
1 blue butterfly and 3 yellow butterflies resting on a garden fence

Evaluator (Qwen 3.5 122B)

The image shows three butterflies on a fence: one large blue butterfly in the center, one pale blue/white butterfly on the left, and one yellow butterfly on the right. The prompt requested “1 blue butterfly and 3 yellow butterflies.” There is only 1 yellow butterfly, not 3, and the second butterfly is not yellow but pale blue/white. Therefore, the count does not match the request. VERDICT: FAIL

Hard 0%
Exactly 9 goldfish swimming in a large round fishbowlFAIL
Exactly 9 goldfish swimming in a large round fishbowl

Evaluator (Qwen 3.5 122B)

The image shows a round fishbowl containing goldfish. Counting the fish: there are 10 goldfish visible (not 9). The prompt requested exactly 9, so the count does not match. All other elements (round bowl, swimming fish) are correct, but the numerical requirement is violated. VERDICT: FAIL

A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 FAIL
A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 apples

Evaluator (Qwen 3.5 122B)

The image shows a fruit bowl with 3 bananas, 2 oranges, and 2 apples. The prompt requested 3 bananas, 4 oranges, and 2 apples. While the banana and apple counts are correct, the orange count is incorrect (only 2 visible instead of 4). Therefore, the total object count does not exactly match the request. VERDICT: FAIL

Exactly 11 colorful balloons floating against a clear blue sFAIL
Exactly 11 colorful balloons floating against a clear blue sky

Evaluator (Qwen 3.5 122B)

The image displays colorful balloons against a clear blue sky. Counting them: there are 15 balloons total — not 11. Colors include blue, silver, yellow, red, green, pink, purple, gold, and light blue. Since the prompt requested exactly 11 balloons and the image contains 15, the count does not match. VERDICT: FAIL

NegationClick to expand78%

Negation measures whether local/sefi-image-5b-turbo respects what should be absent from the image, not only what should be present.

Easy 100%
A tropical beach scene with no peoplePASS
A tropical beach scene with no people

Evaluator (Qwen 3.5 122B)

The image depicts a serene tropical beach scene with white sand, turquoise water, and two palm trees framing the view. There are no people visible anywhere in the frame — not on the beach, in the water, or in the sky. The exclusion criterion (“no people”) is fully satisfied. VERDICT: PASS

A cat lying on a couch without a collarPASS
A cat lying on a couch without a collar

Evaluator (Qwen 3.5 122B)

The image shows a tabby and white cat lying on a gray couch. The cat is looking directly at the camera with a calm expression. Its fur has distinct stripes, and its paws are tucked neatly under its body. There is no collar visible around the cat’s neck — the prompt explicitly excluded “a collar,” and it is indeed absent. All other elements (cat, couch, lying position) match the prompt’s description. VERDICT: PASS

A classroom with desks and a chalkboard but no studentsPASS
A classroom with desks and a chalkboard but no students

Evaluator (Qwen 3.5 122B)

The image shows an empty classroom with rows of wooden desks and chairs facing a large black chalkboard. There are no students visible in the room — the space is completely unoccupied by people. The only elements present are furniture, walls, floor, and the chalkboard. This matches the prompt’s exclusion criteria. VERDICT: PASS

Medium 100%
A sandy beach with palm trees, seashells, and a beach umbrelPASS
A sandy beach with palm trees, seashells, and a beach umbrella but no water or ocean visible

Evaluator (Qwen 3.5 122B)

The image shows a sandy beach with palm trees, seashells in the foreground, and a white beach umbrella. The background features a clear blue sky meeting a flat horizon line. There is no visible water, ocean, waves, or shoreline — the horizon appears to be where sand meets sky, suggesting the ocean is intentionally excluded. This matches the prompt’s exclusion criterion. VERDICT: PASS

An office workspace with a desk, chair, bookshelf, and lamp PASS
An office workspace with a desk, chair, bookshelf, and lamp but no computers or screens

Evaluator (Qwen 3.5 122B)

The image shows an office workspace with a wooden desk, a black mesh office chair, a bookshelf filled with books, and a desk lamp. There are no computers, monitors, laptops, tablets, or any other screens visible on the desk or elsewhere in the scene. The excluded elements are truly absent. VERDICT: PASS

A kitchen with pots, pans, a stove, and a cutting board but PASS
A kitchen with pots, pans, a stove, and a cutting board but no food

Evaluator (Qwen 3.5 122B)

The image displays a clean kitchen scene featuring a gas stove with four stainless steel and black pots on the burners. In the foreground, there are two wooden cutting boards resting on the countertop. The stove and oven are visible below. Crucially, there is no food visible in the pots, on the cutting boards, or anywhere else in the scene. The excluded element is absent. VERDICT: PASS

Hard 33%
A pepperoni pizza with tomato sauce, basil, and crust but abFAIL
A pepperoni pizza with tomato sauce, basil, and crust but absolutely no cheese

Evaluator (Qwen 3.5 122B)

The image shows a pepperoni pizza with tomato sauce, basil leaves, and a crust. However, there is clearly melted cheese visible across the surface of the pizza, beneath and around the pepperoni slices. Since the prompt explicitly excluded cheese ("absolutely no cheese"), and cheese is present in the image, the core criterion is not met. VERDICT: FAIL

A car body with windows, doors, and headlights but no wheelsPASS
A car body with windows, doors, and headlights but no wheels

Evaluator (Qwen 3.5 122B)

The image shows a car body with windows, doors, and headlights. The excluded element — wheels — is indeed absent. The car body is shown without any wheels attached, consistent with the prompt’s exclusion. All other specified elements (windows, doors, headlights) are present. No wheels are visible in any part of the image. VERDICT: PASS

A human face with eyes, nose, and mouth but no eyebrowsFAIL
A human face with eyes, nose, and mouth but no eyebrows

Evaluator (Qwen 3.5 122B)

The image shows a human face with closed eyes, a nose, and lips. However, the prompt explicitly excluded eyebrows, and this image clearly includes faint but visible eyebrows above the closed eyes. Since the excluded element (eyebrows) is present, the image does not meet the exclusion criterion. VERDICT: FAIL

Relative PositionClick to expand92%

Relative Position measures whether local/sefi-image-5b-turbo places objects in the requested spatial relationship, such as above, below, inside, or next to another object.

Easy 100%
A cat sitting on top of a cardboard boxPASS
A cat sitting on top of a cardboard box

Evaluator (Qwen 3.5 122B)

The image shows a tabby cat sitting directly on the top surface of a closed cardboard box. The cat is centered on the box, facing forward. The spatial relationship “on top of” is accurately represented — the cat is physically above and resting upon the box. No other positional relationships are specified or violated. All described spatial relationships are correct. VERDICT: PASS

A red ball underneath a wooden chairPASS
A red ball underneath a wooden chair

Evaluator (Qwen 3.5 122B)

The image shows a red ball positioned on the floor directly beneath the seat of a wooden chair. The chair’s legs frame the ball, and the ball is clearly underneath the horizontal wooden slats. All spatial relationships described in the prompt — specifically “underneath” — are accurately represented. No positional errors are present. VERDICT: PASS

A coffee mug next to an open book on a deskPASS
A coffee mug next to an open book on a desk

Evaluator (Qwen 3.5 122B)

The image shows two white coffee mugs and an open book on a light wooden desk. One mug is positioned to the left of the book, and the other is in front of the book, slightly overlapping its edge. The book lies flat on the desk surface. All described spatial relationships — “next to” and “on” — are accurately represented. No positional errors detected. VERDICT: PASS

Medium 100%
A bird perched on a branch, the branch extending from a treePASS
A bird perched on a branch, the branch extending from a tree next to a park bench

Evaluator (Qwen 3.5 122B)

The image shows a bird perched on a thin branch that extends horizontally from a tree trunk on the left side of the frame. In the background, slightly out of focus, is a park bench positioned to the right and below the bird’s perch. The spatial relationships described — “bird on branch,” “branch extending from tree,” and “tree next to park bench” — are all accurately represented in the image. VERDICT: PASS

A cat sitting on a pillow, the pillow placed on a wooden chaPASS
A cat sitting on a pillow, the pillow placed on a wooden chair

Evaluator (Qwen 3.5 122B)

The image shows a tabby cat sitting upright on a white pillow. The pillow is placed directly on the seat of a wooden chair. All spatial relationships described in the prompt — “cat on pillow,” “pillow on chair” — are accurately represented. No positional errors are present. VERDICT: PASS

A candle on top of a stack of books, the books next to a potPASS
A candle on top of a stack of books, the books next to a potted plant

Evaluator (Qwen 3.5 122B)

The image shows a lit candle placed directly on top of a stack of books. To the left of the books (from the viewer’s perspective) is a potted plant with green leaves in a brown pot. The spatial relationships described — “candle on top of books” and “books next to potted plant” — are accurately represented. All positional descriptors align with the visual content. VERDICT: PASS

Hard 100%
Five chess pieces on a board: a king in the center, a queen PASS
Five chess pieces on a board: a king in the center, a queen to its left, a bishop to its right, a rook above the king, and a knight below the king

Evaluator (Qwen 3.5 122B)

The image shows five chess pieces on a board. The king is in the center. To its left is a queen (correct). To its right is a bishop (correct). Above the king is a rook (correct). Below the king is a knight (correct). All spatial relationships described in the prompt are accurately represented in the image. VERDICT: PASS

A desk scene: laptop in the center, coffee mug to the left, PASS
A desk scene: laptop in the center, coffee mug to the left, phone to the right, lamp above, and notebook below

Evaluator (Qwen 3.5 122B)

The image shows a desk scene with a laptop centered, a coffee mug to its left, a phone to its right, a lamp above casting light, and a notebook below in front. All specified spatial relationships — left, right, above, below, center — are accurately represented as described in the prompt. VERDICT: PASS

Five animals in a field: a horse on the far left, a cow to tPASS
Five animals in a field: a horse on the far left, a cow to the right of the horse, a sheep in the center, a goat to the right of the sheep, and a dog on the far right

Evaluator (Qwen 3.5 122B)

The image shows five animals in a field arranged left to right: a brown horse, a black-and-white cow, a fluffy sheep, a white goat with horns, and a tricolor dog. The spatial relationships match the prompt exactly — horse on far left, cow to its right, sheep centered, goat to sheep’s right, dog on far right. All positional descriptors are accurate. VERDICT: PASS

Extreme 67%
A landscape with a red barn in the foreground at large scaleFAIL
A landscape with a red barn in the foreground at large scale, a windmill in the midground at medium scale, and snow-capped mountains in the background at small scale, all with consistent depth perspective

Evaluator (Qwen 3.5 122B)

The image shows a red barn in the foreground with a windmill positioned behind and slightly to the right of it — not clearly “in the midground at medium scale” as specified. The snow-capped mountains are in the background, appearing large due to atmospheric perspective, contradicting the prompt’s “small scale” requirement. Spatial relationships like “above” or “below” aren’t violated, but scale and depth layering do not match the prompt’s specifications. VERDICT: FAIL

A beach scene with a sandcastle in the foreground, a surfer PASS
A beach scene with a sandcastle in the foreground, a surfer on the waves in the midground, and a cruise ship on the horizon in the background, sizes decreasing with distance

Evaluator (Qwen 3.5 122B)

The image correctly shows a sandcastle in the foreground, a surfer on waves in the midground, and a cruise ship on the horizon in the background. The sizes of the objects decrease with distance as specified. All positional relationships (foreground, midground, background) are accurate and consistent with the prompt. VERDICT: PASS

A city park with a park bench and pigeons in the foreground,PASS
A city park with a park bench and pigeons in the foreground, a fountain in the midground, and skyscrapers in the background, each layer progressively smaller

Evaluator (Qwen 3.5 122B)

The image correctly depicts a city park scene with a park bench and pigeons in the foreground, a fountain in the midground, and skyscrapers in the background. The spatial arrangement matches the description: the bench is in front of the fountain, which is in front of the buildings. The pigeons are on the ground near the bench. All positional relationships are accurate. VERDICT: PASS

Scale & ProportionsClick to expand67%

Scale & Proportions measures whether local/sefi-image-5b-turbo keeps people, vehicles, buildings, and everyday objects at believable relative sizes.

Easy 100%
A person standing next to a yellow school bus, showing realiPASS
A person standing next to a yellow school bus, showing realistic relative scale

Evaluator (Qwen 3.5 122B)

The man’s height relative to the bus appears realistic — his head reaches about halfway up the bus’s side window row, which is consistent with real-world proportions. The bus wheels and body scale also match typical school bus dimensions. No obvious distortions or size mismatches are present. VERDICT: PASS

A ladybug sitting on a large green leaf, shown at realistic PASS
A ladybug sitting on a large green leaf, shown at realistic proportions

Evaluator (Qwen 3.5 122B)

The ladybug appears appropriately sized relative to the leaf, consistent with real-world proportions. The leaf’s texture and vein structure support a macro-scale view where the insect fits naturally. No exaggerated or distorted sizing is evident. The composition reflects realistic scale between subject and environment. VERDICT: PASS

A child holding a basketball, both at correct relative sizesPASS
A child holding a basketball, both at correct relative sizes

Evaluator (Qwen 3.5 122B)

The image shows a child holding a basketball. The ball appears appropriately sized for the child’s hands and torso — roughly the size of a standard youth or junior basketball relative to the child’s body. No obvious distortions or scale inconsistencies are present. The proportions between the child and the ball are realistic and consistent with real-world expectations. VERDICT: PASS

Medium 100%
A tiny person standing on the rim of a coffee cup, looking dPASS
A tiny person standing on the rim of a coffee cup, looking down into it

Evaluator (Qwen 3.5 122B)

The image shows a tiny person standing on the rim of a coffee cup, looking down into it. The relative sizes are consistent with the prompt: the person is miniature compared to the cup, and the cup’s proportions appear realistic for a standard coffee cup. The perspective and scale match the described scenario. VERDICT: PASS

A giant tabby cat walking between city skyscrapers like a kaPASS
A giant tabby cat walking between city skyscrapers like a kaiju

Evaluator (Qwen 3.5 122B)

The image depicts a tabby cat with kaiju-like spikes, towering over skyscrapers in a cityscape. The cat’s size is exaggerated to match the “giant” description, and its proportions relative to the buildings are consistent with the prompt’s intent — it looms between them as a monster would. While anatomically impossible, the visual scaling aligns with the requested fantasy concept. VERDICT: PASS

A miniature house sitting on the palm of someone's handPASS
A miniature house sitting on the palm of someone's hand

Evaluator (Qwen 3.5 122B)

The image shows a miniature house resting on an open human palm. The house is appropriately scaled to fit comfortably within the hand, with its base spanning roughly the width of the palm and its height comparable to the distance from the palm to the fingertips. The proportions between the house and the hand are realistic and consistent with the prompt’s description. VERDICT: PASS

Hard 0%
A group photo of an elephant, a horse, a dog, a cat, and a mFAIL
A group photo of an elephant, a horse, a dog, a cat, and a mouse, all standing in a line with correct real-world proportional sizes

Evaluator (Qwen 3.5 122B)

The image shows an elephant, horse, dog, cat, and mouse arranged in a line. The elephant is largest, followed by the horse, then dog, cat, and mouse — which aligns with real-world size proportions. However, the dog appears disproportionately large relative to the cat and mouse, and the cat is nearly as tall as the dog’s shoulder, which is inaccurate. Also, the mouse is too large relative to the cat. These inconsistencies violate “correct real-world proportional sizes.” VERDICT: FAIL

A dinner table scene with a plate, fork, knife, wine glass, FAIL
A dinner table scene with a plate, fork, knife, wine glass, salt shaker, and peppercorn, all at realistic relative sizes

Evaluator (Qwen 3.5 122B)

The image shows a plate, fork, knife, wine glass, and salt shaker arranged on a tablecloth. The relative sizes appear mostly realistic — the plate is appropriately large compared to the cutlery, the wine glass is proportionate to the plate, and the salt shaker is smaller than all. However, there is no visible peppercorn as specified in the prompt. Since the prompt explicitly requires a peppercorn and it’s absent, the core criteria are not fully met. VERDICT: FAIL

A farm scene with a barn, tractor, farmer, cow, chicken, andFAIL
A farm scene with a barn, tractor, farmer, cow, chicken, and a fence post, all at correct proportional scale

Evaluator (Qwen 3.5 122B)

The image shows a farm scene with a barn, tractor, farmer, two cows, a chicken, and a fence post. The relative sizes appear mostly consistent: the barn is large in the background, the tractor is appropriately sized next to it, the farmer stands between the cows at a realistic height, and the chicken is small in the foreground. However, the chicken appears slightly oversized relative to the cows and farmer — its body is nearly as tall as the farmer’s knee, which is unrealistic. This breaks propor

Human realism 86%

Faces & ExpressionsClick to expand100%

Faces & Expressions measures whether local/sefi-image-5b-turbo creates realistic faces with the requested age, emotion, expression, and identity cues.

Easy 100%
A close-up portrait of a young woman with a warm natural smiPASS
A close-up portrait of a young woman with a warm natural smile

Evaluator (Qwen 3 VL)

Face is symmetric with natural features. Smile appears warm and genuine, clearly conveying happiness. No noticeable distortions. Lighting and detail are consistent with a natural portrait. The prompt’s intent is fully met. VERDICT: PASS

A portrait of a middle-aged man with a calm, serious expressPASS
A portrait of a middle-aged man with a calm, serious expression

Evaluator (Qwen 3 VL)

Face is symmetric with natural, correct features. Expression is calm and serious, with subtle lines conveying maturity. No distortions or unnatural artifacts. Lighting and texture are realistic. The prompt’s intent is fully met. VERDICT: PASS

A headshot of a teenager with a relaxed, neutral facePASS
A headshot of a teenager with a relaxed, neutral face

Evaluator (Qwen 3 VL)

The face is symmetric with correctly proportioned features. The teenager’s expression is neutral and relaxed, matching the prompt. No distortions are visible; eyes, nose, and mouth are naturally rendered. Lighting and texture appear realistic. VERDICT: PASS

Medium 100%
A close-up of a person's face showing clear disgust, nose wrPASS
A close-up of a person's face showing clear disgust, nose wrinkled and lip curled

Evaluator (Qwen 3 VL)

The face shows strong, symmetrical disgust: nose wrinkled, lips curled, and eyes squinted. Features are anatomically correct with no distortions. The emotion is clearly readable and intense. Skin texture and fine lines are natural. All requested elements are accurately rendered. VERDICT: PASS

A portrait capturing genuine surprise — raised eyebrows, widPASS
A portrait capturing genuine surprise — raised eyebrows, wide eyes, slightly open mouth

Evaluator (Qwen 3 VL)

Face is symmetric with correct anatomical features. The expression clearly conveys genuine surprise: wide eyes, raised eyebrows, and an open mouth. No distortions or unnatural artifacts are present. The emotion is vividly readable and matches the prompt precisely. VERDICT: PASS

A face showing quiet melancholy — downcast eyes, slightly drPASS
A face showing quiet melancholy — downcast eyes, slightly drawn mouth, distant gaze

Evaluator (Qwen 3 VL)

Face is symmetric with correctly rendered features. Eyes are downcast, lips slightly drawn, and gaze distant — clearly conveying quiet melancholy. No major distortions; skin texture and lighting are naturalistic. Emotion is unmistakably readable and aligned with the prompt. VERDICT: PASS

Hard 100%
An elderly woman in her 80s with deep wrinkles and white haiPASS
An elderly woman in her 80s with deep wrinkles and white hair, smiling warmly with kind eyes

Evaluator (Qwen 3 VL)

The face is symmetric with natural, aged features: deep wrinkles, white hair, and a warm smile. Eyes convey kindness. No distortions; details are crisp and realistic. The emotion is clearly readable as joyful and gentle. All prompt elements are accurately rendered. VERDICT: PASS

A 5-year-old child with round cheeks and large eyes, crying PASS
A 5-year-old child with round cheeks and large eyes, crying with tears streaming down their face

Evaluator (Qwen 3 VL)

The child’s face is symmetric with natural features: round cheeks, large eyes, and tears streaming down. The crying expression is clearly readable—furrowed brow, open mouth, and visible tears. No distortions or unnatural artifacts are present. The prompt’s requirements are fully met. VERDICT: PASS

An elderly man in his 70s with age spots and thinning gray hPASS
An elderly man in his 70s with age spots and thinning gray hair, looking skeptically over reading glasses

Evaluator (Qwen 3 VL)

Face is symmetric with realistic elderly features: age spots, thinning gray hair, and deep wrinkles. Skeptical expression is clearly readable via furrowed brow and downturned mouth. No significant distortions. Glasses and hand placement are natural. All prompt elements are accurately rendered. VERDICT: PASS

Extreme 100%
A group of three friends at a table: one laughing with head PASS
A group of three friends at a table: one laughing with head tilted back, one looking shocked with wide eyes, and one rolling their eyes with a smirk

Evaluator (Qwen 3 VL)

Faces are mostly symmetric with correct anatomical features. Emotions are clearly readable: left person laughs with head back, center looks shocked with wide eyes, right person rolls eyes with smirk. No significant distortions. All three expressions match the prompt accurately. VERDICT: PASS

A family photo with four people: a father beaming with pridePASS
A family photo with four people: a father beaming with pride, a mother smiling gently, a teenager looking bored, and a child grinning excitedly

Evaluator (Qwen 3 VL)

Faces show natural symmetry and correct anatomical features. Emotions are clearly readable: father beams, mother smiles gently, teenager looks bored, child grins excitedly. No distortions or unnatural artifacts. All four individuals match the prompt’s emotional descriptions accurately. VERDICT: PASS

Three coworkers in a meeting: one looking frustrated with fuPASS
Three coworkers in a meeting: one looking frustrated with furrowed brows, one nodding with a pleased expression, and one looking confused with a tilted head

Evaluator (Qwen 3 VL)

Faces are symmetric with correct features. Emotions are clearly readable: left man shows frustration (furrowed brows), center woman shows pleased expression (smile, clapping), right man shows confusion (tilted head, hand on chin). No significant distortions. All three expressions match the prompt accurately. VERDICT: PASS

Full BodyClick to expand100%

Full Body measures whether local/sefi-image-5b-turbo renders full-body people with natural anatomy, clothing, pose, and proportions.

Easy 100%
A woman standing upright in a relaxed pose, arms at her sidePASS
A woman standing upright in a relaxed pose, arms at her sides, wearing jeans and a white t-shirt

Evaluator (Qwen 3 VL)

The woman stands upright with arms at her sides, wearing jeans and a white t-shirt as described. Limb count and proportions are anatomically correct. Pose is relaxed and recognizable. No distortions are visible. All elements match the prompt precisely. VERDICT: PASS

A man standing straight facing the camera, hands in pockets,PASS
A man standing straight facing the camera, hands in pockets, normal proportions

Evaluator (Qwen 3 VL)

The man stands straight, facing forward, with hands in pockets — perfectly matching the prompt. Limb count is correct (two arms, two legs). Proportions are natural and anatomically accurate. No distortions or abnormalities are visible. The pose is clearly recognizable and correctly rendered. VERDICT: PASS

A person standing in a neutral pose in a park, full body visPASS
A person standing in a neutral pose in a park, full body visible from head to feet

Evaluator (Qwen 3 VL)

The person stands full-body in a neutral pose, head to feet, in a park. Limb count and proportions are anatomically correct. Pose is recognizable as standing still, hands relaxed at sides. No distortions observed. All elements match the prompt accurately. VERDICT: PASS

Medium 100%
A woman mid-stride running through a park, hair and clothes PASS
A woman mid-stride running through a park, hair and clothes showing motion

Evaluator (Qwen 3 VL)

Limb count and proportions are anatomically correct. Pose is a natural running stride with motion blur in hair and clothing, clearly conveying movement. No distortions or anatomical errors. The action is unmistakably running through a park. All criteria met. VERDICT: PASS

A man captured mid-jump over a puddle, body in the air with PASS
A man captured mid-jump over a puddle, body in the air with bent knees

Evaluator (Qwen 3 VL)

The man has correct limb count and proportions. His pose—bent knees, mid-air jump over a puddle with visible splash—is anatomically plausible and matches the prompt. No distortions are evident. The action is clearly recognizable as a jump over water. VERDICT: PASS

A dancer in a dynamic spin, dress flowing outward with one lPASS
A dancer in a dynamic spin, dress flowing outward with one leg extended

Evaluator (Qwen 3 VL)

The dancer’s limbs are correctly proportioned and count is accurate. The pose—spinning with one leg extended en pointe, arms gracefully arched—is anatomically plausible and matches the prompt. The flowing dress captures motion realistically. No distortions in form or anatomy are visible. The action is clearly recognizable as a dynamic spin. VERDICT: PASS

Hard 100%
A person in a yoga tree pose — standing on one leg with the PASS
A person in a yoga tree pose — standing on one leg with the other foot pressed to the inner thigh, arms raised overhead

Evaluator (Qwen 3 VL)

The figure has two legs, correct proportions, and performs the tree pose: standing on one foot with the other foot’s sole pressed to the inner thigh. Arms are raised overhead. No anatomical distortions are visible. The pose is clearly recognizable and executed accurately. VERDICT: PASS

A rock climber clinging to a cliff face with arms and legs ePASS
A rock climber clinging to a cliff face with arms and legs extended to different holds

Evaluator (Qwen 3 VL)

The climber has two arms and two legs, with proportions and limb placement consistent with human anatomy. The pose is recognizable as dynamic rock climbing, with limbs extended to holds. No significant distortions are visible. The action matches the prompt accurately. VERDICT: PASS

A cellist seated with correct posture, legs around the instrPASS
A cellist seated with correct posture, legs around the instrument, bow arm extended

Evaluator (Qwen 3 VL)

The cellist has correct limb count and proportions. The pose is anatomically accurate: seated, legs around the instrument, bow arm extended. The cello rests on a stand, not held between legs as implied. The action is recognizable as playing. No major distortions. VERDICT: PASS

Extreme 100%
Two tango dancers in a close embrace, the man dipping the woPASS
Two tango dancers in a close embrace, the man dipping the woman backward with intertwined legs

Evaluator (Qwen 3 VL)

The dancers’ limbs are correctly counted and proportioned. The woman’s leg is extended backward, intertwined with the man’s leg, and her body is arched backward in a classic tango dip. The pose is anatomically plausible and recognizable as a tango figure. No distortions are visible. VERDICT: PASS

Two soccer players mid-tackle, one sliding and the other jumPASS
Two soccer players mid-tackle, one sliding and the other jumping over with the ball at their feet

Evaluator (Qwen 3 VL)

Both players show correct limb count and proportions. The sliding player’s body posture and the jumping player’s airborne pose are anatomically plausible and match the described action. The ball is near the sliding player’s foot, and motion blur enhances realism. No significant distortions are visible. The scene accurately depicts a mid-tackle moment. VERDICT: PASS

Two gymnasts performing a synchronized handstand, side by siPASS
Two gymnasts performing a synchronized handstand, side by side with identical body alignment

Evaluator (Qwen 3 VL)

Both gymnasts show correct limb count and proportions. Their synchronized handstand pose is anatomically accurate, with straight legs and aligned bodies. The action is clearly recognizable as a handstand. No distortions are visible. VERDICT: PASS

HandsClick to expand50%

Hands measures whether local/sefi-image-5b-turbo draws anatomically plausible hands, fingers, and gestures.

Easy 67%
A person standing with both hands visible at their sides, fiPASS
A person standing with both hands visible at their sides, fingers relaxed and naturally spread

Evaluator (Qwen 3.5 122B)

Upon close examination, the left hand (viewer’s left) clearly shows 5 fingers with natural spacing and relaxed posture. The right hand (viewer’s right) also displays 5 fingers, though the thumb is partially obscured by angle — still anatomically plausible. No extra or missing digits; no fusion or distortion. Gesture matches “relaxed and naturally spread.” Minor occlusion doesn’t violate anatomical correctness. VERDICT: PASS

A woman holding her hands out in front of her palms-up, all PASS
A woman holding her hands out in front of her palms-up, all ten fingers clearly visible

Evaluator (Qwen 3.5 122B)

The image shows a woman with both hands extended forward, palms facing up. Each hand clearly displays five fingers — left hand: thumb, index, middle, ring, pinky; right hand: same. All ten fingers are visible and anatomically positioned correctly for an open-palm gesture. No extra, missing, or malformed fingers. The gesture is natural and consistent with the prompt. VERDICT: PASS

A man with arms crossed, both hands resting on opposite uppeFAIL
A man with arms crossed, both hands resting on opposite upper arms with fingers visible

Evaluator (Qwen 3.5 122B)

The image shows a man with arms crossed. On the left hand (resting on right arm), all 5 fingers are visible and anatomically correct. On the right hand (resting on left arm), only 4 fingers are clearly visible — the thumb is obscured or missing from view. While this may be due to angle, the prompt requires “fingers visible” and “exactly 5 per hand.” Since one hand does not show 5 distinct fingers, it fails the criterion. VERDICT: FAIL

Medium 67%
A person holding a white ceramic coffee mug by the handle wiFAIL
A person holding a white ceramic coffee mug by the handle with their right hand

Evaluator (Qwen 3.5 122B)

The hand gripping the mug appears to be a right hand, but it shows only four fingers (thumb, index, middle, ring) — the pinky is missing or obscured. The grip itself is anatomically plausible for holding a mug, but the finger count is incorrect. Since the prompt specifies “exactly 5 per hand” and this hand visibly lacks a fifth finger, it fails the criterion. VERDICT: FAIL

A woman writing in a notebook with a pen gripped correctly iPASS
A woman writing in a notebook with a pen gripped correctly in her right hand

Evaluator (Qwen 3.5 122B)

The right hand gripping the pen shows 5 fingers, but the thumb and index finger positioning is slightly unnatural. The left hand supporting the notebook also shows 5 fingers, though some are partially obscured. Overall, the grip is mostly anatomically correct with no major anomalies. VERDICT: PASS

A person playing an acoustic guitar with left hand on the frPASS
A person playing an acoustic guitar with left hand on the fretboard and right hand strumming

Evaluator (Qwen 3.5 122B)

The left hand (fretting) shows four fingers clearly visible on the fretboard, with the thumb likely behind the neck — anatomically plausible. The right hand (strumming/picking) also displays four fingers in a natural position near the soundhole; the thumb is partially obscured but likely present. No obvious extra or missing fingers. Grip and gesture appear anatomically correct for guitar playing. Minor occlusion doesn’t violate the “exactly 5 per hand” requirement if we assume standard anatomy.

Hard 33%
A person making a peace sign with their right hand, clearly FAIL
A person making a peace sign with their right hand, clearly showing two fingers up and three curled

Evaluator (Qwen 3.5 122B)

The hand in the image shows a peace sign gesture with two fingers extended and three curled. However, upon close inspection, the thumb is not clearly visible or properly positioned — it appears fused or missing from the base of the index finger, which breaks anatomical correctness. The curled fingers also appear slightly malformed or merged. This violates the requirement for exactly 5 distinct, correctly articulated fingers. VERDICT: FAIL

A hand giving a thumbs-up with four fingers curled and thumbPASS
A hand giving a thumbs-up with four fingers curled and thumb extended upward

Evaluator (Qwen 3.5 122B)

The image shows a single hand making a thumbs-up gesture. The thumb is extended upward, and the other four fingers are curled into a fist — which matches the prompt. Anatomically, this is correct: humans have five fingers per hand (including the thumb), and here we see all five accounted for in their proper positions for this gesture. No anomalies or extra/missing digits are visible. VERDICT: PASS

A person counting to three on their fingers, with index, midFAIL
A person counting to three on their fingers, with index, middle, and ring fingers extended

Evaluator (Qwen 3.5 122B)

The image shows one hand making a gesture. The index and middle fingers are extended upward, while the ring and pinky fingers are curled into the palm. The thumb is also visible and extended outward. This is a “V” or peace sign gesture — not counting to three. The prompt asked for index, middle, and ring fingers extended — which is not what’s shown. Additionally, only one hand is visible, so we cannot verify “5 per hand” across both hands. Anatomically, the visible hand is correct, but it does n

Extreme 33%
Two people shaking hands in a professional setting, all fingPASS
Two people shaking hands in a professional setting, all fingers of both hands anatomically correct

Evaluator (Qwen 3.5 122B)

The image shows two hands shaking. The left hand (from viewer’s perspective) has 5 fingers visible and correctly formed. The right hand also displays 5 fingers, with natural curvature and positioning consistent with a handshake. No extra, missing, or malformed fingers are present. Grip appears anatomically plausible for a professional handshake. All criteria met. VERDICT: PASS

Two people doing a high-five, palms meeting with all fingersFAIL
Two people doing a high-five, palms meeting with all fingers extended on both hands

Evaluator (Qwen 3.5 122B)

The image shows two hands pressed together in a prayer-like gesture (Anjali Mudra), not a high-five. Each hand clearly displays 5 fingers, and the anatomy is correct — no extra or missing digits. However, the gesture does not match the prompt’s description of “palms meeting with all fingers extended on both hands” for a high-five; instead, fingers are aligned vertically and palms are flat against each other. The core criterion — matching the described action — is not met. VERDICT: FAIL

One person handing a small gift box to another person, both FAIL
One person handing a small gift box to another person, both sets of hands gripping the box correctly

Evaluator (Qwen 3.5 122B)

Upon close examination, the left hand (wearing a grey sleeve) shows 5 fingers gripping the box correctly. The right hand (top) also appears to have 5 fingers with natural positioning. However, the bottom-right hand has an anatomical anomaly: the thumb and index finger appear fused or malformed, and the finger count is ambiguous — possibly only 4 distinct digits visible. This violates the “exactly 5 per hand” and “anatomically correct” criteria. VERDICT: FAIL

Multi-SubjectClick to expand100%

Multi-Subject measures whether local/sefi-image-5b-turbo keeps multiple people distinct while preserving their requested attributes, outfits, and positions.

Easy 100%
Two women standing side by side: one with short blonde hair PASS
Two women standing side by side: one with short blonde hair wearing a red jacket, the other with long black hair wearing a blue dress

Evaluator (Qwen 3 VL)

Left woman: short blonde hair, wearing red jacket. Right woman: long black hair, wearing blue dress. Both are visually distinct and match descriptions exactly. No missing or mismatched elements. VERDICT: PASS

Two men at a cafe: one with a beard and glasses wearing a grPASS
Two men at a cafe: one with a beard and glasses wearing a green sweater, the other clean-shaven in a black hoodie

Evaluator (Qwen 3 VL)

Person 1: Bearded man with glasses, wearing a green sweater — matches description. Person 2: Clean-shaven man, wearing a black hoodie — matches description. Both are visually distinct and present. All features align precisely with the prompt. VERDICT: PASS

Two friends posing: one tall with curly red hair in denim ovPASS
Two friends posing: one tall with curly red hair in denim overalls, the other short with straight brown hair in a yellow sundress

Evaluator (Qwen 3 VL)

Tall person: curly red hair, denim overalls — matches. Short person: straight brown hair, yellow sundress — matches. Both visually distinct, posing together. All described features accurately represented. VERDICT: PASS

Medium 100%
Four people at a dinner table: a bald man in a suit, a womanPASS
Four people at a dinner table: a bald man in a suit, a woman with silver hair in a red blouse, a young man with dreadlocks in a denim jacket, and a woman with a hijab in a green dress

Evaluator (Qwen 3 VL)

All four individuals are present and visually distinct. Left: bald man in suit/tie. Second: woman with silver hair in red blouse. Third: young man with dreadlocks in denim jacket. Right: woman in green hijab and dress. Features match prompt exactly. VERDICT: PASS

Three musicians on stage: a tall woman with short pink hair PASS
Three musicians on stage: a tall woman with short pink hair playing bass, a man with a beard and bandana on drums, and a slim man with glasses playing keyboard

Evaluator (Qwen 3 VL)

Tall woman: pink bob haircut, playing bass. Man with beard/bandana: behind drums, bearded, wearing patterned bandana. Slim man with glasses: at keyboard, wearing glasses. All three are visually distinct and match descriptions. VERDICT: PASS

Three hikers on a trail: a man with a backpack and sunhat, aPASS
Three hikers on a trail: a man with a backpack and sunhat, a woman with braided hair and hiking poles, and a teenager with a baseball cap and water bottle

Evaluator (Qwen 3 VL)

Man: wears sunhat, backpack, hiking poles. Woman: has braided hair, wears hiking poles. Teenager: wears baseball cap, carries water bottle. All three are visually distinct and match described features. VERDICT: PASS

Truthfulness 74%

PhotorealismClick to expand100%

Photorealism measures whether local/sefi-image-5b-turbo makes materials, surfaces, and real-world objects look visually plausible and distinct.

Medium 100%
A still life on a wooden table featuring a clear glass of waPASS
A still life on a wooden table featuring a clear glass of water, a polished silver spoon, a folded linen napkin, and a ripe red apple — each material rendered with accurate texture and light behavior

Evaluator (Qwen 3.6 27B)

The image displays a still life with distinct materials. The glass shows accurate refraction and transparency. The silver spoon exhibits sharp, realistic specular highlights. The wooden table has a natural grain texture, and the linen napkin shows a soft, woven fabric appearance. The apple’s skin reflects light with a subtle sheen, consistent with a ripe fruit. All materials demonstrate correct visual properties regarding texture and light behavior. VERDICT: PASS

A workspace desk with a brushed aluminum laptop, a leather-bPASS
A workspace desk with a brushed aluminum laptop, a leather-bound notebook, a ceramic coffee mug, and a glass paperweight — all materials visually distinct and realistic

Evaluator (Qwen 3.6 27B)

The image successfully renders distinct materials with high fidelity. The laptop exhibits a convincing brushed aluminum texture with soft, diffuse reflections. The notebook shows realistic leather grain and stitching details. The ceramic mug displays a matte, slightly porous surface with a glossy glaze rim, while the paperweight captures sharp, refractive highlights characteristic of glass. The wooden desk provides a natural, textured base. All materials are visually distinct and physically plau

A kitchen counter with a stainless steel pot, a wooden cuttiPASS
A kitchen counter with a stainless steel pot, a wooden cutting board, a glass bottle of olive oil, and a cotton dish towel — each surface reflecting light correctly

Evaluator (Qwen 3.6 27B)

The image accurately renders the distinct material properties requested. The stainless steel pot exhibits sharp, metallic reflections and a brushed texture. The wooden cutting board shows natural grain and matte diffusion. The glass bottle demonstrates correct translucency and refraction of the oil inside, while the cotton towel displays soft, fibrous texture with appropriate light absorption. All surfaces interact with light realistically, maintaining physical consistency. VERDICT: PASS

Physics & ReflectionsClick to expand83%

Physics & Reflections measures whether local/sefi-image-5b-turbo follows basic physical cues such as shadows, reflections, gravity, transparency, and contact points.

Easy 67%
A red sphere on a white surface with a single shadow cast toPASS
A red sphere on a white surface with a single shadow cast to the right, consistent with a light source from the upper left

Evaluator (Qwen 3.6 27B)

The shadow is cast to the right and slightly downward, which is consistent with a light source from the upper left. The shadow’s shape is an elongated oval, appropriately distorted by perspective and the sphere’s curvature. The highlight on the sphere’s upper-left surface further confirms the lighting direction. The shadow’s soft edge and gradual fade are physically plausible for a diffused light source. No anomalies in color or orientation are present. VERDICT: PASS

A coffee mug on a sunny windowsill casting a sharp shadow toPASS
A coffee mug on a sunny windowsill casting a sharp shadow to the left

Evaluator (Qwen 3.6 27B)

The shadow is cast to the left, which is consistent with light coming from the right (the window). The shadow’s shape accurately reflects the mug and handle, and its sharpness matches the strong, direct sunlight implied by the bright highlight on the windowsill. No distortions or color anomalies are present. The physics of light and shadow are correctly rendered. VERDICT: PASS

A person standing outdoors at midday with a short shadow dirFAIL
A person standing outdoors at midday with a short shadow directly below them

Evaluator (Qwen 3.6 27B)

The shadow is long and stretches forward, indicating a low sun angle (morning or evening), which directly contradicts the "midday" and "short shadow" requirements of the prompt. Additionally, the shadow's shape is distorted and does not accurately reflect the person's silhouette. VERDICT: FAIL

Medium 67%
A woman standing in front of a full-length mirror, her refleFAIL
A woman standing in front of a full-length mirror, her reflection accurately showing the back of her outfit and reversed orientation

Evaluator (Qwen 3.6 27B)

The reflection is physically impossible. The woman stands with her back to the mirror, yet the reflection shows her front. Additionally, the reflection is not laterally reversed; her right arm in the reflection corresponds to her right arm in reality, rather than mirroring to the left. This violates the fundamental laws of optics for a flat mirror. VERDICT: FAIL

A cat sitting on a polished marble floor with its reflectionPASS
A cat sitting on a polished marble floor with its reflection visible below

Evaluator (Qwen 3.6 27B)

The reflection is vertically inverted, correctly mirroring the cat’s posture and orientation. The distortion aligns with a polished floor surface—slightly blurred and softened, consistent with real-world reflections on glossy marble. Colors match the original subject without unnatural shifts. The reflection’s position directly beneath the cat follows expected physics. No directional or structural inconsistencies are present. VERDICT: PASS

A vase of flowers on a glass table with a clear reflection vPASS
A vase of flowers on a glass table with a clear reflection visible on the table surface

Evaluator (Qwen 3.6 27B)

The reflection on the glass table is physically plausible. It correctly mirrors the vase and flowers with appropriate vertical inversion, slight distortion due to the glass curvature, and reduced brightness consistent with real-world reflections. The stems and waterline are accurately reflected, and the color tones match the original objects. No directional or orientation errors are present. VERDICT: PASS

Hard 100%
A row of trees reflected in a calm lake at sunset, with the PASS
A row of trees reflected in a calm lake at sunset, with the reflection slightly darker and gently rippled

Evaluator (Qwen 3.6 27B)

The reflections are vertically inverted, matching the trees' positions and maintaining correct orientation. The water surface shows gentle ripples that distort the reflection slightly, consistent with a calm lake. The reflected colors mirror the warm sunset tones while appearing subtly darker due to light absorption in water—physically accurate. No directional inconsistencies or unnatural distortions are present. VERDICT: PASS

A city skyline reflected in a rain puddle on a street, with PASS
A city skyline reflected in a rain puddle on a street, with mild distortion and color shift

Evaluator (Qwen 3.6 27B)

The reflection is vertically inverted, which is physically correct for a puddle. The color shift toward cooler, bluer tones in the reflection mimics the sky’s influence on water surfaces. Mild distortion is present, especially near the edges where the puddle meets the ground, consistent with real-world water ripples and surface tension. The framing by the rough pavement edge enhances realism. Overall, the reflection behaves as expected under natural lighting and water conditions. VERDICT: PASS

A mountain range reflected in a still alpine lake, with the PASS
A mountain range reflected in a still alpine lake, with the reflection inverted and subtly muted in color

Evaluator (Qwen 3.6 27B)

The reflection is perfectly inverted and matches the mountain range’s orientation, which is physically plausible for a still lake. The colors in the reflection are subtly muted compared to the actual mountains, consistent with how water reflects light. Shadows align correctly with the light source direction. No distortions or anomalies are present. The scene adheres to real-world optical principles. VERDICT: PASS

Extreme 100%
A clear glass sphere sitting on a mirror surface near a poolPASS
A clear glass sphere sitting on a mirror surface near a pool of water, showing refraction through the sphere, reflection on the mirror, and water reflection simultaneously

Evaluator (Qwen 3.6 27B)

The image displays a glass sphere on a reflective surface. The reflection is vertically inverted and correctly positioned directly beneath the sphere, consistent with a mirror plane. The refraction through the sphere inverts the background scene (pool and sky), which is physically accurate for a convex lens effect. The colors and lighting are consistent between the real scene, the refracted view, and the reflection. No artifacts or logical inconsistencies are present in the optical phenomena. V

A chrome sculpture in a room with a glass floor and large wiPASS
A chrome sculpture in a room with a glass floor and large windows, showing multiple reflections and environmental mapping on every surface

Evaluator (Qwen 3.6 27B)

The reflections are physically plausible. The chrome sculpture accurately mirrors the surrounding environment, including the grid of the glass floor and the bright windows, with appropriate distortion based on its curved surface. The reflection on the glass floor is correctly oriented and slightly dimmer than the object itself, consistent with real-world optics. The lighting is coherent, with highlights matching the window positions. VERDICT: PASS

A crystal wine glass on a polished marble counter next to a PASS
A crystal wine glass on a polished marble counter next to a window overlooking a lake — showing refraction in the glass, reflection on the marble, and the lake's own reflective surface

Evaluator (Qwen 3.6 27B)

The reflections and refractions are physically plausible. The wine glass shows realistic distortion of the background lake through its curved surface, with appropriate light bending at the bowl’s edges. The marble countertop reflects the glass stem and base accurately, matching the angle and intensity of ambient light. The lake’s surface outside mirrors the sky and distant treeline with natural softness. Shadows fall consistently under the glass, aligned with the window’s light source. No direct

World KnowledgeClick to expand58%

World Knowledge measures whether local/sefi-image-5b-turbo depicts recognizable places, landmarks, flags, and known objects with accurate real-world details.

Easy 100%
The Eiffel Tower in Paris on a clear day, architecturally acPASS
The Eiffel Tower in Paris on a clear day, architecturally accurate with its iron lattice structure and three levels

Evaluator (Qwen 3.6 27B)

The image depicts the Eiffel Tower with high architectural accuracy. The intricate iron lattice structure, the three distinct levels (including the first platform and the top observation deck), and the characteristic tapering shape are all correctly rendered. The perspective from the base looking up through the arch is realistic. The background shows a clear blue sky and distant city buildings consistent with the Parisian skyline. There are no visible artifacts or structural distortions. VERDIC

The Taj Mahal in Agra seen from the reflecting pool, with acPASS
The Taj Mahal in Agra seen from the reflecting pool, with accurate dome, minarets, and symmetry

Evaluator (Qwen 3.6 27B)

The image depicts the Taj Mahal with high architectural accuracy, including the central dome, four minarets, and symmetrical layout. The reflection in the pool is clear and realistic. Minor details like the arched entrances and decorative elements are well-rendered. No significant inaccuracies are present. VERDICT: PASS

The Statue of Liberty on her island pedestal, with correct tPASS
The Statue of Liberty on her island pedestal, with correct torch, crown, tablet, and green patina

Evaluator (Qwen 3.6 27B)

The Statue of Liberty is clearly recognizable and accurately depicted with correct proportions, green patina, torch, crown, and tablet. The pedestal architecture matches the real structure, including stone masonry and base details. Minor imperfections like slight texture smoothing or lighting uniformity are typical of AI generation but do not detract from factual accuracy. Core elements align with reality. VERDICT: PASS

Medium 67%
A traditional Japanese shamisen instrument with accurate thrFAIL
A traditional Japanese shamisen instrument with accurate three-stringed design, wooden body, and bachi plectrum

Evaluator (Qwen 3.6 27B)

The image depicts a shamisen, but it is not factually accurate. The body is a solid wooden bowl rather than the traditional drumhead (skin) stretched over a frame. The bridge and tailpiece assembly appears structurally nonsensical and fused. The bachi (plectrum) is floating in mid-air, disconnected from any hand or context. While recognizable as a shamisen, the construction details are physically impossible and architecturally incorrect for a real-world instrument. VERDICT: FAIL

A medieval astrolabe with accurate rete, tympan, and engravePASS
A medieval astrolabe with accurate rete, tympan, and engraved degree markings on brass

Evaluator (Qwen 3.6 27B)

The image depicts a complex brass astronomical instrument that closely resembles a historical astrolabe or planisphere. It features recognizable components such as a rete (star map), degree markings, and engraved scales. The metal texture, patina, and mechanical complexity appear realistic and consistent with antique scientific instruments. While specific historical accuracy of every engraving is hard to verify without expert analysis, the overall subject is clearly recognizable and factually pl

A Japanese kintsugi bowl — cracked ceramic repaired with visPASS
A Japanese kintsugi bowl — cracked ceramic repaired with visible gold lacquer lines

Evaluator (Qwen 3.6 27B)

The image depicts a ceramic bowl with a pale glaze and a dark unglazed foot, consistent with traditional Japanese pottery. The cracks are filled with gold lacquer, accurately representing the kintsugi technique. The lighting, shadows, and texture appear realistic and physically plausible. There are no architectural or factual inaccuracies; the subject is clearly recognizable as a kintsugi bowl. VERDICT: PASS

Hard 0%
A detailed diagram of the water cycle showing evaporation frFAIL
A detailed diagram of the water cycle showing evaporation from the ocean, condensation forming clouds, precipitation as rain, and collection in rivers flowing back to the sea

Evaluator (Qwen 3.6 27B)

The image depicts the water cycle, but the text labels are nonsensical gibberish (e.g., "PRELGIATION" instead of Precipitation, "EOPR EKCTOIN" instead of Evaporation). This makes the diagram factually inaccurate and fails to meet the requirement for a clear, educational diagram. VERDICT: FAIL

A cross-section showing tectonic plates at a convergent bounFAIL
A cross-section showing tectonic plates at a convergent boundary with subduction, a trench, volcanic arc, and layered mantle

Evaluator (Qwen 3.6 27B)

The image depicts a geological cross-section with recognizable features like a subducting plate, trench, and volcanic arc. However, the text labels are nonsensical gibberish (e.g., "CONVETOII NT LIRABES", "MROARE"), which is a common AI generation artifact. While the visual structure is roughly accurate to the prompt's request for a convergent boundary, the illegible and incorrect labeling significantly detracts from the factual accuracy and utility of the diagram. VERDICT: FAIL

An anatomical illustration of the human heart showing four cFAIL
An anatomical illustration of the human heart showing four chambers, valves, aorta, pulmonary arteries, and veins with correct placement

Evaluator (Qwen 3.6 27B)

The image is not factually or anatomically accurate. While it resembles a heart, the major vessels are misshapen and incorrectly connected (e.g., the aorta and pulmonary artery appear fused or malformed). The internal structures shown in the cutaway are nonsensical blobs rather than recognizable chambers or valves. Additionally, there is a stray text artifact ("□ ......") on the left side. It fails to meet the criteria for correct placement and anatomical accuracy. VERDICT: FAIL

Extreme 67%
A Morpho menelaus butterfly with accurate iridescent blue doFAIL
A Morpho menelaus butterfly with accurate iridescent blue dorsal wings, brown ventral wing pattern with eyespots, and correct body morphology

Evaluator (Qwen 3.6 27B)

The image depicts a butterfly with vibrant blue wings and brown edges, resembling a Morpho species. However, the ventral side (underside) is incorrectly shown as having the same bright blue coloration as the dorsal side, which is not accurate for Morpho menelaus—the underside should be predominantly brown with eyespots. Additionally, the body morphology appears slightly off, with an overly elongated abdomen and unusual wing shape. These inaccuracies affect factual correctness. VERDICT: FAIL

The Hagia Sophia interior showing the massive central dome, PASS
The Hagia Sophia interior showing the massive central dome, pendentives, semi-domes, and Islamic calligraphy medallions alongside Byzantine mosaics

Evaluator (Qwen 3.6 27B)

The image accurately depicts the interior of the Hagia Sophia, showcasing its massive central dome, pendentives, and semi-domes. The presence of both Byzantine mosaics and large Islamic calligraphy medallions is historically and architecturally correct for this site. The structural details, lighting, and ornate decorations align with real-world photographs of the monument. No significant inaccuracies are observed. VERDICT: PASS

A specimen of bismuth crystal showing accurate iridescent oxPASS
A specimen of bismuth crystal showing accurate iridescent oxide layers, geometric hopper crystal formation, and characteristic staircase structure

Evaluator (Qwen 3.6 27B)

The image accurately depicts a bismuth crystal specimen. It correctly features the characteristic geometric hopper crystal formation with distinct staircase structures. The iridescent oxide layers are rendered with realistic colors and patterns, consistent with real-world bismuth crystals. The overall composition and details align well with the prompt's requirements for accuracy in depicting this mineral. VERDICT: PASS

Professional Studio 93%

Camera & LightingClick to expand92%

Camera & Lighting measures whether local/sefi-image-5b-turbo follows specific lens, angle, focus, exposure, and lighting instructions.

Easy 100%
A wide-angle shot of a modern living room interior showing ePASS
A wide-angle shot of a modern living room interior showing expansive space with slight barrel distortion at the edges

Evaluator (Gemma 4 26B)

The image successfully implements the requested wide-angle perspective. It shows an expansive, modern living room with a clear sense of depth. Most importantly, it features a very prominent barrel distortion effect at the edges (the curved black vignetting/warping), which directly matches the prompt's requirement for "slight barrel distortion at the edges" (though the effect here is actually quite heavy/pronounced, it fulfills the intent). The lighting is bright and even. VERDICT: PASS

A close-up macro shot of a single dewdrop on a rose petal, sPASS
A close-up macro shot of a single dewdrop on a rose petal, shallow depth of field

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. It features a close-up macro shot of a single, clear dewdrop resting on a pink rose petal. The depth of field is extremely shallow, with the dewdrop in sharp focus while the surrounding petal texture and background are softly blurred. The lighting is soft and natural, highlighting the transparency and curvature of the droplet. VERDICT: PASS

An aerial bird's-eye view looking straight down at a trafficPASS
An aerial bird's-eye view looking straight down at a traffic roundabout with cars and trees

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. It features a true aerial bird's-eye view looking straight down at a circular traffic roundabout. The perspective is top-down, showing the central island, the surrounding road, and the cars within the frame. The lighting is consistent with daylight, and the depth of field is appropriate for a high-altitude aerial shot, keeping the entire scene in relatively sharp focus. VERDICT: PASS

Medium 67%
A portrait lit with classic Rembrandt lighting — triangle ofPASS
A portrait lit with classic Rembrandt lighting — triangle of light on the shadowed cheek, moody atmosphere

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. The lighting is classic Rembrandt lighting, characterized by a distinct triangle of light on the shadowed cheek (the subject's right cheek). The lighting creates a moody, dramatic atmosphere with deep shadows and a high-contrast, painterly aesthetic. The perspective is a standard portrait, and the depth of field is shallow, focusing entirely on the subject's face against a dark, indistinct background. VERDICT: PASS

A landscape photograph bathed in golden hour light — long waFAIL
A landscape photograph bathed in golden hour light — long warm shadows, orange sky, soft glow

Evaluator (Gemma 4 26B)

The image features an orange sky and a warm, golden glow consistent with the prompt. However, the "landscape" is highly abstract and appears to be a textured fabric or a digital artifact rather than a natural landscape. While there are dark shapes that could be interpreted as shadows, the overall composition lacks the realistic depth and perspective of a landscape photograph. VERDICT: FAIL

A high-key portrait of a person against a bright white backgPASS
A high-key portrait of a person against a bright white background, minimal shadows, airy and luminous

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. It features a high-key portrait with a bright white background and a luminous, airy quality. The lighting is extremely bright and even, resulting in minimal shadows on the subject's face. The overall aesthetic is soft and ethereal, successfully capturing the "airy and luminous" requirement. VERDICT: PASS

Hard 100%
A portrait shot at 85mm f/1.4 with creamy bokeh background, PASS
A portrait shot at 85mm f/1.4 with creamy bokeh background, backlit with warm rim light separating the subject from a blurred garden

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. The perspective is a classic portrait shot. The depth of field is shallow, creating a creamy, soft bokeh in the garden background. Most importantly, the lighting features a distinct warm rim light (backlighting) along the subject's hair and shoulders, effectively separating him from the background as requested. VERDICT: PASS

A street photograph with shallow depth of field as if shot aPASS
A street photograph with shallow depth of field as if shot at 50mm f/1.8, subject sharp with soft city lights in the background

Evaluator (Gemma 4 26B)

The image successfully captures the requested aesthetic. The subject is sharp, and the background features a shallow depth of field with soft, circular bokeh from city lights, accurately simulating a 50mm f/1.8 lens. The lighting on the subject's face is natural for a street photograph at night. Despite the presence of unwanted text artifacts over the subject's face, the specific camera/lighting criteria regarding perspective, depth of field, and lighting pattern are met. VERDICT: PASS

A food photograph shot from above at 35mm f/2.8 with naturalPASS
A food photograph shot from above at 35mm f/2.8 with natural window light from the left, slight vignette, warm color temperature

Evaluator (Gemma 4 26B)

The image matches all specified criteria. The perspective is a top-down shot (shot from above). The depth of field is shallow, consistent with an f/2.8 aperture, as the edges of the plate and background are softly blurred. The lighting clearly originates from the window on the left, creating a natural light pattern. The color temperature is warm, and a subtle vignette is visible around the edges of the frame. VERDICT: PASS

Extreme 100%
A tilt-shift photograph of a city street making real-life loPASS
A tilt-shift photograph of a city street making real-life look like a miniature model — selective focus strip, saturated colors

Evaluator (Gemma 4 26B)

The image successfully implements the tilt-shift effect. It features a distinct horizontal strip of sharp focus in the middle ground, with heavy blur (shallow depth of field) in both the foreground and the background, creating the "miniature model" look. The colors are highly saturated, and the high-angle perspective enhances the toy-like appearance. The lighting is bright and even, consistent with a daylight street scene. VERDICT: PASS

A cinematic frame with blue and orange anamorphic lens flarePASS
A cinematic frame with blue and orange anamorphic lens flare streaking horizontally across a dark moody scene

Evaluator (Gemma 4 26B)

The image features a horizontal light streak that transitions from blue on the left to orange on the right, set against a dark, moody background. This directly matches the prompt's description of "blue and orange anamorphic lens flare streaking horizontally across a dark moody scene." The lighting pattern and color scheme are accurately represented. VERDICT: PASS

A portrait with dramatic split lighting — half the face brigPASS
A portrait with dramatic split lighting — half the face brightly lit, the other half in deep shadow, high contrast chiaroscuro

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt's requirements. It features a portrait with dramatic split lighting, where the left side of the face is brightly lit and the right side is cast in deep, heavy shadow. This creates a high-contrast chiaroscuro effect. The perspective is a standard portrait shot, and the lighting pattern is exactly as described. VERDICT: PASS

Color PrecisionClick to expand100%

Color Precision measures whether local/sefi-image-5b-turbo renders requested colors clearly and consistently across subjects, products, and backgrounds.

Easy 100%
A bright red sports car parked on a gray asphalt road — the PASS
A bright red sports car parked on a gray asphalt road — the red is vivid and unambiguous

Evaluator (Gemma 4 26B)

The image features a bright, vivid red sports car parked on a gray asphalt road. The red color of the car is highly saturated and unambiguous, and the road is a clear gray, perfectly matching the color descriptions provided in the prompt. VERDICT: PASS

A woman wearing a royal blue evening dress against a neutralPASS
A woman wearing a royal blue evening dress against a neutral background

Evaluator (Gemma 4 26B)

The image features a woman wearing a vibrant, deep blue evening dress. The color of the dress is a true royal blue, and the background is a neutral, light gray. Both the dress color and the background color strictly adhere to the requirements specified in the prompt. VERDICT: PASS

A field of pure sunflower yellow flowers under a clear blue PASS
A field of pure sunflower yellow flowers under a clear blue sky

Evaluator (Gemma 4 26B)

The image perfectly matches the color descriptions provided in the prompt. The flowers are a vibrant, pure sunflower yellow, and the sky is a clear, bright blue. There are no deviations from the specified color palette. VERDICT: PASS

Medium 100%
A bedroom wall painted in #FF6B35 (burnt orange) with white PASS
A bedroom wall painted in #FF6B35 (burnt orange) with white furniture and natural daylight

Evaluator (Gemma 4 26B)

The image features a bedroom wall painted in a vibrant orange color that closely matches the description of "burnt orange" (#FF6B35). The furniture, including the bed and the two nightstands, is white as requested. The scene is illuminated by bright, natural daylight coming from the left side. All color requirements are accurately met. VERDICT: PASS

A forest scene where the foliage is precisely #2D5F2D (dark PASS
A forest scene where the foliage is precisely #2D5F2D (dark muted green) with brown tree trunks

Evaluator (Gemma 4 26B)

The image features a forest scene with foliage in a dark, muted green shade and brown tree trunks. The green color of the leaves closely aligns with the description of #2D5F2D (a dark, desaturated green), and the tree trunks are clearly brown. The colors accurately reflect the specific color requirements provided in the prompt. VERDICT: PASS

A room with walls in #4A90D9 (medium blue) with warm wooden PASS
A room with walls in #4A90D9 (medium blue) with warm wooden floor and white trim

Evaluator (Gemma 4 26B)

The image accurately reflects the prompt's color specifications. The walls are a medium blue consistent with the hex code #4A90D9. The floor is a warm wooden texture, and the baseboard/trim is white. All color elements match the described requirements. VERDICT: PASS

Hard 100%
A minimalist room with walls in #F5E6CC (warm cream), a sofaPASS
A minimalist room with walls in #F5E6CC (warm cream), a sofa in #2C3E50 (dark navy), and accent pillows in #E74C3C (bright red)

Evaluator (Gemma 4 26B)

The image accurately reflects the requested colors. The walls are a warm cream/off-white, the sofa is a dark navy blue, and the accent pillows are a bright, vibrant red. These colors align with the hex codes and descriptions provided in the prompt (#F5E6CC for walls, #2C3E50 for the sofa, and #E74C3C for the pillows). VERDICT: PASS

A product flat-lay on a #1A1A2E (deep dark blue) background,PASS
A product flat-lay on a #1A1A2E (deep dark blue) background, with a #F0C27F (soft gold) watch and #FFFFFF (pure white) earbuds

Evaluator (Gemma 4 26B)

The image accurately reflects the requested colors. The background is a deep dark blue consistent with #1A1A2E (the hex code is even explicitly written in the image). The watch features a soft gold/beige color matching #F0C27F, and the earbuds are pure white (#FFFFFF). All three color requirements are met precisely. VERDICT: PASS

A graphic design composition with background #FAFAFA (near-wPASS
A graphic design composition with background #FAFAFA (near-white), main shape #FF4500 (orange-red), and accent lines #333333 (dark charcoal)

Evaluator (Gemma 4 26B)

The image accurately reflects the requested color palette. The background is a near-white/off-white color consistent with #FAFAFA. The large central shape is a vibrant orange-red, matching the description of #FF4500. The accent lines are a dark charcoal/near-black color, consistent with #333333. All three specified colors are correctly represented in the composition. VERDICT: PASS

Extreme 100%
A smooth gradient on a wall going from #FF0000 (red) on the PASS
A smooth gradient on a wall going from #FF0000 (red) on the left to #0000FF (blue) on the right, transitioning evenly through purple

Evaluator (Gemma 4 26B)

The image features a smooth gradient transitioning from a bright red on the left to a deep blue on the right. The middle section transitions through a vibrant purple, as requested. The colors align with the specified hex values: the left side is a pure red (#FF0000), the right side is a pure blue (#0000FF), and the transition through purple is even and smooth. VERDICT: PASS

A sunset sky gradient on a flat surface from #FF6B35 (orangePASS
A sunset sky gradient on a flat surface from #FF6B35 (orange) at the bottom to #1B1464 (deep indigo) at the top

Evaluator (Gemma 4 26B)

The image features a vertical gradient. The bottom color is a vibrant orange, which aligns well with the specified #FF6B35. The top color is a deep, dark indigo/violet, which matches the description of #1B1464. The transition between the orange and the deep indigo is smooth and follows the requested color scheme accurately. VERDICT: PASS

A horizontal banner with a gradient from #00C9FF (cyan) on tPASS
A horizontal banner with a gradient from #00C9FF (cyan) on the left to #92FE9D (mint green) on the right

Evaluator (Gemma 4 26B)

The image shows a horizontal banner with a smooth gradient. On the left, the color is a bright cyan, which aligns well with the specified #00C9FF. On the right, the color transitions into a light, bright mint green, which matches the description of #92FE9D. The gradient direction and the color transitions accurately reflect the requested hex values and names. VERDICT: PASS

PhotorealismClick to expand67%

Photorealism measures whether local/sefi-image-5b-turbo produces polished product and studio imagery with realistic materials, lighting, shadows, and finish.

Easy 67%
A single red sneaker on a clean white background with soft sPASS
A single red sneaker on a clean white background with soft studio lighting and gentle shadow — product photography style

Evaluator (Gemma 4 26B)

The image successfully meets the prompt requirements. The red sneaker features a consistent texture and realistic soft studio lighting. The shadow underneath the shoe is natural and well-integrated with the white background. There are no obvious AI artifacts, such as warped laces or nonsensical geometry. The image looks like a high-quality, professional product photograph. VERDICT: PASS

A clear glass perfume bottle on a reflective black surface wFAIL
A clear glass perfume bottle on a reflective black surface with studio lighting, no background distractions

Evaluator (Gemma 4 26B)

The image fails to meet the standard of a real photograph. There are significant AI artifacts regarding the physics of light and glass. The internal structure of the bottle is nonsensical; the "straw" or tube inside is disconnected and morphs into a vague, blurry line. Furthermore, the reflections on the glass and the surface are inconsistent with a real-world light source, and the cap's geometry is structurally impossible and distorted. VERDICT: FAIL

A ripe green apple on a seamless white background with soft PASS
A ripe green apple on a seamless white background with soft side lighting and subtle shadow — catalog style

Evaluator (Gemma 4 26B)

The image is a high-quality studio shot that perfectly matches the prompt. The green apple features realistic textures, including subtle skin pores and natural color variations. The lighting is soft and directional, creating a realistic soft shadow on the seamless white background. There are no obvious AI artifacts, such as distorted shapes or unnatural textures. It passes as a professional product photograph at first glance. VERDICT: PASS

Graphical design 54%

Data VisualisationClick to expand0%

Data Visualisation measures whether local/sefi-image-5b-turbo produces charts with correct labels, proportions, colors, and readable values.

Easy 0%
A bar chart showing monthly sales for January through June wFAIL
A bar chart showing monthly sales for January through June with values 12K, 18K, 15K, 22K, 19K, 25K — labeled axes, color-coded bars, and a title 'Monthly Sales 2025'

Evaluator (Qwen 3 VL)

The chart shows a bar chart titled "Monthly Sales 2025" with six bars. However, the x-axis labels are garbled (e.g., "1ank", "Jaak"), and the y-axis is mislabeled as "CNPPA1KQ". The values on bars (12K, 25K, etc.) do not match the requested sequence (12K, 18K, 15K, 22K, 19K, 25K). The chart type is correct, but critical data and labels are corrupted. VERDICT: FAIL

A pie chart showing market share: Company A 35%, Company B 2FAIL
A pie chart showing market share: Company A 35%, Company B 25%, Company C 20%, Others 20% — each slice a different color with labels and percentages

Evaluator (Qwen 3 VL)

The chart is a pie chart labeled "Market Share" but mislabels segments: "Company A" is 25% (should be 35%), "Others" is 20% (should be 20% total, but labeled as "Ouhlers" and "fOthclx"), and "Company B" is missing. Percentages and labels are incorrect and unreadable. Chart type matches, but data is wrong. VERDICT: FAIL

A simple line graph showing daily website visitors over one FAIL
A simple line graph showing daily website visitors over one week (Mon-Sun): 1200, 1500, 1100, 1800, 2000, 1700, 1400 — with grid lines, axis labels, and title 'Weekly Traffic'

Evaluator (Qwen 3 VL)

The chart shows a line graph titled "Weekly Traffic" with grid lines and axis labels. However, the x-axis labels ("Mono", "1570", "11C", etc.) are illegible and do not represent days Mon-Sun. The y-axis is mislabeled with "Visitors" rotated incorrectly. Data values shown (e.g., "118001", "27.00J") are corrupted and inconsistent with the requested values. Chart type matches, but critical elements are unreadable and incorrect. VERDICT: FAIL

Layout & DesignClick to expand33%

Layout & Design measures whether local/sefi-image-5b-turbo organizes posters, menus, covers, and other designed compositions with the requested hierarchy and elements.

Easy 100%
A movie night poster with a large popcorn image in the centePASS
A movie night poster with a large popcorn image in the center, the title 'MOVIE NIGHT' in bold at the top, and the tagline 'Every Friday at 8 PM' at the bottom

Evaluator (Qwen 3 VL)

The image features a large popcorn bucket centered, with "MOVIE NIGHT" in bold white text at the top and "Every Friday at 8 PM" at the bottom. Text is readable and correctly placed per the prompt. Visual hierarchy is clear, with the popcorn as the focal point and text framing it appropriately. All required layout elements are present and correctly positioned. VERDICT: PASS

An event poster featuring a guitar illustration, the headlinPASS
An event poster featuring a guitar illustration, the headline 'LIVE MUSIC FEST' in large text, and 'Tickets available now' as a tagline below

Evaluator (Qwen 3 VL)

The image displays a guitar illustration with "LIVE MUSIC FEST" in large, layered text above it, and "Tickets available now" below the guitar. Text is readable and correctly placed per the prompt. Visual hierarchy is clear, with headline dominating, tagline subordinate. All required elements are present and correctly positioned. VERDICT: PASS

A book club poster with an open book illustration, the titlePASS
A book club poster with an open book illustration, the title 'JOIN THE BOOK CLUB' prominently displayed, and 'Meets every Wednesday' as the tagline

Evaluator (Qwen 3 VL)

The image displays an open book illustration with "JOIN THE BOOK CLUB" in large, bold text above it, establishing correct visual hierarchy. The tagline "Meets every Wednesday" is clearly readable and positioned below the book. All required layout elements are present and correctly placed. VERDICT: PASS

Medium 0%
A landing page design with a hero header image at the top, aFAIL
A landing page design with a hero header image at the top, a headline and subtitle in the middle, a 'Sign Up' call-to-action button, and a footer with three social media icons

Evaluator (Qwen 3 VL)

The image shows a hero header with a background image, a headline and subtitle (though text is garbled/unreadable), a "Sign Up" button, and a footer with three social icons. The layout structure is present with correct visual hierarchy. However, the text is illegible and appears corrupted, failing the readability requirement. VERDICT: FAIL

A newsletter layout with a banner image header, a two-columnFAIL
A newsletter layout with a banner image header, a two-column body section with text on the left and an image on the right, a highlighted quote block, and a footer with unsubscribe link

Evaluator (Qwen 3 VL)

The image shows a newsletter layout with a header banner, two-column body (text left, image right), a highlighted quote block, and a footer with an unsubscribe link. However, all text is gibberish and unreadable, violating the requirement for readable, correctly placed text. The visual hierarchy is present but meaningless due to illegible content. VERDICT: FAIL

A restaurant menu layout with a logo header, three category FAIL
A restaurant menu layout with a logo header, three category sections (appetizers, mains, desserts) each with items and prices, and a footer with address and hours

Evaluator (Qwen 3 VL)

The image shows a menu with a header logo, three main sections (Appetizers, Mains, Desserts), and a footer with address/hours. However, all text is gibberish, unreadable, and incorrectly placed. The layout structure is present but visually unusable due to illegible content. VERDICT: FAIL

Hard 0%
A magazine double-page spread with a left page containing a FAIL
A magazine double-page spread with a left page containing a large hero photo, and a right page with a two-column article, a pull quote, a sidebar with author bio, and a page number

Evaluator (Qwen 3 VL)

The left page features a large hero photo as required. The right page has a two-column article, a pull quote, and a sidebar with “Author” and bio text. Page number “3” is visible. However, all text is gibberish, unreadable placeholder text, violating the requirement for readable content. Layout structure is correct, but text illegibility fails the core criteria. VERDICT: FAIL

A mobile app UI mockup showing a dashboard with a top navigaFAIL
A mobile app UI mockup showing a dashboard with a top navigation bar, a search field, a 2x2 grid of feature cards with icons, a list of recent activity items, and a bottom tab bar

Evaluator (Qwen 3 VL)

The image shows a mobile app UI with a top navigation bar, search field, 2x2 grid of feature cards, recent activity list, and bottom tab bar — all present. However, all text is garbled and unreadable, violating the requirement for readable, correctly placed text. Visual hierarchy is mostly intact, but illegible text renders the UI non-functional per prompt. VERDICT: FAIL

A fashion magazine spread with a full-bleed left-page photo,FAIL
A fashion magazine spread with a full-bleed left-page photo, right page with three-column layout, headline, body copy, a smaller inset image, and styled page numbers

Evaluator (Qwen 3 VL)

The left page has a full-bleed photo as required. The right page features a three-column layout with a headline, body copy, and a smaller inset image. Page numbers are styled correctly. However, all text is gibberish and unreadable, violating the requirement for readable text. Layout structure is correct, but content fails. VERDICT: FAIL

Style DiversityClick to expand83%

Style Diversity measures whether local/sefi-image-5b-turbo can shift between visual styles like oil painting, pixel art, watercolor, and technical illustration.

Easy 67%
A golden retriever sitting in a garden, painted in the stylePASS
A golden retriever sitting in a garden, painted in the style of an oil painting with visible brushstrokes and rich texture

Evaluator (Qwen 3.5 122B)

The image clearly depicts a golden retriever sitting in a garden. The style is unmistakably that of an oil painting, with visible, expressive brushstrokes throughout the composition. The texture is rich, particularly in the dog’s fur and the surrounding foliage, where dabs and sweeps of paint create depth and movement. The color palette and blending techniques further reinforce the impression of traditional oil painting. All key visual characteristics requested are present and well-executed. VE

A golden retriever sitting in a garden, rendered in JapanesePASS
A golden retriever sitting in a garden, rendered in Japanese anime style with flat colors and large expressive eyes

Evaluator (Qwen 3.5 122B)

The image clearly exhibits Japanese anime style characteristics: the golden retriever has large, expressive eyes with prominent highlights, simplified facial features, and clean outlines. The colors are vibrant yet relatively flat with minimal gradient shading, consistent with anime aesthetics. The background flora is stylized rather than photorealistic, reinforcing the anime look. All key visual traits requested are present and recognizable. VERDICT: PASS

A golden retriever sitting in a garden, rendered in pixel arFAIL
A golden retriever sitting in a garden, rendered in pixel art style with a 32x32 grid and limited color palette

Evaluator (Qwen 3.5 122B)

The image clearly depicts a golden retriever in a garden setting. The pixel art style is unmistakable, with visible blocky pixels forming the dog and background. However, the grid appears much finer than 32x32 — likely 64x64 or higher — and the color palette is not particularly limited, showing many shades of green, pink, blue, and yellow. While stylistically correct, it fails the specific technical constraints. VERDICT: FAIL

Medium 100%
A city street scene in the style of Art Deco — geometric shaPASS
A city street scene in the style of Art Deco — geometric shapes, gold and black palette, ornamental symmetry

Evaluator (Qwen 3.5 122B)

The image perfectly embodies the requested Art Deco style. It features strong geometric shapes, sharp lines, and ornamental symmetry throughout the composition. The color palette is strictly gold and black, creating the luxurious aesthetic typical of the movement. The city street scene is framed within these decorative elements, making the style clearly recognizable and the key visual characteristics fully present. VERDICT: PASS

A woman with a parasol in the style of Impressionism — soft PASS
A woman with a parasol in the style of Impressionism — soft brushstrokes, dappled light, pastel palette

Evaluator (Qwen 3.5 122B)

The image clearly exhibits Impressionist characteristics: soft, visible brushstrokes, dappled light effects, and a pastel color palette. The woman with the parasol is rendered in a style that closely mimics late 19th-century French Impressionism, particularly reminiscent of Renoir. All requested visual elements are present and well-executed. VERDICT: PASS

A portrait of a man in Pop Art style — bold outlines, Ben-DaPASS
A portrait of a man in Pop Art style — bold outlines, Ben-Day dots, flat vivid primary colors

Evaluator (Qwen 3.5 122B)

The image clearly exhibits Pop Art characteristics: bold black outlines define the figure, flat vivid colors (especially the blue shirt and yellow skin tones) are used, and Ben-Day dots are visibly applied throughout the face and background to create shading and texture. The stylized, graphic quality matches the requested aesthetic precisely. All key visual elements of the specified style are present and recognizable. VERDICT: PASS

Hard 100%
A forest rendered in photorealistic watercolor — the detail PASS
A forest rendered in photorealistic watercolor — the detail and lighting of a photo combined with visible watercolor bleeds and paper texture

Evaluator (Qwen 3.5 122B)

The image clearly exhibits the requested photorealistic watercolor style. Key characteristics are present: visible paper texture throughout, characteristic watercolor bleeds and blooms (especially in the foreground foliage), and a blend of photographic lighting/detail with translucent, layered washes. The trees and light filtering through the canopy show both realism and the medium’s fluid nature. VERDICT: PASS

A woman in a futuristic city rendered in cyberpunk Art NouvePASS
A woman in a futuristic city rendered in cyberpunk Art Nouveau — neon colors and tech elements combined with flowing organic lines and floral borders

Evaluator (Qwen 3.5 122B)

The image perfectly embodies the requested "cyberpunk Art Nouveau" style. The central figure and background feature neon-lit futuristic skyscrapers (cyberpunk), while the composition is framed by intricate, flowing vines and large flowers (Art Nouveau). The color palette utilizes vibrant neons (pinks, teals, purples) alongside organic, swirling lines. All key visual characteristics are clearly present and harmoniously integrated. VERDICT: PASS

A medieval castle depicted in Japanese ukiyo-e woodblock priPASS
A medieval castle depicted in Japanese ukiyo-e woodblock print style combined with photorealistic sky and lighting

Evaluator (Qwen 3.5 122B)

The image successfully combines Japanese ukiyo-e woodblock print aesthetics (stylized castle architecture, flat color planes, textured paper effect) with photorealistic sky and lighting. The clouds show realistic depth and gradient, while the castle retains traditional Japanese features like curved roofs and stone walls. The fusion is clear and intentional. VERDICT: PASS

Extreme 67%
An interior scene rendered in soft brutalism — raw concrete PASS
An interior scene rendered in soft brutalism — raw concrete textures combined with rounded pastel furniture and warm diffused light

Evaluator (Qwen 3.5 122B)

The image perfectly embodies the requested "soft brutalism" style. The raw, unfinished concrete walls and floor provide the brutalist foundation, while the rounded, pastel pink and cream furniture introduces the soft, organic contrast. The warm, diffused sunlight streaming from the left creates the specified lighting atmosphere. All key visual characteristics are clearly present and well-executed. VERDICT: PASS

A still life painted in anxious minimalism — sparse compositFAIL
A still life painted in anxious minimalism — sparse composition with uncomfortable negative space, muted colors, and subtle visual tension

Evaluator (Qwen 3.5 122B)

The image displays a minimalist composition with two horizontal bands of muted, grayish-green tones. While it fits the "minimalism" and "muted colors" criteria, it lacks the specific "anxious" quality or "visual tension" requested. The composition is static and serene rather than uncomfortable or tense. The "still life" aspect is entirely absent, as there are no objects depicted. The core stylistic requirements are not met. VERDICT: FAIL

A cityscape rendered in nostalgic futurism — retro-futuristiPASS
A cityscape rendered in nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina

Evaluator (Qwen 3.5 122B)

The image strongly exhibits the requested nostalgic futurism style. The sepia/warm-toned patina creates a melancholic, aged atmosphere. Architectural elements like the domed buildings and spires reflect 1960s retro-futuristic design. The presence of planets in the sky reinforces the space-age theme. The overall aesthetic aligns well with the prompt’s description of retro-futurism and warm-toned nostalgia. VERDICT: PASS