vs

192 evaluations across 6 categories

64%vs95%

Pass Rate

39.1svs45.3s

Avg Latency

Text Rendering › Typography Style › Easylocal/nucleus-image-17b-a2b

Prompt: The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Text Rendering 60%vs100%

Typography StyleClick to expand100%vs100%

Easy 100%vs100%

local/nucleus-image-17b-a2bPASS

The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Evaluator (Qwen 3 VL)

The text “CHAPTER ONE” is spelled correctly. The font style is consistent with a vintage typewriter font — bold, blocky, and evenly spaced, which matches the aesthetic of mechanical typewriters. The paper appears aged with a slightly yellowed tone, and the ink is uniform and crisp, though the prompt requested “slightly uneven ink.” While the ink is clean and consistent, it does not convincingly show the slight imperfections or variations typical of real typewriter ink (e.g., smudges, uneven pres

local/nucleus-image-17b-a2bPASS

The phrase 'TODAY'S SPECIAL' written in white chalk on a dark green chalkboard with chalk dust

Evaluator (Qwen 3 VL)

The text “TODAY’S SPECIAL” is rendered in a style that convincingly mimics white chalk on a dark green chalkboard. The lettering shows the characteristic uneven, textured strokes and slight smudges of chalk writing, with visible chalk dust residue around the letters. The background is a dark green chalkboard with a wooden frame, and chalk pieces are visible at the top left, reinforcing the medium. The spelling is correct: “TODAY’S SPECIAL” (with proper apostrophe in “TODAY’S”). VERDICT: PASS

local/nucleus-image-17b-a2bPASS

The word 'OPEN' displayed on a red LED sign with glowing pixel-style letters

Evaluator (Qwen 3 VL)

The text “OPEN” is spelled correctly. The style of the text matches the described medium: it is rendered in glowing red pixel-style letters, consistent with an LED sign. The individual dots forming the letters are clearly visible, and the overall appearance convincingly emulates an LED display. VERDICT: PASS

Writing accuracyClick to expand50%vs100%

Easy 100%vs100%

local/nucleus-image-17b-a2bPASS

A single large word 'HELLO' centered on a white background

Evaluator (Qwen 3 VL)

The word displayed is "HELLO", spelled correctly with all letters clearly visible: H, E, L, L, O. The text is fully readable, centered on a white background, and matches the intended specification exactly. No misspellings, missing words, or illegible parts are present. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

The word 'ADVENTURE' in big bold letters centered on a plain cream background

Evaluator (Qwen 3 VL)

The word "ADVENTURE" is spelled correctly and is fully readable. All letters are clearly visible and intact. There are no misspellings, missing words, or illegible parts. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

The word 'SERENITY' written large and centered on a solid light blue background

Evaluator (Qwen 3 VL)

The word displayed is "SERENITY", which is spelled correctly. All letters are clearly visible and fully readable against the solid light blue background. There are no misspellings, missing words, or illegible parts. VERDICT: PASS

Medium 100%vs100%

local/nucleus-image-17b-a2bPASS

The sentence 'The quick brown fox jumps high' in clean black text on a white background

Evaluator (Qwen 3 VL)

The text reads: "The quick brown fox jumps high" All words are spelled correctly: - The - quick - brown - fox - jumps - high All words are fully readable and clearly rendered in clean black text on a white background, matching the intended description. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

'Every sunset brings a new dawn ahead' written neatly on a pale yellow background

Evaluator (Qwen 3 VL)

The text in the image reads: "Every sunset brings a new dawn ahead". - All words are spelled correctly. - All words are fully readable. - No words are missing. - No characters are illegible. The text matches the intended phrase exactly. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

'Stars shine brightest in the dark night' displayed on a clean off-white background

Evaluator (Qwen 3 VL)

The text in the image reads: "Stars shine brightest in the dark night". - All words are spelled correctly. - All words are fully readable. - The background is a dark night sky, not a clean off-white background as specified, but the text itself matches the intended phrase exactly. The only deviation from the intended description is the background color, but the question specifically asks whether the text is spelled correctly and readable — not whether the background matches. VERDICT: PASS

Hard 0%vs100%

local/nucleus-image-17b-a2bFAIL

A poster with headline 'SUMMER SALE' in large bold text, body text 'Up to 50% off all items this weekend only' in medium size, and fine print 'Terms and conditions apply. See store for details.' at the bottom in small text

Evaluator (Qwen 3 VL)

The text in the image contains the following errors: - “all this items” should be “all these items” — “this” is incorrect; “these” is the correct plural pronoun to refer to “items”. - The fine print reads “See store for details.” — this is missing the word “the” before “store”, making it grammatically incorrect. It should be “See the store for details.” These are both spelling/grammar errors that make the text inaccurate compared to the intended text. VERDICT: FAIL

local/nucleus-image-17b-a2bFAIL

A flyer with large title 'GRAND OPENING', subtitle 'Join us for food, music, and fun' in medium text, and small disclaimer 'RSVP required. Limited seating available.' at the bottom

Evaluator (Qwen 3 VL)

The text in the image contains multiple errors compared to the intended message: 1. “Grand open 4” — “open 4” is incorrect; should be “Grand Opening”. 2. “for fouk, food music, and fun,” — “fouk” is a misspelling of “food”. 3. “Join-m us, musking” — “Join-m us” is a misspelling of “Join us”; “musking” is a misspelling of “music”. 4. The intended subtitle “Join us for food, music, and fun” is partially obscured by the hand and pen, but the visible text is incorrect. The disclaimer “RSVP require

local/nucleus-image-17b-a2bFAIL

An event banner with headline 'ANNUAL GALA 2025' at the top in large serif font, description 'An evening of fine dining and live jazz' in medium text, and footnote 'Black tie optional. Doors open at 6 PM.' in small text

Evaluator (Qwen 3 VL)

The text on the banner is mostly correct, but there is one misspelling: - "Doors opet 6 PM." — The word "opet" is misspelled; it should be "open". All other words are spelled correctly and fully readable. VERDICT: FAIL

Extreme 0%vs100%

local/nucleus-image-17b-a2bFAIL

A vintage movie poster for 'THE LAST VOYAGE' featuring a tagline 'One ship. One chance. No turning back.' with credits block reading 'Directed by James Harlow. Starring Elena Voss, Marcus Chen, and Sofia Reyes. Music by Thomas Grant. A Paramount Pict

Evaluator (Qwen 3 VL)

The text on the poster contains multiple errors and is not fully readable: 1. Tagline: “ONE FANLOW. APANS MOURIRING BUCK...” — This is garbled and does not match the intended “One ship. One chance. No turning back.” The words are misspelled and nonsensical. 2. Credits: - “DIRECTED BYHAISE” — Should be “Directed by James Harlow” — misspelled and incomplete. - “STARRING CHEN - SINGEVS” — Should be “Starring Elena Voss, Marcus Chen, and Sofia Reyes” — “SINGEVS” is not “Sofia Reyes”; “Elena

local/nucleus-image-17b-a2bFAIL

A magazine cover titled 'NATIONAL EXPLORER' with cover lines 'The Hidden Temples of Cambodia — page 34', 'Interview: Dr. Maria Santos on Climate Action', 'Top 10 Trails You Must Hike This Year', and a barcode area with 'Issue 247 | March 2025 | $6.99

Evaluator (Qwen 3 VL)

The intended text was: - Magazine title: "NATIONAL EXPLORER" - Cover lines: 1. "The Hidden Temples of Cambodia — page 34" 2. "Interview: Dr. Maria Santos on Climate Action" 3. "Top 10 Trails You Must Hike This Year" - Barcode area: "Issue 247 | March 2025 | $6.99" Actual text in the image: - Magazine title: "NADU" (incorrect spelling; should be "NATIONAL EXPLORER") - Cover lines: 1. "Netial Narial of Cambonnda - 344" → "The Hidden Temples of Cambodia — page 34" → misspelled: "Netial" (

local/nucleus-image-17b-a2bFAIL

A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtitle 'Raw & Unfiltered', weight '12 oz (340g)', description 'Pure wildflower honey harvested from sustainable apiaries in the Pacific Northwest. No additives or preservatives.', and small te

Evaluator (Qwen 3 VL)

The text on the label contains multiple errors: 1. On the lid: “ORGANIC” is misspelled as “ORGANICI”. 2. On the main label: - “12 oz before” is incorrect — should be “12 oz (340g)”. - “626 willflower harvanted” — “willflower” is misspelled (should be “wildflower”), “harvanted” is misspelled (should be “harvested”). - “in the 0 Sustainable apiraiey fm bate” — “0” is not a word, “apiraiey” is misspelled (should be “apiaries”), “fm bate” is illegible and incorrect (should be “from the Pac

Spatial Reasoning 61%vs97%

Attributes BindingClick to expand78%vs100%

Easy 100%vs100%

local/nucleus-image-17b-a2bPASS

A red ball and a blue cube sitting side by side on a white surface

Evaluator (Qwen 3.5 122B)

The image contains a red sphere (ball) on the left and a blue cube on the right. Both objects are positioned side by side on a white surface. The colors and shapes are correctly assigned to their respective objects as requested. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A green apple and an orange pumpkin on a wooden cutting board

Evaluator (Qwen 3.5 122B)

The image displays a green apple on the left and an orange pumpkin on the right, both resting on a wooden cutting board. The colors and identities of the objects match the prompt perfectly without any attribute swapping. * **Green Apple:** Correctly identified as green and apple-shaped. * **Orange Pumpkin:** Correctly identified as orange and pumpkin-shaped. * **Wooden Cutting Board:** Correctly identified as the surface. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A yellow taxi and a black limousine parked next to each other on a street

Evaluator (Qwen 3.5 122B)

The image displays two vehicles on a street. 1. **Yellow Taxi:** In the foreground, there is a yellow sedan with a taxi roof sign and typical taxi markings. 2. **Black Limousine:** In the background, there is a long, black stretched vehicle, clearly a limousine. The attributes are correctly bound: the yellow vehicle is the taxi, and the black vehicle is the limousine. They are positioned next to each other on the street as requested. VERDICT: PASS

Medium 100%vs100%

local/nucleus-image-17b-a2bPASS

A large red circle, a small blue triangle, and a medium green star arranged on a white canvas

Evaluator (Qwen 3.5 122B)

The image contains exactly what was requested: 1. **Large red circle:** Present at the top. 2. **Small blue triangle:** Present at the bottom left. 3. **Medium green star:** Present at the bottom right. 4. **White canvas:** The background is white. All attributes (color, size, shape) are correctly bound to the correct objects with no swaps. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A tall thin white candle, a short fat red candle, and a medium spiral yellow candle on a shelf

Evaluator (Qwen 3.5 122B)

The image contains three candles on a shelf matching the description perfectly: 1. **Left:** A tall, thin, white candle. 2. **Middle:** A short, fat, red candle. 3. **Right:** A medium-height, spiral-shaped, yellow candle. All attributes (color, size, shape) are correctly bound to the respective objects with no swaps. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A big brown horse, a small white rabbit, and a medium-sized black cat standing in a meadow

Evaluator (Qwen 3.5 122B)

The image contains a large brown horse, a small white rabbit, and a black cat in a grassy meadow. All attributes match the prompt perfectly. - **Horse:** Big, brown. - **Rabbit:** Small, white. - **Cat:** Medium-sized, black. - **Setting:** Meadow. VERDICT: PASS

Hard 33%vs100%

local/nucleus-image-17b-a2bFAIL

A pink fire truck, a blue banana, and a green sun in a surreal landscape

Evaluator (Qwen 3.5 122B)

Based on my examination of the image and the provided prompt ("A pink fire truck, a blue banana, and a green sun in a surreal landscape"): **Object Analysis:** 1. **Fire Truck:** The truck is clearly **pink**. This matches the prompt. 2. **Banana:** There is a large, prominent banana in the foreground which is **green**. The prompt requested a **blue** banana. While there is a blue, banana-shaped object attached to the ladder on the truck, the primary "banana" object in the scene is green.

local/nucleus-image-17b-a2bFAIL

A purple carrot, a red tree with red leaves, and a white watermelon sliced open

Evaluator (Qwen 3.5 122B)

The image contains a purple carrot and a red tree with red leaves, which match the prompt. However, the prompt requested a "white watermelon sliced open." The image shows a whole, uncut white watermelon with green stripes. The "sliced open" attribute is missing. VERDICT: FAIL

local/nucleus-image-17b-a2bPASS

An orange elephant, a yellow penguin, and a silver strawberry arranged on a table

Evaluator (Qwen 3.5 122B)

The image displays three distinct objects on a table. 1. **Left:** An orange elephant. 2. **Middle:** A penguin with a bright yellow belly (matching "yellow penguin"). 3. **Right:** A strawberry that is silver/metallic in color. All objects match the requested colors and types without any swaps. VERDICT: PASS

CompositionalityClick to expand89%vs100%

Easy 100%vs100%

local/nucleus-image-17b-a2bPASS

A tabby cat wearing oversized red sunglasses

Evaluator (Qwen 3.5 122B)

- **Tabby cat:** Present (The cat has distinct tabby striping). - **Oversized:** Present (The glasses are large relative to the cat's face). - **Red sunglasses:** Present (The frames are bright red with dark lenses). All elements are clearly visible and combined coherently. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A golden retriever wearing a tiny cowboy hat

Evaluator (Qwen 3.5 122B)

- **Golden Retriever:** Present. The dog clearly exhibits the physical characteristics of a Golden Retriever (golden coat, floppy ears, facial structure). - **Wearing:** Present. The hat is positioned on the dog's head. - **Tiny cowboy hat:** Present. The dog is wearing a brown cowboy hat that is proportionate to its head size. All described elements are present and combined coherently. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A penguin holding a colorful umbrella

Evaluator (Qwen 3.5 122B)

The image clearly depicts a penguin standing in a snowy environment. The penguin is holding a large umbrella with distinct rainbow-colored panels (red, yellow, green, blue, etc.). All elements from the prompt are present and combined coherently. * **Penguin:** Present. * **Holding:** Present. * **Colorful umbrella:** Present. VERDICT: PASS

Medium 100%vs100%

local/nucleus-image-17b-a2bPASS

A medieval knight in full armor sitting at a desk typing on a modern laptop

Evaluator (Qwen 3.5 122B)

- **Medieval knight:** Present. The figure is clearly dressed as a knight. - **Full armor:** Present. The figure is wearing a complete suit of plate armor including a helmet and gauntlets. - **Sitting at a desk:** Present. The knight is seated on a wooden chair at a wooden table. - **Typing on a modern laptop:** Present. A modern-looking laptop is on the desk, and the knight's armored hands are positioned on the keyboard as if typing. All described elements are clearly visible and integrated

local/nucleus-image-17b-a2bPASS

A Roman gladiator taking a selfie with a smartphone in the Colosseum

Evaluator (Qwen 3.5 122B)

- **Roman gladiator:** Present (man in armor with helmet and plume). - **Taking a selfie:** Present (holding phone up, looking at it). - **Smartphone:** Present (black smartphone held in hand). - **Colosseum:** Present (distinctive arched architecture in background). All described elements are clearly visible and combined into a single, coherent scene. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

An astronaut in a spacesuit riding a bicycle through a park

Evaluator (Qwen 3.5 122B)

- **Astronaut:** Present (The figure is wearing a full spacesuit). - **Spacesuit:** Present (Detailed white suit with patches and helmet). - **Riding a bicycle:** Present (The figure is seated on and operating a black bicycle). - **Park:** Present (Background features green grass, trees, benches, and a paved path). All elements are clearly visible and combined coherently in a single scene. VERDICT: PASS

Hard 67%vs100%

local/nucleus-image-17b-a2bPASS

An octopus wearing a top hat and monocle playing chess on the surface of the moon with Earth visible in the background

Evaluator (Qwen 3.5 122B)

The image contains all the requested elements: - **Octopus:** Present (large pink octopus). - **Top hat:** Present (black hat with red band). - **Monocle:** Present (gold-rimmed lens over one eye). - **Playing chess:** Present (chessboard with pieces). - **Surface of the moon:** Present (cratered grey ground). - **Earth visible in the background:** Present (blue planet in the sky). All elements are clearly depicted and coherently combined. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A dinosaur in a business suit giving a PowerPoint presentation in a modern office while drinking coffee

Evaluator (Qwen 3.5 122B)

- **Dinosaur:** Present (Green T-Rex). - **Business suit:** Present (Dark jacket, white shirt, blue tie). - **PowerPoint presentation:** Present (Screen with charts/graphs in background). - **Modern office:** Present (Conference table, laptop, screen). - **Drinking coffee:** Present (Holding cup, liquid pouring into mouth). All elements are clearly visible and integrated into a coherent scene. VERDICT: PASS

local/nucleus-image-17b-a2bFAIL

A robot dressed as a pirate sailing a paper boat on a lake of lava while juggling three kittens

Evaluator (Qwen 3.5 122B)

The image contains most elements but fails on the specific count and action regarding the kittens. * **Robot dressed as a pirate:** Present (wearing a pirate hat and chest emblem). * **Paper boat:** Present (white origami boat). * **Lake of lava:** Present (glowing red liquid). * **Juggling three kittens:** **FAIL**. There are six kittens visible (two held in hands, four floating/jumping in the air), not three. Additionally, the action resembles floating or jumping rather than a clear

CountingClick to expand11%vs89%

Easy 33%vs100%

local/nucleus-image-17b-a2bFAIL

Exactly 3 red apples arranged on a wooden table

Evaluator (Qwen 3.5 122B)

I see 4 red apples arranged on the wooden table. The prompt requested exactly 3 red apples. Since the actual count is 4, the count does not match the request. VERDICT: FAIL

local/nucleus-image-17b-a2bFAIL

Exactly 5 yellow rubber ducks sitting in a row on a bathroom shelf

Evaluator (Qwen 3.5 122B)

I count 6 yellow rubber ducks sitting in a row on the shelf. The prompt requested exactly 5. Since the actual count (6) does not match the requested count (5), the image fails the specific constraint. VERDICT: FAIL

local/nucleus-image-17b-a2bPASS

Exactly 4 white candles standing on a dark surface

Evaluator (Qwen 3.5 122B)

I see exactly 4 white candles standing on a dark surface. The count matches the request perfectly. VERDICT: PASS

Medium 0%vs100%

local/nucleus-image-17b-a2bFAIL

2 orange cats and 3 black dogs sitting together on a grassy lawn

Evaluator (Qwen 3.5 122B)

The image shows 2 orange cats, which matches the request. However, there are only 2 black dogs visible, whereas the prompt requested 3 black dogs. Since the count of dogs is incorrect, the image does not fully satisfy the prompt. VERDICT: FAIL

local/nucleus-image-17b-a2bFAIL

4 red roses and 2 white daisies in a glass vase on a table

Evaluator (Qwen 3.5 122B)

I count 6 red roses and 3 white daisies in the image. The prompt requested 4 red roses and 2 white daisies. Since the actual counts (6 roses, 3 daisies) do not match the requested counts (4 roses, 2 daisies), the image does not meet the specific criteria. VERDICT: FAIL

local/nucleus-image-17b-a2bFAIL

1 blue butterfly and 3 yellow butterflies resting on a garden fence

Evaluator (Qwen 3.5 122B)

The image contains 2 blue butterflies (one on the wood, one on the wire) and 3 yellow butterflies (two on the wood, one on the wire). The prompt requested exactly 1 blue butterfly. Since there is an extra blue butterfly present, the count does not match the request. VERDICT: FAIL

Hard 0%vs67%

local/nucleus-image-17b-a2bFAIL

Exactly 9 goldfish swimming in a large round fishbowl

Evaluator (Qwen 3.5 122B)

I count 11 goldfish in the image. 1. Top left 2. Top middle (small) 3. Top right 4. Middle left 5. Middle center-left 6. Middle center 7. Middle right 8. Bottom left 9. Bottom center 10. Bottom right 11. Far right edge (small) The prompt requested exactly 9 goldfish. The actual count is 11. VERDICT: FAIL

local/nucleus-image-17b-a2bFAIL

A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 apples

Evaluator (Qwen 3.5 122B)

The image contains 3 bananas, 3 oranges, and 2 apples. The prompt requested 4 oranges, but only 3 are visible in the bowl. Therefore, the count does not exactly match the request. VERDICT: FAIL

local/nucleus-image-17b-a2bFAIL

Exactly 11 colorful balloons floating against a clear blue sky

Evaluator (Qwen 3.5 122B)

I count 12 balloons in the image: 1. Cyan (top left) 2. Orange (top center) 3. Green (top right) 4. Yellow (middle left) 5. Yellow (middle center) 6. Purple (middle) 7. Dark Blue (top right) 8. Blue (middle right) 9. Red (far right) 10. Pink (bottom left) 11. Dark purple/grey (center, partially hidden) 12. Green (bottom right) The prompt requested exactly 11 balloons, but there are 12. VERDICT: FAIL

NegationClick to expand44%vs89%

Easy 67%vs100%

local/nucleus-image-17b-a2bPASS

A tropical beach scene with no people

Evaluator (Qwen 3.5 122B)

The image depicts a serene tropical beach scene featuring palm trees, blue ocean water, and a sandy shore. I have carefully scanned the entire image, including the beach, the water, and the background foliage. There are absolutely no human figures, swimmers, or people visible anywhere in the scene. The exclusion criteria is fully met. VERDICT: PASS

local/nucleus-image-17b-a2bFAIL

A cat lying on a couch without a collar

Evaluator (Qwen 3.5 122B)

The image shows a tabby cat lying on a grey couch. However, contrary to the exclusion constraint, the cat is clearly wearing a black collar with a gold buckle around its neck. Since the excluded element (a collar) is present, the image does not meet the specific negative constraint. VERDICT: FAIL

local/nucleus-image-17b-a2bPASS

A classroom with desks and a chalkboard but no students

Evaluator (Qwen 3.5 122B)

The image depicts an empty classroom setting. I see rows of wooden desks and chairs facing a large green chalkboard mounted on the wall. The room is completely devoid of people; there are absolutely no students or teachers visible in the frame. The exclusion criteria "no students" is fully met. VERDICT: PASS

Medium 33%vs100%

local/nucleus-image-17b-a2bFAIL

A sandy beach with palm trees, seashells, and a beach umbrella but no water or ocean visible

Evaluator (Qwen 3.5 122B)

The image displays a sandy beach with palm trees, seashells, and a yellow beach umbrella as requested. However, contrary to the exclusion criteria, there is clearly a blue ocean visible in the background, stretching across the horizon behind the umbrella and sand. The water is a prominent feature of the scene. VERDICT: FAIL

local/nucleus-image-17b-a2bPASS

An office workspace with a desk, chair, bookshelf, and lamp but no computers or screens

Evaluator (Qwen 3.5 122B)

The image shows a clean office workspace with a desk, a black office chair, a white desk lamp, a potted plant, stacks of paper, an open notebook, a green mug, and a filing cabinet. There are no computers (laptops or desktops) or screens (monitors) visible in the scene. The black object on the papers appears to be a smartphone or a small case, but it does not constitute a computer or screen in the context of the exclusion. VERDICT: PASS

local/nucleus-image-17b-a2bFAIL

A kitchen with pots, pans, a stove, and a cutting board but no food

Evaluator (Qwen 3.5 122B)

The image shows a kitchen scene with pots, pans, a stove, and a cutting board as requested. However, on the cutting board in the foreground, there is a small bowl containing a white substance that appears to be chopped food (likely onions, garlic, or cheese). Since the prompt explicitly excluded food, this element violates the constraint. VERDICT: FAIL

Hard 33%vs67%

local/nucleus-image-17b-a2bFAIL

A pepperoni pizza with tomato sauce, basil, and crust but absolutely no cheese

Evaluator (Qwen 3.5 122B)

The image shows a pizza covered in a significant layer of melted, yellowish-white cheese with browned spots. The prompt explicitly requested "absolutely no cheese," but the image is dominated by it. VERDICT: FAIL

local/nucleus-image-17b-a2bPASS

A car body with windows, doors, and headlights but no wheels

Evaluator (Qwen 3.5 122B)

The image displays a motion-blurred view of a vehicle, likely a car, characterized by horizontal streaks of blue and white light. The blur obscures fine details, but the general shape suggests a car body with horizontal lines indicating windows or door panels. The bottom section is dark and indistinct. Crucially, there are no visible wheels; the lower area is just a dark shadow or blur, consistent with the "no wheels" constraint. The excluded element (wheels) is absent. VERDICT: PASS

local/nucleus-image-17b-a2bFAIL

A human face with eyes, nose, and mouth but no eyebrows

Evaluator (Qwen 3.5 122B)

The image shows a human face with eyes, nose, and mouth. However, the prompt explicitly requested "no eyebrows." The image clearly displays eyebrows—specifically, there are realistic eyebrows above the eyes, and additionally, sketch-like drawn eyebrows are visible just above the natural ones. Since the excluded element (eyebrows) is present, the image fails the criteria. VERDICT: FAIL

Relative PositionClick to expand83%vs100%

Easy 67%vs100%

local/nucleus-image-17b-a2bFAIL

A cat sitting on top of a cardboard box

Evaluator (Qwen 3.5 122B)

The image shows a cat sitting **inside** an open cardboard box, with its front paws resting on the rim. The prompt specifies the cat is "on top of" the box. While the cat is elevated, it is clearly positioned within the interior volume of the box rather than on the upper surface of a closed box or balanced on top. The spatial relationship "on top of" is inaccurate compared to "in" or "inside". VERDICT: FAIL

local/nucleus-image-17b-a2bPASS

A red ball underneath a wooden chair

Evaluator (Qwen 3.5 122B)

The image displays a wooden chair with a red ball positioned directly beneath its seat, resting on the floor between the legs. This arrangement perfectly matches the prompt's description of the ball being "underneath" the chair. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A coffee mug next to an open book on a desk

Evaluator (Qwen 3.5 122B)

The image shows a grey coffee mug positioned to the left and an open book positioned to the right. Both objects are resting on a flat, light-colored surface, consistent with a desk. The spatial arrangement matches the description "next to" and "on a desk" perfectly. VERDICT: PASS

Medium 100%vs100%

local/nucleus-image-17b-a2bPASS

A bird perched on a branch, the branch extending from a tree next to a park bench

Evaluator (Qwen 3.5 122B)

The image shows a bird perched on a branch that extends from a tree trunk on the right side. In the lower left foreground, there is a park bench. The spatial arrangement matches the description: the bird is on the branch, the branch is part of the tree, and the tree is situated next to the bench in the park setting. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A cat sitting on a pillow, the pillow placed on a wooden chair

Evaluator (Qwen 3.5 122B)

The image shows a tabby and white cat sitting upright on a white cushion. This cushion is placed directly on the seat of a light wooden chair. The spatial relationships described in the prompt are accurate: the cat is on the pillow, and the pillow is on the chair. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A candle on top of a stack of books, the books next to a potted plant

Evaluator (Qwen 3.5 122B)

The image shows a lit candle sitting directly on top of a vertical stack of books. To the right of this stack, there is a green potted plant. The spatial relationships described in the prompt ("candle on top of a stack of books" and "books next to a potted plant") are accurately represented in the image. VERDICT: PASS

Hard 67%vs100%

local/nucleus-image-17b-a2bFAIL

Five chess pieces on a board: a king in the center, a queen to its left, a bishop to its right, a rook above the king, and a knight below the king

Evaluator (Qwen 3.5 122B)

The image shows a central golden King, but the surrounding pieces do not match the prompt's description. To the left is a golden Knight, not a Queen. To the right is a brown Knight, not a Bishop. In front (below) the King is a brown Pawn, not a Knight. Behind (above) the King are other pieces, but not a Rook directly aligned. Additionally, there are more than five pieces visible (at least seven). The spatial relationships are incorrect. VERDICT: FAIL

local/nucleus-image-17b-a2bPASS

A desk scene: laptop in the center, coffee mug to the left, phone to the right, lamp above, and notebook below

Evaluator (Qwen 3.5 122B)

The image perfectly matches the described spatial arrangement. The laptop is centrally placed on the desk. To its left sits a coffee mug, and to its right stands a phone. A lamp hangs directly above the setup, casting light downwards. In the foreground, positioned "below" the laptop from the viewer's perspective, lies an open notebook with a pen. All positional relationships are accurate. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

Five animals in a field: a horse on the far left, a cow to the right of the horse, a sheep in the center, a goat to the right of the sheep, and a dog on the far right

Evaluator (Qwen 3.5 122B)

The image displays five animals standing in a row in a grassy field. From left to right, the arrangement is: a brown horse on the far left, a black and white cow immediately to its right, a sheep in the center, a white goat to the right of the sheep, and a golden dog on the far right. This perfectly matches the specified spatial relationships. VERDICT: PASS

Extreme 100%vs100%

local/nucleus-image-17b-a2bPASS

A landscape with a red barn in the foreground at large scale, a windmill in the midground at medium scale, and snow-capped mountains in the background at small scale, all with consistent depth perspective

Evaluator (Qwen 3.5 122B)

The image displays a red barn situated in the foreground on the right side, appearing large. To its left, a windmill stands in the midground area, appearing slightly smaller in visual mass but tall. Behind both structures, a range of snow-capped mountains spans the background, appearing smaller due to distance. The depth perspective is consistent, with the field in the immediate front, the structures in the middle, and the mountains in the far back. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A beach scene with a sandcastle in the foreground, a surfer on the waves in the midground, and a cruise ship on the horizon in the background, sizes decreasing with distance

Evaluator (Qwen 3.5 122B)

The image displays a clear depth progression matching the prompt. A large sandcastle sits in the immediate foreground at the bottom. A surfer is positioned in the middle distance on the waves. A cruise ship is visible on the horizon line in the background. The relative sizes and positions (foreground to background) are accurate. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A city park with a park bench and pigeons in the foreground, a fountain in the midground, and skyscrapers in the background, each layer progressively smaller

Evaluator (Qwen 3.5 122B)

The image displays a clear depth progression. In the immediate foreground, a park bench sits on the left with numerous pigeons scattered on the pavement. In the midground, a large fountain is centrally located. In the background, tall skyscrapers rise above the tree line. The spatial arrangement of foreground (bench/pigeons), midground (fountain), and background (skyscrapers) is accurate. VERDICT: PASS

Scale & ProportionsClick to expand56%vs100%

Easy 67%vs100%

local/nucleus-image-17b-a2bPASS

A person standing next to a yellow school bus, showing realistic relative scale

Evaluator (Qwen 3.5 122B)

The image shows a person standing next to a large yellow school bus. The person's height relative to the bus appears realistic; their head reaches roughly the lower section of the bus's side, well below the roofline, which is consistent with the actual dimensions of a school bus (approx. 10-11 feet tall) versus an adult human. The wheels and bumper sizes also align correctly with the person's scale. VERDICT: PASS

local/nucleus-image-17b-a2bFAIL

A ladybug sitting on a large green leaf, shown at realistic proportions

Evaluator (Qwen 3.5 122B)

The image shows a ladybug that appears disproportionately large relative to the leaf. In a realistic scenario, a ladybug (typically 4-8mm) would appear much smaller on a "large" leaf. Here, the ladybug occupies a significant portion of the visible leaf area, suggesting the scale is off or the leaf is actually quite small, contradicting the "large" descriptor and "realistic proportions" requirement. VERDICT: FAIL

local/nucleus-image-17b-a2bPASS

A child holding a basketball, both at correct relative sizes

Evaluator (Qwen 3.5 122B)

The image shows a young boy holding a basketball. The size of the basketball relative to the child's torso and hands appears realistic and consistent with real-world proportions for a child of that age holding a standard or youth-sized basketball. The ball is large but not disproportionately huge, and the grip looks natural. VERDICT: PASS

Medium 100%vs100%

local/nucleus-image-17b-a2bPASS

A tiny person standing on the rim of a coffee cup, looking down into it

Evaluator (Qwen 3.5 122B)

The image depicts a miniature figurine standing on the rim of a white coffee cup filled with dark liquid. The figure is significantly smaller than the cup, accurately reflecting the "tiny person" description in the prompt. The relative scale is consistent and realistic for a miniature photography concept. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A giant tabby cat walking between city skyscrapers like a kaiju

Evaluator (Qwen 3.5 122B)

The image successfully depicts a tabby cat that is enormous in scale relative to its surroundings. The cat's paws are larger than the cars visible in the background, and its body height rivals the lower stories of the surrounding skyscrapers. This accurately reflects the "giant" and "kaiju" size relationships requested in the prompt. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A miniature house sitting on the palm of someone's hand

Evaluator (Qwen 3.5 122B)

The image shows a small model house resting comfortably in the center of an open human palm. The scale is consistent with a "miniature house," as it fits entirely within the hand without looking disproportionately large or tiny. The proportions are logical for the described scene. VERDICT: PASS

Hard 0%vs100%

local/nucleus-image-17b-a2bFAIL

A group photo of an elephant, a horse, a dog, a cat, and a mouse, all standing in a line with correct real-world proportional sizes

Evaluator (Qwen 3.5 122B)

The image fails to meet the prompt's requirements in two main ways. First, the prompt requested a "dog," but the image contains three cats instead. Second, and more critically regarding proportions, the mouse is disproportionately large. In reality, a mouse is significantly smaller than a cat (roughly 1/4 to 1/5 the size), but in the image, the mouse is nearly as tall as the sitting cats. VERDICT: FAIL

local/nucleus-image-17b-a2bFAIL

A dinner table scene with a plate, fork, knife, wine glass, salt shaker, and peppercorn, all at realistic relative sizes

Evaluator (Qwen 3.5 122B)

The image is missing the salt shaker and peppercorn entirely, so the requirement for "all" objects to be present with realistic sizes cannot be met. Additionally, the fork in the foreground is disproportionately large compared to the plate and cutlery in the background, indicating inconsistent scaling and perspective. VERDICT: FAIL

local/nucleus-image-17b-a2bFAIL

A farm scene with a barn, tractor, farmer, cow, chicken, and a fence post, all at correct proportional scale

Evaluator (Qwen 3.5 122B)

The image contains several objects with incorrect relative sizes. Specifically, the chickens are disproportionately large. The chicken perched on the fence post is nearly as tall as the farmer standing in the background, which is physically impossible in reality. The other chickens are also oversized relative to the cow and the grass. Therefore, the proportional scale is not correct. VERDICT: FAIL

Human realism 79%vs93%

Faces & ExpressionsClick to expand92%vs100%

Easy 100%vs100%

local/nucleus-image-17b-a2bPASS

A close-up portrait of a young woman with a warm natural smile

Evaluator (Qwen 3 VL)

The face in the image is largely symmetrical with correctly proportioned features: eyes, nose, and mouth are well-placed and balanced. The smile appears natural and warm, with visible teeth and slight crinkling around the eyes that conveys genuine happiness. There are no obvious distortions—no asymmetrical facial structures, unnatural eye shapes, or misaligned features. The lighting and skin texture look realistic, and the emotion is clearly readable as joyful and sincere. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A portrait of a middle-aged man with a calm, serious expression

Evaluator (Qwen 3 VL)

The face in the image is largely symmetrical with correctly rendered features: the eyes, nose, and mouth are balanced, and the hairline and ear placement are proportionate. The man appears middle-aged, with visible signs of aging such as wrinkles around the eyes and forehead, and gray streaks in his hair — consistent with the prompt. His expression is calm and serious, conveyed through slightly narrowed eyes, a neutral mouth, and a focused gaze — all of which are clearly readable. There are no s

local/nucleus-image-17b-a2bPASS

A headshot of a teenager with a relaxed, neutral face

Evaluator (Qwen 3 VL)

The face in the image is symmetric and features are proportionally correct — eyes, nose, mouth, and ears are well-placed and balanced. The teenager’s expression is neutral and relaxed, with no strong emotion conveyed, matching the prompt. There are no visible distortions; the facial structure is natural and well-rendered. The lighting and composition support a clear, clean headshot. VERDICT: PASS

Medium 100%vs100%

local/nucleus-image-17b-a2bPASS

A close-up of a person's face showing clear disgust, nose wrinkled and lip curled

Evaluator (Qwen 3 VL)

The face in the image is a close-up showing a strong expression of disgust. The nose is visibly wrinkled, and the lips are curled downward and outward, consistent with the prompt. The eyebrows are drawn together and furrowed, enhancing the expression. The facial features are largely symmetric and anatomically plausible, with no major distortions or unnatural artifacts. The emotion is clearly readable — the expression is unmistakably disgust. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A portrait capturing genuine surprise — raised eyebrows, wide eyes, slightly open mouth

Evaluator (Qwen 3 VL)

The face is largely symmetric with correctly rendered features: the eyes, eyebrows, nose, and mouth are well-proportioned and aligned. The intended emotion — genuine surprise — is clearly readable through the wide eyes, raised eyebrows, and slightly open mouth, all of which are accurately depicted. There are no significant distortions; the facial structure is natural and the expression is coherent and convincing. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A face showing quiet melancholy — downcast eyes, slightly drawn mouth, distant gaze

Evaluator (Qwen 3 VL)

The face in the image is largely symmetric with correctly rendered features — eyes, nose, and mouth are proportionally aligned and anatomically plausible. The lighting and shadows are consistent with a somber, low-key mood. The intended emotion — quiet melancholy — is clearly readable: the eyes are downcast, the mouth is slightly drawn and downturned, and the gaze is distant and introspective. There are no significant distortions; the facial structure is natural, and the expression is coherent

Hard 100%vs100%

local/nucleus-image-17b-a2bPASS

An elderly woman in her 80s with deep wrinkles and white hair, smiling warmly with kind eyes

Evaluator (Qwen 3 VL)

The face is largely symmetrical, with both eyes, eyebrows, and the curve of the smile appearing balanced. The features are anatomically plausible: the nose, ears, and mouth are correctly proportioned and positioned. The deep wrinkles around the eyes and mouth are natural and consistent with an elderly person’s facial aging. The white hair is realistic in texture and distribution. The emotion is clearly readable — the warm, genuine smile with crinkled eyes conveys kindness and joy. There are no s

local/nucleus-image-17b-a2bPASS

A 5-year-old child with round cheeks and large eyes, crying with tears streaming down their face

Evaluator (Qwen 3 VL)

The face is largely symmetric with correct anatomical features: the eyes, nose, and mouth are centered and proportionate, and the tear streams are symmetrically falling from each eye. The child’s expression clearly conveys crying — eyebrows are furrowed, eyes are squeezed shut, and the mouth is open in a wail. The “round cheeks” and “large eyes” are also well-represented. There are no significant distortions; the features are natural and emotionally expressive without unnatural warping. VERDICT

local/nucleus-image-17b-a2bPASS

An elderly man in his 70s with age spots and thinning gray hair, looking skeptically over reading glasses

Evaluator (Qwen 3 VL)

The face is largely symmetric with correct anatomical features — the eyes, nose, and mouth are well-proportioned and aligned. The age spots are abundant and concentrated on the forehead and cheeks, consistent with the prompt. Thinning gray hair is visible, especially on the sides and crown. The expression is clearly skeptical or disapproving — the brow is furrowed, lips are downturned, and the gaze is intense — all of which convey the intended emotion. There are no major distortions; the facial

Extreme 67%vs100%

local/nucleus-image-17b-a2bFAIL

A group of three friends at a table: one laughing with head tilted back, one looking shocked with wide eyes, and one rolling their eyes with a smirk

Evaluator (Qwen 3 VL)

The image shows five people, not three, and the facial expressions do not fully match the prompt’s requested emotions. - The man on the far left is laughing with his head tilted back — this matches the first requested emotion. - The woman in the center has wide eyes and an open mouth — this matches the “shocked” expression. - The man on the far right is covering his eyes with his hand and has a pouty, annoyed expression — this does not clearly convey “rolling eyes with a smirk”; it looks more l

local/nucleus-image-17b-a2bPASS

A family photo with four people: a father beaming with pride, a mother smiling gently, a teenager looking bored, and a child grinning excitedly

Evaluator (Qwen 3 VL)

All four faces in the image are symmetric with correct anatomical features — eyes, nose, mouth, and ears are properly aligned and proportioned without noticeable distortions. Emotions are clearly readable and match the prompt: - The father (left) is beaming with pride — wide smile, crinkled eyes. - The mother (right) is smiling gently — soft, closed-mouth smile. - The teenager (back center) looks bored — neutral to slightly downturned mouth, lack of engagement. - The child (front center) is gri

local/nucleus-image-17b-a2bPASS

Three coworkers in a meeting: one looking frustrated with furrowed brows, one nodding with a pleased expression, and one looking confused with a tilted head

Evaluator (Qwen 3 VL)

All three faces are symmetric with correct, clearly defined features appropriate for cartoon-style illustration. The intended emotions are clearly readable: - Left face: Frustrated with furrowed brows, downturned mouth, and tense expression — correctly conveyed. - Center face: Nodding with a pleased, smirking expression, hand near chin — clearly shows satisfaction. - Right face: Confused with tilted head, wide eyes, and hand on cheek — accurately depicts bewilderment. No distortions are presen

Full BodyClick to expand92%vs100%

Easy 100%vs100%

local/nucleus-image-17b-a2bPASS

A woman standing upright in a relaxed pose, arms at her sides, wearing jeans and a white t-shirt

Evaluator (Qwen 3 VL)

The image shows a full-body view of a woman standing upright with her arms at her sides, wearing a white t-shirt and blue jeans — exactly matching the prompt. - Limb count: Correct — two arms, two legs, visible and properly attached. - Proportions: Anatomically accurate — head, torso, limbs, and joints appear proportionate and natural. - Pose: The woman is standing upright in a relaxed, neutral pose with arms at her sides — fully recognizable and consistent with the prompt. - Distortions: None

local/nucleus-image-17b-a2bPASS

A man standing straight facing the camera, hands in pockets, normal proportions

Evaluator (Qwen 3 VL)

The image shows a full-body shot of a man standing straight, facing the camera, with both hands in his pockets. His proportions appear normal: head, torso, and limbs are in proportion to each other without obvious exaggeration or distortion. He has two arms, two legs, and the correct number of fingers — no limb count anomalies are visible. The pose is clearly recognizable as “standing straight with hands in pockets,” and the posture is natural and anatomically plausible. There are no significan

local/nucleus-image-17b-a2bPASS

A person standing in a neutral pose in a park, full body visible from head to feet

Evaluator (Qwen 3 VL)

The image shows a full-body view of a person standing in a park, viewed from behind. The person is in a neutral, upright stance with feet shoulder-width apart, hands resting at the sides, and head straight. The pose is recognizable as a standard standing posture, commonly used in photography or AI-generated imagery to convey calmness or observation. Anatomical inspection: - Limb count: Two arms, two legs — correct. - Proportions: The torso, arms, and legs appear proportionate relative to each o

Medium 100%vs100%

local/nucleus-image-17b-a2bPASS

A woman mid-stride running through a park, hair and clothes showing motion

Evaluator (Qwen 3 VL)

The image shows a woman in mid-stride running through a park, with her hair and clothes exhibiting motion blur consistent with movement. Her limbs are correctly proportioned — two arms, two legs, with natural joint articulation. The pose is anatomically plausible for running: one leg forward, the other back, arms bent and swinging opposite to legs, torso slightly forward. The motion is captured with a shallow depth of field and background blur, enhancing the sense of speed. There are no obvious

local/nucleus-image-17b-a2bPASS

A man captured mid-jump over a puddle, body in the air with bent knees

Evaluator (Qwen 3 VL)

The image shows a man mid-air, seemingly jumping over a puddle. His body is in a dynamic pose with bent knees and arms extended for balance — consistent with the described action. Limb count is correct: two arms, two legs, and the torso are all present. Proportions appear anatomically plausible — the torso-to-leg ratio, arm length, and joint angles look natural for a human in mid-air. The pose is recognizable as a jump, and the splash in the puddle confirms the action is occurring. There are no

local/nucleus-image-17b-a2bPASS

A dancer in a dynamic spin, dress flowing outward with one leg extended

Evaluator (Qwen 3 VL)

The dancer in the image is captured mid-spin, with one leg extended outward and the other supporting her weight en pointe — a pose that is both dynamic and recognizable as a classical ballet turn. The dress flows outward in a way that suggests motion, consistent with the spinning action. Limb count is correct: two arms, two legs, and the torso are all intact and properly formed. Proportions appear anatomically accurate — the dancer’s body is slender but proportionate, with no obvious distortions

Hard 100%vs100%

local/nucleus-image-17b-a2bPASS

A person in a yoga tree pose — standing on one leg with the other foot pressed to the inner thigh, arms raised overhead

Evaluator (Qwen 3 VL)

The image shows a person performing the yoga tree pose (Vrksasana). The pose is recognizable: one leg is standing firmly on the ground, the other foot is pressed against the inner thigh of the standing leg, and the arms are raised overhead with palms together in prayer position. Limb count is correct — two arms, two legs, two feet, and two hands are clearly visible and properly formed. Proportions appear anatomically accurate — the torso, limbs, and head are proportionally rendered without obvi

local/nucleus-image-17b-a2bPASS

A rock climber clinging to a cliff face with arms and legs extended to different holds

Evaluator (Qwen 3 VL)

The climber in the image has two arms and two legs — limb count is correct. Proportions appear anatomically plausible: the torso, limbs, and head are proportionally scaled relative to each other, with no obvious skeletal or muscular distortions. The pose is recognizable as a dynamic climbing stance — one arm reaching upward to a hold, the other gripping a rope, legs positioned for balance and leverage on footholds. The body is angled appropriately for climbing, and the use of climbing shoes and

local/nucleus-image-17b-a2bPASS

A cellist seated with correct posture, legs around the instrument, bow arm extended

Evaluator (Qwen 3 VL)

The image shows a cellist seated with the cello positioned between their legs, supported by a stand. The bow is extended horizontally, held in the right hand, and appears to be in motion across the strings. The left hand is placed on the fingerboard, consistent with playing. The posture is upright and appears correct for a seated cellist. Limb count is normal: two arms, two legs, and the torso are intact and proportionally rendered. The proportions of the body and instrument are realistic, with

Extreme 67%vs100%

local/nucleus-image-17b-a2bPASS

Two tango dancers in a close embrace, the man dipping the woman backward with intertwined legs

Evaluator (Qwen 3 VL)

The image depicts two tango dancers in a close embrace. The woman is arched backward, supported by the man, with her legs intertwined with his — one leg wrapped around his waist, the other extended back. This is a classic tango pose known as a “dip” or “back dip,” and the action is recognizable and stylistically appropriate for tango. Anatomical inspection: - Limb count: Both dancers have all limbs present — two arms, two legs each — no missing or extra limbs. - Proportions: The proportions app

local/nucleus-image-17b-a2bPASS

Two soccer players mid-tackle, one sliding and the other jumping over with the ball at their feet

Evaluator (Qwen 3 VL)

The image depicts two soccer players in a dynamic, mid-action scene. One player is sliding on the ground, while the other is airborne, seemingly jumping over the sliding player. The ball is visible near the sliding player’s foot, and there is dirt kicked up, suggesting motion and contact. Anatomical analysis: - Limb count: Both players have all limbs intact — two arms and two legs per player. - Proportions: The players’ proportions appear realistic — heads, torsos, limbs are in proportion to ea

local/nucleus-image-17b-a2bFAIL

Two gymnasts performing a synchronized handstand, side by side with identical body alignment

Evaluator (Qwen 3 VL)

The image shows two gymnasts performing handstands side by side. However, their body alignment is not identical — the gymnast on the right has her legs slightly bent at the knees and her feet not fully extended upward, while the gymnast on the left has straight legs and a more classic handstand posture. Additionally, their torsos are not perfectly aligned; the right gymnast’s head is turned slightly, and her body is angled differently. Anatomically, both figures have correct limb count (two arm

HandsClick to expand42%vs75%

Easy 0%vs0%

local/nucleus-image-17b-a2bFAIL

A person standing with both hands visible at their sides, fingers relaxed and naturally spread

Evaluator (Qwen 3.5 122B)

Upon close inspection: - Left hand (viewer’s right): 5 fingers visible, but the thumb appears fused or malformed with the index finger — not anatomically natural. - Right hand (viewer’s left): Only 4 distinct fingers clearly visible; the pinky is either missing or merged. Gestures are not naturally relaxed — fingers appear stiff and unnaturally positioned. Anomalies include missing/merged digits and unnatural spacing. VERDICT: FAIL

local/nucleus-image-17b-a2bFAIL

A woman holding her hands out in front of her palms-up, all ten fingers clearly visible

Evaluator (Qwen 3.5 122B)

The image shows two hands: one palm-up with five fingers clearly visible, and another hand resting on top, also showing five fingers. However, the top hand’s fingers appear to merge or overlap unnaturally with the bottom hand — particularly the thumb and index finger of the top hand seem fused or misaligned with the palm below. The gesture is not anatomically correct; it looks like a single hand with extra fingers or a glitched overlay. Core criteria (anatomically correct, 5 fingers per hand) ar

local/nucleus-image-17b-a2bFAIL

A man with arms crossed, both hands resting on opposite upper arms with fingers visible

Evaluator (Qwen 3.5 122B)

The image shows a man with arms crossed. On the left hand (viewer’s right), all 5 fingers are visible and appear anatomically correct, resting on the right upper arm. On the right hand (viewer’s left), only 4 fingers are clearly visible — the thumb is obscured or missing from view. The grip gesture is otherwise natural. Since one hand does not show exactly 5 visible fingers as required by the prompt, the core criterion is not fully met. VERDICT: FAIL

Medium 67%vs100%

local/nucleus-image-17b-a2bFAIL

A person holding a white ceramic coffee mug by the handle with their right hand

Evaluator (Qwen 3.5 122B)

The hand gripping the mug appears to be a right hand, as described. However, upon close inspection, the fingers are not clearly distinguishable — there are no visible individual fingers or thumb separation; instead, the grip looks like a fused, amorphous mass of skin with no anatomical detail. This suggests an AI-generated artifact where finger structure was not properly rendered. No clear count of 5 fingers is possible, and the gesture lacks realism. VERDICT: FAIL

local/nucleus-image-17b-a2bPASS

A woman writing in a notebook with a pen gripped correctly in her right hand

Evaluator (Qwen 3.5 122B)

The right hand gripping the pen shows 5 fingers, and the grip appears anatomically correct — thumb, index, and middle fingers holding the pen naturally. The left hand is not visible in the image, so we cannot verify its finger count or posture. Since the prompt specifies “each visible hand” and only one is visible (and it’s correct), the core criterion is met. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A person playing an acoustic guitar with left hand on the fretboard and right hand strumming

Evaluator (Qwen 3.5 122B)

The left hand (on the fretboard) shows 4 fingers clearly visible, with the thumb likely behind the neck — anatomically plausible. The right hand (strumming area) also shows 4 fingers; the thumb is not clearly visible but may be obscured or positioned normally. No obvious extra or missing fingers. Grip and posture appear natural for guitar playing. Minor occlusion doesn’t indicate anomaly. VERDICT: PASS

Hard 33%vs100%

local/nucleus-image-17b-a2bFAIL

A person making a peace sign with their right hand, clearly showing two fingers up and three curled

Evaluator (Qwen 3.5 122B)

The image shows a single right hand making a gesture. Three fingers (index, middle, ring) are extended upward; the thumb and pinky are curled. This contradicts the prompt’s requirement of “two fingers up and three curled.” Anatomically, the hand is correctly formed with 5 fingers total, but the gesture does not match the requested peace sign (which typically uses index and middle fingers only). The extra extended finger violates the prompt’s core instruction. VERDICT: FAIL

local/nucleus-image-17b-a2bPASS

A hand giving a thumbs-up with four fingers curled and thumb extended upward

Evaluator (Qwen 3.5 122B)

The image shows one hand giving a thumbs-up gesture. The thumb is extended upward, and the other four fingers are curled into a fist — consistent with the prompt. Anatomically, this is correct for a standard thumbs-up. There are no visible anomalies: the hand has 5 fingers total (thumb + 4 curled), and the proportions and positioning appear natural for a cartoon-style illustration. No extra or missing digits. VERDICT: PASS

local/nucleus-image-17b-a2bFAIL

A person counting to three on their fingers, with index, middle, and ring fingers extended

Evaluator (Qwen 3.5 122B)

The image shows a single hand with three fingers extended: index, middle, and ring — matching the prompt’s gesture. However, counting visible fingers: thumb is curled (partially visible), index, middle, ring are extended, and pinky is not visible — so only 4 fingers are clearly present or implied. The numbers “1”, “2”, “1” are drawn on knuckles, which is an anomaly not requested. Anatomically, the hand structure appears plausible, but the numbering and incomplete finger visibility violate the co

Extreme 67%vs100%

local/nucleus-image-17b-a2bPASS

Two people shaking hands in a professional setting, all fingers of both hands anatomically correct

Evaluator (Qwen 3.5 122B)

Upon close examination of the handshake: - Left hand (from viewer’s perspective): 5 fingers are visible — thumb, index, middle, ring, and pinky. All appear anatomically proportioned and correctly positioned in the grip. - Right hand: Also shows 5 fingers — thumb wrapped around, four fingers interlaced or resting naturally. No extra or missing digits; joint alignment and curvature look natural for a firm handshake. No anomalies detected. Grip is realistic and anatomically plausible. VERDICT: P

local/nucleus-image-17b-a2bPASS

Two people doing a high-five, palms meeting with all fingers extended on both hands

Evaluator (Qwen 3.5 122B)

Upon close inspection: The left hand (in black sleeve) shows 5 fingers — thumb, index, middle, ring, pinky — all extended and visible. The right hand (gray sleeve) also displays 5 fingers, though the thumb is partially obscured by the other hand’s palm; however, its position and curvature suggest it’s correctly placed. No extra or missing digits. The gesture is anatomically plausible for a high-five, with palms meeting squarely. Minor occlusion doesn’t invalidate correctness. VERDICT: PASS

local/nucleus-image-17b-a2bFAIL

One person handing a small gift box to another person, both sets of hands gripping the box correctly

Evaluator (Qwen 3.5 122B)

Upon close examination: The left person’s top hand shows 4 fingers clearly; the bottom hand shows 5. The right person’s top hand shows 4 fingers; bottom hand shows 5. One finger on each top hand appears merged or missing — likely an AI artifact. Grips are plausible but not perfectly anatomical due to finger anomalies. Core criterion (“exactly 5 per hand”) is violated. VERDICT: FAIL

Multi-SubjectClick to expand100%vs100%

Easy 100%vs100%

local/nucleus-image-17b-a2bPASS

Two women standing side by side: one with short blonde hair wearing a red jacket, the other with long black hair wearing a blue dress

Evaluator (Qwen 3 VL)

Person 1: - Hair: Short, blonde, styled in a bob cut. - Clothing: Wearing a bright red zip-up jacket. - Position: Standing on the left side of the image. Person 2: - Hair: Long, straight, black hair. - Clothing: Wearing a royal blue V-neck dress with a flared skirt. - Position: Standing on the right side of the image, side by side with Person 1. Both individuals are visually distinct and match the described features exactly. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

Two men at a cafe: one with a beard and glasses wearing a green sweater, the other clean-shaven in a black hoodie

Evaluator (Qwen 3 VL)

Person 1 (left): - Has a full, thick beard (matches "beard"). - Wears black-rimmed glasses (matches "glasses"). - Wearing a green sweater (matches "green sweater"). - Appears to be a man (matches "one man"). Person 2 (right): - Clean-shaven (no facial hair). - Wearing a black hoodie (matches "black hoodie"). - Appears to be a man (matches "other man"). - Note: The man is bald, which is a more specific trait than "clean-shaven," but "clean-shaven" is still accurately represented as he has no fac

local/nucleus-image-17b-a2bPASS

Two friends posing: one tall with curly red hair in denim overalls, the other short with straight brown hair in a yellow sundress

Evaluator (Qwen 3 VL)

Person 1 (tall): - Hair: Curly, vibrant red, voluminous, and framing the face. - Clothing: Denim overalls over a white tank top. - Height: Appears tall, standing significantly above the other person. Person 2 (short): - Hair: Straight, brown, cut in a bob with bangs. - Clothing: Yellow sundress with thin straps and a gathered waist. - Height: Clearly shorter than the first person, visually confirming the “short” descriptor. Both individuals are present, visually distinct, and match the descr

Medium 100%vs100%

local/nucleus-image-17b-a2bPASS

Four people at a dinner table: a bald man in a suit, a woman with silver hair in a red blouse, a young man with dreadlocks in a denim jacket, and a woman with a hijab in a green dress

Evaluator (Qwen 3 VL)

- Bald man in a suit: Present. He is on the far left, bald, wearing a dark suit with a white shirt and tie. Visually distinct. - Woman with silver hair in a red blouse: Present. She is second from the left, has short silver hair, and wears a red blouse. Visually distinct. - Young man with dreadlocks in a denim jacket: Present. He is third from the left, has dark dreadlocks, and wears a denim jacket over a white shirt. Visually distinct. - Woman with a hijab in a green dress: Present. She is on t

local/nucleus-image-17b-a2bPASS

Three musicians on stage: a tall woman with short pink hair playing bass, a man with a beard and bandana on drums, and a slim man with glasses playing keyboard

Evaluator (Qwen 3 VL)

- Tall woman with short pink hair playing bass: Present. She has bright pink, short, straight hair with bangs, is standing, and is holding a double bass (not an electric bass, but still a bass instrument). She is visually distinct and matches the description of playing bass. - Man with a beard and bandana on drums: Present. He has a full beard and is wearing a blue patterned bandana on his head. He is seated behind a drum kit and is visually distinct. He matches the description. - Slim man wit

local/nucleus-image-17b-a2bPASS

Three hikers on a trail: a man with a backpack and sunhat, a woman with braided hair and hiking poles, and a teenager with a baseball cap and water bottle

Evaluator (Qwen 3 VL)

Person 1 (left): - Observed traits: Man, wearing a beige sunhat, carrying a large dark backpack with a water bottle attached, hiking boots, and khaki shorts. - Matches description: Yes — man with backpack and sunhat. Person 2 (middle): - Observed traits: Woman with long dark hair in two braids, wearing a light gray t-shirt and shorts, carrying a gray backpack, and using hiking poles. - Matches description: Yes — woman with braided hair and hiking poles. Person 3 (right): - Observed traits

Truthfulness 56%vs93%

PhotorealismClick to expand67%vs100%

Medium 67%vs100%

local/nucleus-image-17b-a2bPASS

A still life on a wooden table featuring a clear glass of water, a polished silver spoon, a folded linen napkin, and a ripe red apple — each material rendered with accurate texture and light behavior

Evaluator (Qwen 3.6 27B)

The image accurately renders the requested materials. The glass shows proper refraction and transparency; the spoon exhibits sharp, realistic metallic reflections; the napkin displays a convincing woven fabric texture; and the apples have a smooth, glossy skin with appropriate specular highlights. The wooden table surface also shows realistic grain and lighting. VERDICT: PASS

local/nucleus-image-17b-a2bFAIL

A workspace desk with a brushed aluminum laptop, a leather-bound notebook, a ceramic coffee mug, and a glass paperweight — all materials visually distinct and realistic

Evaluator (Qwen 3.6 27B)

The image fails to render the requested materials correctly. The laptop appears to be matte plastic rather than brushed aluminum, lacking metallic reflections. The notebook is a standard paper spiral-bound type, not leather-bound. The central object is a glass tea infuser, not a ceramic mug or glass paperweight as requested. The textures are generic and do not exhibit the distinct visual properties (like translucency or specific grain) required for the prompt. VERDICT: FAIL

local/nucleus-image-17b-a2bPASS

A kitchen counter with a stainless steel pot, a wooden cutting board, a glass bottle of olive oil, and a cotton dish towel — each surface reflecting light correctly

Evaluator (Qwen 3.6 27B)

The image accurately renders the requested materials. The stainless steel pot shows sharp, metallic reflections. The wooden cutting boards display natural grain and matte texture. The glass bottle exhibits correct translucency and refraction of the olive oil inside. The cotton towel has a soft, fibrous texture with appropriate light absorption. All surfaces interact with light realistically. VERDICT: PASS

Physics & ReflectionsClick to expand75%vs92%

Easy 67%vs100%

local/nucleus-image-17b-a2bPASS

A red sphere on a white surface with a single shadow cast to the right, consistent with a light source from the upper left

Evaluator (Qwen 3.6 27B)

The shadow is cast to the right and slightly downward, consistent with a light source from the upper left. The shadow’s shape is an oval, appropriately distorted by perspective and surface contact. Its soft edges suggest diffuse lighting, which matches the gentle gradient on the sphere. No reflections are expected on a matte white surface, and none appear — correct. Color of the shadow is neutral gray, not tinted, which is plausible for a white surface under neutral light. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A coffee mug on a sunny windowsill casting a sharp shadow to the left

Evaluator (Qwen 3.6 27B)

The light source is coming from the window on the right, which correctly casts the shadow to the left. The shadow’s shape accurately matches the mug and handle, with appropriate sharpness for direct sunlight. The lighting and reflections on the mug’s surface are consistent with a glossy ceramic material under bright natural light. No distortions or implausible elements are present. VERDICT: PASS

local/nucleus-image-17b-a2bFAIL

A person standing outdoors at midday with a short shadow directly below them

Evaluator (Qwen 3.6 27B)

The shadow is elongated and stretches far down the frame, which contradicts the “midday” condition (where shadows should be short). The shadow’s shape also appears distorted and inconsistent with the person’s posture. VERDICT: FAIL

Medium 67%vs67%

local/nucleus-image-17b-a2bFAIL

A woman standing in front of a full-length mirror, her reflection accurately showing the back of her outfit and reversed orientation

Evaluator (Qwen 3.6 27B)

The reflection is physically impossible. The woman faces the mirror, so her reflection should show her front. Instead, the reflection shows her back, while the real figure’s back is visible to the viewer — a logical contradiction. Shadows and lighting are consistent, but the core reflection error invalidates plausibility. VERDICT: FAIL

local/nucleus-image-17b-a2bPASS

A cat sitting on a polished marble floor with its reflection visible below

Evaluator (Qwen 3.6 27B)

The reflection is vertically aligned beneath the cat, matching its posture and orientation. The distortion follows the curvature of the polished floor, with slight blurring consistent with a glossy surface. Color and brightness are appropriately muted in the reflection, as expected from light absorption and angle. Directionality is correct — no lateral or inverted errors. Overall, the reflection behaves physically plausibly for a marble floor. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A vase of flowers on a glass table with a clear reflection visible on the table surface

Evaluator (Qwen 3.6 27B)

The reflection on the glass table is physically plausible. It correctly mirrors the vase and flowers with appropriate orientation, slight distortion due to the glass surface, and accurate color representation. The reflection aligns with the object’s position and lighting conditions. VERDICT: PASS

Hard 100%vs100%

local/nucleus-image-17b-a2bPASS

A row of trees reflected in a calm lake at sunset, with the reflection slightly darker and gently rippled

Evaluator (Qwen 3.6 27B)

The reflections are vertically inverted, matching the trees’ orientation. They exhibit gentle ripples consistent with a calm lake surface, and appear slightly darker than the originals — all physically plausible. The color gradient in the sky is mirrored accurately in the water. No directional or distortion errors detected. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A city skyline reflected in a rain puddle on a street, with mild distortion and color shift

Evaluator (Qwen 3.6 27B)

The reflection in the puddle is inverted, which is physically correct. The distortion is mild and consistent with a shallow puddle on asphalt. The color shift toward cooler tones in the reflection is plausible due to water absorption and ambient light. The background skyline matches the reflected shapes, confirming spatial consistency. No obvious artifacts or implausible elements are present. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A mountain range reflected in a still alpine lake, with the reflection inverted and subtly muted in color

Evaluator (Qwen 3.6 27B)

The reflection is physically plausible: it is correctly inverted, aligned with the mountain’s base at the shoreline, and shows slight softening and muted color consistent with water surface diffusion. No directional or orientation errors are present. VERDICT: PASS

Extreme 67%vs100%

local/nucleus-image-17b-a2bPASS

A clear glass sphere sitting on a mirror surface near a pool of water, showing refraction through the sphere, reflection on the mirror, and water reflection simultaneously

Evaluator (Qwen 3.6 27B)

The reflections and refractions are physically plausible. The sphere correctly refracts the background, inverting the scene as expected through a convex lens. The reflection on the mirror surface is sharp and accurately mirrors the sphere’s position and lighting. The water reflection in the background is soft and consistent with a pool environment. Colors and distortions align with real-world optics. VERDICT: PASS

local/nucleus-image-17b-a2bFAIL

A chrome sculpture in a room with a glass floor and large windows, showing multiple reflections and environmental mapping on every surface

Evaluator (Qwen 3.6 27B)

The reflections are physically implausible. The floor tiles show reflections of the ceiling and windows, but the geometry is distorted and inconsistent with the room’s layout. The figures’ reflections do not align correctly with their positions, and some reflections appear in surfaces that should not reflect them (e.g., opaque walls). Color and orientation are also mismatched in several areas. VERDICT: FAIL

local/nucleus-image-17b-a2bPASS

A crystal wine glass on a polished marble counter next to a window overlooking a lake — showing refraction in the glass, reflection on the marble, and the lake's own reflective surface

Evaluator (Qwen 3.6 27B)

The reflections and refractions are physically plausible. The crystal glass shows realistic distortion of the background through its facets, with light bending correctly. The marble surface reflects the base of the glass with appropriate softness for a polished but not mirror-like material. The lake’s surface reflects the sky and distant trees with natural blurring due to water movement. Shadows are subtle and consistent with ambient daylight from the window. No directional or color inconsistenc

World KnowledgeClick to expand33%vs92%

Easy 33%vs67%

local/nucleus-image-17b-a2bPASS

The Eiffel Tower in Paris on a clear day, architecturally accurate with its iron lattice structure and three levels

Evaluator (Qwen 3.6 27B)

The image depicts the Eiffel Tower with high architectural accuracy. The iron lattice structure, the three distinct levels (including the two main observation decks and the top platform), and the iconic arched base are all correctly rendered. The perspective and surrounding environment (trees, clear sky) are consistent with a real-world photograph. No significant inaccuracies are present. VERDICT: PASS

local/nucleus-image-17b-a2bFAIL

The Taj Mahal in Agra seen from the reflecting pool, with accurate dome, minarets, and symmetry

Evaluator (Qwen 3.6 27B)

The image depicts the Taj Mahal with recognizable features: the central dome, four minarets, and the reflecting pool. However, there are notable inaccuracies. The minarets are too short and lack the correct proportion and detailing. The side domes (chhatris) are oversized and misshapen. The reflection in the pool is overly simplified and lacks realistic distortion. The overall symmetry is approximate but not precise. These deviations from architectural accuracy disqualify it as factually correct

local/nucleus-image-17b-a2bFAIL

The Statue of Liberty on her island pedestal, with correct torch, crown, tablet, and green patina

Evaluator (Qwen 3.6 27B)

The image depicts two identical Statues of Liberty standing side-by-side on separate pedestals. This is factually incorrect, as there is only one Statue of Liberty. The duplication makes the subject architecturally inaccurate and unrealistic. VERDICT: FAIL

Medium 33%vs100%

local/nucleus-image-17b-a2bFAIL

A traditional Japanese shamisen instrument with accurate three-stringed design, wooden body, and bachi plectrum

Evaluator (Qwen 3.6 27B)

The image depicts a shamisen, but it contains significant factual inaccuracies. The instrument shown has four strings instead of the traditional three. Additionally, the bridge is incorrectly positioned on the soundboard rather than being a separate piece resting on it, and the tuning pegs are arranged in an unusual, non-standard configuration. These errors make the depiction factually inaccurate. VERDICT: FAIL

local/nucleus-image-17b-a2bFAIL

A medieval astrolabe with accurate rete, tympan, and engraved degree markings on brass

Evaluator (Qwen 3.6 27B)

The subject is clearly recognizable as an astrolabe, with appropriate brass material and general structure. However, the engraved markings are not accurate — they consist of gibberish symbols and nonsensical characters rather than real degree markings, zodiac signs, or Arabic/Latin inscriptions typical of medieval astrolabes. The rete and tympan also lack authentic geometric precision and historical detail. While visually plausible at first glance, it fails factual accuracy. VERDICT: FAIL

local/nucleus-image-17b-a2bPASS

A Japanese kintsugi bowl — cracked ceramic repaired with visible gold lacquer lines

Evaluator (Qwen 3.6 27B)

The image accurately depicts a Japanese kintsugi bowl. The ceramic vessel features visible cracks repaired with gold lacquer, which is the defining characteristic of the kintsugi art form. The bowl’s shape, glaze, and the application of gold along the fracture lines are realistic and consistent with real-world examples. No architectural or factual inaccuracies are present. VERDICT: PASS

Hard 0%vs100%

local/nucleus-image-17b-a2bFAIL

A detailed diagram of the water cycle showing evaporation from the ocean, condensation forming clouds, precipitation as rain, and collection in rivers flowing back to the sea

Evaluator (Qwen 3.6 27B)

The image depicts a stylized diagram of the water cycle, which is recognizable in concept. However, the text labels are nonsensical gibberish (e.g., "EVAVATION," "FRAFRILATION CYAL," "ECLÉPOTION"), making it factually inaccurate as a diagram. The prompt asked for a detailed diagram showing specific processes; since the labels are unreadable and incorrect, it fails to convey the required information accurately. VERDICT: FAIL

local/nucleus-image-17b-a2bFAIL

A cross-section showing tectonic plates at a convergent boundary with subduction, a trench, volcanic arc, and layered mantle

Evaluator (Qwen 3.6 27B)

The image depicts a surface-level landscape of cracked earth with a river, rather than a geological cross-section. It completely fails to show the requested internal structures: tectonic plates, subduction zones, trenches, volcanic arcs, or the mantle. The subject is not recognizable as the requested scientific diagram. VERDICT: FAIL

local/nucleus-image-17b-a2bFAIL

An anatomical illustration of the human heart showing four chambers, valves, aorta, pulmonary arteries, and veins with correct placement

Evaluator (Qwen 3.6 27B)

The image is not factually accurate. While it resembles a heart, the anatomy is distorted and incorrect. The chambers are misshapen, the valves are not clearly or correctly depicted, and the major vessels (aorta, pulmonary arteries/veins) are in wrong positions or have unnatural shapes. For example, the aorta appears to merge incorrectly with other structures, and the pulmonary veins are not properly shown entering the left atrium. These errors make it unsuitable as an anatomical illustration.

Extreme 67%vs100%

local/nucleus-image-17b-a2bPASS

A Morpho menelaus butterfly with accurate iridescent blue dorsal wings, brown ventral wing pattern with eyespots, and correct body morphology

Evaluator (Qwen 3.6 27B)

The image depicts a Morpho butterfly with the characteristic iridescent blue dorsal wings and brown ventral patterns with eyespots. The body morphology, including antennae and wing venation, appears anatomically correct. The coloration and pattern are consistent with the species Morpho menelaus. No significant inaccuracies are observed. VERDICT: PASS

local/nucleus-image-17b-a2bFAIL

The Hagia Sophia interior showing the massive central dome, pendentives, semi-domes, and Islamic calligraphy medallions alongside Byzantine mosaics

Evaluator (Qwen 3.6 27B)

The image depicts the interior of the Hagia Sophia with recognizable architectural features such as the central dome, pendentives, and Islamic calligraphy medallions. However, the Byzantine mosaics appear distorted and inaccurately rendered, with some figures and details blending into the background or appearing nonsensical. The overall composition is plausible but lacks precise architectural accuracy in the mosaic details. VERDICT: FAIL

local/nucleus-image-17b-a2bPASS

A specimen of bismuth crystal showing accurate iridescent oxide layers, geometric hopper crystal formation, and characteristic staircase structure

Evaluator (Qwen 3.6 27B)

The image depicts a bismuth crystal specimen with high fidelity. It accurately captures the characteristic geometric hopper crystal formation, the distinct "staircase" structure on the crystal faces, and the vibrant, iridescent oxide layers in correct colors (blues, greens, golds, purples). The base rock matrix is also rendered realistically. The subject is clearly recognizable and factually accurate. VERDICT: PASS

Professional Studio 74%vs93%

Camera & LightingClick to expand83%vs100%

Easy 100%vs100%

local/nucleus-image-17b-a2bPASS

A wide-angle shot of a modern living room interior showing expansive space with slight barrel distortion at the edges

Evaluator (Gemma 4 26B)

The image shows a modern living room interior with a wide-angle perspective. 1. **Perspective/Wide-angle/Distortion**: The image exhibits significant barrel distortion, particularly visible in the curved lines of the floor and the way the walls and ceiling meet. This creates the "expansive space" and "slight barrel distortion" requested in the prompt. 2. **Depth of Field**: The depth of field is deep, with the entire room from the foreground floor to the background windows being in sharp focus

local/nucleus-image-17b-a2bPASS

A close-up macro shot of a single dewdrop on a rose petal, shallow depth of field

Evaluator (Gemma 4 26B)

The image shows a close-up macro shot of a single, prominent dewdrop resting on a pink rose petal. 1. **Perspective/Shot Type**: It is a macro shot, focusing on the detail of the water droplet and the texture of the petal. 2. **Depth of Field**: The image features a shallow depth of field. The central dewdrop is in sharp focus, while the foreground and background (the rest of the rose petal) are heavily blurred (bokeh effect), which aligns perfectly with the prompt. 3. **Lighting**: The lighti

local/nucleus-image-17b-a2bPASS

An aerial bird's-eye view looking straight down at a traffic roundabout with cars and trees

Evaluator (Gemma 4 26B)

The image depicts an aerial bird's-eye view looking straight down at a traffic roundabout. 1. **Perspective**: The perspective is a true top-down (nadir) view, looking straight down at the center of the roundabout. 2. **Depth of Field**: The entire scene is in sharp focus from the center to the edges, which is consistent with a high-altitude aerial shot. 3. **Lighting**: The lighting is even and consistent across the scene, typical of daylight, with no harsh shadows or directional light source

Medium 100%vs100%

local/nucleus-image-17b-a2bPASS

A portrait lit with classic Rembrandt lighting — triangle of light on the shadowed cheek, moody atmosphere

Evaluator (Gemma 4 26B)

The image depicts a portrait with a strong emphasis on light and shadow. 1. **Lighting Pattern:** The image features a clear "Rembrandt lighting" pattern. There is a distinct, small triangle of light on the shadowed cheek (the viewer's left, the subject's right), which is the hallmark of this technique. 2. **Atmosphere:** The lighting is moody and dramatic, with deep shadows and a dark background, creating a high-contrast, chiaroscuro effect. 3. **Perspective and Depth of Field:** The persp

local/nucleus-image-17b-a2bPASS

A landscape photograph bathed in golden hour light — long warm shadows, orange sky, soft glow

Evaluator (Gemma 4 26B)

The image depicts a landscape with a distinct orange sky and a warm, golden color palette. The lighting pattern shows long, dark shadows stretching across the undulating terrain, consistent with a low sun angle. The light appears soft and creates a "glow" effect across the hills. The perspective shows a wide landscape view with a sense of depth created by the layered hills and the light/shadow transitions. The image successfully matches the description of "golden hour light," "long warm shadows

local/nucleus-image-17b-a2bPASS

A high-key portrait of a person against a bright white background, minimal shadows, airy and luminous

Evaluator (Gemma 4 26B)

The image matches the prompt's requirements for a high-key portrait: 1. **Lighting Pattern:** The image is extremely bright, featuring a high-key aesthetic with a bright white background. The lighting is airy and luminous, with very minimal shadows, creating a "washed out" or ethereal effect. 2. **Perspective:** It is a standard portrait perspective (head and shoulders). 3. **Depth of Field:** The image has a very shallow depth of field, with the facial features appearing soft and slightly o

Hard 33%vs100%

local/nucleus-image-17b-a2bPASS

A portrait shot at 85mm f/1.4 with creamy bokeh background, backlit with warm rim light separating the subject from a blurred garden

Evaluator (Gemma 4 26B)

The image matches the prompt's specifications: 1. **Perspective/Lens (85mm):** The image shows a tight portrait with a shallow depth of field, characteristic of an 85mm lens used for portraits. The subject is the clear focus, and the background is heavily blurred. 2. **Depth of Field (f/1.4/Creamy Bokeh):** The background is extremely soft and blurred, creating a "creamy bokeh" effect where the garden elements are indistinct shapes of light and color. 3. **Lighting (Backlit/Warm Rim Light):*

local/nucleus-image-17b-a2bFAIL

A street photograph with shallow depth of field as if shot at 50mm f/1.8, subject sharp with soft city lights in the background

Evaluator (Gemma 4 26B)

The provided image is a solid-colored, featureless grey square. It does not contain a street photograph, a subject, city lights, or any discernible depth of field or perspective. It fails to meet any of the criteria specified in the prompt. VERDICT: FAIL

local/nucleus-image-17b-a2bFAIL

A food photograph shot from above at 35mm f/2.8 with natural window light from the left, slight vignette, warm color temperature

Evaluator (Gemma 4 26B)

The image fails to meet several key criteria specified in the prompt: 1. **Perspective/Angle:** The prompt specifies a shot "from above" (top-down/flat lay). The image provided is a side-angle/eye-level shot looking into the bowl. 2. **Lighting:** The prompt specifies "natural window light from the left." While there is a window visible on the left, the light hitting the food is very flat and lacks the directional shadows or high-contrast modeling typically associated with natural window ligh

Extreme 100%vs100%

local/nucleus-image-17b-a2bPASS

A tilt-shift photograph of a city street making real-life look like a miniature model — selective focus strip, saturated colors

Evaluator (Gemma 4 26B)

The image exhibits the core characteristics of a tilt-shift photograph. 1. **Perspective/Tilt-Shift Effect:** The image uses a very shallow depth of field to create the "miniature" effect. There is a distinct "selective focus strip" where the foreground/midground (the road surface and the curb) is in sharp focus, while the foreground (the bottom left corner) and the background (the buildings and cars) are heavily blurred. 2. **Depth of Field:** The depth of field is extremely shallow, which i

local/nucleus-image-17b-a2bPASS

A cinematic frame with blue and orange anamorphic lens flare streaking horizontally across a dark moody scene

Evaluator (Gemma 4 26B)

The image features a prominent horizontal light streak across a dark, moody scene. 1. **Lighting Pattern:** The streak transitions from a bright blue on the left to a bright orange/amber on the right, matching the "blue and orange" description. 2. **Anamorphic Lens Flare:** The light is a sharp, horizontal streak characteristic of an anamorphic lens flare. 3. **Scene/Mood:** The background is dark and moody with a soft, out-of-focus texture at the bottom, suggesting a cinematic frame. The ima

local/nucleus-image-17b-a2bPASS

A portrait with dramatic split lighting — half the face brightly lit, the other half in deep shadow, high contrast chiaroscuro

Evaluator (Gemma 4 26B)

The image shows a portrait of a person where the lighting is a classic example of split lighting. 1. **Lighting Pattern:** The face is divided vertically by light and shadow. The left side of the subject's face (from the viewer's perspective) is brightly illuminated, while the right side is in deep, dark shadow. This perfectly matches the "dramatic split lighting" and "high contrast chiaroscuro" described in the prompt. 2. **Perspective:** The perspective is a standard close-up portrait, looki

Color PrecisionClick to expand83%vs92%

Easy 100%vs100%

local/nucleus-image-17b-a2bPASS

A bright red sports car parked on a gray asphalt road — the red is vivid and unambiguous

Evaluator (Gemma 4 26B)

The image shows a sports car that is a bright, vivid, and unambiguous red. The car is parked on a gray asphalt road. The color of the car matches the prompt's requirement for a "bright red" and "vivid" color perfectly. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A woman wearing a royal blue evening dress against a neutral background

Evaluator (Gemma 4 26B)

The image shows a woman wearing a dress that is a vibrant, deep blue color. This color matches the description of "royal blue" specified in the prompt. The background is a solid, light beige/tan color, which qualifies as a "neutral background." The colors in the image accurately reflect the prompt's requirements. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A field of pure sunflower yellow flowers under a clear blue sky

Evaluator (Gemma 4 26B)

The image contains a field of sunflowers and a sky. 1. **Sky Color**: The sky is a bright, clear blue, which matches the prompt's requirement for a "clear blue sky." 2. **Flower Color**: The sunflowers are a vibrant, bright yellow, which matches the prompt's requirement for "pure sunflower yellow flowers." The colors in the image accurately reflect the specific colors requested in the prompt. VERDICT: PASS

Medium 33%vs67%

local/nucleus-image-17b-a2bPASS

A bedroom wall painted in #FF6B35 (burnt orange) with white furniture and natural daylight

Evaluator (Gemma 4 26B)

The image shows a bedroom wall painted in a vibrant, saturated orange color. The hex code #FF6B35 represents a "burnt orange" or a bright reddish-orange. The color in the image is a deep, warm orange that aligns well with the description of "burnt orange" and the specific hex value provided. The furniture shown is white, which also matches the prompt. VERDICT: PASS

local/nucleus-image-17b-a2bFAIL

A forest scene where the foliage is precisely #2D5F2D (dark muted green) with brown tree trunks

Evaluator (Gemma 4 26B)

The image depicts a forest scene with a very dark, monochromatic green color palette. The foliage and background are various shades of deep green, and the tree trunks are also rendered in a very dark green/black shade. The prompt specifically requested: 1. Foliage color: **#2D5F2D (dark muted green)**. 2. Tree trunks: **Brown**. The image fails on both counts: - The color used for the foliage is a much darker, more saturated green than the requested #2D5F2D. - The tree trunks are dark green/bl

local/nucleus-image-17b-a2bFAIL

A room with walls in #4A90D9 (medium blue) with warm wooden floor and white trim

Evaluator (Gemma 4 26B)

The prompt specifies a wall color of **#4A90D9**, which is a medium-toned, slightly desaturated sky blue. Looking at the image, the walls are a very vibrant, highly saturated, bright cyan/cerulean blue. The color in the image is significantly more luminous and "electric" than the more muted, professional-grade blue represented by the hex code #4A90D9. While it is a "medium blue" in a general sense, it lacks the specific tonal quality and saturation level of the requested hex code. VERDICT: FA

Hard 100%vs100%

local/nucleus-image-17b-a2bPASS

A minimalist room with walls in #F5E6CC (warm cream), a sofa in #2C3E50 (dark navy), and accent pillows in #E74C3C (bright red)

Evaluator (Gemma 4 26B)

To evaluate the color accuracy, I will compare the image colors to the specified hex values and names: 1. **Walls (#F5E6CC - Warm Cream):** The walls in the image are a pale, yellowish cream color. This matches the "warm cream" description and the hex code #F5E6CC well. 2. **Sofa (#2C3E50 - Dark Navy):** The sofa is a very dark, deep navy blue. This aligns perfectly with the hex code #2C3E50. 3. **Accent Pillows (#E74C3C - Bright Red):** The pillows are a vibrant, saturated red. This matches

local/nucleus-image-17b-a2bPASS

A product flat-lay on a #1A1A2E (deep dark blue) background, with a #F0C27F (soft gold) watch and #FFFFFF (pure white) earbuds

Evaluator (Gemma 4 26B)

The image contains the following elements with their corresponding colors: 1. **Background:** The background is a very dark, deep blue. The hex code #1A1A2E represents a deep navy/dark blue, which matches the visual appearance of the background in the image. 2. **Watch:** The watch is a soft gold color. The hex code #F0C27F is a pale, yellowish-gold, which matches the soft gold appearance of the watch in the image. 3. **Earbuds:** The earbuds are pure white. The hex code #FFFFFF is pure whit

local/nucleus-image-17b-a2bPASS

A graphic design composition with background #FAFAFA (near-white), main shape #FF4500 (orange-red), and accent lines #333333 (dark charcoal)

Evaluator (Gemma 4 26B)

The image contains the following colors: 1. **Background:** The background is a near-white/off-white color. While it is not a pure white, it is very close to the requested `#FAFAFA`. 2. **Main Shape:** The central shape is a vibrant orange-red. This aligns well with the requested `#FF4500` (Orange Red). 3. **Accent Lines:** The lines are a dark charcoal/gray color, which matches the requested `#333333`. The colors in the image accurately reflect the hex codes and descriptions provided in the

Extreme 100%vs100%

local/nucleus-image-17b-a2bPASS

A smooth gradient on a wall going from #FF0000 (red) on the left to #0000FF (blue) on the right, transitioning evenly through purple

Evaluator (Gemma 4 26B)

The image shows a smooth gradient transition. On the far left, the color is a vibrant red, which corresponds to #FF0000. On the far right, the color is a vibrant blue, which corresponds to #0000FF. The transition between these two colors passes through a purple/magenta hue in the center, as requested. The gradient is even and follows the specified color progression. VERDICT: PASS

local/nucleus-image-17b-a2bPASS

A sunset sky gradient on a flat surface from #FF6B35 (orange) at the bottom to #1B1464 (deep indigo) at the top

Evaluator (Gemma 4 26B)

The image displays a vertical color gradient. 1. **Top Color:** The top of the image is a very dark, deep indigo/navy blue. This aligns well with the requested hex code **#1B1464**. 2. **Bottom Color:** The bottom of the image is a vibrant orange. While the very bottom edge appears slightly darker/shadowed, the primary orange hue matches the requested **#FF6B35**. 3. **Gradient Transition:** The transition from the deep indigo at the top to the orange at the bottom follows the requested struct

local/nucleus-image-17b-a2bPASS

A horizontal banner with a gradient from #00C9FF (cyan) on the left to #92FE9D (mint green) on the right

Evaluator (Gemma 4 26B)

The image shows a horizontal banner with a gradient. 1. **Left side color:** The color on the left is a bright cyan/light blue. The hex code #00C9FF is a vibrant cyan, which matches the visual appearance of the left side of the image. 2. **Right side color:** The color on the right is a light, pale green. The hex code #92FE9D is a light mint green, which matches the visual appearance of the right side of the image. 3. **Gradient transition:** The gradient flows smoothly from the cyan on the

PhotorealismClick to expand0%vs67%

Easy 0%vs67%

local/nucleus-image-17b-a2bFAIL

A single red sneaker on a clean white background with soft studio lighting and gentle shadow — product photography style

Evaluator (Gemma 4 26B)

The image shows a single red sneaker on a white background. While the overall composition mimics a product shot, there are several significant issues that indicate it is not a real photograph: 1. **Anatomical/Structural Errors:** The shape of the sneaker is nonsensical. The heel area and the opening where the foot would go are distorted and lack a logical structure. The laces and the eyelets are poorly defined and appear to melt into the body of the shoe. 2. **Lighting and Shadow Inconsistenc

local/nucleus-image-17b-a2bFAIL

A clear glass perfume bottle on a reflective black surface with studio lighting, no background distractions

Evaluator (Gemma 4 26B)

The image shows a clear glass perfume bottle on a reflective black surface. Upon examination, there are several significant AI artifacts and physical inconsistencies: 1. **Impossible Geometry/Reflections:** Inside the bottle, the liquid and the internal structure do not align with the external shape. There is a strange, white, curved line/shape inside the bottle that looks like a distorted reflection or a structural element that doesn't make physical sense. 2. **Distorted Cap/Nozzle:** The to

local/nucleus-image-17b-a2bFAIL

A ripe green apple on a seamless white background with soft side lighting and subtle shadow — catalog style

Evaluator (Gemma 4 26B)

The image depicts a single green apple on a white background. **Analysis:** 1. **Subject Matter:** The apple is a bright, uniform green. While it looks clean, the texture is unnaturally smooth and lacks the organic imperfections (tiny pores, speckles, or slight color variations) typically found in a real Granny Smith apple. 2. **Lighting and Shadows:** The lighting is very soft, but the shadow at the base of the apple is extremely blurry and lacks a defined contact point. The transition from t

Graphical design 46%vs100%

Data VisualisationClick to expand0%vs100%

Easy 0%vs100%

local/nucleus-image-17b-a2bFAIL

A bar chart showing monthly sales for January through June with values 12K, 18K, 15K, 22K, 19K, 25K — labeled axes, color-coded bars, and a title 'Monthly Sales 2025'

Evaluator (Qwen 3 VL)

The requested bar chart should show monthly sales for January through June with values 12K, 18K, 15K, 22K, 19K, 25K — labeled axes, color-coded bars, and a title 'Monthly Sales 2025'. In the provided image: - The chart type is a bar chart, which matches the request. - The bars are color-coded (blue gradient), which is correct. - The title 'Monthly Sales 2025' is not present — it is missing. - The x-axis labels are incorrect: they show “12 2025”, “19K”, “25K”, which do not represent months Janua

local/nucleus-image-17b-a2bFAIL

A pie chart showing market share: Company A 35%, Company B 25%, Company C 20%, Others 20% — each slice a different color with labels and percentages

Evaluator (Qwen 3 VL)

The requested pie chart should show: - Company A: 35% - Company B: 25% - Company C: 20% - Others: 20% In the provided image: - The largest slice (35%) is labeled “A5%” and “35%” — this appears to be Company A, but the label “A5%” is incorrect and confusing. - The 25% slice is labeled “25% B%” — this is inconsistent and poorly formatted. - The 20% slice is labeled “20%” and “20% 100%” — the “100%” is incorrect and misleading. - The “Others” slice is labeled “Others 25%” — this contradicts the re

local/nucleus-image-17b-a2bFAIL

A simple line graph showing daily website visitors over one week (Mon-Sun): 1200, 1500, 1100, 1800, 2000, 1700, 1400 — with grid lines, axis labels, and title 'Weekly Traffic'

Evaluator (Qwen 3 VL)

The requested visualization is a simple line graph showing daily website visitors over one week (Mon-Sun) with specific values: 1200, 1500, 1100, 1800, 2000, 1700, 1400. In the provided image: - The chart type is a line graph, which matches the request. - However, the x-axis labels are incorrect: “Mon”, “Sinp”, “T800”, “1400” — these are not readable day labels (e.g., “Tue”, “Wed”, etc.), and “Sinp” and “T800” are nonsensical. - The y-axis is labeled “Litn” (likely a misrendering of “Visitors”

Layout & DesignClick to expand33%vs100%

Easy 67%vs100%

local/nucleus-image-17b-a2bPASS

A movie night poster with a large popcorn image in the center, the title 'MOVIE NIGHT' in bold at the top, and the tagline 'Every Friday at 8 PM' at the bottom

Evaluator (Qwen 3 VL)

The image features a large, centrally placed popcorn bucket, which fulfills the “large popcorn image in the center” requirement. The title “MOVIE NIGHT” is prominently displayed at the top in bold red text, matching the prompt. The tagline “Every Friday at 8 PM” is correctly placed at the bottom in white text and is fully readable. The visual hierarchy is appropriate: the title is bold and large at the top, the popcorn is the central focal point, and the tagline is clearly visible at the bottom

local/nucleus-image-17b-a2bPASS

An event poster featuring a guitar illustration, the headline 'LIVE MUSIC FEST' in large text, and 'Tickets available now' as a tagline below

Evaluator (Qwen 3 VL)

All described layout zones/elements are present: - A guitar illustration is on the left side. - The headline “LIVE MUSIC FEST” is in large, bold text on the right, with clear visual hierarchy (largest element). - The tagline “Tickets available now” is correctly placed below the headline, with “now” on a separate line and underlined for emphasis. - All text is readable, correctly positioned, and matches the prompt’s description. VERDICT: PASS

local/nucleus-image-17b-a2bFAIL

A book club poster with an open book illustration, the title 'JOIN THE BOOK CLUB' prominently displayed, and 'Meets every Wednesday' as the tagline

Evaluator (Qwen 3 VL)

The prompt specified three required elements: 1. An open book illustration — not present. 2. The title “JOIN THE BOOK CLUB” — present and prominently displayed at the top with correct visual hierarchy. 3. The tagline “Meets every Wednesday” — not present. While the title is correctly placed and readable, the other two required elements are missing. The absence of the open book illustration and the tagline means the poster does not fulfill the full layout specification. VERDICT: FAIL

Medium 0%vs100%

local/nucleus-image-17b-a2bFAIL

A landing page design with a hero header image at the top, a headline and subtitle in the middle, a 'Sign Up' call-to-action button, and a footer with three social media icons

Evaluator (Qwen 3 VL)

The image shows a landing page with a blue header bar at the top, a central “Sign Up” headline, a paragraph of placeholder text, and two buttons (“Sign Up” and “SOCIALS”) below. However, the prompt specifically requested: 1. A hero header image at the top — not present; instead, there’s a blue navigation bar. 2. A headline and subtitle in the middle — the headline “Sign Up Uke” is present, but the subtitle is missing or illegible (the text below is gibberish placeholder). 3. A “Sign Up” call-to

local/nucleus-image-17b-a2bFAIL

A newsletter layout with a banner image header, a two-column body section with text on the left and an image on the right, a highlighted quote block, and a footer with unsubscribe link

Evaluator (Qwen 3 VL)

The image does not meet the required layout specifications. - There is no banner image header — the top section is a title (“Nisirreset erki”) with no accompanying banner or image. - The two-column body section is present, with text on the left and an image on the right — but the image is a stylized drawing of a flower, not a “banner image” as specified. - There is no clearly defined “highlighted quote block” — the text blocks are not visually emphasized as quotes. - The footer contains an “Uns

local/nucleus-image-17b-a2bFAIL

A restaurant menu layout with a logo header, three category sections (appetizers, mains, desserts) each with items and prices, and a footer with address and hours

Evaluator (Qwen 3 VL)

The image depicts a restaurant menu with a header logo (“LEALI MENSI”), three category sections (Appetizers, Mains, Desserts), and a footer with address and hours. However, the text within all sections — including item names, prices, and footer information — is rendered as gibberish or nonsensical characters. The layout structure is visually present, but the content is unreadable and does not convey any meaningful information. The prompt required readable, correctly placed text for all describe

Hard 33%vs100%

local/nucleus-image-17b-a2bPASS

A magazine double-page spread with a left page containing a large hero photo, and a right page with a two-column article, a pull quote, a sidebar with author bio, and a page number

Evaluator (Qwen 3 VL)

The image shows a magazine double-page spread with the following layout elements: - Left page: Contains a large hero photo of a woman’s face, which dominates the page — this matches the prompt. - Right page: Contains a two-column article layout, a pull quote (in a distinct box near the top), a sidebar with what appears to be an author bio (in a box on the right side), and a page number (bottom right corner). Visual hierarchy is generally correct: the hero image on the left is prominent, the ar

local/nucleus-image-17b-a2bFAIL

A mobile app UI mockup showing a dashboard with a top navigation bar, a search field, a 2x2 grid of feature cards with icons, a list of recent activity items, and a bottom tab bar

Evaluator (Qwen 3 VL)

The image shows a mobile app UI mockup that includes: - A top navigation bar with status indicators (time, signal, Wi-Fi, battery) — ✅ present. - A search field with placeholder text “Search” and a magnifying glass icon — ✅ present. - A 2x2 grid of feature cards with icons — ✅ present. - A “Recent Activity” list section — ✅ present. - A bottom tab bar with five icons — ✅ present. However, the text on the feature cards is illegible or incorrectly rendered (“Featur” instead of “Feature” on one c

local/nucleus-image-17b-a2bFAIL

A fashion magazine spread with a full-bleed left-page photo, right page with three-column layout, headline, body copy, a smaller inset image, and styled page numbers

Evaluator (Qwen 3 VL)

The image shows an open magazine spread. The left page features a full-bleed photograph of a model, which matches the prompt’s description. The right page contains text arranged in multiple columns, a headline, body copy, and a smaller inset image near the bottom. Page numbers are visible at the bottom of both pages, styled appropriately. However, the text on both pages is not readable — it appears to be gibberish or placeholder text (possibly due to AI generation or encoding issues), and while

Style DiversityClick to expand67%vs100%

Easy 100%vs100%

local/nucleus-image-17b-a2bPASS

A golden retriever sitting in a garden, painted in the style of an oil painting with visible brushstrokes and rich texture

Evaluator (Qwen 3 VL)

The image depicts a golden retriever sitting in a garden, and it is rendered in a style that strongly resembles an oil painting. Visible brushstrokes are present throughout the composition — especially in the grass, foliage, and the dog’s fur — giving the image texture and a painterly quality. The colors are rich and saturated, consistent with the description of “rich texture.” The overall effect mimics traditional oil painting techniques, including blending, impasto-like texture in some areas,

local/nucleus-image-17b-a2bPASS

A golden retriever sitting in a garden, rendered in Japanese anime style with flat colors and large expressive eyes

Evaluator (Qwen 3 VL)

The image depicts a golden retriever sitting in a garden, which matches the subject matter of the prompt. The style is clearly Japanese anime-inspired, featuring large, expressive eyes, simplified facial features, and a clean, stylized aesthetic. The color palette is flat with minimal shading, consistent with the requested “flat colors” characteristic. While there are some subtle gradients and soft shadows to suggest form, the overall look remains stylized and cartoonish, fitting the anime style

local/nucleus-image-17b-a2bPASS

A golden retriever sitting in a garden, rendered in pixel art style with a 32x32 grid and limited color palette

Evaluator (Qwen 3 VL)

The image depicts a golden retriever sitting in a garden, rendered in a pixel art style. The visual characteristics of pixel art are clearly present: the image is composed of distinct, blocky pixels, and the overall composition is defined by a visible grid overlay, which strongly suggests a 32x32 grid structure. The color palette is limited, using a few distinct shades to represent the dog, grass, flowers, and background foliage — consistent with the constraints of pixel art. However, the grid

Medium 100%vs100%

local/nucleus-image-17b-a2bPASS

A city street scene in the style of Art Deco — geometric shapes, gold and black palette, ornamental symmetry

Evaluator (Qwen 3 VL)

The image clearly depicts a city street scene rendered in the Art Deco style. Key characteristics are unmistakably present: - **Geometric shapes**: The buildings are composed of sharp angles, vertical lines, stepped forms, and stylized motifs — all hallmarks of Art Deco architecture. - **Gold and black palette**: The entire scene is rendered in a striking contrast of gold and black, which is a classic Art Deco color scheme. - **Ornamental symmetry**: The composition is highly symmetrical, with

local/nucleus-image-17b-a2bPASS

A woman with a parasol in the style of Impressionism — soft brushstrokes, dappled light, pastel palette

Evaluator (Qwen 3 VL)

The image clearly embodies the requested Impressionist style. Key characteristics are present: - Soft, visible brushstrokes are evident throughout the composition, especially in the background foliage and the woman’s dress. - Dappled light is captured in the way sunlight filters through the trees, creating patches of light and shadow in the background. - A pastel palette is used consistently — soft pinks, yellows, greens, and blues — which is typical of Impressionism. - The focus on capturing a

local/nucleus-image-17b-a2bPASS

A portrait of a man in Pop Art style — bold outlines, Ben-Day dots, flat vivid primary colors

Evaluator (Qwen 3 VL)

The image clearly embodies the requested Pop Art style. It features: - Bold, black outlines defining the figure and clothing. - Ben-Day dots (halftone patterns) visible in the background and subtly in the skin tones. - Flat, vivid primary colors: bright blue suit, red tie, yellow background, and orange skin tone. All key visual characteristics of the specified Pop Art style are present and unmistakable. VERDICT: PASS

Hard 33%vs100%

local/nucleus-image-17b-a2bFAIL

A forest rendered in photorealistic watercolor — the detail and lighting of a photo combined with visible watercolor bleeds and paper texture

Evaluator (Qwen 3 VL)

The image depicts a forest scene rendered in a watercolor style, with visible paper texture, watercolor bleeds, and soft, diffused edges characteristic of watercolor painting. However, the prompt requested a “photorealistic watercolor” style — meaning it should combine the high detail and lighting realism of a photograph with the watercolor medium. While the image has watercolor qualities, it lacks photorealistic detail: the trees and foliage are stylized, not sharply defined, and the lighting

local/nucleus-image-17b-a2bPASS

A woman in a futuristic city rendered in cyberpunk Art Nouveau — neon colors and tech elements combined with flowing organic lines and floral borders

Evaluator (Qwen 3 VL)

The image successfully embodies the requested “cyberpunk Art Nouveau” style. Key characteristics are clearly present: - **Neon colors**: The scene is saturated with glowing neon blues, pinks, and purples, typical of cyberpunk aesthetics. - **Tech elements**: The woman’s outfit features glowing circuitry, mechanical details, and futuristic design, blending technology with the human form. - **Flowing organic lines**: The Art Nouveau influence is evident in the curvilinear, decorative framing with

local/nucleus-image-17b-a2bFAIL

A medieval castle depicted in Japanese ukiyo-e woodblock print style combined with photorealistic sky and lighting

Evaluator (Qwen 3 VL)

The image attempts to merge a medieval castle with ukiyo-e woodblock print aesthetics, but the result is visually inconsistent and fails to meet the prompt’s requirements. Key issues: - The structure is not a medieval European castle; it is a Japanese castle (yamashiro-style), which is architecturally incompatible with the “medieval castle” request. - The ukiyo-e style is present: flat planes of color, bold outlines, stylized clouds, and vertical composition — but these are applied to a Japanes

Extreme 33%vs100%

local/nucleus-image-17b-a2bFAIL

An interior scene rendered in soft brutalism — raw concrete textures combined with rounded pastel furniture and warm diffused light

Evaluator (Qwen 3 VL)

The image presents an interior scene with geometric, block-like forms in soft pastel colors (pink, yellow, mint, peach), which aligns with the “rounded pastel furniture” and “warm diffused light” aspects of the prompt. The surfaces appear smooth and minimalist, lacking the raw, textured concrete typically associated with brutalism. While the composition has a clean, modern aesthetic with strong geometric shapes, the “raw concrete textures” element is notably absent — the materials look painted o

local/nucleus-image-17b-a2bFAIL

A still life painted in anxious minimalism — sparse composition with uncomfortable negative space, muted colors, and subtle visual tension

Evaluator (Qwen 3 VL)

The requested style — “anxious minimalism” — emphasizes sparse composition, uncomfortable negative space, muted colors, and subtle visual tension. The image presents a nearly monochromatic, minimalist composition with two horizontal planes of slightly different gray tones. While this achieves a sparse, minimalist aesthetic and uses muted colors, it lacks any visual tension or subject matter to evoke anxiety. The “uncomfortable negative space” is not meaningfully present — the space is empty but

local/nucleus-image-17b-a2bPASS

A cityscape rendered in nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina

Evaluator (Qwen 3 VL)

The image clearly embodies the requested “nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina.” Key visual characteristics are present: - **Retro-futuristic architecture**: The cityscape features sleek, tapered towers, flying saucers as both vehicles and architectural elements, and streamlined, chrome-like surfaces — all hallmarks of 1960s sci-fi design. - **Warm-toned patina**: The entire scene is bathed in a sepia-toned, golden-orange glow, ev