vs
192 evaluations across 6 categories

Prompt: The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink
Text Rendering 60%vs100%
Typography StyleClick to expand100%vs100%
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The text “CHAPTER ONE” is spelled correctly. The font style is consistent with a vintage typewriter font — bold, blocky, and evenly spaced, which matches the aesthetic of mechanical typewriters. The paper appears aged with a slightly yellowed tone, and the ink is uniform and crisp, though the prompt requested “slightly uneven ink.” While the ink is clean and consistent, it does not convincingly show the slight imperfections or variations typical of real typewriter ink (e.g., smudges, uneven pres
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The text “TODAY’S SPECIAL” is rendered in a style that convincingly mimics white chalk on a dark green chalkboard. The lettering shows the characteristic uneven, textured strokes and slight smudges of chalk writing, with visible chalk dust residue around the letters. The background is a dark green chalkboard with a wooden frame, and chalk pieces are visible at the top left, reinforcing the medium. The spelling is correct: “TODAY’S SPECIAL” (with proper apostrophe in “TODAY’S”). VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The text “OPEN” is spelled correctly. The style of the text matches the described medium: it is rendered in glowing red pixel-style letters, consistent with an LED sign. The individual dots forming the letters are clearly visible, and the overall appearance convincingly emulates an LED display. VERDICT: PASS
Writing accuracyClick to expand50%vs100%
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The word displayed is "HELLO", spelled correctly with all letters clearly visible: H, E, L, L, O. The text is fully readable, centered on a white background, and matches the intended specification exactly. No misspellings, missing words, or illegible parts are present. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The word "ADVENTURE" is spelled correctly and is fully readable. All letters are clearly visible and intact. There are no misspellings, missing words, or illegible parts. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The word displayed is "SERENITY", which is spelled correctly. All letters are clearly visible and fully readable against the solid light blue background. There are no misspellings, missing words, or illegible parts. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The text reads: "The quick brown fox jumps high" All words are spelled correctly: - The - quick - brown - fox - jumps - high All words are fully readable and clearly rendered in clean black text on a white background, matching the intended description. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The text in the image reads: "Every sunset brings a new dawn ahead". - All words are spelled correctly. - All words are fully readable. - No words are missing. - No characters are illegible. The text matches the intended phrase exactly. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The text in the image reads: "Stars shine brightest in the dark night". - All words are spelled correctly. - All words are fully readable. - The background is a dark night sky, not a clean off-white background as specified, but the text itself matches the intended phrase exactly. The only deviation from the intended description is the background color, but the question specifically asks whether the text is spelled correctly and readable — not whether the background matches. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The text in the image contains the following errors: - “all this items” should be “all these items” — “this” is incorrect; “these” is the correct plural pronoun to refer to “items”. - The fine print reads “See store for details.” — this is missing the word “the” before “store”, making it grammatically incorrect. It should be “See the store for details.” These are both spelling/grammar errors that make the text inaccurate compared to the intended text. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The text in the image contains multiple errors compared to the intended message: 1. “Grand open 4” — “open 4” is incorrect; should be “Grand Opening”. 2. “for fouk, food music, and fun,” — “fouk” is a misspelling of “food”. 3. “Join-m us, musking” — “Join-m us” is a misspelling of “Join us”; “musking” is a misspelling of “music”. 4. The intended subtitle “Join us for food, music, and fun” is partially obscured by the hand and pen, but the visible text is incorrect. The disclaimer “RSVP require
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The text on the banner is mostly correct, but there is one misspelling: - "Doors opet 6 PM." — The word "opet" is misspelled; it should be "open". All other words are spelled correctly and fully readable. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The text on the poster contains multiple errors and is not fully readable: 1. Tagline: “ONE FANLOW. APANS MOURIRING BUCK...” — This is garbled and does not match the intended “One ship. One chance. No turning back.” The words are misspelled and nonsensical. 2. Credits: - “DIRECTED BYHAISE” — Should be “Directed by James Harlow” — misspelled and incomplete. - “STARRING CHEN - SINGEVS” — Should be “Starring Elena Voss, Marcus Chen, and Sofia Reyes” — “SINGEVS” is not “Sofia Reyes”; “Elena
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The intended text was: - Magazine title: "NATIONAL EXPLORER" - Cover lines: 1. "The Hidden Temples of Cambodia — page 34" 2. "Interview: Dr. Maria Santos on Climate Action" 3. "Top 10 Trails You Must Hike This Year" - Barcode area: "Issue 247 | March 2025 | $6.99" Actual text in the image: - Magazine title: "NADU" (incorrect spelling; should be "NATIONAL EXPLORER") - Cover lines: 1. "Netial Narial of Cambonnda - 344" → "The Hidden Temples of Cambodia — page 34" → misspelled: "Netial" (
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The text on the label contains multiple errors: 1. On the lid: “ORGANIC” is misspelled as “ORGANICI”. 2. On the main label: - “12 oz before” is incorrect — should be “12 oz (340g)”. - “626 willflower harvanted” — “willflower” is misspelled (should be “wildflower”), “harvanted” is misspelled (should be “harvested”). - “in the 0 Sustainable apiraiey fm bate” — “0” is not a word, “apiraiey” is misspelled (should be “apiaries”), “fm bate” is illegible and incorrect (should be “from the Pac
Spatial Reasoning 61%vs97%
Attributes BindingClick to expand78%vs100%
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image contains a red sphere (ball) on the left and a blue cube on the right. Both objects are positioned side by side on a white surface. The colors and shapes are correctly assigned to their respective objects as requested. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image displays a green apple on the left and an orange pumpkin on the right, both resting on a wooden cutting board. The colors and identities of the objects match the prompt perfectly without any attribute swapping. * **Green Apple:** Correctly identified as green and apple-shaped. * **Orange Pumpkin:** Correctly identified as orange and pumpkin-shaped. * **Wooden Cutting Board:** Correctly identified as the surface. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image displays two vehicles on a street. 1. **Yellow Taxi:** In the foreground, there is a yellow sedan with a taxi roof sign and typical taxi markings. 2. **Black Limousine:** In the background, there is a long, black stretched vehicle, clearly a limousine. The attributes are correctly bound: the yellow vehicle is the taxi, and the black vehicle is the limousine. They are positioned next to each other on the street as requested. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image contains exactly what was requested: 1. **Large red circle:** Present at the top. 2. **Small blue triangle:** Present at the bottom left. 3. **Medium green star:** Present at the bottom right. 4. **White canvas:** The background is white. All attributes (color, size, shape) are correctly bound to the correct objects with no swaps. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image contains three candles on a shelf matching the description perfectly: 1. **Left:** A tall, thin, white candle. 2. **Middle:** A short, fat, red candle. 3. **Right:** A medium-height, spiral-shaped, yellow candle. All attributes (color, size, shape) are correctly bound to the respective objects with no swaps. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image contains a large brown horse, a small white rabbit, and a black cat in a grassy meadow. All attributes match the prompt perfectly. - **Horse:** Big, brown. - **Rabbit:** Small, white. - **Cat:** Medium-sized, black. - **Setting:** Meadow. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
Based on my examination of the image and the provided prompt ("A pink fire truck, a blue banana, and a green sun in a surreal landscape"): **Object Analysis:** 1. **Fire Truck:** The truck is clearly **pink**. This matches the prompt. 2. **Banana:** There is a large, prominent banana in the foreground which is **green**. The prompt requested a **blue** banana. While there is a blue, banana-shaped object attached to the ladder on the truck, the primary "banana" object in the scene is green.
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image contains a purple carrot and a red tree with red leaves, which match the prompt. However, the prompt requested a "white watermelon sliced open." The image shows a whole, uncut white watermelon with green stripes. The "sliced open" attribute is missing. VERDICT: FAIL
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image displays three distinct objects on a table. 1. **Left:** An orange elephant. 2. **Middle:** A penguin with a bright yellow belly (matching "yellow penguin"). 3. **Right:** A strawberry that is silver/metallic in color. All objects match the requested colors and types without any swaps. VERDICT: PASS
CompositionalityClick to expand89%vs100%
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
- **Tabby cat:** Present (The cat has distinct tabby striping). - **Oversized:** Present (The glasses are large relative to the cat's face). - **Red sunglasses:** Present (The frames are bright red with dark lenses). All elements are clearly visible and combined coherently. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
- **Golden Retriever:** Present. The dog clearly exhibits the physical characteristics of a Golden Retriever (golden coat, floppy ears, facial structure). - **Wearing:** Present. The hat is positioned on the dog's head. - **Tiny cowboy hat:** Present. The dog is wearing a brown cowboy hat that is proportionate to its head size. All described elements are present and combined coherently. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image clearly depicts a penguin standing in a snowy environment. The penguin is holding a large umbrella with distinct rainbow-colored panels (red, yellow, green, blue, etc.). All elements from the prompt are present and combined coherently. * **Penguin:** Present. * **Holding:** Present. * **Colorful umbrella:** Present. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
- **Medieval knight:** Present. The figure is clearly dressed as a knight. - **Full armor:** Present. The figure is wearing a complete suit of plate armor including a helmet and gauntlets. - **Sitting at a desk:** Present. The knight is seated on a wooden chair at a wooden table. - **Typing on a modern laptop:** Present. A modern-looking laptop is on the desk, and the knight's armored hands are positioned on the keyboard as if typing. All described elements are clearly visible and integrated
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
- **Roman gladiator:** Present (man in armor with helmet and plume). - **Taking a selfie:** Present (holding phone up, looking at it). - **Smartphone:** Present (black smartphone held in hand). - **Colosseum:** Present (distinctive arched architecture in background). All described elements are clearly visible and combined into a single, coherent scene. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
- **Astronaut:** Present (The figure is wearing a full spacesuit). - **Spacesuit:** Present (Detailed white suit with patches and helmet). - **Riding a bicycle:** Present (The figure is seated on and operating a black bicycle). - **Park:** Present (Background features green grass, trees, benches, and a paved path). All elements are clearly visible and combined coherently in a single scene. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image contains all the requested elements: - **Octopus:** Present (large pink octopus). - **Top hat:** Present (black hat with red band). - **Monocle:** Present (gold-rimmed lens over one eye). - **Playing chess:** Present (chessboard with pieces). - **Surface of the moon:** Present (cratered grey ground). - **Earth visible in the background:** Present (blue planet in the sky). All elements are clearly depicted and coherently combined. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
- **Dinosaur:** Present (Green T-Rex). - **Business suit:** Present (Dark jacket, white shirt, blue tie). - **PowerPoint presentation:** Present (Screen with charts/graphs in background). - **Modern office:** Present (Conference table, laptop, screen). - **Drinking coffee:** Present (Holding cup, liquid pouring into mouth). All elements are clearly visible and integrated into a coherent scene. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image contains most elements but fails on the specific count and action regarding the kittens. * **Robot dressed as a pirate:** Present (wearing a pirate hat and chest emblem). * **Paper boat:** Present (white origami boat). * **Lake of lava:** Present (glowing red liquid). * **Juggling three kittens:** **FAIL**. There are six kittens visible (two held in hands, four floating/jumping in the air), not three. Additionally, the action resembles floating or jumping rather than a clear
CountingClick to expand11%vs89%
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
I see 4 red apples arranged on the wooden table. The prompt requested exactly 3 red apples. Since the actual count is 4, the count does not match the request. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
I count 6 yellow rubber ducks sitting in a row on the shelf. The prompt requested exactly 5. Since the actual count (6) does not match the requested count (5), the image fails the specific constraint. VERDICT: FAIL
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
I see exactly 4 white candles standing on a dark surface. The count matches the request perfectly. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image shows 2 orange cats, which matches the request. However, there are only 2 black dogs visible, whereas the prompt requested 3 black dogs. Since the count of dogs is incorrect, the image does not fully satisfy the prompt. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
I count 6 red roses and 3 white daisies in the image. The prompt requested 4 red roses and 2 white daisies. Since the actual counts (6 roses, 3 daisies) do not match the requested counts (4 roses, 2 daisies), the image does not meet the specific criteria. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image contains 2 blue butterflies (one on the wood, one on the wire) and 3 yellow butterflies (two on the wood, one on the wire). The prompt requested exactly 1 blue butterfly. Since there is an extra blue butterfly present, the count does not match the request. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
I count 11 goldfish in the image. 1. Top left 2. Top middle (small) 3. Top right 4. Middle left 5. Middle center-left 6. Middle center 7. Middle right 8. Bottom left 9. Bottom center 10. Bottom right 11. Far right edge (small) The prompt requested exactly 9 goldfish. The actual count is 11. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image contains 3 bananas, 3 oranges, and 2 apples. The prompt requested 4 oranges, but only 3 are visible in the bowl. Therefore, the count does not exactly match the request. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
I count 12 balloons in the image: 1. Cyan (top left) 2. Orange (top center) 3. Green (top right) 4. Yellow (middle left) 5. Yellow (middle center) 6. Purple (middle) 7. Dark Blue (top right) 8. Blue (middle right) 9. Red (far right) 10. Pink (bottom left) 11. Dark purple/grey (center, partially hidden) 12. Green (bottom right) The prompt requested exactly 11 balloons, but there are 12. VERDICT: FAIL
NegationClick to expand44%vs89%
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image depicts a serene tropical beach scene featuring palm trees, blue ocean water, and a sandy shore. I have carefully scanned the entire image, including the beach, the water, and the background foliage. There are absolutely no human figures, swimmers, or people visible anywhere in the scene. The exclusion criteria is fully met. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image shows a tabby cat lying on a grey couch. However, contrary to the exclusion constraint, the cat is clearly wearing a black collar with a gold buckle around its neck. Since the excluded element (a collar) is present, the image does not meet the specific negative constraint. VERDICT: FAIL
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image depicts an empty classroom setting. I see rows of wooden desks and chairs facing a large green chalkboard mounted on the wall. The room is completely devoid of people; there are absolutely no students or teachers visible in the frame. The exclusion criteria "no students" is fully met. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image displays a sandy beach with palm trees, seashells, and a yellow beach umbrella as requested. However, contrary to the exclusion criteria, there is clearly a blue ocean visible in the background, stretching across the horizon behind the umbrella and sand. The water is a prominent feature of the scene. VERDICT: FAIL
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image shows a clean office workspace with a desk, a black office chair, a white desk lamp, a potted plant, stacks of paper, an open notebook, a green mug, and a filing cabinet. There are no computers (laptops or desktops) or screens (monitors) visible in the scene. The black object on the papers appears to be a smartphone or a small case, but it does not constitute a computer or screen in the context of the exclusion. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image shows a kitchen scene with pots, pans, a stove, and a cutting board as requested. However, on the cutting board in the foreground, there is a small bowl containing a white substance that appears to be chopped food (likely onions, garlic, or cheese). Since the prompt explicitly excluded food, this element violates the constraint. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image shows a pizza covered in a significant layer of melted, yellowish-white cheese with browned spots. The prompt explicitly requested "absolutely no cheese," but the image is dominated by it. VERDICT: FAIL
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image displays a motion-blurred view of a vehicle, likely a car, characterized by horizontal streaks of blue and white light. The blur obscures fine details, but the general shape suggests a car body with horizontal lines indicating windows or door panels. The bottom section is dark and indistinct. Crucially, there are no visible wheels; the lower area is just a dark shadow or blur, consistent with the "no wheels" constraint. The excluded element (wheels) is absent. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image shows a human face with eyes, nose, and mouth. However, the prompt explicitly requested "no eyebrows." The image clearly displays eyebrows—specifically, there are realistic eyebrows above the eyes, and additionally, sketch-like drawn eyebrows are visible just above the natural ones. Since the excluded element (eyebrows) is present, the image fails the criteria. VERDICT: FAIL
Relative PositionClick to expand83%vs100%
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image shows a cat sitting **inside** an open cardboard box, with its front paws resting on the rim. The prompt specifies the cat is "on top of" the box. While the cat is elevated, it is clearly positioned within the interior volume of the box rather than on the upper surface of a closed box or balanced on top. The spatial relationship "on top of" is inaccurate compared to "in" or "inside". VERDICT: FAIL
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image displays a wooden chair with a red ball positioned directly beneath its seat, resting on the floor between the legs. This arrangement perfectly matches the prompt's description of the ball being "underneath" the chair. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image shows a grey coffee mug positioned to the left and an open book positioned to the right. Both objects are resting on a flat, light-colored surface, consistent with a desk. The spatial arrangement matches the description "next to" and "on a desk" perfectly. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image shows a bird perched on a branch that extends from a tree trunk on the right side. In the lower left foreground, there is a park bench. The spatial arrangement matches the description: the bird is on the branch, the branch is part of the tree, and the tree is situated next to the bench in the park setting. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image shows a tabby and white cat sitting upright on a white cushion. This cushion is placed directly on the seat of a light wooden chair. The spatial relationships described in the prompt are accurate: the cat is on the pillow, and the pillow is on the chair. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image shows a lit candle sitting directly on top of a vertical stack of books. To the right of this stack, there is a green potted plant. The spatial relationships described in the prompt ("candle on top of a stack of books" and "books next to a potted plant") are accurately represented in the image. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image shows a central golden King, but the surrounding pieces do not match the prompt's description. To the left is a golden Knight, not a Queen. To the right is a brown Knight, not a Bishop. In front (below) the King is a brown Pawn, not a Knight. Behind (above) the King are other pieces, but not a Rook directly aligned. Additionally, there are more than five pieces visible (at least seven). The spatial relationships are incorrect. VERDICT: FAIL
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image perfectly matches the described spatial arrangement. The laptop is centrally placed on the desk. To its left sits a coffee mug, and to its right stands a phone. A lamp hangs directly above the setup, casting light downwards. In the foreground, positioned "below" the laptop from the viewer's perspective, lies an open notebook with a pen. All positional relationships are accurate. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image displays five animals standing in a row in a grassy field. From left to right, the arrangement is: a brown horse on the far left, a black and white cow immediately to its right, a sheep in the center, a white goat to the right of the sheep, and a golden dog on the far right. This perfectly matches the specified spatial relationships. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image displays a red barn situated in the foreground on the right side, appearing large. To its left, a windmill stands in the midground area, appearing slightly smaller in visual mass but tall. Behind both structures, a range of snow-capped mountains spans the background, appearing smaller due to distance. The depth perspective is consistent, with the field in the immediate front, the structures in the middle, and the mountains in the far back. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image displays a clear depth progression matching the prompt. A large sandcastle sits in the immediate foreground at the bottom. A surfer is positioned in the middle distance on the waves. A cruise ship is visible on the horizon line in the background. The relative sizes and positions (foreground to background) are accurate. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image displays a clear depth progression. In the immediate foreground, a park bench sits on the left with numerous pigeons scattered on the pavement. In the midground, a large fountain is centrally located. In the background, tall skyscrapers rise above the tree line. The spatial arrangement of foreground (bench/pigeons), midground (fountain), and background (skyscrapers) is accurate. VERDICT: PASS
Scale & ProportionsClick to expand56%vs100%
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image shows a person standing next to a large yellow school bus. The person's height relative to the bus appears realistic; their head reaches roughly the lower section of the bus's side, well below the roofline, which is consistent with the actual dimensions of a school bus (approx. 10-11 feet tall) versus an adult human. The wheels and bumper sizes also align correctly with the person's scale. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image shows a ladybug that appears disproportionately large relative to the leaf. In a realistic scenario, a ladybug (typically 4-8mm) would appear much smaller on a "large" leaf. Here, the ladybug occupies a significant portion of the visible leaf area, suggesting the scale is off or the leaf is actually quite small, contradicting the "large" descriptor and "realistic proportions" requirement. VERDICT: FAIL
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image shows a young boy holding a basketball. The size of the basketball relative to the child's torso and hands appears realistic and consistent with real-world proportions for a child of that age holding a standard or youth-sized basketball. The ball is large but not disproportionately huge, and the grip looks natural. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image depicts a miniature figurine standing on the rim of a white coffee cup filled with dark liquid. The figure is significantly smaller than the cup, accurately reflecting the "tiny person" description in the prompt. The relative scale is consistent and realistic for a miniature photography concept. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image successfully depicts a tabby cat that is enormous in scale relative to its surroundings. The cat's paws are larger than the cars visible in the background, and its body height rivals the lower stories of the surrounding skyscrapers. This accurately reflects the "giant" and "kaiju" size relationships requested in the prompt. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image shows a small model house resting comfortably in the center of an open human palm. The scale is consistent with a "miniature house," as it fits entirely within the hand without looking disproportionately large or tiny. The proportions are logical for the described scene. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image fails to meet the prompt's requirements in two main ways. First, the prompt requested a "dog," but the image contains three cats instead. Second, and more critically regarding proportions, the mouse is disproportionately large. In reality, a mouse is significantly smaller than a cat (roughly 1/4 to 1/5 the size), but in the image, the mouse is nearly as tall as the sitting cats. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image is missing the salt shaker and peppercorn entirely, so the requirement for "all" objects to be present with realistic sizes cannot be met. Additionally, the fork in the foreground is disproportionately large compared to the plate and cutlery in the background, indicating inconsistent scaling and perspective. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image contains several objects with incorrect relative sizes. Specifically, the chickens are disproportionately large. The chicken perched on the fence post is nearly as tall as the farmer standing in the background, which is physically impossible in reality. The other chickens are also oversized relative to the cow and the grass. Therefore, the proportional scale is not correct. VERDICT: FAIL
Human realism 79%vs93%
Faces & ExpressionsClick to expand92%vs100%
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The face in the image is largely symmetrical with correctly proportioned features: eyes, nose, and mouth are well-placed and balanced. The smile appears natural and warm, with visible teeth and slight crinkling around the eyes that conveys genuine happiness. There are no obvious distortions—no asymmetrical facial structures, unnatural eye shapes, or misaligned features. The lighting and skin texture look realistic, and the emotion is clearly readable as joyful and sincere. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The face in the image is largely symmetrical with correctly rendered features: the eyes, nose, and mouth are balanced, and the hairline and ear placement are proportionate. The man appears middle-aged, with visible signs of aging such as wrinkles around the eyes and forehead, and gray streaks in his hair — consistent with the prompt. His expression is calm and serious, conveyed through slightly narrowed eyes, a neutral mouth, and a focused gaze — all of which are clearly readable. There are no s
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The face in the image is symmetric and features are proportionally correct — eyes, nose, mouth, and ears are well-placed and balanced. The teenager’s expression is neutral and relaxed, with no strong emotion conveyed, matching the prompt. There are no visible distortions; the facial structure is natural and well-rendered. The lighting and composition support a clear, clean headshot. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The face in the image is a close-up showing a strong expression of disgust. The nose is visibly wrinkled, and the lips are curled downward and outward, consistent with the prompt. The eyebrows are drawn together and furrowed, enhancing the expression. The facial features are largely symmetric and anatomically plausible, with no major distortions or unnatural artifacts. The emotion is clearly readable — the expression is unmistakably disgust. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The face is largely symmetric with correctly rendered features: the eyes, eyebrows, nose, and mouth are well-proportioned and aligned. The intended emotion — genuine surprise — is clearly readable through the wide eyes, raised eyebrows, and slightly open mouth, all of which are accurately depicted. There are no significant distortions; the facial structure is natural and the expression is coherent and convincing. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The face in the image is largely symmetric with correctly rendered features — eyes, nose, and mouth are proportionally aligned and anatomically plausible. The lighting and shadows are consistent with a somber, low-key mood. The intended emotion — quiet melancholy — is clearly readable: the eyes are downcast, the mouth is slightly drawn and downturned, and the gaze is distant and introspective. There are no significant distortions; the facial structure is natural, and the expression is coherent
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The face is largely symmetrical, with both eyes, eyebrows, and the curve of the smile appearing balanced. The features are anatomically plausible: the nose, ears, and mouth are correctly proportioned and positioned. The deep wrinkles around the eyes and mouth are natural and consistent with an elderly person’s facial aging. The white hair is realistic in texture and distribution. The emotion is clearly readable — the warm, genuine smile with crinkled eyes conveys kindness and joy. There are no s
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The face is largely symmetric with correct anatomical features: the eyes, nose, and mouth are centered and proportionate, and the tear streams are symmetrically falling from each eye. The child’s expression clearly conveys crying — eyebrows are furrowed, eyes are squeezed shut, and the mouth is open in a wail. The “round cheeks” and “large eyes” are also well-represented. There are no significant distortions; the features are natural and emotionally expressive without unnatural warping. VERDICT
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The face is largely symmetric with correct anatomical features — the eyes, nose, and mouth are well-proportioned and aligned. The age spots are abundant and concentrated on the forehead and cheeks, consistent with the prompt. Thinning gray hair is visible, especially on the sides and crown. The expression is clearly skeptical or disapproving — the brow is furrowed, lips are downturned, and the gaze is intense — all of which convey the intended emotion. There are no major distortions; the facial
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The image shows five people, not three, and the facial expressions do not fully match the prompt’s requested emotions. - The man on the far left is laughing with his head tilted back — this matches the first requested emotion. - The woman in the center has wide eyes and an open mouth — this matches the “shocked” expression. - The man on the far right is covering his eyes with his hand and has a pouty, annoyed expression — this does not clearly convey “rolling eyes with a smirk”; it looks more l
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
All four faces in the image are symmetric with correct anatomical features — eyes, nose, mouth, and ears are properly aligned and proportioned without noticeable distortions. Emotions are clearly readable and match the prompt: - The father (left) is beaming with pride — wide smile, crinkled eyes. - The mother (right) is smiling gently — soft, closed-mouth smile. - The teenager (back center) looks bored — neutral to slightly downturned mouth, lack of engagement. - The child (front center) is gri
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
All three faces are symmetric with correct, clearly defined features appropriate for cartoon-style illustration. The intended emotions are clearly readable: - Left face: Frustrated with furrowed brows, downturned mouth, and tense expression — correctly conveyed. - Center face: Nodding with a pleased, smirking expression, hand near chin — clearly shows satisfaction. - Right face: Confused with tilted head, wide eyes, and hand on cheek — accurately depicts bewilderment. No distortions are presen
Full BodyClick to expand92%vs100%
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image shows a full-body view of a woman standing upright with her arms at her sides, wearing a white t-shirt and blue jeans — exactly matching the prompt. - Limb count: Correct — two arms, two legs, visible and properly attached. - Proportions: Anatomically accurate — head, torso, limbs, and joints appear proportionate and natural. - Pose: The woman is standing upright in a relaxed, neutral pose with arms at her sides — fully recognizable and consistent with the prompt. - Distortions: None
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image shows a full-body shot of a man standing straight, facing the camera, with both hands in his pockets. His proportions appear normal: head, torso, and limbs are in proportion to each other without obvious exaggeration or distortion. He has two arms, two legs, and the correct number of fingers — no limb count anomalies are visible. The pose is clearly recognizable as “standing straight with hands in pockets,” and the posture is natural and anatomically plausible. There are no significan
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image shows a full-body view of a person standing in a park, viewed from behind. The person is in a neutral, upright stance with feet shoulder-width apart, hands resting at the sides, and head straight. The pose is recognizable as a standard standing posture, commonly used in photography or AI-generated imagery to convey calmness or observation. Anatomical inspection: - Limb count: Two arms, two legs — correct. - Proportions: The torso, arms, and legs appear proportionate relative to each o
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image shows a woman in mid-stride running through a park, with her hair and clothes exhibiting motion blur consistent with movement. Her limbs are correctly proportioned — two arms, two legs, with natural joint articulation. The pose is anatomically plausible for running: one leg forward, the other back, arms bent and swinging opposite to legs, torso slightly forward. The motion is captured with a shallow depth of field and background blur, enhancing the sense of speed. There are no obvious
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image shows a man mid-air, seemingly jumping over a puddle. His body is in a dynamic pose with bent knees and arms extended for balance — consistent with the described action. Limb count is correct: two arms, two legs, and the torso are all present. Proportions appear anatomically plausible — the torso-to-leg ratio, arm length, and joint angles look natural for a human in mid-air. The pose is recognizable as a jump, and the splash in the puddle confirms the action is occurring. There are no
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The dancer in the image is captured mid-spin, with one leg extended outward and the other supporting her weight en pointe — a pose that is both dynamic and recognizable as a classical ballet turn. The dress flows outward in a way that suggests motion, consistent with the spinning action. Limb count is correct: two arms, two legs, and the torso are all intact and properly formed. Proportions appear anatomically accurate — the dancer’s body is slender but proportionate, with no obvious distortions
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image shows a person performing the yoga tree pose (Vrksasana). The pose is recognizable: one leg is standing firmly on the ground, the other foot is pressed against the inner thigh of the standing leg, and the arms are raised overhead with palms together in prayer position. Limb count is correct — two arms, two legs, two feet, and two hands are clearly visible and properly formed. Proportions appear anatomically accurate — the torso, limbs, and head are proportionally rendered without obvi
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The climber in the image has two arms and two legs — limb count is correct. Proportions appear anatomically plausible: the torso, limbs, and head are proportionally scaled relative to each other, with no obvious skeletal or muscular distortions. The pose is recognizable as a dynamic climbing stance — one arm reaching upward to a hold, the other gripping a rope, legs positioned for balance and leverage on footholds. The body is angled appropriately for climbing, and the use of climbing shoes and
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image shows a cellist seated with the cello positioned between their legs, supported by a stand. The bow is extended horizontally, held in the right hand, and appears to be in motion across the strings. The left hand is placed on the fingerboard, consistent with playing. The posture is upright and appears correct for a seated cellist. Limb count is normal: two arms, two legs, and the torso are intact and proportionally rendered. The proportions of the body and instrument are realistic, with
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image depicts two tango dancers in a close embrace. The woman is arched backward, supported by the man, with her legs intertwined with his — one leg wrapped around his waist, the other extended back. This is a classic tango pose known as a “dip” or “back dip,” and the action is recognizable and stylistically appropriate for tango. Anatomical inspection: - Limb count: Both dancers have all limbs present — two arms, two legs each — no missing or extra limbs. - Proportions: The proportions app
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image depicts two soccer players in a dynamic, mid-action scene. One player is sliding on the ground, while the other is airborne, seemingly jumping over the sliding player. The ball is visible near the sliding player’s foot, and there is dirt kicked up, suggesting motion and contact. Anatomical analysis: - Limb count: Both players have all limbs intact — two arms and two legs per player. - Proportions: The players’ proportions appear realistic — heads, torsos, limbs are in proportion to ea
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The image shows two gymnasts performing handstands side by side. However, their body alignment is not identical — the gymnast on the right has her legs slightly bent at the knees and her feet not fully extended upward, while the gymnast on the left has straight legs and a more classic handstand posture. Additionally, their torsos are not perfectly aligned; the right gymnast’s head is turned slightly, and her body is angled differently. Anatomically, both figures have correct limb count (two arm
HandsClick to expand42%vs75%
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
Upon close inspection: - Left hand (viewer’s right): 5 fingers visible, but the thumb appears fused or malformed with the index finger — not anatomically natural. - Right hand (viewer’s left): Only 4 distinct fingers clearly visible; the pinky is either missing or merged. Gestures are not naturally relaxed — fingers appear stiff and unnaturally positioned. Anomalies include missing/merged digits and unnatural spacing. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image shows two hands: one palm-up with five fingers clearly visible, and another hand resting on top, also showing five fingers. However, the top hand’s fingers appear to merge or overlap unnaturally with the bottom hand — particularly the thumb and index finger of the top hand seem fused or misaligned with the palm below. The gesture is not anatomically correct; it looks like a single hand with extra fingers or a glitched overlay. Core criteria (anatomically correct, 5 fingers per hand) ar
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image shows a man with arms crossed. On the left hand (viewer’s right), all 5 fingers are visible and appear anatomically correct, resting on the right upper arm. On the right hand (viewer’s left), only 4 fingers are clearly visible — the thumb is obscured or missing from view. The grip gesture is otherwise natural. Since one hand does not show exactly 5 visible fingers as required by the prompt, the core criterion is not fully met. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The hand gripping the mug appears to be a right hand, as described. However, upon close inspection, the fingers are not clearly distinguishable — there are no visible individual fingers or thumb separation; instead, the grip looks like a fused, amorphous mass of skin with no anatomical detail. This suggests an AI-generated artifact where finger structure was not properly rendered. No clear count of 5 fingers is possible, and the gesture lacks realism. VERDICT: FAIL
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The right hand gripping the pen shows 5 fingers, and the grip appears anatomically correct — thumb, index, and middle fingers holding the pen naturally. The left hand is not visible in the image, so we cannot verify its finger count or posture. Since the prompt specifies “each visible hand” and only one is visible (and it’s correct), the core criterion is met. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The left hand (on the fretboard) shows 4 fingers clearly visible, with the thumb likely behind the neck — anatomically plausible. The right hand (strumming area) also shows 4 fingers; the thumb is not clearly visible but may be obscured or positioned normally. No obvious extra or missing fingers. Grip and posture appear natural for guitar playing. Minor occlusion doesn’t indicate anomaly. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image shows a single right hand making a gesture. Three fingers (index, middle, ring) are extended upward; the thumb and pinky are curled. This contradicts the prompt’s requirement of “two fingers up and three curled.” Anatomically, the hand is correctly formed with 5 fingers total, but the gesture does not match the requested peace sign (which typically uses index and middle fingers only). The extra extended finger violates the prompt’s core instruction. VERDICT: FAIL
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
The image shows one hand giving a thumbs-up gesture. The thumb is extended upward, and the other four fingers are curled into a fist — consistent with the prompt. Anatomically, this is correct for a standard thumbs-up. There are no visible anomalies: the hand has 5 fingers total (thumb + 4 curled), and the proportions and positioning appear natural for a cartoon-style illustration. No extra or missing digits. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
The image shows a single hand with three fingers extended: index, middle, and ring — matching the prompt’s gesture. However, counting visible fingers: thumb is curled (partially visible), index, middle, ring are extended, and pinky is not visible — so only 4 fingers are clearly present or implied. The numbers “1”, “2”, “1” are drawn on knuckles, which is an anomaly not requested. Anatomically, the hand structure appears plausible, but the numbering and incomplete finger visibility violate the co
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
Upon close examination of the handshake: - Left hand (from viewer’s perspective): 5 fingers are visible — thumb, index, middle, ring, and pinky. All appear anatomically proportioned and correctly positioned in the grip. - Right hand: Also shows 5 fingers — thumb wrapped around, four fingers interlaced or resting naturally. No extra or missing digits; joint alignment and curvature look natural for a firm handshake. No anomalies detected. Grip is realistic and anatomically plausible. VERDICT: P
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.5 122B)
Upon close inspection: The left hand (in black sleeve) shows 5 fingers — thumb, index, middle, ring, pinky — all extended and visible. The right hand (gray sleeve) also displays 5 fingers, though the thumb is partially obscured by the other hand’s palm; however, its position and curvature suggest it’s correctly placed. No extra or missing digits. The gesture is anatomically plausible for a high-five, with palms meeting squarely. Minor occlusion doesn’t invalidate correctness. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.5 122B)
Upon close examination: The left person’s top hand shows 4 fingers clearly; the bottom hand shows 5. The right person’s top hand shows 4 fingers; bottom hand shows 5. One finger on each top hand appears merged or missing — likely an AI artifact. Grips are plausible but not perfectly anatomical due to finger anomalies. Core criterion (“exactly 5 per hand”) is violated. VERDICT: FAIL
Multi-SubjectClick to expand100%vs100%
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
Person 1: - Hair: Short, blonde, styled in a bob cut. - Clothing: Wearing a bright red zip-up jacket. - Position: Standing on the left side of the image. Person 2: - Hair: Long, straight, black hair. - Clothing: Wearing a royal blue V-neck dress with a flared skirt. - Position: Standing on the right side of the image, side by side with Person 1. Both individuals are visually distinct and match the described features exactly. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
Person 1 (left): - Has a full, thick beard (matches "beard"). - Wears black-rimmed glasses (matches "glasses"). - Wearing a green sweater (matches "green sweater"). - Appears to be a man (matches "one man"). Person 2 (right): - Clean-shaven (no facial hair). - Wearing a black hoodie (matches "black hoodie"). - Appears to be a man (matches "other man"). - Note: The man is bald, which is a more specific trait than "clean-shaven," but "clean-shaven" is still accurately represented as he has no fac
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
Person 1 (tall): - Hair: Curly, vibrant red, voluminous, and framing the face. - Clothing: Denim overalls over a white tank top. - Height: Appears tall, standing significantly above the other person. Person 2 (short): - Hair: Straight, brown, cut in a bob with bangs. - Clothing: Yellow sundress with thin straps and a gathered waist. - Height: Clearly shorter than the first person, visually confirming the “short” descriptor. Both individuals are present, visually distinct, and match the descr
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
- Bald man in a suit: Present. He is on the far left, bald, wearing a dark suit with a white shirt and tie. Visually distinct. - Woman with silver hair in a red blouse: Present. She is second from the left, has short silver hair, and wears a red blouse. Visually distinct. - Young man with dreadlocks in a denim jacket: Present. He is third from the left, has dark dreadlocks, and wears a denim jacket over a white shirt. Visually distinct. - Woman with a hijab in a green dress: Present. She is on t
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
- Tall woman with short pink hair playing bass: Present. She has bright pink, short, straight hair with bangs, is standing, and is holding a double bass (not an electric bass, but still a bass instrument). She is visually distinct and matches the description of playing bass. - Man with a beard and bandana on drums: Present. He has a full beard and is wearing a blue patterned bandana on his head. He is seated behind a drum kit and is visually distinct. He matches the description. - Slim man wit
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
Person 1 (left): - Observed traits: Man, wearing a beige sunhat, carrying a large dark backpack with a water bottle attached, hiking boots, and khaki shorts. - Matches description: Yes — man with backpack and sunhat. Person 2 (middle): - Observed traits: Woman with long dark hair in two braids, wearing a light gray t-shirt and shorts, carrying a gray backpack, and using hiking poles. - Matches description: Yes — woman with braided hair and hiking poles. Person 3 (right): - Observed traits
Truthfulness 56%vs93%
PhotorealismClick to expand67%vs100%
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.6 27B)
The image accurately renders the requested materials. The glass shows proper refraction and transparency; the spoon exhibits sharp, realistic metallic reflections; the napkin displays a convincing woven fabric texture; and the apples have a smooth, glossy skin with appropriate specular highlights. The wooden table surface also shows realistic grain and lighting. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.6 27B)
The image fails to render the requested materials correctly. The laptop appears to be matte plastic rather than brushed aluminum, lacking metallic reflections. The notebook is a standard paper spiral-bound type, not leather-bound. The central object is a glass tea infuser, not a ceramic mug or glass paperweight as requested. The textures are generic and do not exhibit the distinct visual properties (like translucency or specific grain) required for the prompt. VERDICT: FAIL
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.6 27B)
The image accurately renders the requested materials. The stainless steel pot shows sharp, metallic reflections. The wooden cutting boards display natural grain and matte texture. The glass bottle exhibits correct translucency and refraction of the olive oil inside. The cotton towel has a soft, fibrous texture with appropriate light absorption. All surfaces interact with light realistically. VERDICT: PASS
Physics & ReflectionsClick to expand75%vs92%
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.6 27B)
The shadow is cast to the right and slightly downward, consistent with a light source from the upper left. The shadow’s shape is an oval, appropriately distorted by perspective and surface contact. Its soft edges suggest diffuse lighting, which matches the gentle gradient on the sphere. No reflections are expected on a matte white surface, and none appear — correct. Color of the shadow is neutral gray, not tinted, which is plausible for a white surface under neutral light. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.6 27B)
The light source is coming from the window on the right, which correctly casts the shadow to the left. The shadow’s shape accurately matches the mug and handle, with appropriate sharpness for direct sunlight. The lighting and reflections on the mug’s surface are consistent with a glossy ceramic material under bright natural light. No distortions or implausible elements are present. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.6 27B)
The shadow is elongated and stretches far down the frame, which contradicts the “midday” condition (where shadows should be short). The shadow’s shape also appears distorted and inconsistent with the person’s posture. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.6 27B)
The reflection is physically impossible. The woman faces the mirror, so her reflection should show her front. Instead, the reflection shows her back, while the real figure’s back is visible to the viewer — a logical contradiction. Shadows and lighting are consistent, but the core reflection error invalidates plausibility. VERDICT: FAIL
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.6 27B)
The reflection is vertically aligned beneath the cat, matching its posture and orientation. The distortion follows the curvature of the polished floor, with slight blurring consistent with a glossy surface. Color and brightness are appropriately muted in the reflection, as expected from light absorption and angle. Directionality is correct — no lateral or inverted errors. Overall, the reflection behaves physically plausibly for a marble floor. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.6 27B)
The reflection on the glass table is physically plausible. It correctly mirrors the vase and flowers with appropriate orientation, slight distortion due to the glass surface, and accurate color representation. The reflection aligns with the object’s position and lighting conditions. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.6 27B)
The reflections are vertically inverted, matching the trees’ orientation. They exhibit gentle ripples consistent with a calm lake surface, and appear slightly darker than the originals — all physically plausible. The color gradient in the sky is mirrored accurately in the water. No directional or distortion errors detected. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.6 27B)
The reflection in the puddle is inverted, which is physically correct. The distortion is mild and consistent with a shallow puddle on asphalt. The color shift toward cooler tones in the reflection is plausible due to water absorption and ambient light. The background skyline matches the reflected shapes, confirming spatial consistency. No obvious artifacts or implausible elements are present. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.6 27B)
The reflection is physically plausible: it is correctly inverted, aligned with the mountain’s base at the shoreline, and shows slight softening and muted color consistent with water surface diffusion. No directional or orientation errors are present. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.6 27B)
The reflections and refractions are physically plausible. The sphere correctly refracts the background, inverting the scene as expected through a convex lens. The reflection on the mirror surface is sharp and accurately mirrors the sphere’s position and lighting. The water reflection in the background is soft and consistent with a pool environment. Colors and distortions align with real-world optics. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.6 27B)
The reflections are physically implausible. The floor tiles show reflections of the ceiling and windows, but the geometry is distorted and inconsistent with the room’s layout. The figures’ reflections do not align correctly with their positions, and some reflections appear in surfaces that should not reflect them (e.g., opaque walls). Color and orientation are also mismatched in several areas. VERDICT: FAIL
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.6 27B)
The reflections and refractions are physically plausible. The crystal glass shows realistic distortion of the background through its facets, with light bending correctly. The marble surface reflects the base of the glass with appropriate softness for a polished but not mirror-like material. The lake’s surface reflects the sky and distant trees with natural blurring due to water movement. Shadows are subtle and consistent with ambient daylight from the window. No directional or color inconsistenc
World KnowledgeClick to expand33%vs92%
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.6 27B)
The image depicts the Eiffel Tower with high architectural accuracy. The iron lattice structure, the three distinct levels (including the two main observation decks and the top platform), and the iconic arched base are all correctly rendered. The perspective and surrounding environment (trees, clear sky) are consistent with a real-world photograph. No significant inaccuracies are present. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.6 27B)
The image depicts the Taj Mahal with recognizable features: the central dome, four minarets, and the reflecting pool. However, there are notable inaccuracies. The minarets are too short and lack the correct proportion and detailing. The side domes (chhatris) are oversized and misshapen. The reflection in the pool is overly simplified and lacks realistic distortion. The overall symmetry is approximate but not precise. These deviations from architectural accuracy disqualify it as factually correct
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.6 27B)
The image depicts two identical Statues of Liberty standing side-by-side on separate pedestals. This is factually incorrect, as there is only one Statue of Liberty. The duplication makes the subject architecturally inaccurate and unrealistic. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.6 27B)
The image depicts a shamisen, but it contains significant factual inaccuracies. The instrument shown has four strings instead of the traditional three. Additionally, the bridge is incorrectly positioned on the soundboard rather than being a separate piece resting on it, and the tuning pegs are arranged in an unusual, non-standard configuration. These errors make the depiction factually inaccurate. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.6 27B)
The subject is clearly recognizable as an astrolabe, with appropriate brass material and general structure. However, the engraved markings are not accurate — they consist of gibberish symbols and nonsensical characters rather than real degree markings, zodiac signs, or Arabic/Latin inscriptions typical of medieval astrolabes. The rete and tympan also lack authentic geometric precision and historical detail. While visually plausible at first glance, it fails factual accuracy. VERDICT: FAIL
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.6 27B)
The image accurately depicts a Japanese kintsugi bowl. The ceramic vessel features visible cracks repaired with gold lacquer, which is the defining characteristic of the kintsugi art form. The bowl’s shape, glaze, and the application of gold along the fracture lines are realistic and consistent with real-world examples. No architectural or factual inaccuracies are present. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.6 27B)
The image depicts a stylized diagram of the water cycle, which is recognizable in concept. However, the text labels are nonsensical gibberish (e.g., "EVAVATION," "FRAFRILATION CYAL," "ECLÉPOTION"), making it factually inaccurate as a diagram. The prompt asked for a detailed diagram showing specific processes; since the labels are unreadable and incorrect, it fails to convey the required information accurately. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.6 27B)
The image depicts a surface-level landscape of cracked earth with a river, rather than a geological cross-section. It completely fails to show the requested internal structures: tectonic plates, subduction zones, trenches, volcanic arcs, or the mantle. The subject is not recognizable as the requested scientific diagram. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.6 27B)
The image is not factually accurate. While it resembles a heart, the anatomy is distorted and incorrect. The chambers are misshapen, the valves are not clearly or correctly depicted, and the major vessels (aorta, pulmonary arteries/veins) are in wrong positions or have unnatural shapes. For example, the aorta appears to merge incorrectly with other structures, and the pulmonary veins are not properly shown entering the left atrium. These errors make it unsuitable as an anatomical illustration.
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.6 27B)
The image depicts a Morpho butterfly with the characteristic iridescent blue dorsal wings and brown ventral patterns with eyespots. The body morphology, including antennae and wing venation, appears anatomically correct. The coloration and pattern are consistent with the species Morpho menelaus. No significant inaccuracies are observed. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3.6 27B)
The image depicts the interior of the Hagia Sophia with recognizable architectural features such as the central dome, pendentives, and Islamic calligraphy medallions. However, the Byzantine mosaics appear distorted and inaccurately rendered, with some figures and details blending into the background or appearing nonsensical. The overall composition is plausible but lacks precise architectural accuracy in the mosaic details. VERDICT: FAIL
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3.6 27B)
The image depicts a bismuth crystal specimen with high fidelity. It accurately captures the characteristic geometric hopper crystal formation, the distinct "staircase" structure on the crystal faces, and the vibrant, iridescent oxide layers in correct colors (blues, greens, golds, purples). The base rock matrix is also rendered realistically. The subject is clearly recognizable and factually accurate. VERDICT: PASS
Professional Studio 74%vs93%
Camera & LightingClick to expand83%vs100%
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image shows a modern living room interior with a wide-angle perspective. 1. **Perspective/Wide-angle/Distortion**: The image exhibits significant barrel distortion, particularly visible in the curved lines of the floor and the way the walls and ceiling meet. This creates the "expansive space" and "slight barrel distortion" requested in the prompt. 2. **Depth of Field**: The depth of field is deep, with the entire room from the foreground floor to the background windows being in sharp focus
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image shows a close-up macro shot of a single, prominent dewdrop resting on a pink rose petal. 1. **Perspective/Shot Type**: It is a macro shot, focusing on the detail of the water droplet and the texture of the petal. 2. **Depth of Field**: The image features a shallow depth of field. The central dewdrop is in sharp focus, while the foreground and background (the rest of the rose petal) are heavily blurred (bokeh effect), which aligns perfectly with the prompt. 3. **Lighting**: The lighti
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image depicts an aerial bird's-eye view looking straight down at a traffic roundabout. 1. **Perspective**: The perspective is a true top-down (nadir) view, looking straight down at the center of the roundabout. 2. **Depth of Field**: The entire scene is in sharp focus from the center to the edges, which is consistent with a high-altitude aerial shot. 3. **Lighting**: The lighting is even and consistent across the scene, typical of daylight, with no harsh shadows or directional light source
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image depicts a portrait with a strong emphasis on light and shadow. 1. **Lighting Pattern:** The image features a clear "Rembrandt lighting" pattern. There is a distinct, small triangle of light on the shadowed cheek (the viewer's left, the subject's right), which is the hallmark of this technique. 2. **Atmosphere:** The lighting is moody and dramatic, with deep shadows and a dark background, creating a high-contrast, chiaroscuro effect. 3. **Perspective and Depth of Field:** The persp
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image depicts a landscape with a distinct orange sky and a warm, golden color palette. The lighting pattern shows long, dark shadows stretching across the undulating terrain, consistent with a low sun angle. The light appears soft and creates a "glow" effect across the hills. The perspective shows a wide landscape view with a sense of depth created by the layered hills and the light/shadow transitions. The image successfully matches the description of "golden hour light," "long warm shadows
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image matches the prompt's requirements for a high-key portrait: 1. **Lighting Pattern:** The image is extremely bright, featuring a high-key aesthetic with a bright white background. The lighting is airy and luminous, with very minimal shadows, creating a "washed out" or ethereal effect. 2. **Perspective:** It is a standard portrait perspective (head and shoulders). 3. **Depth of Field:** The image has a very shallow depth of field, with the facial features appearing soft and slightly o
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image matches the prompt's specifications: 1. **Perspective/Lens (85mm):** The image shows a tight portrait with a shallow depth of field, characteristic of an 85mm lens used for portraits. The subject is the clear focus, and the background is heavily blurred. 2. **Depth of Field (f/1.4/Creamy Bokeh):** The background is extremely soft and blurred, creating a "creamy bokeh" effect where the garden elements are indistinct shapes of light and color. 3. **Lighting (Backlit/Warm Rim Light):*
local/nucleus-image-17b-a2bFAILEvaluator (Gemma 4 26B)
The provided image is a solid-colored, featureless grey square. It does not contain a street photograph, a subject, city lights, or any discernible depth of field or perspective. It fails to meet any of the criteria specified in the prompt. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Gemma 4 26B)
The image fails to meet several key criteria specified in the prompt: 1. **Perspective/Angle:** The prompt specifies a shot "from above" (top-down/flat lay). The image provided is a side-angle/eye-level shot looking into the bowl. 2. **Lighting:** The prompt specifies "natural window light from the left." While there is a window visible on the left, the light hitting the food is very flat and lacks the directional shadows or high-contrast modeling typically associated with natural window ligh
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image exhibits the core characteristics of a tilt-shift photograph. 1. **Perspective/Tilt-Shift Effect:** The image uses a very shallow depth of field to create the "miniature" effect. There is a distinct "selective focus strip" where the foreground/midground (the road surface and the curb) is in sharp focus, while the foreground (the bottom left corner) and the background (the buildings and cars) are heavily blurred. 2. **Depth of Field:** The depth of field is extremely shallow, which i
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image features a prominent horizontal light streak across a dark, moody scene. 1. **Lighting Pattern:** The streak transitions from a bright blue on the left to a bright orange/amber on the right, matching the "blue and orange" description. 2. **Anamorphic Lens Flare:** The light is a sharp, horizontal streak characteristic of an anamorphic lens flare. 3. **Scene/Mood:** The background is dark and moody with a soft, out-of-focus texture at the bottom, suggesting a cinematic frame. The ima
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image shows a portrait of a person where the lighting is a classic example of split lighting. 1. **Lighting Pattern:** The face is divided vertically by light and shadow. The left side of the subject's face (from the viewer's perspective) is brightly illuminated, while the right side is in deep, dark shadow. This perfectly matches the "dramatic split lighting" and "high contrast chiaroscuro" described in the prompt. 2. **Perspective:** The perspective is a standard close-up portrait, looki
Color PrecisionClick to expand83%vs92%
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image shows a sports car that is a bright, vivid, and unambiguous red. The car is parked on a gray asphalt road. The color of the car matches the prompt's requirement for a "bright red" and "vivid" color perfectly. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image shows a woman wearing a dress that is a vibrant, deep blue color. This color matches the description of "royal blue" specified in the prompt. The background is a solid, light beige/tan color, which qualifies as a "neutral background." The colors in the image accurately reflect the prompt's requirements. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image contains a field of sunflowers and a sky. 1. **Sky Color**: The sky is a bright, clear blue, which matches the prompt's requirement for a "clear blue sky." 2. **Flower Color**: The sunflowers are a vibrant, bright yellow, which matches the prompt's requirement for "pure sunflower yellow flowers." The colors in the image accurately reflect the specific colors requested in the prompt. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image shows a bedroom wall painted in a vibrant, saturated orange color. The hex code #FF6B35 represents a "burnt orange" or a bright reddish-orange. The color in the image is a deep, warm orange that aligns well with the description of "burnt orange" and the specific hex value provided. The furniture shown is white, which also matches the prompt. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Gemma 4 26B)
The image depicts a forest scene with a very dark, monochromatic green color palette. The foliage and background are various shades of deep green, and the tree trunks are also rendered in a very dark green/black shade. The prompt specifically requested: 1. Foliage color: **#2D5F2D (dark muted green)**. 2. Tree trunks: **Brown**. The image fails on both counts: - The color used for the foliage is a much darker, more saturated green than the requested #2D5F2D. - The tree trunks are dark green/bl
local/nucleus-image-17b-a2bFAILEvaluator (Gemma 4 26B)
The prompt specifies a wall color of **#4A90D9**, which is a medium-toned, slightly desaturated sky blue. Looking at the image, the walls are a very vibrant, highly saturated, bright cyan/cerulean blue. The color in the image is significantly more luminous and "electric" than the more muted, professional-grade blue represented by the hex code #4A90D9. While it is a "medium blue" in a general sense, it lacks the specific tonal quality and saturation level of the requested hex code. VERDICT: FA
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
To evaluate the color accuracy, I will compare the image colors to the specified hex values and names: 1. **Walls (#F5E6CC - Warm Cream):** The walls in the image are a pale, yellowish cream color. This matches the "warm cream" description and the hex code #F5E6CC well. 2. **Sofa (#2C3E50 - Dark Navy):** The sofa is a very dark, deep navy blue. This aligns perfectly with the hex code #2C3E50. 3. **Accent Pillows (#E74C3C - Bright Red):** The pillows are a vibrant, saturated red. This matches
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image contains the following elements with their corresponding colors: 1. **Background:** The background is a very dark, deep blue. The hex code #1A1A2E represents a deep navy/dark blue, which matches the visual appearance of the background in the image. 2. **Watch:** The watch is a soft gold color. The hex code #F0C27F is a pale, yellowish-gold, which matches the soft gold appearance of the watch in the image. 3. **Earbuds:** The earbuds are pure white. The hex code #FFFFFF is pure whit
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image contains the following colors: 1. **Background:** The background is a near-white/off-white color. While it is not a pure white, it is very close to the requested `#FAFAFA`. 2. **Main Shape:** The central shape is a vibrant orange-red. This aligns well with the requested `#FF4500` (Orange Red). 3. **Accent Lines:** The lines are a dark charcoal/gray color, which matches the requested `#333333`. The colors in the image accurately reflect the hex codes and descriptions provided in the
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image shows a smooth gradient transition. On the far left, the color is a vibrant red, which corresponds to #FF0000. On the far right, the color is a vibrant blue, which corresponds to #0000FF. The transition between these two colors passes through a purple/magenta hue in the center, as requested. The gradient is even and follows the specified color progression. VERDICT: PASS
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image displays a vertical color gradient. 1. **Top Color:** The top of the image is a very dark, deep indigo/navy blue. This aligns well with the requested hex code **#1B1464**. 2. **Bottom Color:** The bottom of the image is a vibrant orange. While the very bottom edge appears slightly darker/shadowed, the primary orange hue matches the requested **#FF6B35**. 3. **Gradient Transition:** The transition from the deep indigo at the top to the orange at the bottom follows the requested struct
local/nucleus-image-17b-a2bPASSEvaluator (Gemma 4 26B)
The image shows a horizontal banner with a gradient. 1. **Left side color:** The color on the left is a bright cyan/light blue. The hex code #00C9FF is a vibrant cyan, which matches the visual appearance of the left side of the image. 2. **Right side color:** The color on the right is a light, pale green. The hex code #92FE9D is a light mint green, which matches the visual appearance of the right side of the image. 3. **Gradient transition:** The gradient flows smoothly from the cyan on the
PhotorealismClick to expand0%vs67%
local/nucleus-image-17b-a2bFAILEvaluator (Gemma 4 26B)
The image shows a single red sneaker on a white background. While the overall composition mimics a product shot, there are several significant issues that indicate it is not a real photograph: 1. **Anatomical/Structural Errors:** The shape of the sneaker is nonsensical. The heel area and the opening where the foot would go are distorted and lack a logical structure. The laces and the eyelets are poorly defined and appear to melt into the body of the shoe. 2. **Lighting and Shadow Inconsistenc
local/nucleus-image-17b-a2bFAILEvaluator (Gemma 4 26B)
The image shows a clear glass perfume bottle on a reflective black surface. Upon examination, there are several significant AI artifacts and physical inconsistencies: 1. **Impossible Geometry/Reflections:** Inside the bottle, the liquid and the internal structure do not align with the external shape. There is a strange, white, curved line/shape inside the bottle that looks like a distorted reflection or a structural element that doesn't make physical sense. 2. **Distorted Cap/Nozzle:** The to
local/nucleus-image-17b-a2bFAILEvaluator (Gemma 4 26B)
The image depicts a single green apple on a white background. **Analysis:** 1. **Subject Matter:** The apple is a bright, uniform green. While it looks clean, the texture is unnaturally smooth and lacks the organic imperfections (tiny pores, speckles, or slight color variations) typically found in a real Granny Smith apple. 2. **Lighting and Shadows:** The lighting is very soft, but the shadow at the base of the apple is extremely blurry and lacks a defined contact point. The transition from t
Graphical design 46%vs100%
Data VisualisationClick to expand0%vs100%
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The requested bar chart should show monthly sales for January through June with values 12K, 18K, 15K, 22K, 19K, 25K — labeled axes, color-coded bars, and a title 'Monthly Sales 2025'. In the provided image: - The chart type is a bar chart, which matches the request. - The bars are color-coded (blue gradient), which is correct. - The title 'Monthly Sales 2025' is not present — it is missing. - The x-axis labels are incorrect: they show “12 2025”, “19K”, “25K”, which do not represent months Janua
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The requested pie chart should show: - Company A: 35% - Company B: 25% - Company C: 20% - Others: 20% In the provided image: - The largest slice (35%) is labeled “A5%” and “35%” — this appears to be Company A, but the label “A5%” is incorrect and confusing. - The 25% slice is labeled “25% B%” — this is inconsistent and poorly formatted. - The 20% slice is labeled “20%” and “20% 100%” — the “100%” is incorrect and misleading. - The “Others” slice is labeled “Others 25%” — this contradicts the re
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The requested visualization is a simple line graph showing daily website visitors over one week (Mon-Sun) with specific values: 1200, 1500, 1100, 1800, 2000, 1700, 1400. In the provided image: - The chart type is a line graph, which matches the request. - However, the x-axis labels are incorrect: “Mon”, “Sinp”, “T800”, “1400” — these are not readable day labels (e.g., “Tue”, “Wed”, etc.), and “Sinp” and “T800” are nonsensical. - The y-axis is labeled “Litn” (likely a misrendering of “Visitors”
Layout & DesignClick to expand33%vs100%
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image features a large, centrally placed popcorn bucket, which fulfills the “large popcorn image in the center” requirement. The title “MOVIE NIGHT” is prominently displayed at the top in bold red text, matching the prompt. The tagline “Every Friday at 8 PM” is correctly placed at the bottom in white text and is fully readable. The visual hierarchy is appropriate: the title is bold and large at the top, the popcorn is the central focal point, and the tagline is clearly visible at the bottom
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
All described layout zones/elements are present: - A guitar illustration is on the left side. - The headline “LIVE MUSIC FEST” is in large, bold text on the right, with clear visual hierarchy (largest element). - The tagline “Tickets available now” is correctly placed below the headline, with “now” on a separate line and underlined for emphasis. - All text is readable, correctly positioned, and matches the prompt’s description. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The prompt specified three required elements: 1. An open book illustration — not present. 2. The title “JOIN THE BOOK CLUB” — present and prominently displayed at the top with correct visual hierarchy. 3. The tagline “Meets every Wednesday” — not present. While the title is correctly placed and readable, the other two required elements are missing. The absence of the open book illustration and the tagline means the poster does not fulfill the full layout specification. VERDICT: FAIL
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The image shows a landing page with a blue header bar at the top, a central “Sign Up” headline, a paragraph of placeholder text, and two buttons (“Sign Up” and “SOCIALS”) below. However, the prompt specifically requested: 1. A hero header image at the top — not present; instead, there’s a blue navigation bar. 2. A headline and subtitle in the middle — the headline “Sign Up Uke” is present, but the subtitle is missing or illegible (the text below is gibberish placeholder). 3. A “Sign Up” call-to
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The image does not meet the required layout specifications. - There is no banner image header — the top section is a title (“Nisirreset erki”) with no accompanying banner or image. - The two-column body section is present, with text on the left and an image on the right — but the image is a stylized drawing of a flower, not a “banner image” as specified. - There is no clearly defined “highlighted quote block” — the text blocks are not visually emphasized as quotes. - The footer contains an “Uns
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The image depicts a restaurant menu with a header logo (“LEALI MENSI”), three category sections (Appetizers, Mains, Desserts), and a footer with address and hours. However, the text within all sections — including item names, prices, and footer information — is rendered as gibberish or nonsensical characters. The layout structure is visually present, but the content is unreadable and does not convey any meaningful information. The prompt required readable, correctly placed text for all describe
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image shows a magazine double-page spread with the following layout elements: - Left page: Contains a large hero photo of a woman’s face, which dominates the page — this matches the prompt. - Right page: Contains a two-column article layout, a pull quote (in a distinct box near the top), a sidebar with what appears to be an author bio (in a box on the right side), and a page number (bottom right corner). Visual hierarchy is generally correct: the hero image on the left is prominent, the ar
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The image shows a mobile app UI mockup that includes: - A top navigation bar with status indicators (time, signal, Wi-Fi, battery) — ✅ present. - A search field with placeholder text “Search” and a magnifying glass icon — ✅ present. - A 2x2 grid of feature cards with icons — ✅ present. - A “Recent Activity” list section — ✅ present. - A bottom tab bar with five icons — ✅ present. However, the text on the feature cards is illegible or incorrectly rendered (“Featur” instead of “Feature” on one c
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The image shows an open magazine spread. The left page features a full-bleed photograph of a model, which matches the prompt’s description. The right page contains text arranged in multiple columns, a headline, body copy, and a smaller inset image near the bottom. Page numbers are visible at the bottom of both pages, styled appropriately. However, the text on both pages is not readable — it appears to be gibberish or placeholder text (possibly due to AI generation or encoding issues), and while
Style DiversityClick to expand67%vs100%
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image depicts a golden retriever sitting in a garden, and it is rendered in a style that strongly resembles an oil painting. Visible brushstrokes are present throughout the composition — especially in the grass, foliage, and the dog’s fur — giving the image texture and a painterly quality. The colors are rich and saturated, consistent with the description of “rich texture.” The overall effect mimics traditional oil painting techniques, including blending, impasto-like texture in some areas,
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image depicts a golden retriever sitting in a garden, which matches the subject matter of the prompt. The style is clearly Japanese anime-inspired, featuring large, expressive eyes, simplified facial features, and a clean, stylized aesthetic. The color palette is flat with minimal shading, consistent with the requested “flat colors” characteristic. While there are some subtle gradients and soft shadows to suggest form, the overall look remains stylized and cartoonish, fitting the anime style
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image depicts a golden retriever sitting in a garden, rendered in a pixel art style. The visual characteristics of pixel art are clearly present: the image is composed of distinct, blocky pixels, and the overall composition is defined by a visible grid overlay, which strongly suggests a 32x32 grid structure. The color palette is limited, using a few distinct shades to represent the dog, grass, flowers, and background foliage — consistent with the constraints of pixel art. However, the grid
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image clearly depicts a city street scene rendered in the Art Deco style. Key characteristics are unmistakably present: - **Geometric shapes**: The buildings are composed of sharp angles, vertical lines, stepped forms, and stylized motifs — all hallmarks of Art Deco architecture. - **Gold and black palette**: The entire scene is rendered in a striking contrast of gold and black, which is a classic Art Deco color scheme. - **Ornamental symmetry**: The composition is highly symmetrical, with
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image clearly embodies the requested Impressionist style. Key characteristics are present: - Soft, visible brushstrokes are evident throughout the composition, especially in the background foliage and the woman’s dress. - Dappled light is captured in the way sunlight filters through the trees, creating patches of light and shadow in the background. - A pastel palette is used consistently — soft pinks, yellows, greens, and blues — which is typical of Impressionism. - The focus on capturing a
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image clearly embodies the requested Pop Art style. It features: - Bold, black outlines defining the figure and clothing. - Ben-Day dots (halftone patterns) visible in the background and subtly in the skin tones. - Flat, vivid primary colors: bright blue suit, red tie, yellow background, and orange skin tone. All key visual characteristics of the specified Pop Art style are present and unmistakable. VERDICT: PASS
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The image depicts a forest scene rendered in a watercolor style, with visible paper texture, watercolor bleeds, and soft, diffused edges characteristic of watercolor painting. However, the prompt requested a “photorealistic watercolor” style — meaning it should combine the high detail and lighting realism of a photograph with the watercolor medium. While the image has watercolor qualities, it lacks photorealistic detail: the trees and foliage are stylized, not sharply defined, and the lighting
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image successfully embodies the requested “cyberpunk Art Nouveau” style. Key characteristics are clearly present: - **Neon colors**: The scene is saturated with glowing neon blues, pinks, and purples, typical of cyberpunk aesthetics. - **Tech elements**: The woman’s outfit features glowing circuitry, mechanical details, and futuristic design, blending technology with the human form. - **Flowing organic lines**: The Art Nouveau influence is evident in the curvilinear, decorative framing with
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The image attempts to merge a medieval castle with ukiyo-e woodblock print aesthetics, but the result is visually inconsistent and fails to meet the prompt’s requirements. Key issues: - The structure is not a medieval European castle; it is a Japanese castle (yamashiro-style), which is architecturally incompatible with the “medieval castle” request. - The ukiyo-e style is present: flat planes of color, bold outlines, stylized clouds, and vertical composition — but these are applied to a Japanes
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The image presents an interior scene with geometric, block-like forms in soft pastel colors (pink, yellow, mint, peach), which aligns with the “rounded pastel furniture” and “warm diffused light” aspects of the prompt. The surfaces appear smooth and minimalist, lacking the raw, textured concrete typically associated with brutalism. While the composition has a clean, modern aesthetic with strong geometric shapes, the “raw concrete textures” element is notably absent — the materials look painted o
local/nucleus-image-17b-a2bFAILEvaluator (Qwen 3 VL)
The requested style — “anxious minimalism” — emphasizes sparse composition, uncomfortable negative space, muted colors, and subtle visual tension. The image presents a nearly monochromatic, minimalist composition with two horizontal planes of slightly different gray tones. While this achieves a sparse, minimalist aesthetic and uses muted colors, it lacks any visual tension or subject matter to evoke anxiety. The “uncomfortable negative space” is not meaningfully present — the space is empty but
local/nucleus-image-17b-a2bPASSEvaluator (Qwen 3 VL)
The image clearly embodies the requested “nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina.” Key visual characteristics are present: - **Retro-futuristic architecture**: The cityscape features sleek, tapered towers, flying saucers as both vehicles and architectural elements, and streamlined, chrome-like surfaces — all hallmarks of 1960s sci-fi design. - **Warm-toned patina**: The entire scene is bathed in a sepia-toned, golden-orange glow, ev