ImageBench V1 —
192 evaluations across 6 categories

Prompt: The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink
Text Rendering 93%
Typography StyleClick to expand100%
PASSEvaluator (Qwen 3 VL)
Text reads “CHAPTER ONE” correctly. Font mimics vintage typewriter style with blocky, slightly uneven ink strokes and subtle smudges. Paper texture appears aged and slightly creased. Ink variation and character spacing convincingly simulate typewriter output. All elements align with the prompt’s description. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Text reads "TODAY'S SPECIAL" correctly. Letters show white chalk texture with rough, uneven edges and smudges, convincingly mimicking chalk on a dark green board. Chalk dust effect is subtly implied in texture. Background is a plain chalkboard with minor wear. All elements align with the prompt. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The word "OPEN" is correctly spelled and rendered in a glowing, pixel-style LED matrix with distinct red light-emitting diodes. The style convincingly matches the described medium, showing individual illuminated dots forming each letter. The text is clear, properly formed, and visually consistent with an LED sign. VERDICT: PASS
Writing accuracyClick to expand92%
PASSEvaluator (Qwen 3 VL)
The word "HELLO" is fully spelled correctly and is clearly legible in bold, uppercase letters. No misspellings, missing letters, or illegible parts are present. The text is centered on a white background as intended. All criteria are met. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The word "ADVENTURE" is spelled correctly and fully readable in bold, centered text on a plain cream background. All letters are distinct and legible. No misspellings or missing parts. The design matches the intended specification exactly. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The word "SERENITY" is spelled correctly and fully readable, centered on a solid light blue background. All letters are distinct and legible. No misspellings or missing words. The text matches the intended specification exactly. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Text reads exactly: "The quick brown fox jumps high". All words are spelled correctly and fully legible against the white background. No misspellings or missing words detected. Text matches the intended phrase precisely. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Text reads: "Every sunset brings a new dawn ahead". All words are spelled correctly and fully legible against the pale yellow background. No misspellings or missing words detected. The phrase matches the intended text exactly. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Text reads exactly: "Stars shine brightest in the dark night". All words are spelled correctly, fully present, and clearly legible against the off-white background. No misspellings or missing words detected. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
All text matches the intended content exactly. "SUMMER SALE", "Up to 50% off all items this weekend only", and "Terms and conditions apply. See store for details." are all spelled correctly and fully readable. No misspellings or illegible parts detected. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
All text matches the intended content exactly: “GRAND OPENING”, “Join us for food, music, and fun”, and “RSVP required. Limited seating available.”. Every word is spelled correctly and fully readable, with no missing or illegible parts. Font sizes and hierarchy are appropriate. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
All text matches the intended content exactly: "ANNUAL GALA 2025", "An evening of fine dining and live jazz", and "Black tie optional. Doors open at 6 PM.". No misspellings, missing words, or illegible parts are present. Font sizes and hierarchy are appropriate. VERDICT: PASS
FAILEvaluator (Qwen 3 VL)
Text contains multiple errors: “Elena Voss” → “Elenca Voxxs”, “Marcus Chen” → “vazcus Chen, Chen,”, “Thomas Grant” → “Thotmes Grant”, “Paramount Pictures Release” → “Plasyoc Reslanes”, “Music by Thomas Grant” → “Musit by Thotmes Grant”, “Coming Summer 2025” → “Coming Summer 2025.” Several names and credits are garbled or missing. VERDICT: FAIL
PASSEvaluator (Qwen 3 VL)
All text matches the intended content exactly: "NATIONAL EXPLORER", "The Hidden Temples of Cambodia — page 34", "Interview: Dr. Maria Santos on Climate Action", "Top 10 Trails You Must Hike This Year", and "Issue 247 | March 2025 | $6.99". All words are spelled correctly and fully readable. No misspellings or illegible parts detected. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
All text matches the intended content exactly. Spelling, punctuation, and wording are correct and fully readable. No misspellings, missing words, or illegible parts detected. The layout and typography are clear and consistent with the specification. VERDICT: PASS
Spatial Reasoning 77%
Attributes BindingClick to expand89%
PASSEvaluator (Qwen 3.5 122B)
The image shows two objects on a white surface: a red sphere (ball) and a blue cube. The red ball is on the left, the blue cube on the right — matching the prompt’s description. Colors, shapes, and spatial arrangement are correctly bound with no swaps. Size appears comparable and appropriate for “side by side” placement. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a green apple and an orange pumpkin placed on a wooden cutting board. The apple is correctly colored green and has the typical round shape of an apple. The pumpkin is correctly colored orange and has the characteristic ribbed, rounded shape of a pumpkin. Both objects are appropriately sized relative to each other and the board. No attributes are swapped or misassigned. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a yellow taxi on the left and a black limousine on the right, parked next to each other on a street. The taxi is correctly colored yellow and has taxi markings. The limousine is correctly colored black and has the elongated shape of a limousine. All attributes (color, size, shape) are correctly bound to the right objects with no swaps. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image contains three distinct geometric shapes on a white background. The large red circle is present in the upper left. The small blue triangle is located on the right side. The medium green star is positioned at the bottom center. All colors, sizes, and shapes are correctly bound to their respective objects as described in the prompt. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows three candles on a shelf. From left to right: 1. Tall, thin, white candle — matches description. 2. Short, fat, red candle — matches description. 3. Medium-height, spiral-shaped, yellow candle — matches description. All attributes (color, size, shape) are correctly bound to the right objects with no swaps. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a large brown horse with black lower legs and tail, standing in a grassy meadow. In front of it are a small white rabbit and a medium-sized black cat sitting on the grass. All objects match their described attributes: horse is big and brown, rabbit is small and white, cat is medium and black. No attribute swaps or misidentifications observed. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image contains: - A pink fire truck (correct color and object) - A blue banana (correct color and object) - A green sun (correct color and object) All attributes are correctly bound to their respective objects with no swaps. The surreal landscape is also present with purple hills and other whimsical elements. VERDICT: PASS
FAILEvaluator (Qwen 3.5 122B)
The image contains a purple carrot with green leaves, correctly matching the prompt. The tree has red leaves and a brown trunk, also matching. However, the watermelon is not white — it’s green on the outside and pink/red inside, which contradicts the “white watermelon” requirement. Additionally, there’s a small orange fruit on top of the watermelon slice, which wasn’t requested. Attributes are mostly bound correctly except for the watermelon color. VERDICT: FAIL
PASSEvaluator (Qwen 3.5 122B)
The image shows three objects on a blue surface: 1. A yellow penguin with black feet and beak, white chest — matches “yellow penguin” 2. An orange elephant with raised trunk — matches “orange elephant” 3. A silver strawberry with textured surface and green leaves — matches “silver strawberry” All attributes (color, shape, object type) are correctly bound. No swaps or mismatches observed. VERDICT: PASS
CompositionalityClick to expand100%
PASSEvaluator (Qwen 3.5 122B)
The image contains all required elements: - Tabby cat: Present, with clear tabby markings and typical feline features - Oversized red sunglasses: Present, large red-framed glasses covering the cat's eyes - Coherent combination: The sunglasses fit naturally on the cat's face without distortion All described elements are present and properly combined in a realistic manner. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image contains all required elements: 1. Golden retriever - Present (clearly visible, golden fur, correct breed features) 2. Tiny cowboy hat - Present (small brown hat with characteristic brim and crown) 3. Coherent combination - Present (hat sits naturally on dog's head, proportions are appropriate) All described elements are accurately represented and logically combined in the image. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a penguin standing on snow/ice. The penguin is holding a colorful umbrella (red, blue, yellow panels) with its flipper. All described elements — penguin, colorful umbrella, and the act of holding it — are present and coherently combined. No contradictions or missing components. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a medieval knight in full armor sitting at a wooden desk, typing on a modern laptop. All described elements are present and coherently combined: - Medieval knight in full armor: Present - Sitting at a desk: Present - Typing on a modern laptop: Present The scene is well-integrated, with the knight’s hands positioned as if typing on the laptop, and the setting (stone walls, chandelier) complements the medieval theme. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image contains all required elements: 1. Roman gladiator - Present (wearing armor, skirt, greaves) 2. Selfie with smartphone - Present (holding phone up in selfie pose) 3. Colosseum - Present (ancient Roman amphitheater structure visible) All three core concepts are clearly depicted and coherently combined in a single scene. The anachronistic combination of ancient gladiator with modern smartphone is intentionally portrayed as requested. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image contains all required elements: 1. Astronaut in a spacesuit - PRESENT (full white suit with helmet, American flag patch, and backpack) 2. Riding a bicycle - PRESENT (astronaut is seated on and pedaling a bicycle) 3. Through a park - PRESENT (green grass, trees, paved path, and park setting visible) All elements are coherently combined in a single scene. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image contains all described elements: - Octopus: Present (center, orange) - Top hat: Present (on octopus head) - Monocle: Present (on octopus eye) - Chessboard with pieces: Present (in front of octopus) - Moon surface: Present (gray, cratered terrain) - Earth in background: Present (large, blue/white planet) All elements are coherently combined in a surreal but visually consistent composition. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
Analysis of elements: 1. Dinosaur: Present (T-Rex). 2. Business suit: Present (grey suit, tie, shirt). 3. Giving a PowerPoint presentation: Partially present. There is a screen showing a chart titled "Q3 GROWTH PROJECTED", but the dinosaur is not actively presenting (e.g., pointing or gesturing at it); he is standing away from it. 4. Modern office: Present (glass walls, city view, wood floors). 5. Drinking coffee: Present (holding a white mug). The dinosaur is holding a coffee mug and wearing a
PASSEvaluator (Qwen 3.5 122B)
The image contains all described elements: - Robot dressed as a pirate: Present (metallic robot with pirate hat and sash) - Paper boat: Present (white origami-style boat) - Lake of lava: Present (glowing red-orange molten surface) - Three kittens: Present (orange, black-and-white, and gray tabby kittens appearing to float/juggle around the robot) All elements are coherently combined in a surreal but visually consistent composition. VERDICT: PASS
CountingClick to expand22%
PASSEvaluator (Qwen 3.5 122B)
The image displays exactly three red apples positioned on a wooden surface. The count matches the requested quantity precisely. The apples are arranged in a triangular formation, and the wooden texture is consistent with a table. No extra or missing objects are present. VERDICT: PASS
FAILEvaluator (Qwen 3.5 122B)
The image shows six yellow rubber ducks arranged in a row on a white shelf. The prompt requested exactly five ducks. Since there are six visible ducks (including one partially cut off on the far left), the count does not match the requested number. VERDICT: FAIL
FAILEvaluator (Qwen 3.5 122B)
The image shows exactly 4 white candles, which matches the requested count. However, they are standing on a transparent/checkered background (not a dark surface), and there is no visible surface — only reflections suggesting one. The core object count is correct, but the surface condition fails the prompt’s requirement. VERDICT: FAIL
PASSEvaluator (Qwen 3.5 122B)
The image shows 2 orange cats and 3 black dogs sitting together on a grassy lawn. The count exactly matches the requested objects: two orange cats (one on each end) and three black dogs (in the center). All animals are present and correctly identified. VERDICT: PASS
FAILEvaluator (Qwen 3.5 122B)
The image shows a bouquet with 6 red roses and at least 3 white daisies (possibly more partially hidden). The prompt requested exactly 4 red roses and 2 white daisies. The counts do not match — there are more roses and daisies than specified. Therefore, the core requirement of exact object count is not met. VERDICT: FAIL
FAILEvaluator (Qwen 3.5 122B)
The image shows 1 blue butterfly on the top rail. Below, there is 1 yellow/black swallowtail, 1 yellow butterfly, and 2 pale yellow/green butterflies on the lower rail. This totals 1 blue and 4 yellow butterflies. The prompt requested 1 blue and 3 yellow butterflies. The count does not exactly match. VERDICT: FAIL
FAILEvaluator (Qwen 3.5 122B)
The image shows a round fishbowl with water and green plants. Counting the goldfish: there are 10 visible fish (not 9). The prompt requested exactly 9 goldfish, so the count does not match. Even though the bowl is large and round, the fish count is incorrect. VERDICT: FAIL
FAILEvaluator (Qwen 3.5 122B)
The image shows a fruit bowl with 3 bananas, 3 oranges, and 2 apples. The prompt requested exactly 3 bananas, 4 oranges, and 2 apples. While the banana and apple counts are correct, the orange count is off by one (3 instead of 4). Therefore, the total object count does not exactly match the request. VERDICT: FAIL
FAILEvaluator (Qwen 3.5 122B)
The image contains 20 colorful balloons floating against a clear blue sky. The prompt requested exactly 11 balloons, but the actual count is nearly double that. Since the number of objects does not match the specified quantity, the core criterion is not met. VERDICT: FAIL
NegationClick to expand56%
PASSEvaluator (Qwen 3.5 122B)
The image depicts a serene tropical beach with white sand, turquoise water, and palm trees under a clear blue sky. There are no visible people in the scene — no individuals on the beach, in the water, or in the background foliage. The exclusion criterion (“no people”) is fully satisfied. VERDICT: PASS
FAILEvaluator (Qwen 3.5 122B)
The image shows a tabby cat lying on a beige couch. Upon close inspection of the cat’s neck, there is a thin, dark band visible — consistent with a collar. Since the prompt explicitly excluded “a cat lying on a couch without a collar,” and the cat in the image appears to be wearing one, the excluded element is present. VERDICT: FAIL
PASSEvaluator (Qwen 3.5 122B)
The image shows an empty classroom with wooden desks and chairs arranged in rows. A large chalkboard is mounted on the front wall with "WELCOME BACK TO SCHOOL" written on it. There are maps on the right wall, windows on the left letting in sunlight, and a speaker mounted high on the wall. No students are visible anywhere in the scene. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a sandy beach with several palm trees, a pile of large seashells at the base of the trees, and a yellow-and-white striped beach umbrella on the left. The background consists of more sand and a clear sky — there is no visible water, ocean, or sea horizon line. The excluded element (water/ocean) is truly absent. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a traditional office workspace featuring a wooden desk, a leather chair, a tall bookshelf filled with books, and a green banker’s lamp. There are no computers, monitors, tablets, or any electronic screens visible in the scene. The excluded elements are fully absent. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a kitchen with pots, pans, a stove, and a cutting board. The excluded element — “no food” — is correctly absent: there are no visible food items on the counter, stove, or cutting board. Bottles of oil/vinegar, spice jars, and kitchen tools are present, but none constitute food. The scene adheres to the constraint. VERDICT: PASS
FAILEvaluator (Qwen 3.5 122B)
The image shows a pepperoni pizza with tomato sauce, basil leaves, and a crust. However, there is clearly melted white cheese visible beneath and around the pepperoni slices and basil leaves. The prompt explicitly excluded cheese, but cheese is present in the image. VERDICT: FAIL
FAILEvaluator (Qwen 3.5 122B)
The image shows a gray sedan with windows, doors, and headlights clearly visible. However, the wheels are not fully absent — both front and rear wheel wells expose brake rotors and suspension components, indicating wheels have been removed. Since the prompt explicitly excluded “no wheels,” and the image still shows wheel assemblies (rotors, hubs), the excluded element is partially present. The core intent of “no wheels” is violated. VERDICT: FAIL
FAILEvaluator (Qwen 3.5 122B)
The image shows a close-up of a human face with clearly visible eyes, nose, and mouth. Importantly, the person has well-defined eyebrows above each eye — they are dark, shaped, and natural-looking. Since the prompt explicitly excluded eyebrows (“no eyebrows”), and they are present in the image, the core criterion is violated. VERDICT: FAIL
Relative PositionClick to expand92%
PASSEvaluator (Qwen 3.5 122B)
The image shows an orange tabby cat sitting directly on top of a brown cardboard box. The box is placed on a wooden floor, and the cat is centered on the box’s upper surface. All spatial relationships described in the prompt — specifically “on top of” — are accurately represented. No positional errors are present. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a red ball positioned on the floor directly beneath the seat of a wooden chair. The ball is centered between the front legs of the chair, clearly “underneath” it as specified. All spatial relationships described in the prompt are accurately represented. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a white coffee mug filled with dark liquid, positioned to the right of an open book on a wooden desk. The book is open and lies flat on the desk surface. The spatial relationship described — “a coffee mug next to an open book on a desk” — is accurately represented: the mug is adjacent (to the right) of the book, and both rest on the desk. No positional errors are present. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a bird perched on a branch that extends from a tree. The branch is positioned above and slightly in front of a park bench, which sits on grass below. The spatial relationships described — bird “on” branch, branch “next to” bench (with implied proximity and adjacency), and bench “under” the branch — are all visually accurate. No contradictions exist. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows an orange tabby cat sitting directly on a floral-patterned pillow. The pillow is placed on the seat of a wooden chair. All spatial relationships described in the prompt — “cat on pillow,” “pillow on chair” — are accurately represented. The cat is above the pillow, the pillow is above the chair seat, and no conflicting left/right/under/above relationships are violated. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a candle placed directly on top of a stack of three books. To the right of the books is a potted plant in a terracotta pot. All specified spatial relationships — “candle on top of books” and “books next to potted plant” — are accurately represented. The arrangement matches the prompt’s description precisely. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows five light-colored wooden chess pieces on a checkered board. The king is centrally located. To its left is a queen. To its right is a bishop. Above the king (in the background) is a rook. Below the king (in the foreground) is a knight. All described spatial relationships — left, right, above, below, and center — are accurately represented in the image. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a top-down view of a desk with a laptop centered. To the left of the laptop is a coffee mug (and a small container of coffee beans). To the right is a smartphone. Above the laptop is a black desk lamp. Below the laptop is a dark notebook. All specified spatial relationships — left, right, above, below, and center — are accurately represented. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows five animals in a field arranged left to right: a brown horse, a black-and-white cow, a white sheep, a light-brown goat, and a sitting German Shepherd dog. The spatial relationships described — horse on far left, cow to its right, sheep in center, goat to sheep’s right, dog on far right — are all accurately reflected in the image. All positional descriptors match the visual layout. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a red barn in the foreground at large scale, a windmill slightly behind and to the left (midground, medium scale), and snow-capped mountains in the background at smaller scale — consistent with depth perspective. All positional relationships (foreground/midground/background) are correctly rendered. No contradictions in spatial layout. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a beach scene with a detailed sandcastle in the foreground, two surfers riding waves in the midground, and a cruise ship on the horizon in the background. The spatial arrangement matches the prompt: sandcastle closest, surfers further back on waves, ship farthest away. Sizes decrease appropriately with distance. All positional relationships are correctly represented. VERDICT: PASS
FAILEvaluator (Qwen 3.5 122B)
The image shows a city park with a bench and pigeons in the foreground, a fountain in the midground, and skyscrapers in the background — matching the prompt’s spatial layering. However, the prompt specifies “each layer progressively smaller,” which is not visually accurate: the fountain appears larger than the distant skyscrapers due to perspective, but the skyscrapers themselves are massive structures that would be larger than the fountain if viewed at same scale. The phrase “progressively smal
Scale & ProportionsClick to expand100%
PASSEvaluator (Qwen 3.5 122B)
The image shows a young boy standing next to a large yellow school bus. The relative scale appears realistic: the bus is significantly larger than the child, with its front wheel alone being nearly as tall as the boy. The perspective and proportions of both the person and the vehicle are consistent with real-world expectations. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The ladybug appears appropriately sized relative to the leaf surface it rests on, consistent with real-world proportions. The leaf’s vein structure and texture further support realistic scaling. No exaggerated or distorted size relationships are evident. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The child’s hands are holding the basketball, and the ball appears appropriately sized relative to the child’s torso and arms — consistent with a real-world basketball held by a young boy. No anatomical or scale distortions are visible. The proportions are natural and physically plausible. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a tiny person standing on the rim of a coffee cup, looking down into it. The person is appropriately scaled relative to the cup — small enough to be “tiny” in context, yet clearly visible and proportionally consistent with the cup’s dimensions. The perspective and lighting support the surreal scale without visual contradictions. Core size relationship is accurately represented. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a tabby cat that is disproportionately large compared to the surrounding skyscrapers, fitting the "giant" and "kaiju" description. The cat towers over the buildings, with its head reaching near the top of the frame while its paws are positioned at street level. The perspective and scale are intentionally exaggerated to match the prompt's requirement for a giant creature among city structures. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a small wooden house model resting in the palm of a human hand. The house is appropriately sized to fit comfortably within the palm, with its dimensions appearing consistent with a miniature or toy-scale object relative to the hand’s anatomy. The proportions between the house and the hand are realistic and visually coherent. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image displays an elephant, horse, dog, cat, and mouse arranged in a line from largest to smallest. The elephant is the largest, followed by the horse, then the dog, cat, and finally the mouse as the smallest. The relative sizes are consistent with real-world proportions. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image displays a top-down view of a table setting. The plate is centrally located with a fork to its left and a knife to its right. A salt shaker is positioned above the fork, and a wine glass is above the knife. Scattered peppercorns are near the top center. All objects appear to be at realistic relative sizes for a dinner table scene. The proportions between the plate, cutlery, glass, and shaker are consistent with real-world expectations. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image displays a farm scene with a red barn, green tractor, farmer, cows, chickens, and fence posts. The relative sizes appear consistent: the barn is largest, followed by the tractor and cows, then the farmer, with chickens being smallest. Fence posts are appropriately scaled relative to the animals and structures. No obvious distortions or scale inconsistencies are visible. VERDICT: PASS
Human realism 86%
Faces & ExpressionsClick to expand100%
PASSEvaluator (Qwen 3 VL)
Face is symmetric with natural features, no distortions. Warm, genuine smile clearly conveys friendliness. Lighting and composition enhance the natural look. All elements align with the prompt’s request for a young woman with a warm natural smile. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Face is symmetric with natural, correctly proportioned features. Expression is calm and serious, with neutral lips and steady gaze. No visible distortions or artifacts. Lighting and texture appear realistic. Meets all criteria for the prompt. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Face is symmetric with natural, correctly proportioned features. Expression is neutral and relaxed, matching the prompt. No visible distortions or artifacts. Lighting and focus are professional. The teenager’s gaze is steady and composed. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Face is symmetric with natural features. Disgust is clearly conveyed: nose wrinkled, lips curled, brows furrowed. No significant distortions. Skin texture and lighting are realistic. Emotion is unambiguous and well-executed. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Face is symmetric with natural features. Eyes are wide, eyebrows raised, mouth slightly open — clearly conveying genuine surprise. No visible distortions. Skin texture and hair look authentic. All key elements of the prompt are accurately rendered. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Face is symmetric with natural, correctly rendered features. Eyes are downcast, lips slightly drawn, gaze distant — clearly conveying quiet melancholy. No visible distortions. Lighting and texture are realistic. Emotion is unmistakable and aligns perfectly with the prompt. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The face is symmetric with natural, aged features: deep wrinkles, white hair, and warm, kind eyes. The smile is clearly readable and genuine, conveying warmth. No significant distortions are present. Facial structure and proportions are realistic and accurate. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The child’s face is nearly symmetrical with natural features: round cheeks, large eyes, and a tear streaming down the right cheek. The crying expression is clearly readable through furrowed brows, downturned mouth, and tear. No significant distortions are present. The image accurately fulfills the prompt’s requirements. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Face is symmetric with natural wrinkles, age spots, and thinning gray hair. Reading glasses are present. Skeptical expression is clearly readable via furrowed brow and downturned lips. No significant distortions. Features align well with prompt. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Faces are symmetric with correct features. Emotions are clearly readable: left woman laughs with head back, center man looks shocked, right woman smirks while rolling eyes. No distortions. All three expressions match the prompt. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Faces are symmetric with correct anatomical features. Emotions are clearly readable: father beams, mother smiles gently, teenager looks bored, child grins excitedly. No distortions or unnatural artifacts. All four individuals match the described expressions and roles. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Faces are symmetric with correct anatomical features. Emotions are clearly readable: left man shows frustration (furrowed brows, clenched fist), center woman shows pleased (smiling, nodding), right man shows confusion (tilted head, raised eyebrows). No distortions or artifacts. All three expressions match the prompt accurately. VERDICT: PASS
Full BodyClick to expand100%
PASSEvaluator (Qwen 3 VL)
The woman stands upright with arms at her sides, wearing jeans and a white t-shirt as described. Limb count and proportions are anatomically correct. Pose is relaxed and recognizable. No distortions or anomalies are visible. All elements match the prompt precisely. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The man stands straight, facing forward, with hands in pockets. Limb count and proportions are anatomically correct. The pose is clearly recognizable as described. No distortions are visible. All elements match the prompt precisely. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The person’s full body is visible from head to feet in a neutral standing pose. Limb count and proportions are anatomically correct. The pose is recognizable as standing still in a park. No distortions are apparent. All elements align with the prompt. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The woman’s limbs are correctly proportioned and anatomically accurate. Her running pose is dynamic and recognizable, with arms bent and legs in mid-stride. Hair and clothing show motion blur consistent with running. No distortions or anatomical errors are visible. The scene matches the prompt’s description. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The man’s limbs are correctly counted and proportioned. His bent-knee jump over the puddle is anatomically plausible and recognizable. No distortions are visible. The pose matches the prompt accurately. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The dancer’s limbs are correctly proportioned and anatomically plausible. The pose captures a dynamic spin with one leg extended and the dress flowing outward as described. No distortions in limb count or proportions are visible. The action is clearly recognizable as a spinning motion. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The person stands on one leg, with the other foot pressed against the inner thigh, arms raised overhead — accurately depicting the tree pose. Limb count and proportions are correct. No anatomical distortions are visible. The pose is clearly recognizable and executed with proper form. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The climber has correct limb count and proportions. His arms and legs are extended to different holds, matching the prompt. The pose is anatomically plausible for rock climbing, showing tension and balance. No significant distortions are visible. The action is clearly recognizable as dynamic bouldering or sport climbing. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The cellist’s posture and limb count are anatomically correct. Legs are wrapped around the instrument, bow arm is extended, and the cello rests on a stand. Proportions are natural, and the playing pose is clearly recognizable. No distortions are visible. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The dancers’ limbs are correctly counted and proportioned. The man dips the woman backward with their legs intertwined, matching the prompt. The pose is anatomically plausible and recognizable as a tango move. No distortions are visible. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Both players show correct limb count and proportions. The sliding player’s pose is anatomically plausible, and the jumping player’s posture is dynamic and recognizable as a tackle evasion. The ball is at the jumping player’s feet as described. No significant distortions are visible. The action matches the prompt accurately. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Both gymnasts show correct limb count and proportions. Their handstands are anatomically plausible, though slight asymmetry exists—right gymnast’s legs are slightly more spread. Pose is recognizable as synchronized handstands. Minor distortions: one gymnast’s head is slightly tilted, and limbs appear subtly elongated. Overall, the core requirements are met. VERDICT: PASS
HandsClick to expand58%
FAILEvaluator (Qwen 3.5 122B)
The left hand (viewer’s right) shows 5 fingers, all appearing anatomically correct and relaxed. The right hand (viewer’s left) also displays 5 fingers, but the thumb is partially obscured or merged with the palm — not clearly separated or naturally spread. While not severely distorted, it fails the “naturally spread” criterion for that hand. Core requirement of 5 clearly visible, relaxed fingers per hand is not fully met. VERDICT: FAIL
PASSEvaluator (Qwen 3.5 122B)
The image shows a woman with both hands extended forward, palms facing up. Each hand clearly displays five fingers — left hand: thumb, index, middle, ring, pinky; right hand: same. All ten fingers are visible and anatomically correct in positioning and structure. No anomalies such as extra, missing, or fused fingers. Gesture is natural and consistent with the prompt. VERDICT: PASS
FAILEvaluator (Qwen 3.5 122B)
The left hand (on the right side of image) shows 5 fingers correctly positioned. The right hand (on left side of image) also displays 5 fingers, but the thumb appears unnaturally fused or missing its distinct separation from the index finger — a subtle anatomical anomaly. Overall gesture is mostly correct, but minor distortion in thumb anatomy reduces realism. VERDICT: FAIL
FAILEvaluator (Qwen 3.5 122B)
The visible right hand gripping the mug handle shows only four fingers — thumb, index, middle, and ring — with no pinky visible or implied. The grip appears anatomically plausible for holding a small mug, but the missing fifth finger violates the “exactly 5 per hand” criterion. No other hands are visible to evaluate. Since the prompt requires exactly 5 fingers per hand and this is not met, it fails. VERDICT: FAIL
PASSEvaluator (Qwen 3.5 122B)
The woman’s right hand gripping the pen shows 5 fingers, and the left hand resting on the notebook also shows 5 fingers. The grip appears anatomically correct — thumb and index finger hold the pen, middle finger supports it, ring and pinky are curled naturally. No visible anomalies in finger count or positioning. Both hands look realistic and consistent with natural writing posture. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The left hand (on the fretboard) shows 5 fingers with a natural grip. The right hand (strumming area) also shows 5 fingers, though the thumb is partially obscured by the guitar body — still anatomically plausible. No obvious deformities or extra/missing digits. Grip and posture are consistent with real guitar playing. Minor occlusion doesn’t invalidate anatomy. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows a single right hand making a peace sign. The index and middle fingers are extended upward, while the thumb, ring, and pinky fingers are curled inward — consistent with the gesture described. All five fingers are present and anatomically positioned correctly for this gesture. No anomalies or extra/missing digits are visible. The grip is natural and matches the prompt’s description. VERDICT: PASS
PASSEvaluator (Qwen 3.5 122B)
The image shows one hand making a thumbs-up gesture. The thumb is extended upward, and the other four fingers are curled into a fist — this matches the prompt exactly. Anatomically, the hand appears correct: five fingers total (one thumb + four curled fingers), with natural skin texture, knuckles, and proportions. No anomalies or extra/missing digits are visible. The gesture is standard and realistic. VERDICT: PASS
FAILEvaluator (Qwen 3.5 122B)
The image shows one hand with four fingers extended (index, middle, ring, and pinky) — not three as prompted. The thumb is curled inward, making it appear as if only four fingers are visible. Anatomically, the hand has five digits total, but the gesture does not match “counting to three” with index, middle, and ring fingers extended — the pinky is also up. This misalignment with the prompt’s specific finger count constitutes a failure. VERDICT: FAIL
PASSEvaluator (Qwen 3.5 122B)
The handshake shows two hands clasped. The left hand (in the suit sleeve) has 5 fingers visible: thumb, index, middle, ring, and pinky — all appear anatomically correct. The right hand (in blue shirt) also shows 5 fingers: thumb on top, and four fingers wrapped around — all present and correctly positioned. No extra or missing fingers; grip is natural and professional. Anomalies: none detected. VERDICT: PASS
FAILEvaluator (Qwen 3.5 122B)
The image shows two hands meeting in a high-five gesture. The left hand (on the blue sleeve) has 5 fingers clearly visible and anatomically correct. The right hand (on the white sleeve) also has 5 fingers, but the thumb is positioned unusually — it appears to be fused or merged with the index finger, lacking clear separation. This is an anatomical anomaly. While the gesture is recognizable, the hand structure is not fully correct. VERDICT: FAIL
PASSEvaluator (Qwen 3.5 122B)
The image shows four hands interacting with a gift box. The left hand (with ring) has 5 fingers visible and anatomically correct grip. The right hand also has 5 fingers and proper grip. However, the two background hands are partially obscured — one appears to have only 3–4 visible fingers due to occlusion, not necessarily missing digits. No clear anatomical anomalies or extra/missing fingers are evident in fully visible hands. Grip is natural and correct. VERDICT: PASS
Multi-SubjectClick to expand83%
PASSEvaluator (Qwen 3 VL)
Left woman: short blonde hair, wearing red blazer over white shirt. Right woman: long dark hair, wearing navy blue dress. Both visually distinct and match prompt descriptions precisely. No mismatches or omissions. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Person 1: Bearded, wears glasses, green sweater. Person 2: Clean-shaven, wears black hoodie. Both individuals are visually distinct and match described features exactly. All prompt details are accurately represented. VERDICT: PASS
FAILEvaluator (Qwen 3 VL)
Image shows three women, not two. Central woman has curly red hair and denim overalls (matches tall description). Two others wear yellow dresses; one has straight brown hair (matches short description). However, the second person in yellow dress is not clearly short or tall relative to the central figure, and the third person’s hair is not distinctly straight brown. The prompt’s specific pairings are not fully realized. VERDICT: FAIL
PASSEvaluator (Qwen 3 VL)
All four individuals are present and visually distinct. Left: young man with dreadlocks in denim jacket. Next: woman with silver hair in red blouse. Center: bald man in suit. Right: woman in green hijab and dress. Features match prompt exactly. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Left: Tall woman with short pink hair, playing bass guitar. Center: Man with beard and bandana, on drums. Right: Slim man with glasses, playing keyboard. All three are visually distinct and match described features. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Man: wears sunhat and backpack. Woman: has braided hair, uses hiking poles. Teenager: wears red baseball cap, carries water bottle. All three are visually distinct and match descriptions. VERDICT: PASS
Truthfulness 78%
PhotorealismClick to expand100%
PASSEvaluator (Qwen 3.6 27B)
The image successfully renders the requested materials with high fidelity. The glass exhibits correct refraction and transparency, distorting the background elements behind it. The silver spoon displays sharp, accurate specular highlights consistent with polished metal. The wooden table shows a convincing grain texture and matte finish, while the linen napkin captures the soft, woven texture of fabric. The apple’s skin reflects light naturally. All visual properties align with the prompt’s requi
PASSEvaluator (Qwen 3.6 27B)
The image successfully renders the requested materials with high fidelity. The brushed aluminum laptop shows appropriate metallic sheen and subtle reflections. The leather notebook displays realistic texture and stitching. The ceramic mug has a matte, opaque finish, while the glass paperweight exhibits correct transparency, refraction, and specular highlights. The wooden desk features natural grain patterns and surface texture. All materials are visually distinct and realistic. VERDICT: PASS
PASSEvaluator (Qwen 3.6 27B)
The image accurately renders the distinct material properties requested. The stainless steel pot exhibits sharp, metallic reflections. The wooden cutting board shows realistic grain and end-grain texture. The glass bottle demonstrates correct translucency and refraction of the oil within, while the cotton towel displays appropriate fabric folds and matte texture. The lighting interacts naturally with each surface, creating believable highlights and shadows. VERDICT: PASS
Physics & ReflectionsClick to expand92%
PASSEvaluator (Qwen 3.6 27B)
The image displays a red sphere on a white surface. The shadow is cast to the right and slightly behind the object, which is consistent with a light source positioned to the upper left. The shadow's softness and gradient are physically plausible for a studio lighting setup. Additionally, there is a subtle reflection on the surface beneath the sphere, indicating a slightly glossy floor, which adds to the realism. The lighting and shadows are coherent. VERDICT: PASS
PASSEvaluator (Qwen 3.6 27B)
The lighting originates from the right side, consistent with the window frame visible on the right. Consequently, the mug casts a distinct, sharp shadow to the left, which aligns perfectly with the prompt's requirement. The shadow’s intensity and angle are physically plausible for direct sunlight. The reflections on the coffee surface and the subtle highlights on the mug’s rim are also consistent with the light source. VERDICT: PASS
PASSEvaluator (Qwen 3.6 27B)
The shadow is cast directly beneath the subject, consistent with a midday sun position. The shape accurately mirrors the person’s stance and clothing folds, showing no distortion or misalignment. The shadow’s darkness and soft edges are realistic for bright outdoor lighting. No reflections are expected in this scene, and none appear. All shadow characteristics align with physical plausibility. VERDICT: PASS
FAILEvaluator (Qwen 3.6 27B)
The reflection is physically impossible. The woman is standing with her back to the mirror, yet the reflection shows her front. A mirror cannot reflect the front of a person when their back is facing it. Additionally, the reflection’s orientation and lighting do not match the real-world geometry. These are fundamental violations of optical physics. VERDICT: FAIL
PASSEvaluator (Qwen 3.6 27B)
The reflection is physically plausible. It correctly mirrors the cat’s orientation and position, appearing directly beneath it on the polished marble floor. The slight blur and distortion align with real-world reflections on glossy surfaces. The color matches the cat’s dark fur, and the lighting direction is consistent—no contradictory shadows or misaligned highlights. The reflection fades naturally toward the edges, mimicking how light interacts with reflective flooring. Overall, it adheres to
PASSEvaluator (Qwen 3.6 27B)
The reflection on the glass table is physically plausible. It correctly mirrors the vase and stems with appropriate vertical inversion and slight blurring due to the glass surface. The orientation matches the object above, and the distortion is consistent with a reflective plane. Shadows are soft and ambient, fitting the bright indoor lighting. No directional inconsistencies or unnatural color shifts are present. VERDICT: PASS
PASSEvaluator (Qwen 3.6 27B)
The reflections are vertically inverted and correctly positioned below the waterline. The gentle ripples distort the reflection slightly, consistent with a calm lake surface. The colors in the reflection match the sky and trees above, appearing slightly darker as expected due to light absorption by the water. The orientation and distortion are physically plausible for this scene. VERDICT: PASS
PASSEvaluator (Qwen 3.6 27B)
The reflections are physically plausible. The orientation is correctly inverted, mirroring the buildings and lights above. The distortion is consistent with a wet, uneven surface, showing slight waviness and elongation typical of puddle reflections. The color shift is accurate, with warm yellow and cool blue lights reflecting appropriately. The intensity and spread of the light streaks match the physics of light on water. VERDICT: PASS
PASSEvaluator (Qwen 3.6 27B)
The reflection is perfectly inverted and mirrors the mountain range accurately, consistent with a still alpine lake. The colors in the reflection are slightly darker and more muted than the actual mountains, which is physically plausible due to light absorption and scattering in water. The orientation and lack of distortion align with calm water conditions. Shadows and highlights match the source lighting. VERDICT: PASS
PASSEvaluator (Qwen 3.6 27B)
The reflections and refractions are physically plausible. The mirror surface correctly inverts the spheres vertically, maintaining proper orientation and perspective. The water pool shows realistic distortion and refraction effects consistent with liquid surfaces. Light paths through the glass spheres demonstrate accurate bending and focusing. The dark shadow beneath the spheres aligns with the light source direction. Color rendering appears natural with appropriate transparency effects. VERDIC
PASSEvaluator (Qwen 3.6 27B)
The reflections on the chrome sculpture are physically plausible, showing distorted environmental mapping of the windows and cityscape consistent with its curved geometry. The glass floor reflection is sharp and correctly oriented, mirroring the sculpture’s position. Shadows are minimal but consistent with bright, diffuse daylight from large windows. No directional or color inconsistencies are evident. VERDICT: PASS
PASSEvaluator (Qwen 3.6 27B)
The wine glass casts a shadow to the right, consistent with light coming from the left window. The reflection on the marble is vertically inverted and slightly blurred, which is physically accurate for a polished surface. The refraction through the glass distorts the background view appropriately. The colors in the reflection match the scene’s lighting. All elements align with real-world optics. VERDICT: PASS
World KnowledgeClick to expand58%
PASSEvaluator (Qwen 3.6 27B)
The image depicts the Eiffel Tower with high architectural accuracy. The iconic iron lattice structure, the four arched legs, and the three distinct levels (including the top spire) are clearly visible and correctly proportioned. The background shows a clear blue sky and Parisian buildings, matching the prompt's description of a clear day. There are no obvious distortions or hallucinations in the structure. VERDICT: PASS
PASSEvaluator (Qwen 3.6 27B)
The image depicts the Taj Mahal with high architectural accuracy, including the central dome, four minarets, and symmetrical layout. The perspective from the reflecting pool is correct, showing the iconic reflection. Minor details like the smaller domes and arches are well-rendered. The presence of visitors adds realism. No significant inaccuracies are observed in the structure or composition. VERDICT: PASS
PASSEvaluator (Qwen 3.6 27B)
The Statue of Liberty is clearly recognizable and accurately depicted with correct proportions, green patina, raised torch, spiked crown, and tablet. The pedestal and base structure align with real-world architecture. Minor details like the American flag and surrounding island environment are plausible. No significant factual or architectural inaccuracies are present. VERDICT: PASS
FAILEvaluator (Qwen 3.6 27B)
The depicted instrument is not a shamisen. It appears to be a Chinese erhu or similar bowed string instrument, characterized by its long neck, two tuning pegs, and small square resonator. A shamisen has three strings, a larger body, and is played with a plectrum (bachi), none of which are present here. The bachi shown is unrelated to the instrument’s actual design. This is a clear factual inaccuracy. VERDICT: FAIL
FAILEvaluator (Qwen 3.6 27B)
The image depicts a brass instrument resembling an armillary sphere rather than a traditional flat astrolabe. While the materials and general aesthetic fit the "medieval" theme, the specific components mentioned in the prompt are inaccurate. A real astrolabe consists of a flat disk (mater) with interchangeable plates (tympans) and a star map (rete) that rotates on a pivot; this object is a 3D wireframe sphere. The engraved markings are nonsensical gibberish, lacking the precise mathematical accu
PASSEvaluator (Qwen 3.6 27B)
The image depicts a ceramic bowl with visible cracks repaired using gold lacquer, which is the defining characteristic of kintsugi. The bowl’s shape, glaze texture, and the application of gold along the fracture lines are consistent with traditional Japanese kintsugi aesthetics. The subject is clearly recognizable and factually accurate to the prompt. VERDICT: PASS
FAILEvaluator (Qwen 3.6 27B)
The image depicts the water cycle with recognizable elements like mountains, ocean, and clouds. However, it contains significant factual and typographical errors. The label "PREOIPITATION" is misspelled (should be Precipitation). Furthermore, the arrows for "EVAPORATION" and "CONDENSATION" are confusingly placed and overlapping, making the diagram scientifically inaccurate and difficult to interpret. VERDICT: FAIL
PASSEvaluator (Qwen 3.6 27B)
The image accurately depicts the geological features requested: an oceanic plate subducting beneath a continental plate, forming a trench and a volcanic arc. The labels are correctly placed, and the cross-section clearly illustrates the layered mantle and crustal structures. The visual representation aligns with scientific understanding of convergent boundaries. No significant inaccuracies are present. VERDICT: PASS
FAILEvaluator (Qwen 3.6 27B)
The image is not factually accurate. The text labels are nonsensical gibberish (e.g., "ONDINNE," "FPRCINA," "LINGIFAGT") rather than correct anatomical terms. Furthermore, the anatomical placement is incorrect; for instance, the label "AORTA" points to the left atrium/ventricle area, and the "PULMONARY ARCTERY" label points to the coronary arteries on the ventricle surface. The structure of the heart itself appears distorted and unrealistic. VERDICT: FAIL
FAILEvaluator (Qwen 3.6 27B)
The image depicts a butterfly with blue and brown wings, but it is not a Morpho menelaus. Morpho butterflies have solid, iridescent blue dorsal wings without the complex brown patterns, eyespots, or white markings seen here. The depicted subject resembles a different species (possibly a Blue Morpho hybrid or another Nymphalid), failing the specific accuracy requirement for Morpho menelaus morphology. VERDICT: FAIL
PASSEvaluator (Qwen 3.6 27B)
The image accurately depicts the interior of the Hagia Sophia, showcasing its iconic central dome, semi-domes, and pendentives. The massive Islamic calligraphy medallions are correctly positioned on the piers, coexisting with the visible Byzantine mosaics in the upper arches. The architectural details, including the marble columns, galleries, and lighting, are factually consistent with the real-world structure. VERDICT: PASS
PASSEvaluator (Qwen 3.6 27B)
The image depicts a bismuth crystal with high fidelity. It accurately captures the characteristic geometric hopper crystal formation, the distinct staircase structure on the faces, and the vibrant, multi-colored iridescent oxide layers resulting from thin-film interference. The lighting and texture appear realistic, matching the physical properties of real bismuth specimens. VERDICT: PASS
Professional Studio 93%
Camera & LightingClick to expand100%
PASSEvaluator (Gemma 4 26B)
The image successfully implements a wide-angle shot of a modern living room. The perspective shows an expansive space with a clear sense of depth. Most importantly, the image features significant barrel distortion at the edges (visible in the curved lines of the walls and ceiling), which directly matches the prompt's specific requirement. The lighting is bright and natural, consistent with the large windows shown. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image perfectly matches the prompt. It features a close-up macro shot of a single, clear dewdrop resting on a red rose petal. The depth of field is extremely shallow, with the dewdrop and the immediate edge of the petal in sharp focus, while the background and the rest of the petal are heavily blurred. The lighting is soft and natural, highlighting the transparency and reflections within the droplet. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image perfectly matches the prompt. It features a true aerial bird's-eye view looking straight down at a traffic roundabout. The perspective is top-down, showing the circular road, the central island with trees, and several cars. The lighting is even and consistent with a high-angle shot, and the depth of field is deep, keeping the entire scene in sharp focus. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image successfully demonstrates Rembrandt lighting. There is a clear light source from the side, creating a distinct triangle of light on the shadowed cheek (the subject's right cheek, viewer's left). The lighting creates a moody, high-contrast atmosphere with deep shadows. The perspective is a standard portrait shot, and the depth of field is shallow, with the background being a soft, out-of-focus neutral tone. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image perfectly matches the prompt. The lighting is a classic golden hour, featuring a vibrant orange sky and a soft, warm glow across the landscape. The low angle of the light creates long, soft shadows across the rolling hills and highlights the texture of the grass. The depth of field is wide, maintaining clarity from the foreground to the distant horizon, which is appropriate for a landscape photograph. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image perfectly matches the prompt. It features a high-key portrait with a bright white background. The lighting is soft, even, and luminous, creating a bright, airy feel with minimal shadows on the subject's face and clothing. The perspective is a standard eye-level portrait, and the depth of field is appropriately shallow, keeping the focus on the person. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image perfectly matches the prompt. The depth of field is shallow, creating a creamy bokeh background with soft, circular light orbs. The subject features a distinct warm rim light along her hair and shoulders, effectively separating her from the blurred garden background. The perspective is a classic portrait shot consistent with an 85mm lens. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image perfectly matches the prompt's technical requirements. The subject is sharp and in focus, while the background features a shallow depth of field with soft, circular bokeh from city lights, characteristic of a 50mm f/1.8 lens. The perspective and lighting pattern accurately reflect a street photograph taken at night with a wide aperture. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image matches the prompt's requirements. The perspective is a high-angle shot (from above) consistent with a 35mm lens. The depth of field is shallow (f/2.8), with the avocado toast in sharp focus and the juice carafe and background slightly blurred. The lighting clearly originates from the left, creating soft shadows to the right. The color temperature is warm, and a subtle vignette is present around the edges. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image successfully implements the tilt-shift effect. It features a high-angle perspective and a distinct selective focus strip (shallow depth of field) that blurs the foreground and background, creating the "miniature model" illusion. The colors are highly saturated, and the lighting is bright and even, consistent with the prompt's requirements for a tilt-shift aesthetic. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image perfectly matches the prompt. It features a dark, moody, and cinematic scene with a strong sense of depth. The lighting pattern consists of prominent blue and orange light streaks that stretch horizontally across the frame, accurately simulating the look of anamorphic lens flares. The perspective creates a sense of a deep, dark corridor or alleyway, and the light streaks are the primary focus of the lighting design. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image perfectly matches the prompt. It features a close-up portrait with a clear split lighting pattern: the left side of the subject's face is brightly illuminated, while the right side is cast in deep, heavy shadow. This creates the high-contrast chiaroscuro effect requested. The perspective is a direct frontal portrait, and the lighting successfully divides the face into light and dark halves. VERDICT: PASS
Color PrecisionClick to expand92%
PASSEvaluator (Gemma 4 26B)
The image features a sports car in a bright, vivid, and unambiguous red color. The car is parked on a gray asphalt road, which matches the prompt's description perfectly. The colors are accurate to the specified requirements. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image features two women wearing evening dresses in a deep, saturated blue that accurately matches the description of "royal blue." The background is a light, neutral grey/off-white, which also aligns with the prompt's requirement for a neutral background. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image perfectly matches the color descriptions provided in the prompt. The field is filled with vibrant, pure sunflower yellow flowers, and the sky above is a clear, solid blue. There are no deviations in color or hue from the specified requirements. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The wall is a vibrant, saturated orange that aligns well with the description of #FF6B35 (burnt orange). The furniture, including the nightstand and the tall shelving unit, is clearly white. The lighting appears to be natural daylight, creating soft shadows and highlights on the wall and floor. All color requirements specified in the prompt are met. VERDICT: PASS
FAILEvaluator (Gemma 4 26B)
The foliage in the image consists of a variety of green tones. While there are darker areas, the overall color palette is quite diverse, featuring many lighter, more vibrant greens and yellowish-greens that do not strictly adhere to the specific, muted dark green color #2D5F2D. The colors are more natural and varied than the single, precise hex value requested. VERDICT: FAIL
PASSEvaluator (Gemma 4 26B)
The walls are a medium blue color that closely aligns with the specified hex code #4A90D9. The floor is a warm-toned wood, and the trim is white, matching all the color descriptions provided in the prompt. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image accurately reflects the requested colors. The walls are a warm cream (#F5E6CC), the sofa is a dark navy (#2C3E50), and the accent pillows are a bright red (#E74C3C). All three specific color requirements are met precisely as described in the prompt. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image accurately reflects the requested colors. The background is a deep, dark blue consistent with #1A1A2E. The watch features a soft gold tone (#F0C27F) rather than a bright yellow gold. The earbuds are a clean, pure white (#FFFFFF). All three color requirements are met. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image accurately reflects the requested color palette. The background is a near-white color consistent with #FAFAFA. The central square is a vibrant orange-red, matching the #FF4500 specification. The thin intersecting lines are a dark charcoal color, consistent with #333333. All three specified colors are present and correctly applied to the designated elements. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image shows a smooth horizontal gradient. On the far left, the color is a vibrant red (#FF0000). On the far right, the color is a bright blue (#0000FF). The transition between these two colors passes through a purple/magenta middle section, exactly as described in the prompt. The colors and the transition are accurate to the specified hex values and names. VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image shows a vertical gradient. The bottom color is a vibrant orange, which closely matches the requested #FF6B35. The top color is a very dark, deep indigo/navy, which aligns with the requested #1B1464. The transition between the two colors is smooth and follows the specified direction (orange at the bottom, deep indigo at the top). VERDICT: PASS
PASSEvaluator (Gemma 4 26B)
The image shows a horizontal banner with a gradient. On the left, the color is a bright cyan, which matches the description of #00C9FF. On the right, the color transitions into a light mint green, which matches the description of #92FE9D. The gradient flows smoothly from the cyan color to the mint green color as specified. VERDICT: PASS
PhotorealismClick to expand67%
PASSEvaluator (Gemma 4 26B)
The image successfully captures the requested studio aesthetic. The red canvas texture, white laces, and rubber sole look highly realistic. The lighting is soft, and the gentle shadow beneath the shoe is well-rendered, creating a convincing sense of depth. There are no obvious AI artifacts, such as warped textures or nonsensical geometry, in the shoe's structure or the background. It passes as a high-quality professional product photograph. VERDICT: PASS
FAILEvaluator (Gemma 4 26B)
The image fails to meet the quality of a real photograph. There are significant AI artifacts regarding the physics of light and glass. The liquid inside the bottle has nonsensical, floating white lines that do not correspond to any realistic refraction or straw. Additionally, the reflection on the surface is inconsistent with the bottle's shape, and the glass edges appear warped and unnatural. VERDICT: FAIL
PASSEvaluator (Gemma 4 26B)
The image successfully captures a studio-quality look. The lighting is soft and directional, creating a natural, subtle shadow on the white seamless background. The texture of the apple skin, including the slight color variations and the stem, appears realistic and free of obvious AI artifacts or "uncanny" textures. It passes as a professional product photograph for a catalog. VERDICT: PASS
Graphical design 75%
Data VisualisationClick to expand33%
FAILEvaluator (Qwen 3 VL)
The chart shows correct months and values (12K, 18K, 15K, 22K, 19K, 25K) with a title and labeled axes. However, the Y-axis label "Thousands" is misaligned with the scale, and "S/per colde" is a garbled X-axis label. The June value is incorrectly labeled as "25%" (should be "25K"). VERDICT: FAIL
PASSEvaluator (Qwen 3 VL)
The pie chart correctly displays all four categories with exact percentages: Company A (35%), Company B (25%), Company C (20%), and Others (20%). Each slice has a distinct color and clear, readable labels with percentages. The chart type matches the requested pie chart visualization. All elements are accurately rendered and legible. VERDICT: PASS
FAILEvaluator (Qwen 3 VL)
The chart is a line graph titled "Weekly Traffic" with correct axis labels (Mon-Sun) and grid lines. However, the y-axis values do not match the requested data (1200, 1500, 1100, 1800, 2000, 1700, 1400). The plotted points visually approximate the data but lack exact numerical labels for each point, making precise verification impossible. The chart type is correct. VERDICT: FAIL
Layout & DesignClick to expand67%
PASSEvaluator (Qwen 3 VL)
The image shows a popcorn bucket centered, with "MOVIE NIGHT" in bold yellow at the top and "Every Friday at 8 PM" at the bottom. Text is readable and correctly placed per the prompt. Visual hierarchy is clear, with the popcorn as the focal point and text framing it appropriately. All required layout elements are present and correctly positioned. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The image displays a guitar illustration centered on a blue background. The headline "LIVE MUSIC FEST" is prominently placed above the guitar in large white text, and "Tickets available now" appears below it in smaller text. All required elements are present with correct visual hierarchy and readable placement. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The image displays an open book illustration centered below the title "JOIN THE BOOK CLUB," which is prominently placed at the top. The tagline "Meets every Wednesday" is correctly positioned at the bottom. All text is readable, and the visual hierarchy is clear: title > illustration > tagline. Layout and text placement match the prompt exactly. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Image shows hero header image, headline “Unlock Your Potential,” subtitle, “Sign Up” button, and three social icons (Instagram, Twitter, LinkedIn) at bottom. All elements are present, correctly placed, and text is readable. Visual hierarchy is clear with proper spacing and contrast. Layout matches prompt exactly. VERDICT: PASS
FAILEvaluator (Qwen 3 VL)
The image shows a newsletter layout with a header banner, two-column body (text left, image right), a highlighted quote block, and a footer with unsubscribe link. However, all text is gibberish placeholder text, unreadable and incorrectly placed. Visual hierarchy is present but content is invalid. VERDICT: FAIL
PASSEvaluator (Qwen 3 VL)
The image shows a clean menu with a logo header (“THE FORAGER KITCHEN”), three clearly labeled sections (Appetizers, Mains, Desserts) with items and prices, and a footer with address and hours. Text is readable, well-spaced, and correctly placed with appropriate visual hierarchy. All required layout elements are present and accurately rendered. VERDICT: PASS
FAILEvaluator (Qwen 3 VL)
The left page features a large hero photo as required. The right page has a two-column article, a pull quote, and a sidebar with author bio, all correctly placed. The page number “42” is visible. However, all text is gibberish placeholder text, making it unreadable and violating the requirement for readable content. VERDICT: FAIL
FAILEvaluator (Qwen 3 VL)
The image shows a mobile app UI with a top navigation bar, search field, 2x2 feature cards with icons, "RECENT ACTIVITY" list, and bottom tab bar — all present. However, placeholder text is garbled and unreadable, violating the requirement for readable, correctly placed text. Visual hierarchy is mostly correct, but illegible text undermines functionality. VERDICT: FAIL
PASSEvaluator (Qwen 3 VL)
Left page has full-bleed photo. Right page features headline, body copy, inset image, and page number — all correctly placed with clear visual hierarchy. Text is readable, though minor typos exist (e.g., “geomtric”). Layout matches prompt description precisely. VERDICT: PASS
Style DiversityClick to expand92%
PASSEvaluator (Qwen 3 VL)
The image clearly depicts a golden retriever in a garden, rendered with visible brushstrokes and textured paint application characteristic of oil painting. Colors are rich and layered, with impasto effects in fur and foliage. The style matches the requested oil painting aesthetic precisely. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The image shows a golden retriever in a sitting pose, rendered with large, expressive eyes and simplified, flat color shading typical of Japanese anime style. While fur texture is somewhat detailed, the overall aesthetic matches the requested style. The garden setting is absent, but the core visual characteristics are present. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The image displays a golden retriever in a garden, rendered in a clear 32x32 pixel grid with a limited color palette, matching the pixel art style requested. The dog is sitting, and background elements like flowers are also pixelated. However, the image contains nonsensical text overlay, which is not part of the requested style. The core visual style is present. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The image clearly depicts an Art Deco city street scene with geometric forms, symmetrical composition, and a gold-and-black palette. Ornamental details, stylized skyscrapers, and the word “METROPOLIS” in Art Deco font reinforce the requested style. The central lamppost adds period-appropriate detail. All key visual characteristics are present and well-executed. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The image clearly exhibits Impressionism: visible brushstrokes, dappled light effects, and a soft pastel palette. The woman holds a parasol, and the scene captures fleeting outdoor moments with atmospheric color. These core stylistic elements are unmistakably present and well-executed. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The image clearly exhibits Pop Art style: bold black outlines, Ben-Day dot patterns on skin tones, and flat, vivid primary colors (blue shirt, yellow skin). Facial features are simplified and stylized, consistent with the requested aesthetic. All key visual characteristics are present and unmistakable. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The image blends photorealistic forest detail with visible watercolor bleeds and paper texture, especially around edges and in background haze. Lighting mimics a photo with sunbeams and depth. Watercolor effects are present but not uniformly applied, sometimes obscuring realism. Core style elements are recognizable, though execution is inconsistent. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The image clearly depicts a cyberpunk scene with neon lighting and tech elements. It features flowing, organic Art Nouveau floral borders framing the composition and ornate, curvilinear mechanical designs on the woman’s armor. The color palette uses vibrant neon tones against dark backgrounds. All requested stylistic elements are visibly present and well-integrated. VERDICT: PASS
FAILEvaluator (Qwen 3 VL)
The image blends ukiyo-e style (bold outlines, flat areas of color, woodblock texture) with photorealistic sky and lighting. The castle’s architecture is Japanese, not medieval, which misrepresents the prompt. However, the requested stylistic fusion is visually present. The core stylistic elements are recognizable despite the architectural mismatch. VERDICT: FAIL
PASSEvaluator (Qwen 3 VL)
The image clearly shows raw concrete walls and floor, matching soft brutalism. Rounded pastel furniture (pink sofa, pale green table, pink mirror) and warm, diffused light from the arched window are present. The style is unmistakably recognizable and aligns with the prompt’s key visual elements. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
Sparse composition with a single object, muted gray tones, and significant negative space create an anxious minimalism. The dark pitcher contrasts subtly with the plain background, and its isolated placement evokes visual tension. The red mark adds a small, unsettling detail. All key elements of the requested style are clearly present. VERDICT: PASS
PASSEvaluator (Qwen 3 VL)
The image clearly depicts retro-futuristic 1960s space-age aesthetics with domed structures, flying saucer-like buildings, and elevated transit. Warm, melancholic lighting and a patina effect are present, evoking nostalgia. The style is unmistakably recognizable and aligns with the prompt’s core visual requirements. VERDICT: PASS