vs
192 evaluations across 6 categories

Prompt: The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink
Text Rendering 100%vs87%
▸Typography Style100%vs100%
openai/gpt-image-2PASSEvaluator
Text reads “CHAPTER ONE” correctly. Font mimics vintage typewriter style with uneven, slightly smudged ink and dot-matrix appearance. Background is aged paper with authentic texture and discoloration. Matches prompt’s medium and style convincingly. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The text “OPEN” is correctly spelled and rendered in a convincing pixel-style LED format with glowing red dots, matching the prompt’s description. The medium’s visual characteristics — dot matrix, brightness, and casing — are accurately portrayed. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Text is rendered in a convincing chalk-on-chalkboard style with visible texture and dust. Spelling is correct: “TODAY’S SPECIAL”. The medium matches the prompt precisely. VERDICT: PASS
▸Writing accuracy100%vs83%
openai/gpt-image-2PASSEvaluator
The word "HELLO" is fully spelled correctly and is clearly readable. All letters are intact and legible against the white background. No misspellings, missing letters, or illegible parts are present. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The word "ADVENTURE" is spelled correctly and fully readable in bold, centered text on a cream background. No misspellings, missing words, or illegible parts are present. All elements match the intended description. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The word “SERENITY” is spelled correctly and is fully readable in large, centered white letters against a solid light blue background. No misspellings, missing words, or illegible parts are present. The text matches the intended specification exactly. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
All words are spelled correctly and fully readable: "The", "quick", "brown", "fox", "jumps", "high". No misspellings, missing words, or illegible parts. Text matches the intended phrase exactly. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
All words are spelled correctly and fully readable. No misspellings, missing words, or illegible parts. Text matches the intended phrase exactly. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
All words are spelled correctly and fully readable: "Stars", "shine", "brightest", "in", "the", "dark", "night". No misspellings or missing words. Text is clear against the off-white background. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
All text is spelled correctly and fully readable. Headline: “SUMMER SALE”. Body: “Up to 50% off all items this weekend only”. Fine print: “Terms and conditions apply. See store for details.” No misspellings or illegible parts. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
All text is spelled correctly and fully readable. “GRAND OPENING” matches the title. The subtitle “Join us for food, music, and fun” is accurate. The disclaimer “RSVP required. Limited seating available.” is present and legible. No misspellings, missing words, or illegible parts detected. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
All text matches the intended message exactly: “ANNUAL GALA 2025”, “An evening of fine dining and live jazz”, and “Black tie optional. Doors open at 6 PM.”. Every word is spelled correctly and fully readable in its respective font size and placement. No misspellings or illegible parts detected. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
All text matches the intended content exactly: “GOLDEN HARVEST ORGANIC HONEY”, “Raw & Unfiltered”, “12 oz (340g)”, “Pure wildflower honey harvested from sustainable apiaries in the Pacific Northwest. No additives or preservatives.”, and “Best before: 06/2026. Store in cool, dry place. Product of USA.”. No misspellings or illegible parts. All words are fully readable and accurate. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
All text matches the intended content exactly: “NATIONAL EXPLORER”, “The Hidden Temples of Cambodia — page 34”, “Interview: Dr. Maria Santos on Climate Action”, “Top 10 Trails You Must Hike This Year”, and “Issue 247 | March 2025 | $6.99”. No misspellings, missing words, or illegible parts. All text is fully readable and accurate. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
All text matches the intended content exactly: tagline, credits, director, cast, music composer, studio, and release date are fully spelled correctly and clearly readable. No misspellings or illegible parts detected. VERDICT: PASS
Spatial Reasoning 97%vs97%
▸Attributes Binding100%vs100%
openai/gpt-image-2PASSEvaluator
The image displays a green apple on the left and an orange pumpkin on the right, both resting on a wooden cutting board. The colors, shapes, and relative sizes are correctly assigned to their respective objects as requested. - **Green Apple:** Observed as green, round, and smaller. - **Orange Pumpkin:** Observed as orange, ribbed, and larger. - **Surface:** Observed as a wooden cutting board. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image displays a red sphere (ball) on the left and a blue cube on the right, placed on a white background. The colors, shapes, and relative positions match the prompt perfectly. * **Object 1:** Red ball (Sphere, Red color). * **Object 2:** Blue cube (Cube, Blue color). VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image contains a yellow taxi (Toyota Camry) in the foreground and a black limousine (Lincoln Town Car stretch) parked directly behind/next to it along the curb. The colors and vehicle types match the prompt perfectly with no attribute swaps. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image contains three distinct shapes on a white background. 1. **Left Object:** A large red circle. 2. **Middle Object:** A small blue triangle. 3. **Right Object:** A medium green star. All attributes (color, size, shape) are correctly bound to the corresponding objects as requested in the prompt. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image contains a large brown horse, a small white rabbit, and a black cat in a grassy meadow. All attributes (color, size, object type) match the prompt perfectly without any swaps. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image contains exactly what was requested: 1. **Tall thin white candle:** Present on the left in a holder. 2. **Short fat red candle:** Present in the middle. 3. **Medium spiral yellow candle:** Present on the right. All colors, sizes, and shapes match the prompt perfectly with no attribute swapping. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image correctly depicts all requested objects with their specific attributes bound correctly: 1. **Fire Truck:** It is pink, matching the prompt "pink fire truck". 2. **Banana:** It is blue, matching the prompt "blue banana". 3. **Sun:** It is green, matching the prompt "green sun". There are no attribute swaps (e.g., the banana is not pink, the truck is not blue). The surreal landscape context is also present. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image contains three distinct objects arranged on a wooden surface. 1. **Elephant:** It is orange in color and has the correct shape. 2. **Penguin:** It is yellow (with a white belly) and has the correct shape. 3. **Strawberry:** It is silver/metallic in color and has the correct shape. All requested attributes (color, object type) are correctly bound to the respective objects with no swaps. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image contains all the requested objects with the correct attributes bound to them: 1. **Purple carrot:** A carrot with green tops and a distinct purple root is visible on the left. 2. **Red tree with red leaves:** A large tree with vibrant red foliage dominates the background. 3. **White watermelon sliced open:** A white-fleshed watermelon with black seeds is shown whole and sliced on the right. VERDICT: PASS
▸Compositionality100%vs100%
openai/gpt-image-2PASSEvaluator
- **Tabby cat:** Present. The image clearly shows a cat with distinct tabby markings (stripes and an 'M' on the forehead). - **Oversized red sunglasses:** Present. The cat is wearing large, bright red sunglasses that fit the description perfectly. All described elements are present and coherently combined. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
- **Golden retriever:** Present. The main subject is clearly a Golden Retriever with characteristic golden fur and facial features. - **Wearing:** Present. The dog has the hat positioned on its head. - **Tiny cowboy hat:** Present. The dog is wearing a brown cowboy hat with a star-studded band, sized appropriately for the animal. All described elements are present and combined coherently within a western-themed setting. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
- **Penguin:** Present (a Gentoo penguin standing on rocks). - **Holding:** Present (the penguin's flipper is gripping the umbrella handle). - **Colorful umbrella:** Present (a rainbow-colored umbrella with visible raindrops). All elements are clearly visible and combined in a coherent, realistic style despite the unusual subject matter. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
- **Medieval knight:** Present. - **Full armor:** Present (helmet, chainmail, plate armor). - **Sitting at a desk:** Present (wooden table). - **Typing:** Present (gauntleted hands on keyboard). - **Modern laptop:** Present (silver laptop). All elements are clearly visible and combined coherently to create the requested anachronistic scene. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
- **Roman gladiator:** Present (man in armor). - **Taking a selfie:** Present (holding phone up, looking at lens). - **Smartphone:** Present (modern black phone visible). - **Colosseum:** Present (background architecture matches the Roman amphitheater). All elements are clearly visible and combined to depict the requested anachronistic scene. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
- **Astronaut in a spacesuit:** Present. The figure is wearing a full white NASA-style suit with a helmet and backpack. - **Riding a bicycle:** Present. The astronaut is seated on a black bicycle, holding the handlebars. - **Through a park:** Present. The background features green trees, a paved path, a body of water, and a sign explicitly reading "Riverside Park." All described elements are clearly visible and coherently combined in the scene. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
- **Octopus:** Present (central figure). - **Top hat:** Present (black hat on head). - **Monocle:** Present (gold-rimmed eyewear with chain). - **Playing chess:** Present (chessboard with pieces, holding a piece). - **Surface of the moon:** Present (grey, cratered terrain). - **Earth visible in background:** Present (large planet in the sky). All described elements are clearly visible and combined coherently in a surreal scene. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
- **Dinosaur:** Present (A T-Rex character). - **Business suit:** Present (Blue suit, white shirt, tie). - **Giving a PowerPoint presentation:** Present (Pointing at a screen labeled "Q2 BUSINESS REVIEW" with charts). - **Modern office:** Present (Conference room, city view, laptops). - **Drinking coffee:** Present (Holding a mug labeled "CAFFEINATED & FOCUSED"). All described elements are clearly visible and integrated into a coherent scene. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image contains all the requested elements: 1. **Robot:** Present (central figure). 2. **Dressed as a pirate:** Present (pirate hat, eye patch, coat, flag). 3. **Paper boat:** Present (robot is sitting inside one). 4. **Lake of lava:** Present (background and foreground show flowing lava). 5. **Juggling three kittens:** Present (three kittens are floating/jumping in the air above the robot's outstretched hands). The concepts are combined coherently into a single scene. VERDICT: PASS
▸Counting89%vs89%
openai/gpt-image-2PASSEvaluator
I see exactly 3 red apples arranged on a wooden surface. The count matches the prompt's request perfectly. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
I see exactly 4 white candles standing in a row on a dark surface. The count matches the request perfectly. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
I see exactly 5 yellow rubber ducks lined up in a row on the white shelf. The count matches the request perfectly. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
I see exactly 2 orange cats and 3 black dogs sitting on the grass. The counts match the prompt's request perfectly. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image contains exactly 4 red roses (one top center, one left, one center, one right) and 2 white daisies (one left, one right). These are arranged in a clear glass vase sitting on a wooden table. The counts and objects perfectly match the requested description. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
I see exactly 1 blue butterfly on the far left and 3 yellow butterflies to its right, all resting on the wooden fence posts. The count matches the requested description perfectly. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image shows a fruit bowl with the following counts: - **Bananas:** 3 - **Oranges:** 4 - **Apples:** 2 The counts exactly match the requested specifications. VERDICT: PASS
openai/gpt-image-2FAILEvaluator
I count 10 balloons in the image: one purple, one red, one orange, one blue, one yellow, one green, one dark pink, one teal, one light pink, and one white. The prompt requested exactly 11 balloons. Since there are only 10, the count does not match the request. VERDICT: FAIL
openai/gpt-image-2PASSEvaluator
I count 9 goldfish in the image. 1. Top left 2. Top middle 3. Top right 4. Middle left 5. Middle center 6. Middle right 7. Bottom left 8. Bottom center 9. Bottom right The count is exactly 9, matching the request. VERDICT: PASS
▸Negation89%vs89%
openai/gpt-image-2PASSEvaluator
The image shows a tabby cat lying on a grey couch. I have examined the cat's neck area closely. There is no collar visible around its neck; the fur is uninterrupted. The excluded element is absent. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image depicts a pristine tropical beach scene featuring a large leaning palm tree, white sand, turquoise water, and a blue sky with clouds. I have carefully scanned the beach, the water, the rocks, and the distant shoreline. There are absolutely no people visible anywhere in the scene. The exclusion criteria is met. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image depicts a tidy, empty classroom featuring rows of wooden desks and blue chairs facing a green chalkboard with inspirational writing. There are educational posters on the walls, a teacher's desk, and bookshelves. Crucially, there are no people visible in the room—no students and no teacher. The excluded element is completely absent. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image displays a sandy beach scene featuring palm trees, scattered seashells in the foreground, and a blue-and-white striped beach umbrella with a lounge chair. The background consists of dense green vegetation and more palm trees under a blue sky. There is absolutely no water, ocean, or sea visible in the frame; the horizon is blocked by the tropical foliage. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image displays a kitchen scene with pots, pans, a stove, and a cutting board as requested. The cutting board is empty. The pots on the stove are covered, hiding their contents. While there are condiments (oil bottle, pepper grinder) and a potted herb plant on the counter, there is no prepared food or loose ingredients visible. The excluded element (food) is effectively absent in the context of a meal or cooking ingredients. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image depicts a traditional office workspace featuring a wooden desk, a black leather executive chair, a beige guest chair, a tall bookshelf filled with books and decor, and a lamp on a side cabinet. I have carefully scanned the desk surface, shelves, and walls, and there are absolutely no computers, monitors, laptops, or digital screens present. The workspace is entirely analog. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image displays a pizza with a golden-brown crust, a base of red tomato sauce, numerous pepperoni slices, and fresh basil leaves scattered on top. Crucially, there is no visible layer of melted cheese (like mozzarella) covering the sauce or toppings; the pepperoni and basil sit directly on the sauce. The excluded element is absent. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image displays a silver car body featuring windows, doors, and headlights as requested. Crucially, looking at the wheel arches, the wheels are completely missing; the wheel wells are empty, showing only the internal structure where tires would normally be. The excluded element (wheels) is truly absent. VERDICT: PASS
openai/gpt-image-2FAILEvaluator
The image displays a human face with eyes, nose, and mouth. However, contrary to the exclusion criteria, the face clearly has eyebrows. They are visible above the eyes, appearing natural and well-defined. Since the excluded element (eyebrows) is present, the image does not meet the specific constraint. VERDICT: FAIL
▸Relative Position100%vs100%
openai/gpt-image-2PASSEvaluator
The image shows a tabby cat sitting directly on the upper surface of a brown cardboard box. The cat's paws are resting on the box, confirming the "on top of" relationship described in the prompt. No other specific spatial relationships were requested, and the primary relationship is accurate. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image shows a wooden chair standing on a hardwood floor. A red ball is resting on the floor directly beneath the seat of the chair, positioned centrally between the chair's legs. The spatial relationship described in the prompt ("underneath") is accurately depicted in the image. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image shows a white coffee mug positioned to the left of an open book. Both objects are resting on a wooden surface, which appears to be a desk. The spatial arrangement matches the description perfectly: the mug is next to the book, and both are on the desk. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image shows a tabby cat sitting directly on top of a beige pillow. This pillow is resting on the seat of a wooden chair. The spatial hierarchy described in the prompt (cat on pillow, pillow on chair) is accurately represented in the visual content. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image shows a lit candle sitting directly on top of a stack of three books. To the right of this stack, there is a potted plant with green leaves. The spatial arrangement perfectly matches the description: the candle is on the books, and the books are next to the plant. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image shows a bird perched on a branch that extends from a large tree trunk on the right side. Directly below and next to this tree is a park bench. The spatial relationships described in the prompt—bird on branch, branch from tree, tree next to bench—are all accurately represented in the scene. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image displays five white chess pieces arranged in a cross formation on a wooden chessboard. - **Center:** A King piece is positioned in the middle. - **Left:** A Queen piece is to the left of the King. - **Right:** A Bishop piece is to the right of the King. - **Above:** A Rook piece is directly above the King. - **Below:** A Knight piece is directly below the King. All specified spatial relationships are correctly depicted. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image displays five animals standing in a row in a grassy field. From left to right, the arrangement is: a brown horse on the far left, a black and white cow immediately to its right, a white sheep in the center, a brown and white goat to the right of the sheep, and a golden retriever dog on the far right. All specified positional relationships are correct. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image displays a desk scene with a laptop centrally positioned. To the left of the laptop sits a black coffee mug. To the right lies a smartphone. A desk lamp is positioned above the scene (specifically top-left), illuminating the area. In the foreground, below the laptop, rests a spiral notebook. All specified spatial relationships (center, left, right, above, below) are accurately represented. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image displays a red barn prominently in the foreground on the left side. To its right, slightly further back in the midground, stands a windmill. In the background, behind both structures, are snow-capped mountains. The scale and positioning (barn large/close, windmill medium/mid, mountains small/far) create a consistent depth perspective. All described spatial relationships are accurate. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image perfectly matches the described spatial arrangement. The sandcastle is clearly positioned in the immediate foreground on the sand. The surfer is riding a wave in the midground, situated behind the castle but in front of the horizon. The cruise ship is located in the background near the horizon line. The visual sizes correspond to the depth, with the foreground object appearing largest. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image perfectly matches the described spatial arrangement. In the immediate foreground on the left, there is a park bench with several pigeons scattered on the pavement nearby. In the midground, centered in the frame, is a large fountain surrounded by greenery. In the background, rising above the tree line, is a skyline of skyscrapers. The layers are distinct and correctly ordered from front to back. VERDICT: PASS
▸Scale & Proportions100%vs100%
openai/gpt-image-2PASSEvaluator
The image shows a young man standing next to a standard yellow school bus. The relative scale is realistic: the bus is significantly larger and taller than the person, consistent with real-world dimensions where a school bus is roughly 10-12 feet tall. The person's height relative to the bus door and windows appears accurate. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image shows a ladybug resting on a green leaf. The ladybug appears small relative to the leaf, which occupies most of the frame. This size relationship is consistent with real-world proportions, where a ladybug is a small insect compared to a typical leaf. The visual scale is natural and realistic. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image shows a young boy standing on a basketball court holding a basketball. The size of the basketball relative to the child's hands and torso appears realistic and consistent with real-world proportions. The ball is large enough to be a standard youth basketball but fits naturally in his grip. The background elements like the hoop and door also maintain correct scale. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image accurately depicts a miniature figure standing on the rim of a coffee cup, looking down into the liquid. The scale is consistent, with the person appearing appropriately tiny relative to the cup, saucer, and background objects like the book and coffee beans. The proportions match the prompt's description perfectly. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image depicts a massive tabby cat towering over skyscrapers, consistent with the "kaiju" description. The cat's head is higher than the surrounding buildings, and its paws are enormous compared to the street level and vehicles. Helicopters flying nearby appear tiny relative to the cat, reinforcing the correct scale. The proportions are consistent with the prompt's request for a giant cat in a city setting. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image clearly shows a small, detailed model house resting comfortably on an open human palm. The scale is consistent with a "miniature" object, fitting the description perfectly without looking like a full-sized building or an impossibly tiny speck. The proportions between the hand and the house are realistic for a miniature model. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image displays the animals in a line from left to right: elephant, horse, dog, cat, and mouse. Their relative sizes are accurate to real-world proportions. The elephant is the largest, followed by the horse, then the dog, the cat, and finally the tiny mouse. The scaling is consistent and realistic. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image displays a standard dinner table setting. The plate is central and appropriately sized. The fork and knife flanking it are standard cutlery sizes relative to the plate. The wine glass is a typical size for a wine glass, sitting comfortably next to the plate. The salt shaker and pepper grinder (filled with peppercorns) are positioned to the left and have realistic proportions relative to each other and the other items. All objects maintain consistent and realistic relative sizes. VER
openai/gpt-image-2PASSEvaluator
The image displays a farm scene where the relative sizes of the objects are consistent with reality. The tractor is appropriately large compared to the farmer, the cow is a realistic size relative to the farmer and barn, and the chicken is small in the foreground. The barn and silo are large structures in the background, fitting the scale of the scene. VERDICT: PASS
Human realism 98%vs98%
▸Faces & Expressions100%vs100%
openai/gpt-image-2PASSEvaluator
The face is symmetric with natural, correctly proportioned features. The expression is calm and serious, with subtle facial cues (slightly downturned mouth, focused gaze) conveying the intended emotion. No distortions are visible. Lighting and texture are realistic. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The face is symmetric with natural, correctly proportioned features. The teenager’s expression is relaxed and neutral, with no strong emotion. No distortions are visible. The lighting and composition suit a professional headshot. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Face is symmetric with natural features; smile appears warm and genuine, conveying happiness. No obvious distortions. Lighting and detail look authentic. The portrait successfully captures the requested emotion and aesthetic. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The face is symmetric with natural features. Disgust is clearly conveyed: nose is wrinkled, lips are curled, and eyebrows are furrowed. No significant distortions; expression is authentic and readable. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Face is symmetric with natural features. Eyes are wide, eyebrows raised, mouth slightly open — all clearly conveying genuine surprise. No noticeable distortions. Lighting and detail are realistic. Emotion is unmistakable and well-executed. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Face is symmetric with natural, correctly rendered features. Downcast eyes, slightly downturned mouth, and distant gaze clearly convey quiet melancholy. No noticeable distortions. Lighting and composition enhance the emotional tone. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The face is symmetric with natural, aged features and deep wrinkles. White, curly hair frames the face appropriately. The warm smile and kind eyes are clearly visible, conveying genuine warmth. No distortions are present. The image accurately fulfills the prompt’s request. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The child’s face is symmetric with natural, proportionate features including round cheeks and large eyes. Tears are clearly streaming down both cheeks, and the crying expression—furrowed brow, open mouth, and visible tears—is unmistakable. No distortions are present. The image accurately fulfills the prompt’s requirements. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Face is symmetric with natural aging features: age spots, thinning gray hair, and pronounced wrinkles. Expression conveys skepticism via furrowed brow and narrowed eyes behind reading glasses. No distortions or unnatural artifacts. Matches prompt precisely. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
All three faces show clear, distinct emotions matching the prompt: left woman laughing with head back, center woman shocked with wide eyes, right woman smirking while rolling eyes. Facial features are symmetrical and anatomically correct with no distortions. Expressions are vivid and easily readable. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Faces are symmetric with natural features. Emotions are clearly readable: father beams, mother smiles gently, teenager looks bored (hand on cheek, neutral expression), child grins excitedly. No distortions or unnatural artifacts. All four individuals match the prompt’s emotional descriptions. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
All three faces show clear, symmetric features with no distortions. The man on the left displays frustration (furrowed brows, tense mouth). The woman center shows pleased expression (smile, relaxed eyes). The woman right shows confusion (tilted head, furrowed brow). Emotions are clearly readable and match the prompt. VERDICT: PASS
▸Full Body100%vs100%
openai/gpt-image-2PASSEvaluator
The woman stands upright with arms at her sides, wearing jeans and a white t-shirt as described. Limb count and proportions are anatomically correct. Pose is relaxed and recognizable. No distortions are visible. All elements match the prompt accurately. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The man stands straight, facing forward, with hands in pockets — perfectly matching the prompt. Limb count is correct (two arms, two legs). Proportions are natural and anatomically accurate. No distortions are visible. The pose is clearly recognizable and typical for a casual portrait. All criteria are satisfied. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The person’s limbs are correctly counted and proportioned. The neutral standing pose is anatomically accurate, with hands in pockets and slight weight distribution. The park setting and full-body framing match the prompt. No distortions are visible. The pose is clearly recognizable as a relaxed, casual stance. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Limb count and proportions are correct. The pose—bent knees, mid-air jump over a puddle—is anatomically plausible and clearly recognizable. No significant distortions; muscles, joints, and clothing fold naturally. The jump’s timing and posture align with the prompt. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Limb count and proportions are anatomically correct. Pose is a natural, recognizable running stride with dynamic motion in hair and clothing. No significant distortions. The scene matches the prompt’s description of a woman running through a park. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The dancer’s pose is a recognizable ballet turn (fouetté or similar), with one leg extended and the other en pointe. Limb count and proportions are anatomically correct. The dress flows dynamically with motion, matching the prompt. No distortions are visible; musculature, posture, and balance appear natural. The action is clearly a dynamic spin with extended leg. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The person has two legs and correct proportions. The tree pose is accurately depicted: standing on one leg, the other foot pressed into the inner thigh, arms raised overhead in prayer position. No anatomical distortions are visible. The pose is clearly recognizable and executed correctly. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Limb count and proportions are correct. The pose is anatomically plausible for rock climbing — arms and legs extended to different holds, with body tension and gear visible. The action is clearly recognizable as dynamic climbing. No significant distortions; musculature and joint angles appear natural. Gear placement and rope are realistic. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The cellist has correct limb count and proportions. Her posture is anatomically plausible for playing: seated, legs around the instrument, bow arm extended. The cello is properly positioned with the bridge and strings visible. No major distortions; the pose is recognizable as standard cello playing. Minor artifacts: the bow appears slightly unnaturally thin near the frog, and the cello’s tailpiece seems misaligned with the strings. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Limb count and proportions are accurate. The pose is a recognizable tango dip with intertwined legs — the woman’s leg hooks over the man’s thigh, and their bodies are in close embrace. Anatomical structure is plausible, though slight exaggeration in the woman’s leg curvature and the man’s torso angle adds dramatic flair. No major distortions. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Both players show correct limb count and proportions. The sliding player’s body is low with realistic motion blur and grass interaction. The jumping player’s pose is dynamic and anatomically plausible, with arms extended for balance. The ball is at the sliding player’s feet, matching the prompt. No distortions; action is clearly recognizable as a mid-tackle. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Both gymnasts show correct limb count and proportions. Their synchronized handstand pose is anatomically accurate, with legs straight, feet together, and hands supporting weight. Alignment is nearly identical, with minor natural asymmetries. No distortions; muscles and joints appear realistic. The action is clearly recognizable as a synchronized handstand. VERDICT: PASS
▸Hands92%vs92%
openai/gpt-image-2PASSEvaluator
Both hands clearly show five fingers each — thumb, index, middle, ring, and pinky — with natural, relaxed positioning. The anatomy is accurate: no extra or missing digits, no unnatural deformities. Fingers are slightly spread, consistent with a relaxed stance. No anomalies detected. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Both hands show all ten fingers clearly — five per hand. The gesture is anatomically correct: palms up, fingers spread naturally. No anomalies like extra or missing fingers, fused digits, or unnatural deformities. Fingernails and skin texture appear normal. The pose matches the prompt precisely. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Both hands show five distinct fingers each. The gesture is anatomically plausible: hands rest on opposite upper arms, fingers slightly spread, nails visible. No anomalies like extra or missing fingers, or unnatural joint angles. Fingers are correctly positioned for the pose. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The visible right hand holds the mug with a natural, anatomically correct grip: thumb on top, fingers curled around the handle. All five fingers are clearly visible and intact. No anomalies. The gesture matches the prompt perfectly. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Right hand shows all five fingers gripping the pen in a natural, anatomically correct writing posture. Left hand is partially visible, resting on the table, also showing five fingers. No anomalies detected. Grip matches standard pen-holding technique. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Left hand on fretboard shows 5 fingers clearly (thumb, index, middle, ring, pinky) in proper pressing position. Right hand strumming shows 5 fingers (thumb, index, middle, ring, pinky) — though thumb is partially obscured, all 5 are anatomically present and positioned correctly for strumming. No anomalies. Fingers are proportionate and joints look natural. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The right hand clearly shows five fingers: index and middle extended (peace sign), ring, pinky, and thumb curled. The gesture is anatomically correct with natural finger positioning. No anomalies or missing fingers. Left hand is not visible, but prompt only required evaluation of the right hand. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The hand shows five fingers: thumb extended upward, and four fingers curled into a fist. The gesture is anatomically correct — thumb opposes the curled fingers naturally. No anomalies in finger count or structure. Skin texture and joint alignment appear realistic. VERDICT: PASS
openai/gpt-image-2FAILEvaluator
The visible hand shows exactly three extended fingers (index, middle, ring), with thumb and pinky folded. No other hand is visible. The gesture is anatomically correct for counting “three.” The prompt asked for “five per hand,” which is not met. Only one hand is shown, and it has three fingers extended, not five. VERDICT: FAIL
openai/gpt-image-2PASSEvaluator
Both hands show five clearly defined fingers each, with all fingers extended as requested. The high-five gesture is anatomically correct — palms meet squarely, fingers aligned naturally, no unnatural bending or extra digits. No anomalies detected. The image accurately fulfills the prompt’s requirements. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Both hands show five clearly defined fingers each. The handshake is anatomically correct: palms facing slightly inward, fingers interlocked, thumbs resting on the opposite hand’s palm. No anomalies like extra or missing fingers, malformed joints, or unnatural positioning. The gesture is natural and professional. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Both hands show five fingers each, with no anomalies. The grip is anatomically correct: thumbs support the sides, fingers cradle the box securely. The gesture is natural for handing over an object. All fingers are clearly defined and positioned realistically. VERDICT: PASS
▸Multi-Subject100%vs100%
openai/gpt-image-2PASSEvaluator
Person 1: Short blonde hair, wearing a red blazer over a black top — matches description. Person 2: Long black hair, wearing a blue dress — matches description. Both are visually distinct and correctly portrayed. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Person 1: Bearded, wears glasses, green sweater — matches description. Person 2: Clean-shaven, wears black hoodie — matches description. Both individuals are visually distinct and clearly identifiable. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Person 1: Tall, curly red hair, denim overalls — matches description. Person 2: Short, straight brown hair, yellow sundress — matches description. Both are visually distinct and pose together. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Tall woman: pink short hair, playing bass, tattoos visible — matches. Bearded man: bandana, playing drums — matches. Slim man: glasses, playing keyboard — matches. All three are visually distinct and correctly positioned. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Man: Sunhat, backpack — matches. Woman: Braided hair, hiking poles, backpack — matches. Teenager: Baseball cap, water bottle — matches. All three are visually distinct and accurately represent the described features. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
All four individuals are present and visually distinct, matching the prompt: - Bald man: wears suit and tie. - Woman with silver hair: red blouse, earrings. - Young man with dreadlocks: denim jacket over white tee. - Woman with hijab: green dress with embroidery. VERDICT: PASS
Truthfulness 89%vs93%
▸Photorealism100%vs100%
openai/gpt-image-2PASSEvaluator
The image successfully renders all requested materials with high fidelity. The laptop exhibits the matte, metallic sheen of brushed aluminum. The notebook displays a convincing leather texture with visible grain and stitching. The coffee mug has the speckled, matte finish characteristic of ceramic stoneware. The paperweight is a clear glass sphere showing realistic refraction, internal bubbles, and reflections of the surrounding environment. The wooden desk surface also shows appropriate grain
openai/gpt-image-2PASSEvaluator
The image successfully renders the requested materials with accurate visual properties. The stainless steel pot exhibits sharp, realistic reflections of the window and surrounding environment. The wooden cutting board shows natural grain and appropriate specular highlights on its surface. The glass bottle demonstrates correct translucency and refraction, with the olive oil visible inside and distinct highlights on the glass. The cotton dish towel displays a soft, woven texture with realistic f
openai/gpt-image-2PASSEvaluator
The image successfully renders all requested materials with high fidelity. The glass shows accurate refraction and transparency. The silver spoon exhibits sharp, distorted reflections consistent with polished metal. The wooden table features realistic grain and texture. The linen napkin displays a convincing woven fabric texture with natural folds. The apple has a realistic waxy skin with appropriate specular highlights. VERDICT: PASS
▸Physics & Reflections83%vs92%
openai/gpt-image-2PASSEvaluator
The image displays a red sphere with a specular highlight on the upper-left side, indicating a light source from that direction. Consequently, a soft, diffuse shadow is cast to the right and slightly downward, which is physically consistent with the lighting. The shadow's gradient and color are realistic for a white surface. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The shadow is short and falls directly beneath and slightly to the side of the subject, consistent with a high sun position at midday. The lighting on the subject's face and clothing aligns with this overhead light source. The shadow's color and opacity are realistic for a sunny day on concrete. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The lighting source is clearly the window on the right, illuminating the right side of the mug. Consequently, the shadow is cast to the left, which is physically correct. The shadow is sharp and distinct, consistent with direct sunlight. The shape of the shadow accurately reflects the mug's cylindrical form and handle. The overall lighting and shadow interaction are realistic and plausible. VERDICT: PASS
openai/gpt-image-2FAILEvaluator
The reflection is physically implausible. The woman in the foreground is facing towards the mirror (showing the front of her dress), yet the reflection shows her back. In a real mirror, if a person faces the mirror, their reflection faces them (showing the front). To see the back of the outfit in the mirror, the woman would have to be facing away from the mirror, which contradicts her pose in the foreground. The image attempts to show both views simultaneously in a way that violates optical ph
openai/gpt-image-2PASSEvaluator
The reflection is physically plausible. It is correctly oriented as an inverted vertical mirror image directly beneath the cat. The distortion increases with distance from the contact point, consistent with a polished marble surface. The brightness and color saturation are slightly reduced in the reflection, which is realistic. The reflection of the background elements (door, plant) also aligns correctly with the perspective. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The reflection of the vase and the stack of books on the glass table is physically plausible. The reflection is correctly inverted vertically, aligned with the base of the objects, and exhibits appropriate transparency and slight distortion consistent with a glass surface. The lighting and shadows are consistent with the window light source on the left. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The reflections are physically plausible. They are correctly inverted vertically and aligned with the trees above. The "gently rippled" effect is accurately depicted through horizontal distortion of the vertical tree trunks and canopy. The color matches the sunset sky, and the reflections are appropriately slightly darker and less sharp than the direct view, consistent with light interacting with a water surface. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The reflection is physically plausible. It is a vertically inverted mirror image of the mountain range and treeline, aligning perfectly with the objects above. The water surface is calm enough to produce a sharp reflection, though slightly darker and less saturated than the actual landscape, which accurately mimics how light behaves when reflecting off water. The lighting on the peaks is consistent in both the real scene and the reflection. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The reflections are physically plausible. The buildings are correctly inverted, pointing downwards into the water. The distortion is consistent with a rippling puddle surface, breaking up the vertical lines of the skyscrapers. The colors accurately reflect the warm city lights and the cool, twilight sky tones, with a natural shift caused by the water's surface. VERDICT: PASS
openai/gpt-image-2ERROR
openai/gpt-image-2PASSEvaluator
The reflections on the chrome sculpture are physically plausible, showing distorted environmental mapping of the windows, sky, and room consistent with the object's curvature. The glass floor correctly reflects the sculpture's base and lower form. Shadows are cast in a direction consistent with the strong natural light coming from the large windows on the left, and self-shadowing within the sculpture's folds aligns with this light source. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The shadows and reflections are physically plausible. The shadow cast by the wine glass falls to the left, consistent with the light source coming from the window on the right. The reflection on the polished marble is directly beneath the glass, maintaining correct orientation and distortion. The refraction through the cut crystal glass distorts the background view realistically, and the lake outside shows a natural reflection of the sky and clouds. VERDICT: PASS
▸World Knowledge92%vs92%
openai/gpt-image-2PASSEvaluator
The image depicts the Eiffel Tower accurately, featuring its iconic iron lattice structure, three distinct viewing platforms, and the characteristic tapering shape. The setting includes the Seine river with a tour boat and a stone bridge (likely the Pont d'Iéna) in the foreground, consistent with a view from Paris. The sky is blue with scattered clouds, fitting the "clear day" description. The architectural details are faithful to the real-world landmark. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image depicts the Statue of Liberty with high accuracy. The statue features the correct green patina, a golden torch held high, a spiked crown, and a tablet held against the left arm. The architectural details of the pedestal, including the central tower and the surrounding star-fort walls, are factually correct. The surrounding environment, including the water, trees, and American flag, is consistent with the real-world location. VERDICT: PASS
openai/gpt-image-2FAILEvaluator
The image depicts the Taj Mahal, but it contains significant architectural inaccuracies. While the symmetry and general layout (reflecting pool, cypress trees) are correct, the minarets are wrong. The Taj Mahal has four minarets at the corners of the plinth. This image shows two large minarets in the foreground and two smaller minarets flanking the main structure where smaller domes (chhatris) should be. Additionally, the finial on the main dome is simplified and lacks the traditional crescent
openai/gpt-image-2PASSEvaluator
The image depicts a shamisen that is highly recognizable and largely accurate to the prompt. It features the correct three-stringed design, a wooden body (dai) covered with a white membrane, and the traditional tuning pegs (kama) at the headstock. The bachi (plectrum) is present and accurately shaped with a tortoiseshell pattern. While the body shape is slightly more rectangular than some traditional curved styles, it remains a faithful representation of the instrument's core components. VERD
openai/gpt-image-2PASSEvaluator
The image depicts a highly realistic brass astrolabe. It features the key components requested: a central rete with star pointers (labeled with names like Lyra, Cygnus, Draco), a tympan underneath with concentric circles, and an outer mater with clear engraved degree markings. The inclusion of Arabic script alongside Latin/English star names is historically plausible for such instruments. The lighting, texture, and mechanical details (like the alidade rule) are convincing and accurate to the r
openai/gpt-image-2PASSEvaluator
The image depicts a ceramic bowl with visible fractures repaired using gold lacquer, perfectly matching the description of a Japanese kintsugi bowl. The texture of the glaze, the natural-looking fracture patterns, and the application of the gold repair are all realistic and factually accurate to the art form. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image is a clear, accurate, and well-labeled diagram of the water cycle. It correctly depicts all requested stages: evaporation from the ocean (Step 1), condensation into clouds (Step 2), precipitation as rain (Step 3), collection in rivers (Step 4), and the return flow to the sea (Step 5). The visual elements, including the sun, clouds, mountains, and water bodies, are realistic and appropriately arranged to illustrate the scientific process. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image is a highly accurate and detailed educational diagram depicting an oceanic-continental convergent boundary. It clearly labels all requested elements: the subducting oceanic plate, the trench, the volcanic arc on the continental plate, and the distinct layers of the Earth (lithosphere, asthenosphere, lower mantle, outer core). The geological processes shown, such as water release lowering the melting point to generate magma, are scientifically correct. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image is a highly accurate and standard anatomical illustration of the human heart. It correctly depicts the four chambers (Right/Left Atrium and Ventricle) with appropriate color coding for deoxygenated (blue) and oxygenated (red) blood. The major vessels (Aorta, Pulmonary Arteries/Veins, Vena Cavae) and valves (Tricuspid, Pulmonary, Mitral, Aortic) are labeled and positioned correctly according to anatomical standards. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image accurately depicts a *Morpho menelaus* butterfly. The top and bottom-right panels show the characteristic iridescent blue dorsal wings with dark borders and white spots. The bottom-left panel correctly displays the brown ventral wing pattern featuring prominent eyespots, which is the underside of the same species. The body morphology, including antennae and wing venation, appears anatomically correct. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image accurately depicts the interior of the Hagia Sophia. It correctly features the massive central dome with its characteristic windows, the pendentives adorned with large angel figures (cherubim), and the semi-domes extending outward. The large Islamic calligraphy medallions hanging from the piers are present and correctly styled, alongside visible Byzantine mosaics in the apse area. The architectural layout and decorative elements are factually consistent with the real-world structure.
openai/gpt-image-2PASSEvaluator
The image depicts a highly realistic bismuth crystal. It features the classic cubic hopper formation with hollowed-out centers and distinct, stepped "staircase" edges. The surface displays vibrant, shifting iridescent colors (gold, green, blue, purple) caused by the thin oxide layer, which is characteristic of real bismuth specimens. The geometry and texture are factually accurate to the mineral. VERDICT: PASS
Professional Studio 93%vs89%
▸Camera & Lighting100%vs100%
openai/gpt-image-2PASSEvaluator
The image successfully matches the prompt's requirements. It features a wide-angle perspective that captures an expansive living room area. Most notably, the image exhibits significant barrel distortion, visible in the curved edges of the ceiling and walls, which creates the "fisheye" effect requested. The lighting is even and natural, and the depth of field is deep, keeping both the foreground seating and the background staircase in clear focus. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image perfectly matches the prompt. It features a close-up macro shot of a single, clear dewdrop resting on a red rose petal. The depth of field is extremely shallow, with the background and foreground heavily blurred, focusing entirely on the droplet and the immediate texture of the petal. The lighting is soft and natural, highlighting the transparency and refraction within the dewdrop. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image perfectly matches the prompt. It provides a true aerial bird's-eye view, looking straight down at a traffic roundabout. The perspective is top-down, and the lighting is even, consistent with daylight. The scene includes the requested elements: a roundabout, cars, and surrounding trees. The depth of field is consistent with a high-altitude aerial shot, maintaining clarity across the entire scene. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image perfectly matches the prompt. The lighting is a classic Rembrandt style, characterized by a distinct, small triangle of light on the shadowed cheek (the subject's right cheek, viewer's left). The atmosphere is dark and moody, with a shallow depth of field that keeps the focus on the subject's face against a dark, indistinct background. The perspective is a standard portrait angle. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image perfectly matches the prompt. The lighting is a classic "golden hour" scene, featuring a bright, low sun that creates a warm orange sky and a soft glow across the landscape. Long, warm shadows are visible along the cliffs and the path, and the light reflecting off the water enhances the atmospheric depth. The perspective and lighting patterns are consistent with a high-quality landscape photograph. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image perfectly matches the prompt. It features a high-key portrait with a bright, seamless white background. The lighting is even and luminous, creating a soft, airy feel with virtually no harsh shadows on the subject's face or the background. The perspective is a standard close-up portrait, and the depth of field is shallow, keeping the focus entirely on the subject's expression. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image perfectly matches the prompt. The perspective is a classic portrait shot. The depth of field is shallow, creating a very creamy, soft bokeh background that obscures the garden details. Most importantly, the lighting features a strong, warm rim light (backlighting) that creates a glowing halo around the subject's hair and shoulders, effectively separating her from the background as requested. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image successfully matches the prompt's technical requirements. The subject is sharp and in focus, while the background features a shallow depth of field with soft, out-of-focus city lights (bokeh). The lighting pattern is consistent with a night-time street photograph, and the perspective effectively mimics a 50mm lens at a wide aperture (f/1.8) by isolating the subject from the background. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image is a top-down (flat lay) shot, which matches the "shot from above" instruction. The lighting comes from the upper left, creating soft shadows and highlights consistent with "natural window light from the left." The color temperature is warm, and there is a subtle vignette around the edges. The depth of field is relatively shallow, consistent with an f/2.8 aperture. All technical specifications are met. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image perfectly captures the tilt-shift effect. It features a strong selective focus strip (shallow depth of field) that blurs the top and bottom of the frame, creating the miniature model illusion. The colors are highly saturated and vibrant, and the high-angle perspective enhances the toy-like appearance. The lighting is bright and even, consistent with a daylight street scene. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image perfectly matches the prompt. It features a cinematic, dark, and moody scene with a shallow depth of field focusing on the subject. Most importantly, it displays prominent blue and orange anamorphic lens flares streaking horizontally across the frame, exactly as described. The lighting and perspective align with the requested cinematic aesthetic. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image perfectly matches the prompt. The lighting is a textbook example of dramatic split lighting/chiaroscuro: one side of the face is brightly illuminated while the other is cast in deep, heavy shadow. The high contrast between the light and dark areas is sharp, and the close-up portrait perspective creates an intimate, dramatic effect. The depth of field is shallow, keeping the focus on the facial features. VERDICT: PASS
▸Color Precision92%vs100%
openai/gpt-image-2PASSEvaluator
The image features a bright, vivid red sports car parked on a gray asphalt road. The red color is intense, saturated, and unambiguous, perfectly matching the prompt's description. The road is a neutral gray, and the surrounding environment maintains color accuracy without any deviation from the specified hues. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The woman is wearing a vibrant, deep blue evening dress that perfectly matches the "royal blue" description. The background is a solid, light gray/beige tone, which qualifies as a "neutral background." All color requirements specified in the prompt are accurately represented in the image. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image features a vast field of bright, vibrant yellow sunflowers under a solid, clear blue sky. The colors perfectly match the prompt's requirement for "pure sunflower yellow flowers" and a "clear blue sky." There are no deviations in color saturation or hue from the described natural tones. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The wall color is a vibrant, saturated orange. The hex code #FF6B35 represents a bright, slightly reddish-orange (often called "burnt orange" or "sunset orange"). The image accurately reflects this specific hue and intensity, matching the prompt's requirement for a burnt orange wall. The white furniture and natural daylight are also correctly depicted. VERDICT: PASS
openai/gpt-image-2FAILEvaluator
The foliage in the image consists of various shades of green, but it does not strictly adhere to the specific hex code #2D5F2D. The colors are more vibrant and lighter in several areas than the requested "dark muted green." While the scene is a forest, the lack of precision regarding the specific color value requested in the prompt results in a failure to meet the exact color specification. VERDICT: FAIL
openai/gpt-image-2PASSEvaluator
The walls in the image are a vibrant, medium blue that aligns well with the description of #4A90D9 (a medium blue). The floor is a warm wooden tone, and the trim around the window, door, and ceiling is clearly white, matching the prompt's requirements for color accuracy. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image accurately reflects the requested colors. The walls are a warm cream color (#F5E6CC), the sofa is a dark navy blue (#2C3E50), and the accent pillows are a bright, vibrant red (#E74C3C). All elements align perfectly with the specified hex values and color names provided in the prompt. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image features a deep dark blue background, which aligns with #1A1A2E. The watch is a soft gold color, matching the #F0C27F description. The earbuds are pure white, consistent with #FFFFFF. All specified colors and their corresponding hex values are accurately represented in the flat-lay composition. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image matches the specified colors accurately. The background is a near-white color (#FAFAFA), the large main shape is a vibrant orange-red (#FF4500), and the text and accent lines are a dark charcoal color (#333333). The color palette is consistent with the hex values provided in the prompt. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image shows a smooth gradient starting with a vibrant red on the left and transitioning through a purple middle section to a deep blue on the right. The colors align perfectly with the specified hex values: #FF0000 (pure red) on the left and #0000FF (pure blue) on the right, with an even transition through purple in between. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image displays a vertical gradient. The bottom color is a bright orange, which is a close approximation of #FF6B35. The top color is a deep, dark indigo/purple, which aligns with the description of #1B1464. The transition between the two colors is smooth and follows the specified gradient direction from orange at the bottom to deep indigo at the top. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image displays a horizontal gradient transitioning from a bright cyan color on the left to a light mint green on the right. The left side closely matches the #00C9FF (cyan) hue, and the right side matches the #92FE9D (mint green) hue. The transition is smooth and follows the specified color direction and hex values accurately. VERDICT: PASS
▸Photorealism67%vs0%
openai/gpt-image-2PASSEvaluator
The image is a high-quality product shot that successfully mimics professional studio photography. The lighting is soft, the shadows are realistic, and the texture of the fabric and suede appears consistent. There are no obvious AI artifacts, such as warped geometry, nonsensical stitching, or melting textures. The white background is clean, and the single red sneaker looks like a legitimate commercial photograph. VERDICT: PASS
openai/gpt-image-2FAILEvaluator
The image fails to meet the criteria for a real photograph due to significant AI artifacts. The most glaring issue is the "glitch" or distorted character at the bottom right of the reflection, which appears as a nonsensical, jagged shape. Additionally, the internal straw (dip tube) is unnaturally thick and lacks the physical realism of a real object. The overall geometry of the glass edges is slightly inconsistent, indicating it is AI-generated. VERDICT: FAIL
openai/gpt-image-2PASSEvaluator
The image successfully meets the criteria for a studio-quality product shot. The lighting is soft and natural, creating a subtle shadow that grounds the object. The texture of the green apple is realistic, and the seamless white background is clean. There are no obvious AI artifacts, such as unnatural warping or distorted textures, and the stem appears anatomically correct. It looks like a professional catalog photograph. VERDICT: PASS
Graphical design 100%vs92%
▸Layout & Design100%vs89%
openai/gpt-image-2PASSEvaluator
All described elements are present: “MOVIE NIGHT” at the top in bold, a large popcorn bucket centered, and “EVERY FRIDAY AT 8 PM” at the bottom. Text is readable and correctly placed with clear visual hierarchy. The layout matches the prompt precisely. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The poster includes a prominent guitar illustration, the headline “LIVE MUSIC FEST” in large, bold text with clear visual hierarchy, and “TICKETS AVAILABLE NOW” as a tagline below, correctly placed. All text is readable and properly positioned. The layout matches the prompt’s requirements. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
All required elements are present: open book illustration, title “JOIN THE BOOK CLUB” with correct visual hierarchy (large, centered), and tagline “Meets every Wednesday” on a banner at the bottom. Text is readable and correctly placed. Layout is cohesive and follows the prompt’s structure. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Hero header image present at top. Headline “Discover. Learn. Grow.” and subtitle below are centered and readable. “Sign Up” button is clearly placed below. Footer contains three social media icons (Facebook, Twitter, Instagram). Visual hierarchy is correct: image → text → CTA → footer. All required elements are present and correctly positioned. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
All specified layout zones are present: logo header, three category sections (appetizers, mains, desserts) with items and prices, and a footer with address and hours. Visual hierarchy is clear, with consistent typography and spacing. Text is fully readable and correctly placed. No elements are missing or misaligned. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
All layout zones are present: banner header, two-column body (text left, image right), quote block, and footer with unsubscribe link. Visual hierarchy is clear, text is readable and correctly placed. The quote block is appropriately styled and attributed. All elements align with the prompt’s structure and intent. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Left page: Large hero photo with headline and subhead — present. Right page: Two-column article, pull quote, author bio sidebar, page number — all correctly placed with clear visual hierarchy. Text is readable and properly formatted. Layout matches prompt exactly. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
All specified UI elements are present: top navigation bar (with time, menu, and notifications), search field, 2x2 feature cards with icons and text, recent activity list, and bottom tab bar. Text is readable and correctly placed. Visual hierarchy is clear and consistent with mobile app conventions. Layout matches the prompt precisely. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
Full-bleed left-page photo present. Right page has three-column layout, headline “New Directions,” body copy, inset image, and styled page numbers (42/43). Text is readable and correctly placed with clear visual hierarchy. All specified layout zones and elements are accurately rendered. VERDICT: PASS
▸Data Visualisation100%vs100%
openai/gpt-image-2PASSEvaluator
All requested elements are present and accurate: correct percentages (35%, 25%, 20%, 20%), proper labels, distinct colors, and readable text. The chart type is a pie chart as requested. No errors in data or presentation. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The chart correctly displays a line graph with the requested title "Weekly Traffic", axis labels ("Visitors" and "Day of the Week"), and grid lines. Data points align with the specified values: Mon (1200), Tue (1500), Wed (1100), Thu (1800), Fri (2000), Sat (1700), Sun (1400). All elements are clearly readable and the chart type matches the request. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
All requested elements are present and accurate: bar chart type, labeled axes (Month and Sales in Thousands of USD), title "Monthly Sales 2025", color-coded bars, and exact sales values (12K, 18K, 15K, 22K, 19K, 25K) clearly labeled on bars. Readability is high. VERDICT: PASS
▸Style Diversity100%vs92%
openai/gpt-image-2PASSEvaluator
The image clearly depicts a golden retriever in a garden, matching the subject. It features large, expressive eyes and a stylized, clean aesthetic consistent with Japanese anime. However, the fur has subtle shading and texture, and the background has depth and lighting that deviate from strict flat-color anime. The style is recognizable but not perfectly aligned with “flat colors.” VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image clearly depicts a golden retriever in a garden setting. It exhibits visible brushstrokes, textured surfaces, and a painterly quality consistent with oil painting. The style matches the prompt’s request for rich texture and brushwork. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image clearly depicts a golden retriever in a garden, rendered in a distinct pixel art style with visible 32x32 grid blocks and a limited, blocky color palette. Key characteristics — blocky forms, low-resolution textures, and simplified shapes — are present. The scene matches the prompt’s requirements. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image perfectly embodies Art Deco: geometric facades, bold gold-and-black palette, ornamental symmetry, sunburst motifs, and stylized signage. Classic elements like the Chrysler Building-inspired skyscraper, vintage cars, and period attire reinforce the style. All requested characteristics are unmistakably present. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image clearly embodies Impressionism: visible brushstrokes, dappled light, and a pastel palette dominate. The woman holds a parasol, and the scene captures fleeting outdoor moments with soft focus and atmospheric color. All requested stylistic elements are unmistakably present. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image clearly embodies Pop Art: bold black outlines, Ben-Day dot patterns, and flat primary colors (red, yellow, blue). The subject’s stylized face and comic-book background (starburst) reinforce the style. All requested visual characteristics are unmistakably present. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image blends photorealistic detail with visible watercolor bleeds and paper texture, matching the prompt. Lighting and depth resemble photography, while edges show soft watercolor washes and texture. Key characteristics are clearly present. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image successfully blends cyberpunk’s neon-lit, tech-heavy cityscape with Art Nouveau’s flowing organic lines and floral motifs. The woman’s ornate, floral-adorned attire and the decorative border frame echo Art Nouveau, while glowing neon signs, cybernetic elements, and flying vehicles embody cyberpunk. Key visual characteristics are clearly recognizable and well-integrated. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image successfully blends ukiyo-e style (flat planes, bold outlines, woodblock texture, vertical text) with photorealistic sky and lighting (detailed clouds, atmospheric depth, realistic reflections). The castle’s structure is medieval, and the composition mirrors traditional ukiyo-e framing. Key characteristics are clearly present and harmoniously combined. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image clearly embodies soft brutalism: raw concrete walls, ceiling, and staircase contrast with rounded, pastel pink furniture. Warm, diffused light from the skylight and window enhances the serene, minimalist atmosphere. Textures are visible, and the composition balances industrial rawness with organic curves. All key elements of the prompt are faithfully rendered. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image delivers “anxious minimalism” with sparse composition: only two objects on a table, vast negative space, and a dark, muted palette. The wall’s stark, empty rectangle adds visual tension. Subtle textures and restrained lighting enhance unease without overt drama. All key elements of the prompt are clearly present. VERDICT: PASS
openai/gpt-image-2PASSEvaluator
The image clearly embodies retro-futurism with 1960s space-age aesthetics: flying saucers, domed architecture, and sleek towers. Warm, aged tones and a melancholic patina enhance the nostalgic feel. Figures in period attire and vintage signage (e.g., “CONTINENTAL HOTEL”) reinforce the era. Visual cues are unmistakably aligned with the prompt. VERDICT: PASS