ImageBench

vs

192 evaluations across 6 categories

53%vs94%
Pass Rate
11.1svs28.1s
Avg Latency
Text Rendering20%87%Spatial Reasoning49%97%Human realism67%98%Truthfulness48%93%Professional Studio74%89%Graphical design38%92%Latency29%2%
The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink
Text RenderingTypography StyleEasyfal/fal-ai/nano-banana-2

Prompt: The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Text Rendering 20%vs87%

Typography Style67%vs100%
Easy 67%vs100%
The word 'CHAPTER ONE' typed on aged paper with a vintage tysana-local/sana-1.5-1.6bFAIL
The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Evaluator

The text “Chaper Ne” is not spelled correctly — it should be “CHAPTER ONE.” The word “caper.” is also misspelled and incorrectly placed on the typewriter’s carriage, which should show “CHAPTER ONE” as typed text. The font style resembles a typewriter font, and the paper has a vintage, aged appearance with uneven ink effects. However, the spelling errors significantly undermine the accuracy of the prompt. VERDICT: FAIL

The phrase 'TODAY'S SPECIAL' written in white chalk on a darsana-local/sana-1.5-1.6bPASS
The phrase 'TODAY'S SPECIAL' written in white chalk on a dark green chalkboard with chalk dust

Evaluator

The text “Today’s Special” is rendered in a style that convincingly mimics white chalk on a dark green chalkboard. The lettering shows the characteristic uneven, textured strokes and slight smudges of chalk writing, and there are subtle chalk dust particles visible, especially around the bottom left. The spelling is correct: “Today’s Special” (with the apostrophe correctly placed before the ‘s’). VERDICT: PASS

The word 'OPEN' displayed on a red LED sign with glowing pixsana-local/sana-1.5-1.6bPASS
The word 'OPEN' displayed on a red LED sign with glowing pixel-style letters

Evaluator

The text "OPEN" is spelled correctly. The style convincingly matches an LED sign with glowing pixel-style letters — the letters are composed of a grid of illuminated dots, emit a red glow consistent with LED signage, and cast a realistic reflection on the surface below. The design and lighting effects are appropriate for the medium described. VERDICT: PASS

Writing accuracy8%vs83%
Easy 33%vs100%
A single large word 'HELLO' centered on a white backgroundsana-local/sana-1.5-1.6bFAIL
A single large word 'HELLO' centered on a white background

Evaluator

The image displays the word "Hello" in a stylized, lowercase font, centered on a white background. The intended text was "HELLO" (all uppercase), but the image shows "hello" (all lowercase). While the spelling is correct ("h-e-l-l-o"), the case does not match the intended text. Additionally, the word is not fully uppercase as specified. Since the intended text was "HELLO" and the image shows "hello" (lowercase), this constitutes a deviation from the intended format. The text is readable, but no

The word 'ADVENTURE' in big bold letters centered on a plainsana-local/sana-1.5-1.6bPASS
The word 'ADVENTURE' in big bold letters centered on a plain cream background

Evaluator

The word displayed in the image is "Adventure", spelled correctly with all letters present: A-D-V-E-N-T-U-R-E. The text is centered on a plain cream background, as intended. All letters are clearly legible and bold, with no missing or distorted characters. VERDICT: PASS

The word 'SERENITY' written large and centered on a solid lisana-local/sana-1.5-1.6bFAIL
The word 'SERENITY' written large and centered on a solid light blue background

Evaluator

The intended text is "SERENITY", but the image displays "serienty". - The word is misspelled: the correct spelling has an 'N' as the 6th letter, but the image shows a 'T' instead. - The word is also not fully capitalized as intended — it appears in lowercase. - The letter 'N' is missing, and the letter 'T' is incorrectly placed. The text is fully readable, but it is misspelled. VERDICT: FAIL

Medium 0%vs100%
The sentence 'The quick brown fox jumps high' in clean blacksana-local/sana-1.5-1.6bFAIL
The sentence 'The quick brown fox jumps high' in clean black text on a white background

Evaluator

Misspellings: "QUICICWN" (should be "QUICK BROWN"), "jumb EX" (should be "jumps high"), "fisl" (should be "fish"). Words are partially or fully illegible due to distortion. Text does not match intended phrase. VERDICT: FAIL

'Every sunset brings a new dawn ahead' written neatly on a psana-local/sana-1.5-1.6bFAIL
'Every sunset brings a new dawn ahead' written neatly on a pale yellow background

Evaluator

Text contains multiple errors: “BERT” instead of “SUNSET”, “BURES” instead of “SUNSET”, “SEWI” instead of “SUN”, “DOINT” instead of “DAWN”, “DOON” instead of “DAWN”. Words are misspelled and not fully readable. The intended phrase is not accurately rendered. VERDICT: FAIL

'Stars shine brightest in the dark night' displayed on a clesana-local/sana-1.5-1.6bFAIL
'Stars shine brightest in the dark night' displayed on a clean off-white background

Evaluator

There is no text visible in the image. The image depicts a starfield against a dark background, with no readable words or lettering present. The intended text — “'Stars shine brightest in the dark night' displayed on a clean off-white background” — does not appear anywhere in the image. Therefore, it is impossible to verify spelling or readability, as the text is not present. VERDICT: FAIL

Hard 0%vs33%
A poster with headline 'SUMMER SALE' in large bold text, bodsana-local/sana-1.5-1.6bFAIL
A poster with headline 'SUMMER SALE' in large bold text, body text 'Up to 50% off all items this weekend only' in medium size, and fine print 'Terms and conditions apply. See store for details.' at the bottom in small text

Evaluator

The text in the image contains multiple misspellings and illegible parts: - Headline: “Summer Salle” — “Salle” is misspelled (should be “Sale”). - Above headline: “Summme hul ay” — misspelled and unclear. - Body text: “UP% 00% tuis AM 5 OF 50TF & ontern & e eromen” — contains numerous errors: - “UP%” instead of “UP TO” - “00%” instead of “50%” - “tuis AM” instead of “this weekend” - “5 OF” instead of “all items” - “50TF” instead of “50% off” - “& ontern & e eromen” — illegible, like

A flyer with large title 'GRAND OPENING', subtitle 'Join us sana-local/sana-1.5-1.6bFAIL
A flyer with large title 'GRAND OPENING', subtitle 'Join us for food, music, and fun' in medium text, and small disclaimer 'RSVP required. Limited seating available.' at the bottom

Evaluator

The intended text was: - Large title: "GRAND OPENING" - Subtitle: "Join us for food, music, and fun" - Small disclaimer: "RSVP required. Limited seating available." Actual text in the image: - Large title: "GRAND ONENING" — misspelled ("ONENING" instead of "OPENING") - Subtitle: "JON & OR FUN," — misspelled ("JON" instead of "Join", "OR" instead of "us for", "FUN" instead of "fun") - Small disclaimer: "RUP ROOREUS, LOUSD EF FUN," — misspelled ("RUP ROOREUS" instead of "RSVP required", "LOUSD EF

An event banner with headline 'ANNUAL GALA 2025' at the top sana-local/sana-1.5-1.6bFAIL
An event banner with headline 'ANNUAL GALA 2025' at the top in large serif font, description 'An evening of fine dining and live jazz' in medium text, and footnote 'Black tie optional. Doors open at 6 PM.' in small text

Evaluator

The text in the image contains multiple misspellings and inaccuracies compared to the intended text: 1. Headline: “Antriall Gal.” — Incorrect. Should be “ANNUAL GALA 2025”. - “Antriall” is misspelled (should be “Annual”). - “Gal.” is an abbreviation for “Gala”, but the intended text uses “GALA” fully spelled out. - “2025” is misspelled as “SAL / 25” — “SAL” is not “2025”, and the slash is incorrect. 2. Description: “Fildi naisinn” — Incorrect. Should be “An evening of fine dining and

Extreme 0%vs100%
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagsana-local/sana-1.5-1.6bFAIL
A vintage movie poster for 'THE LAST VOYAGE' featuring a tagline 'One ship. One chance. No turning back.' with credits block reading 'Directed by James Harlow. Starring Elena Voss, Marcus Chen, and Sofia Reyes. Music by Thomas Grant. A Paramount Pict

Evaluator

The text on the poster is heavily distorted and illegible. Here’s a breakdown of the issues: Top line: “THE ELLA&. AARG AN TIIHEGOL. PANO,” — This is not “THE LAST VOYAGE” and contains numerous misspellings and nonsensical fragments. Tagline: “O’KE ONE E. A. AU4” — This is not “One ship. One chance. No turning back.” — it’s garbled and unrecognizable. Ship name: “EAPTEIN SLBEN.” — This is not “Captain Siben” or any coherent name; it’s a clear misspelling. Credits block: - “-OMING VOSS HLEEN”

A magazine cover titled 'NATIONAL EXPLORER' with cover linessana-local/sana-1.5-1.6bFAIL
A magazine cover titled 'NATIONAL EXPLORER' with cover lines 'The Hidden Temples of Cambodia — page 34', 'Interview: Dr. Maria Santos on Climate Action', 'Top 10 Trails You Must Hike This Year', and a barcode area with 'Issue 247 | March 2025 | $6.99

Evaluator

The intended text was: - Main title: "NATIONAL EXPLORER" - Cover lines: 1. "The Hidden Temples of Cambodia — page 34" 2. "Interview: Dr. Maria Santos on Climate Action" 3. "Top 10 Trails You Must Hike This Year" - Barcode area: "Issue 247 | March 2025 | $6.99" Actual text in the image: - Main title: "NATION BE DELPORER" → misspelled (should be "NATIONAL EXPLORER") - Left column: - "TRP 14" → likely intended as "TOP 10" or "TRP 14" (but not matching) - "THE HILNEEN 12,

A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtsana-local/sana-1.5-1.6bFAIL
A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtitle 'Raw & Unfiltered', weight '12 oz (340g)', description 'Pure wildflower honey harvested from sustainable apiaries in the Pacific Northwest. No additives or preservatives.', and small te

Evaluator

The text on the label contains multiple errors: - "GOLDEN" is misspelled as "GOLLDEN" - "HARVEST" is misspelled as "HARMEST" - "HONEY" is misspelled as "ORNEEY" - The subtitle "Raw & Unfiltered" is missing or illegible - The weight "12 oz (340g)" is not readable or is gibberish - The description "Pure wildflower honey harvested from sustainable apiaries in the Pacific Northwest. No additives or preservatives." is not readable or is gibberish - The small text "Best before: 06/2026. Store in cool

Spatial Reasoning 49%vs97%

Attributes Binding44%vs100%
Easy 100%vs100%
A red ball and a blue cube sitting side by side on a white ssana-local/sana-1.5-1.6bPASS
A red ball and a blue cube sitting side by side on a white surface

Evaluator

The image displays a red sphere (ball) on the left and a blue cube on the right. Both objects are positioned side by side on a white surface. The colors and shapes are correctly assigned to their respective objects with no swaps. VERDICT: PASS

A green apple and an orange pumpkin on a wooden cutting boarsana-local/sana-1.5-1.6bPASS
A green apple and an orange pumpkin on a wooden cutting board

Evaluator

The image displays a green apple on the left and an orange pumpkin on the right, both resting on a wooden cutting board. The colors and shapes correspond perfectly to the prompt descriptions with no attribute swaps. - **Apple:** Green color, round shape, stem visible. - **Pumpkin:** Orange color, ribbed round shape, stem visible. - **Surface:** Wooden cutting board. VERDICT: PASS

A yellow taxi and a black limousine parked next to each othesana-local/sana-1.5-1.6bPASS
A yellow taxi and a black limousine parked next to each other on a street

Evaluator

The image shows a bright yellow taxi in the foreground and a long black limousine parked directly next to it on a city street. The yellow color is correctly bound to the taxi, and the black color is correctly bound to the limousine. The shapes and sizes correspond to their respective vehicle types. VERDICT: PASS

Medium 33%vs100%
A large red circle, a small blue triangle, and a medium greesana-local/sana-1.5-1.6bFAIL
A large red circle, a small blue triangle, and a medium green star arranged on a white canvas

Evaluator

The image contains a large red circle (correct), a blue triangle (correct shape, though arguably medium/large rather than small), and a green diamond/rhombus (incorrect shape; the prompt requested a star). The green shape is clearly a four-sided polygon (diamond), not a five-pointed star. VERDICT: FAIL

A tall thin white candle, a short fat red candle, and a medisana-local/sana-1.5-1.6bPASS
A tall thin white candle, a short fat red candle, and a medium spiral yellow candle on a shelf

Evaluator

The image contains three distinct candle-like objects with the requested attributes correctly bound: 1. **Tall thin white candle:** Observed on the left. It is tall, thin, white, and has vertical ridges. 2. **Short fat red candle:** Observed on the right (bottom). It is short, fat, and red. 3. **Medium spiral yellow candle:** Observed on the right (top). It is medium height, yellow, and has a spiral shape. There are no attribute swaps (e.g., the red object is not spiral, the yellow object

A big brown horse, a small white rabbit, and a medium-sized sana-local/sana-1.5-1.6bFAIL
A big brown horse, a small white rabbit, and a medium-sized black cat standing in a meadow

Evaluator

The image contains a big brown horse (correct), a small white rabbit (correct), but also a medium-sized white rabbit (extra object not requested). Crucially, the black animal has long rabbit ears, making it a black rabbit, not the requested "medium-sized black cat". The attributes are not correctly bound. VERDICT: FAIL

Hard 0%vs100%
A pink fire truck, a blue banana, and a green sun in a surresana-local/sana-1.5-1.6bFAIL
A pink fire truck, a blue banana, and a green sun in a surreal landscape

Evaluator

The image contains a pink fire truck, which matches the prompt. However, the prompt requested a "blue banana," but the bananas in the foreground are clearly green. The sun is yellow with a green center, which is close enough to "green sun," but the color swap on the banana is a clear error. VERDICT: FAIL

A purple carrot, a red tree with red leaves, and a white watsana-local/sana-1.5-1.6bFAIL
A purple carrot, a red tree with red leaves, and a white watermelon sliced open

Evaluator

The image contains a purple carrot, a red tree with red leaves, and a sliced watermelon. - **Purple carrot:** The carrot is purple with green leaves. This matches the prompt. - **Red tree with red leaves:** The tree has red foliage. This matches the prompt. - **White watermelon:** The prompt asks for a "white watermelon". The watermelon in the image has red flesh and a green/white rind. It is a standard red watermelon, not a white one. The color attribute "white" is incorrectly bound to the wa

An orange elephant, a yellow penguin, and a silver strawberrsana-local/sana-1.5-1.6bFAIL
An orange elephant, a yellow penguin, and a silver strawberry arranged on a table

Evaluator

The image contains an orange elephant, which matches the prompt. However, the penguin is black and white (not yellow), and the strawberries are red (not silver). The color attributes for the penguin and strawberry are incorrect. VERDICT: FAIL

Compositionality89%vs100%
Easy 67%vs100%
A tabby cat wearing oversized red sunglassessana-local/sana-1.5-1.6bPASS
A tabby cat wearing oversized red sunglasses

Evaluator

The image clearly depicts a tabby cat with distinct brown and black stripes. It is wearing a pair of large, bright red sunglasses that fit the description of "oversized." All elements from the prompt are present and combined coherently. * Tabby cat: Present * Oversized red sunglasses: Present VERDICT: PASS

A golden retriever wearing a tiny cowboy hatsana-local/sana-1.5-1.6bFAIL
A golden retriever wearing a tiny cowboy hat

Evaluator

The image contains a golden retriever wearing a cowboy hat. However, the hat is not "tiny"; it appears to be a standard or even large-sized hat relative to the dog's head, dominating the composition rather than being a small accessory. * **Golden retriever:** Present * **Wearing:** Present * **Cowboy hat:** Present * **Tiny:** Not present (The hat is large/standard size) VERDICT: FAIL

A penguin holding a colorful umbrellasana-local/sana-1.5-1.6bPASS
A penguin holding a colorful umbrella

Evaluator

- **Penguin:** Present. A cartoon-style penguin is the central figure. - **Holding:** Present. The penguin is gripping the handle of the umbrella with its flipper. - **Colorful umbrella:** Present. The umbrella features a rainbow spectrum of colors (purple, blue, green, yellow, orange, red). All described elements are clearly visible and logically combined in the illustration. VERDICT: PASS

Medium 100%vs100%
A medieval knight in full armor sitting at a desk typing on sana-local/sana-1.5-1.6bPASS
A medieval knight in full armor sitting at a desk typing on a modern laptop

Evaluator

- **Medieval knight:** Present. The figure is clearly a knight. - **Full armor:** Present. The figure wears a helmet, breastplate, gauntlets, and chainmail. - **Sitting at a desk:** Present. The figure is seated at a table/desk surface. - **Typing on a modern laptop:** Present. The knight's gauntleted hands are on the keyboard of a laptop. All elements are present and coherently combined. VERDICT: PASS

A Roman gladiator taking a selfie with a smartphone in the Csana-local/sana-1.5-1.6bPASS
A Roman gladiator taking a selfie with a smartphone in the Colosseum

Evaluator

- **Roman gladiator:** Present (wearing helmet, armor, and red cape). - **Taking a selfie:** Present (holding phone up, looking at screen, smiling). - **Smartphone:** Present (modern device held in hand). - **Colosseum:** Present (background architecture matches the famous amphitheater). All elements are clearly visible and combined to match the prompt's specific, anachronistic request. VERDICT: PASS

An astronaut in a spacesuit riding a bicycle through a parksana-local/sana-1.5-1.6bPASS
An astronaut in a spacesuit riding a bicycle through a park

Evaluator

- **Astronaut in a spacesuit:** Present. The central figure is wearing a full white spacesuit with a helmet and backpack. - **Riding a bicycle:** Present. The astronaut is straddling a bicycle and holding the handlebars in a riding position. - **Through a park:** Present. The background features trees with autumn foliage, a paved path, and grass, clearly depicting a park setting. All described elements are present and coherently combined. VERDICT: PASS

Hard 100%vs100%
An octopus wearing a top hat and monocle playing chess on thsana-local/sana-1.5-1.6bPASS
An octopus wearing a top hat and monocle playing chess on the surface of the moon with Earth visible in the background

Evaluator

- **Octopus:** Present (Central figure). - **Top hat:** Present (Black top hat on head). - **Monocle:** Present (Interpreted as round steampunk glasses/goggles, though technically two lenses). - **Chess:** Present (Chessboard and pieces in foreground). - **Moon surface:** Present (Grey, cratered ground). - **Earth:** Present (Visible in the upper right sky). The image successfully combines all the surreal elements requested. While the eyewear is technically a pair of round glasses rather than

A dinosaur in a business suit giving a PowerPoint presentatisana-local/sana-1.5-1.6bPASS
A dinosaur in a business suit giving a PowerPoint presentation in a modern office while drinking coffee

Evaluator

The image successfully combines all the requested elements. * **Dinosaur:** Present (A T-Rex head and body). * **Business suit:** Present (Dark suit jacket, white shirt, and tie). * **PowerPoint presentation:** Present (A screen in the background displaying charts and graphs). * **Modern office:** Present (Office lighting, furniture, and layout). * **Drinking coffee:** Present (The dinosaur is holding two coffee cups). VERDICT: PASS

A robot dressed as a pirate sailing a paper boat on a lake osana-local/sana-1.5-1.6bPASS
A robot dressed as a pirate sailing a paper boat on a lake of lava while juggling three kittens

Evaluator

The image contains all the requested elements: 1. **Robot:** Present (central figure). 2. **Dressed as a pirate:** Present (wearing a pirate hat with skull and crossbones). 3. **Paper boat:** Present (the robot is sitting in a folded paper boat). 4. **Lake of lava:** Present (the water is molten orange/red with a fiery background). 5. **Juggling three kittens:** Present (there are three kittens total; one is sitting in the boat, and two are mid-air, with the robot's pose suggesting a jugg

Counting11%vs89%
Easy 0%vs100%
Exactly 3 red apples arranged on a wooden tablesana-local/sana-1.5-1.6bFAIL
Exactly 3 red apples arranged on a wooden table

Evaluator

I see 4 red apples arranged on the wooden table. The prompt requested exactly 3 apples. Since the actual count is 4, the image does not match the specific quantity requested. VERDICT: FAIL

Exactly 5 yellow rubber ducks sitting in a row on a bathroomsana-local/sana-1.5-1.6bFAIL
Exactly 5 yellow rubber ducks sitting in a row on a bathroom shelf

Evaluator

The image shows a row of 6 yellow rubber ducks on the lower shelf, plus one blurry duck on the shelf above. The prompt requested exactly 5 ducks in a row. Since there are 6 ducks in the row, the count is incorrect. VERDICT: FAIL

Exactly 4 white candles standing on a dark surfacesana-local/sana-1.5-1.6bFAIL
Exactly 4 white candles standing on a dark surface

Evaluator

I see 3 white candles standing on a dark surface. The prompt requested exactly 4 white candles. Since the count in the image (3) does not match the requested count (4), the criteria are not met. VERDICT: FAIL

Medium 33%vs100%
2 orange cats and 3 black dogs sitting together on a grassy sana-local/sana-1.5-1.6bFAIL
2 orange cats and 3 black dogs sitting together on a grassy lawn

Evaluator

The image shows 1 orange cat, 1 black cat, and 1 black dog. The prompt requested 2 orange cats and 3 black dogs. The counts are significantly different from the request. VERDICT: FAIL

4 red roses and 2 white daisies in a glass vase on a tablesana-local/sana-1.5-1.6bFAIL
4 red roses and 2 white daisies in a glass vase on a table

Evaluator

The image contains 5 red roses (top left, top center, top right/back, bottom left, bottom center) and 2 white daisies. The prompt requested exactly 4 red roses. Since the count of roses is incorrect (5 instead of 4), the image does not match the specific count requested. VERDICT: FAIL

1 blue butterfly and 3 yellow butterflies resting on a gardesana-local/sana-1.5-1.6bPASS
1 blue butterfly and 3 yellow butterflies resting on a garden fence

Evaluator

The image contains exactly **1 blue butterfly** (center) and **3 yellow butterflies** (one left, one top center, one right). All are positioned around or on a blue wooden garden fence. The counts match the prompt's request perfectly. VERDICT: PASS

Hard 0%vs67%
Exactly 9 goldfish swimming in a large round fishbowlsana-local/sana-1.5-1.6bERROR
Exactly 9 goldfish swimming in a large round fishbowl
A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 sana-local/sana-1.5-1.6bFAIL
A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 apples

Evaluator

The image contains 3 bananas, 4 oranges (including one cut in half), and 3 apples (one red/green at the top, one green at the bottom, and one red at the bottom right). The prompt requested exactly 2 apples, but there are 3. VERDICT: FAIL

Exactly 11 colorful balloons floating against a clear blue ssana-local/sana-1.5-1.6bFAIL
Exactly 11 colorful balloons floating against a clear blue sky

Evaluator

I count approximately 16 balloons in the image (yellow, red, pink, green, orange, blue, purple varieties). The prompt requested exactly 11 balloons. The count does not match. VERDICT: FAIL

Negation22%vs89%
Easy 67%vs100%
A tropical beach scene with no peoplesana-local/sana-1.5-1.6bPASS
A tropical beach scene with no people

Evaluator

The image depicts a vibrant, illustrated tropical beach scene featuring palm trees, lush green vegetation, a sandy shore, and blue ocean waters under a cloudy sky. I have carefully scanned the entire image, including the foreground, background, and distant shoreline, and there are absolutely no human figures or people visible anywhere. The exclusion criteria is fully met. VERDICT: PASS

A cat lying on a couch without a collarsana-local/sana-1.5-1.6bFAIL
A cat lying on a couch without a collar

Evaluator

The image shows a tabby cat sleeping on a couch. Upon close inspection of the neck area, there is a visible silver ring or clasp attached, which indicates the presence of a collar. Since the prompt explicitly required the cat to be "without a collar," the excluded element is present. VERDICT: FAIL

A classroom with desks and a chalkboard but no studentssana-local/sana-1.5-1.6bPASS
A classroom with desks and a chalkboard but no students

Evaluator

The image depicts an empty classroom featuring rows of wooden desks, a large green chalkboard at the front, windows on the left, and ceiling lights. The room is completely devoid of people. There are no students visible anywhere in the scene, confirming the exclusion criteria is met. VERDICT: PASS

Medium 0%vs100%
A sandy beach with palm trees, seashells, and a beach umbrelsana-local/sana-1.5-1.6bFAIL
A sandy beach with palm trees, seashells, and a beach umbrella but no water or ocean visible

Evaluator

The image clearly depicts a large body of blue water with waves stretching across the middle ground, representing an ocean or sea. This directly contradicts the prompt's explicit exclusion of "no water or ocean visible." While the beach, palm trees, shells, and umbrella are present, the presence of the ocean violates the negative constraint. VERDICT: FAIL

An office workspace with a desk, chair, bookshelf, and lamp sana-local/sana-1.5-1.6bFAIL
An office workspace with a desk, chair, bookshelf, and lamp but no computers or screens

Evaluator

The image depicts a modern office workspace featuring a desk, an ergonomic chair, and large built-in bookshelves filled with binders. There are lamps providing lighting. However, directly in the center of the desk, there is a prominent computer monitor (a screen with a black display area and white bezel), which violates the exclusion criteria. VERDICT: FAIL

A kitchen with pots, pans, a stove, and a cutting board but sana-local/sana-1.5-1.6bFAIL
A kitchen with pots, pans, a stove, and a cutting board but no food

Evaluator

The image contains several instances of food, violating the "no food" constraint. Specifically, there are sliced vegetables (likely peppers or tomatoes) and a whole vegetable on the cutting board in the foreground. Additionally, there is a bowl of green leafy vegetables on the counter to the right and a potted plant with greens on the shelf to the left. VERDICT: FAIL

Hard 0%vs67%
A pepperoni pizza with tomato sauce, basil, and crust but absana-local/sana-1.5-1.6bFAIL
A pepperoni pizza with tomato sauce, basil, and crust but absolutely no cheese

Evaluator

The image shows a pepperoni pizza with a crust, tomato sauce, basil leaves, and pepperoni slices. However, there is clearly a layer of melted white/yellowish cheese visible underneath the toppings and between the pepperoni slices. Since the prompt explicitly excluded cheese ("absolutely no cheese"), the presence of cheese means the constraint was not met. VERDICT: FAIL

A car body with windows, doors, and headlights but no wheelssana-local/sana-1.5-1.6bFAIL
A car body with windows, doors, and headlights but no wheels

Evaluator

The image displays a black car illustration featuring windows, doors, and glowing yellow headlights. However, contrary to the exclusion criteria ("no wheels"), the image clearly depicts wheels on both the front and rear axles with visible rims. The excluded element is present. VERDICT: FAIL

A human face with eyes, nose, and mouth but no eyebrowssana-local/sana-1.5-1.6bFAIL
A human face with eyes, nose, and mouth but no eyebrows

Evaluator

The image clearly shows a human face with eyes, nose, and mouth. However, contrary to the exclusion criteria ("no eyebrows"), distinct, dark eyebrows are clearly visible above the eyes. Since the excluded element is present, the image does not meet the specific constraint. VERDICT: FAIL

Relative Position75%vs100%
Easy 100%vs100%
A cat sitting on top of a cardboard boxsana-local/sana-1.5-1.6bPASS
A cat sitting on top of a cardboard box

Evaluator

The image shows a tabby cat resting its body on the upper rim of a cardboard box. The cat is positioned above the box, with its paws hanging over the side, effectively sitting on top of it. The spatial relationship described in the prompt ("on top of") accurately reflects the visual evidence. VERDICT: PASS

A red ball underneath a wooden chairsana-local/sana-1.5-1.6bPASS
A red ball underneath a wooden chair

Evaluator

The image displays a wooden chair with its legs extending down to the floor. A bright red ball is resting on the carpet directly beneath the seat of the chair, positioned between the front legs. This arrangement perfectly matches the prompt's description of the ball being "underneath" the chair. VERDICT: PASS

A coffee mug next to an open book on a desksana-local/sana-1.5-1.6bPASS
A coffee mug next to an open book on a desk

Evaluator

The image shows a large open book in the foreground resting on a wooden desk. Behind it and slightly to the right, there is a white coffee mug filled with coffee, also resting on the desk. The spatial relationship "next to" is appropriate as the objects are adjacent on the surface, and "on a desk" is correct for both items. VERDICT: PASS

Medium 100%vs100%
A bird perched on a branch, the branch extending from a treesana-local/sana-1.5-1.6bPASS
A bird perched on a branch, the branch extending from a tree next to a park bench

Evaluator

The image shows a bird perched on a branch that extends from a large tree trunk on the left side. In the background, slightly out of focus, there is a park bench situated next to the tree. The spatial arrangement matches the prompt: the bird is on the branch, the branch comes from the tree, and the tree is next to the bench. VERDICT: PASS

A cat sitting on a pillow, the pillow placed on a wooden chasana-local/sana-1.5-1.6bPASS
A cat sitting on a pillow, the pillow placed on a wooden chair

Evaluator

The image shows a tabby cat resting directly on top of a light-colored pillow. This pillow is situated on the seat of a wooden chair. The spatial hierarchy is clear: the cat is on the pillow, and the pillow is on the chair, matching the prompt's description perfectly. VERDICT: PASS

A candle on top of a stack of books, the books next to a potsana-local/sana-1.5-1.6bPASS
A candle on top of a stack of books, the books next to a potted plant

Evaluator

The image shows a stack of books with a lit candle resting directly on top of the uppermost book. To the left of the book stack, there is a potted plant with green leaves. The spatial arrangement perfectly matches the description: the candle is on the books, and the books are next to the plant. VERDICT: PASS

Hard 0%vs100%
Five chess pieces on a board: a king in the center, a queen sana-local/sana-1.5-1.6bFAIL
Five chess pieces on a board: a king in the center, a queen to its left, a bishop to its right, a rook above the king, and a knight below the king

Evaluator

The image shows a central King flanked by two Knights (one on the left, one on the right). Behind the King are two Pawns. The prompt describes a Queen to the left, a Bishop to the right, a Rook above, and a Knight below. None of these specific positional relationships or piece identifications match the visual evidence. The pieces are Knights, not a Queen or Bishop, and there are no Rooks or pieces below the King. VERDICT: FAIL

A desk scene: laptop in the center, coffee mug to the left, sana-local/sana-1.5-1.6bFAIL
A desk scene: laptop in the center, coffee mug to the left, phone to the right, lamp above, and notebook below

Evaluator

The image shows a laptop in the center and a coffee mug to the left, which matches the prompt. However, the other relationships are incorrect. The phone is actually on the left side (resting on a notebook), not to the right. The object to the right appears to be a mouse and a closed laptop or notebook. Additionally, the items "below" the laptop are pens, not a notebook. VERDICT: FAIL

Five animals in a field: a horse on the far left, a cow to tsana-local/sana-1.5-1.6bFAIL
Five animals in a field: a horse on the far left, a cow to the right of the horse, a sheep in the center, a goat to the right of the sheep, and a dog on the far right

Evaluator

The image shows a horse on the left, but the other animals do not match the prompt. Instead of a cow, there is another sheep behind the horse. In the center is a sheep. To the right is a goat. There is a small lamb sitting down, but no dog is visible on the far right. The specific animals (cow, dog) requested are missing or misidentified. VERDICT: FAIL

Extreme 100%vs100%
A landscape with a red barn in the foreground at large scalesana-local/sana-1.5-1.6bPASS
A landscape with a red barn in the foreground at large scale, a windmill in the midground at medium scale, and snow-capped mountains in the background at small scale, all with consistent depth perspective

Evaluator

The image displays a clear depth hierarchy consistent with the prompt. The red barn is positioned in the foreground on the left, appearing largest. The windmill is situated in the midground on the right, appearing smaller than the barn but larger than background elements. The snow-capped mountains occupy the background, appearing furthest away. The spatial arrangement of foreground (barn), midground (windmill), and background (mountains) is accurate and maintains consistent perspective. VERDI

A beach scene with a sandcastle in the foreground, a surfer sana-local/sana-1.5-1.6bPASS
A beach scene with a sandcastle in the foreground, a surfer on the waves in the midground, and a cruise ship on the horizon in the background, sizes decreasing with distance

Evaluator

The image displays a clear beach scene with distinct layers of depth. In the immediate foreground, a large sandcastle sits on the sand. In the midground, a surfer stands in the breaking waves. In the background, a large cruise ship sails on the horizon, along with another vessel to the left. The objects follow the rule of sizes decreasing with distance, with the sandcastle being the largest visual element and the ships appearing smaller due to perspective. All described positional relationship

A city park with a park bench and pigeons in the foreground,sana-local/sana-1.5-1.6bPASS
A city park with a park bench and pigeons in the foreground, a fountain in the midground, and skyscrapers in the background, each layer progressively smaller

Evaluator

The image accurately depicts the described spatial arrangement. In the immediate foreground, a wooden park bench sits on a paved area with numerous pigeons perched on it and walking on the ground. Directly behind the bench, in the midground, is a large, active fountain surrounded by green grass and trees. In the background, a dense cluster of skyscrapers rises against the sky, clearly positioned behind the park elements. The layering from foreground (bench/pigeons) to midground (fountain) to b

Scale & Proportions44%vs100%
Easy 67%vs100%
A person standing next to a yellow school bus, showing realisana-local/sana-1.5-1.6bPASS
A person standing next to a yellow school bus, showing realistic relative scale

Evaluator

The relative scale between the person and the school bus appears realistic. The person's height is approximately half the height of the bus, which is consistent with real-world proportions (an average adult vs. a ~10-12 foot tall bus). The door height and wheel size also align correctly with the person's stature. VERDICT: PASS

A ladybug sitting on a large green leaf, shown at realistic sana-local/sana-1.5-1.6bPASS
A ladybug sitting on a large green leaf, shown at realistic proportions

Evaluator

The image depicts a ladybug resting on a large, broad green leaf. The scale of the ladybug relative to the leaf is accurate; the insect appears small and delicate against the substantial surface area of the leaf, consistent with real-world biology. The leaf itself has the characteristics of a large tropical leaf (like a Monstera), further supporting the "large green leaf" description. The proportions are realistic. VERDICT: PASS

A child holding a basketball, both at correct relative sizessana-local/sana-1.5-1.6bFAIL
A child holding a basketball, both at correct relative sizes

Evaluator

The image shows a stylized child holding a basketball. However, the relative sizes are incorrect. The basketball is disproportionately large, appearing wider than the child's shoulders and torso. In reality, a basketball (approx. 9.5 inches diameter) should be slightly narrower than a toddler's shoulder width, but here it dominates the child's upper body, indicating a scaling error. VERDICT: FAIL

Medium 67%vs100%
A tiny person standing on the rim of a coffee cup, looking dsana-local/sana-1.5-1.6bFAIL
A tiny person standing on the rim of a coffee cup, looking down into it

Evaluator

The image shows a tiny person standing on the surface of the coffee (the latte art) inside the cup, rather than standing on the rim as described in the prompt. While the scale of the tiny person relative to the cup is consistent with the concept, the specific spatial relationship ("on the rim") is incorrect. VERDICT: FAIL

A giant tabby cat walking between city skyscrapers like a kasana-local/sana-1.5-1.6bPASS
A giant tabby cat walking between city skyscrapers like a kaiju

Evaluator

The image depicts a massive tabby cat towering over skyscrapers, consistent with the "kaiju" description. The cat's head reaches near the top of the frame, dwarfing the surrounding buildings, which establishes the correct giant scale relative to the city environment. The proportions are consistent with the prompt's request for a giant cat. VERDICT: PASS

A miniature house sitting on the palm of someone's handsana-local/sana-1.5-1.6bPASS
A miniature house sitting on the palm of someone's hand

Evaluator

The image shows a highly detailed, two-story miniature house resting comfortably on an open human palm. The scale is consistent with the description; the house is small enough to be held in one hand, with the fingers providing a clear reference for its miniature size. The accompanying palm tree and landscaping details are also proportionally correct relative to the house and the hand. VERDICT: PASS

Hard 0%vs100%
A group photo of an elephant, a horse, a dog, a cat, and a msana-local/sana-1.5-1.6bFAIL
A group photo of an elephant, a horse, a dog, a cat, and a mouse, all standing in a line with correct real-world proportional sizes

Evaluator

The image displays a lineup of animals, but the relative sizes are incorrect compared to real-world proportions. Specifically, the horse is depicted as taller than the elephant, whereas in reality, an elephant is significantly larger and taller at the shoulder. Additionally, the cat on the left appears almost as tall as the dog, which is disproportionate. The scale relationships are inconsistent with reality. VERDICT: FAIL

A dinner table scene with a plate, fork, knife, wine glass, sana-local/sana-1.5-1.6bFAIL
A dinner table scene with a plate, fork, knife, wine glass, salt shaker, and peppercorn, all at realistic relative sizes

Evaluator

The image displays a dinner setting, but the relative sizes of the objects are not realistic. The wine glasses are disproportionately large and tall compared to the dinner plate and cutlery, with unnaturally elongated stems. The salt shaker is also quite large relative to the other items. Additionally, a specific "peppercorn" is not clearly visible as a distinct object. The scale inconsistencies violate the prompt's requirement for realistic relative sizes. VERDICT: FAIL

A farm scene with a barn, tractor, farmer, cow, chicken, andsana-local/sana-1.5-1.6bFAIL
A farm scene with a barn, tractor, farmer, cow, chicken, and a fence post, all at correct proportional scale

Evaluator

The image fails to meet the criteria for correct proportional scale. The tractor in the foreground is disproportionately tiny, appearing smaller than the chickens and fence posts, whereas a real tractor would be significantly larger. Additionally, the prompt specifies a "farmer" and a "cow," neither of which are present in the scene. The scale relationships are inconsistent with reality. VERDICT: FAIL

Human realism 67%vs98%

Faces & Expressions58%vs100%
Easy 100%vs100%
A close-up portrait of a young woman with a warm natural smisana-local/sana-1.5-1.6bPASS
A close-up portrait of a young woman with a warm natural smile

Evaluator

The face in the image is largely symmetrical with correctly proportioned features: the eyes, eyebrows, nose, and mouth are well-aligned and natural-looking. The smile is wide and warm, clearly conveying joy and approachability, matching the prompt’s request for a “warm natural smile.” There are no obvious distortions—no unnatural asymmetry, misshapen features, or AI artifacts that would detract from realism. The lighting and skin texture are consistent with a high-quality portrait. VERDICT: PAS

A portrait of a middle-aged man with a calm, serious expresssana-local/sana-1.5-1.6bPASS
A portrait of a middle-aged man with a calm, serious expression

Evaluator

The face in the image is largely symmetric and features are anatomically correct: eyes, nose, mouth, and ears are proportionally aligned and natural. The man appears middle-aged, with visible wrinkles and gray hair consistent with that age group. His expression is calm and serious — the lips are closed, brows are neutral, and gaze is steady — making the intended emotion clearly readable. There are no significant distortions; the lighting and texture look natural, and no facial elements appear wa

A headshot of a teenager with a relaxed, neutral facesana-local/sana-1.5-1.6bPASS
A headshot of a teenager with a relaxed, neutral face

Evaluator

The face in the image is largely symmetric with correctly proportioned features: the eyes, nose, and mouth are well-aligned, and the ears are symmetrically placed. The eyebrows are well-defined and balanced, and the skin texture, including freckles and rosy cheeks, appears natural. The expression is a relaxed, neutral smile — not overtly emotional — which matches the prompt’s request for a “relaxed, neutral face.” There are no obvious distortions or anatomical anomalies; the facial structure is

Medium 33%vs100%
A close-up of a person's face showing clear disgust, nose wrsana-local/sana-1.5-1.6bFAIL
A close-up of a person's face showing clear disgust, nose wrinkled and lip curled

Evaluator

The face is not symmetric — the left side (from viewer’s perspective) shows more pronounced wrinkling and distortion than the right. The nose is visibly distorted, with asymmetrical creasing and a slightly misshapen tip. The lips are curled downward, but the expression is more aggressive or angry than disgust — the brow is furrowed, eyes are narrowed, and the overall expression leans toward anger or contempt rather than clear disgust. The nose is wrinkled, but the lip curl is not as pronounced o

A portrait capturing genuine surprise — raised eyebrows, widsana-local/sana-1.5-1.6bPASS
A portrait capturing genuine surprise — raised eyebrows, wide eyes, slightly open mouth

Evaluator

The face is largely symmetric with correctly rendered features: the eyes, eyebrows, nose, and mouth are well-proportioned and centered. The intended emotion — genuine surprise — is clearly readable through raised eyebrows, wide eyes, and a slightly open mouth. There are minor distortions: the eyes have an unnatural, overly bright and stylized appearance (one eye appears slightly more saturated than the other), and the skin texture is unnaturally smooth and plastic-like, lacking fine pores or nat

A face showing quiet melancholy — downcast eyes, slightly drsana-local/sana-1.5-1.6bFAIL
A face showing quiet melancholy — downcast eyes, slightly drawn mouth, distant gaze

Evaluator

The face is not symmetric and exhibits significant distortions. The left eye (viewer’s left) is heavily shadowed and appears to be partially closed or squinting, while the right eye is more open but both are rendered with heavy, unnatural shading and texture. The eyebrows are asymmetrical, with the left brow appearing more arched and the right brow more furrowed. The nose is slightly misshapen and the lips are unevenly textured and slightly pulled downward on the left side. The skin texture is o

Hard 67%vs100%
An elderly woman in her 80s with deep wrinkles and white haisana-local/sana-1.5-1.6bPASS
An elderly woman in her 80s with deep wrinkles and white hair, smiling warmly with kind eyes

Evaluator

The face is largely symmetric with correct anatomical features — the eyes, nose, and mouth are centered and proportionate. The deep wrinkles around the eyes and mouth are rendered naturally and are consistent with an elderly woman in her 80s. The white, voluminous hair is detailed and realistic. The emotion is clearly readable: the warm, genuine smile, crinkled eyes, and relaxed facial muscles convey kindness and joy. There are no significant distortions; the facial structure is coherent, and th

A 5-year-old child with round cheeks and large eyes, crying sana-local/sana-1.5-1.6bPASS
A 5-year-old child with round cheeks and large eyes, crying with tears streaming down their face

Evaluator

The face in the image is largely symmetric, with balanced placement of eyes, nose, and mouth. The child has large, expressive eyes and round cheeks, consistent with the prompt. Tears are clearly streaming down both cheeks, and the mouth is open in a crying expression, making the emotion unmistakably readable as distress or sadness. There are no major distortions — the facial structure is anatomically plausible, though the tears are rendered with a slightly stylized, glossy effect. The overall co

An elderly man in his 70s with age spots and thinning gray hsana-local/sana-1.5-1.6bFAIL
An elderly man in his 70s with age spots and thinning gray hair, looking skeptically over reading glasses

Evaluator

The face is highly asymmetrical — the left side (viewer’s right) of the face is noticeably more pronounced in wrinkles and sagging, and the right eye (viewer’s left) appears slightly larger and more sunken. The nose is also asymmetrically angled. The skin texture is overly exaggerated with unnatural, almost granular “age spots” and deep, inconsistent wrinkles that look digitally rendered rather than organic. The eyebrows are uneven, and the mouth is pulled downward asymmetrically, contributing t

Extreme 33%vs100%
A group of three friends at a table: one laughing with head sana-local/sana-1.5-1.6bFAIL
A group of three friends at a table: one laughing with head tilted back, one looking shocked with wide eyes, and one rolling their eyes with a smirk

Evaluator

The image shows three people at a table, but all three faces express the same shocked or surprised emotion — wide eyes, open mouths, and hands on cheeks (for the woman in the center). None of them show the distinct emotions requested in the prompt: one laughing with head tilted back, one looking shocked (which is partially present), and one rolling their eyes with a smirk. Instead, all three appear to be reacting with identical shock or disbelief. Additionally, while the faces are mostly symmet

A family photo with four people: a father beaming with pridesana-local/sana-1.5-1.6bPASS
A family photo with four people: a father beaming with pride, a mother smiling gently, a teenager looking bored, and a child grinning excitedly

Evaluator

The faces in the image are mostly symmetrical with correct anatomical features — eyes, noses, mouths, and ears are properly aligned and proportioned. There are no major distortions or unnatural deformities. Emotionally: - The father on the left is beaming with pride — his wide smile and crinkled eyes convey genuine joy. - The mother in the center is smiling gently — her soft expression and slight smile match the prompt. - The teenager (on the right) is grinning excitedly — her bright, open smil

Three coworkers in a meeting: one looking frustrated with fusana-local/sana-1.5-1.6bFAIL
Three coworkers in a meeting: one looking frustrated with furrowed brows, one nodding with a pleased expression, and one looking confused with a tilted head

Evaluator

The image shows three individuals seated at a table, each displaying distinct facial expressions. - The person on the left (woman) has furrowed brows and wide, startled eyes, conveying confusion or concern — not the “pleased expression” requested. - The person in the center has a highly distorted, exaggerated expression with bulging eyes and a grimace — this reads as extreme frustration or panic, not a neutral or clear emotion. - The person on the right (man) has a furrowed brow and a downturne

Full Body58%vs100%
Easy 100%vs100%
A woman standing upright in a relaxed pose, arms at her sidesana-local/sana-1.5-1.6bPASS
A woman standing upright in a relaxed pose, arms at her sides, wearing jeans and a white t-shirt

Evaluator

The image shows a full-body view of a woman standing upright with her arms at her sides, wearing a white t-shirt and blue jeans — matching the prompt exactly. - Limb count: Correct — two arms, two legs, both present and fully visible. - Proportions: Anatomically correct — head, torso, limbs, and feet are proportionally balanced and natural. - Pose: The woman is standing upright in a relaxed pose with arms at her sides — fully recognizable and matches the prompt. - Distortions: None — no visible

A man standing straight facing the camera, hands in pockets,sana-local/sana-1.5-1.6bPASS
A man standing straight facing the camera, hands in pockets, normal proportions

Evaluator

The image shows a full-body view of a man standing straight, facing the camera, with both hands in his pockets. His posture is upright and symmetrical, and his limbs (two arms, two legs) are clearly visible and correctly positioned. The proportions appear normal: head-to-torso, torso-to-legs ratios are consistent with human anatomy, and no obvious limb duplication or missing limbs are present. The pose is clearly recognizable as “standing with hands in pockets,” which matches the prompt exactly.

A person standing in a neutral pose in a park, full body vissana-local/sana-1.5-1.6bPASS
A person standing in a neutral pose in a park, full body visible from head to feet

Evaluator

The image shows a full-body view of a person standing in a park, viewed from behind. The person is in a neutral, upright stance with arms relaxed at their sides, feet parallel and shoulder-width apart — consistent with a “neutral pose” as requested. The limbs are present and appear to be in correct anatomical proportion: two arms, two legs, head, torso, and feet are all visible and structurally plausible. There are no obvious distortions in limb count or proportions — no extra limbs, missing lim

Medium 67%vs100%
A woman mid-stride running through a park, hair and clothes sana-local/sana-1.5-1.6bPASS
A woman mid-stride running through a park, hair and clothes showing motion

Evaluator

The image depicts a woman in mid-stride running through a park, with her hair and clothing showing motion blur consistent with the action. Her limbs are present and correctly proportioned: two arms, two legs, with natural joint articulation. The pose is anatomically plausible for running — one leg forward, the other back, arms bent and swinging, torso slightly forward. The motion blur enhances the sense of speed and movement, and the hair is dynamically flung backward, matching the running motio

A man captured mid-jump over a puddle, body in the air with sana-local/sana-1.5-1.6bFAIL
A man captured mid-jump over a puddle, body in the air with bent knees

Evaluator

The image shows a man in mid-air, seemingly jumping over a puddle. His body is angled forward, knees bent, and arms slightly extended — consistent with the described pose. Limb count is correct: two arms, two legs, and the torso are all present. Proportions appear anatomically plausible — head, torso, and limbs are in reasonable scale relative to each other. The pose is recognizable as a jump, and the reflection in the puddle confirms he is airborne and has just disturbed the water. However, su

A dancer in a dynamic spin, dress flowing outward with one lsana-local/sana-1.5-1.6bPASS
A dancer in a dynamic spin, dress flowing outward with one leg extended

Evaluator

The image depicts a dancer in mid-spin, with a flowing orange dress and hair caught in motion. The pose is dynamic and expressive, conveying rotation and movement. Limb count and proportions: The dancer has two arms, two legs, and a torso — all anatomically present. Proportions appear natural and balanced, with no obvious limb duplication or severe distortion. The arms are extended outward, and one leg is slightly lifted, suggesting a spin — though not explicitly “one leg extended” as in a full

Hard 33%vs100%
A person in a yoga tree pose — standing on one leg with the sana-local/sana-1.5-1.6bFAIL
A person in a yoga tree pose — standing on one leg with the other foot pressed to the inner thigh, arms raised overhead

Evaluator

The image shows a person in a yoga pose viewed from behind. The pose is intended to be the “tree pose” (Vrksasana), which requires standing on one leg with the other foot pressed against the inner thigh, arms raised overhead in prayer position. In this image: - The person is standing on one leg, which is correct. - The other foot is not pressed against the inner thigh — instead, it appears to be resting on the back of the thigh or hip area, which is anatomically incorrect for the standard tree

A rock climber clinging to a cliff face with arms and legs esana-local/sana-1.5-1.6bPASS
A rock climber clinging to a cliff face with arms and legs extended to different holds

Evaluator

The climber in the image has two arms and two legs — limb count is correct. Proportions appear anatomically plausible: the torso, limbs, and head are in proportion relative to each other, with no obvious exaggerations or missing limbs. The pose is recognizable as a dynamic rock climbing stance — one arm extended upward gripping a hold, the other arm likely supporting (though partially obscured), legs positioned for balance and leverage, feet on footholds. The body is tensed appropriately for exe

A cellist seated with correct posture, legs around the instrsana-local/sana-1.5-1.6bFAIL
A cellist seated with correct posture, legs around the instrument, bow arm extended

Evaluator

The image depicts a cellist seated with the cello between their legs, bow arm extended horizontally across the strings — a standard playing posture. The figure has two arms, two legs, and a torso consistent with human anatomy. Proportions appear generally correct, though the image has a stylized, painterly quality that slightly exaggerates musculature and limb definition for dramatic effect. However, the pose is not anatomically accurate for playing the cello. In correct cello posture, the inst

Extreme 33%vs100%
Two tango dancers in a close embrace, the man dipping the wosana-local/sana-1.5-1.6bFAIL
Two tango dancers in a close embrace, the man dipping the woman backward with intertwined legs

Evaluator

The image depicts two tango dancers in a close embrace. The woman is arched backward, her body leaning into the man’s chest, and her left leg is extended behind her, while her right leg is bent and lifted. The man’s right arm is wrapped around her waist, and his left hand holds her right hand. Her left leg appears to be behind his, but the pose does not clearly show “intertwined legs” as described in the prompt — the legs are not visibly entwined or crossed in a way that suggests mutual locking

Two soccer players mid-tackle, one sliding and the other jumsana-local/sana-1.5-1.6bPASS
Two soccer players mid-tackle, one sliding and the other jumping over with the ball at their feet

Evaluator

The image depicts two soccer players in a dynamic, mid-action tackle. The player on the left is sliding on the ground, while the player on the right is airborne, seemingly jumping over the sliding player with the ball near their feet — matching the prompt’s description. Anatomical analysis: - Limb count: Both players have all limbs present — two arms and two legs per player — no missing or extra limbs. - Proportions: The figures appear proportionally correct — heads, torsos, limbs are in realis

Two gymnasts performing a synchronized handstand, side by sisana-local/sana-1.5-1.6bFAIL
Two gymnasts performing a synchronized handstand, side by side with identical body alignment

Evaluator

The image depicts two female gymnasts in a pose that appears to be a synchronized handstand, but upon close inspection, the pose is anatomically incorrect and not a true handstand. Key observations: - The gymnasts are standing on their hands, not their feet — this is not a handstand; it’s a standing pose with hands raised. - Their feet are on the ground — one is barefoot, the other appears to be on a small platform or mat — so they are not inverted. - The pose is not a synchronized handstand a

Hands92%vs92%
Easy 100%vs100%
A person standing with both hands visible at their sides, fisana-local/sana-1.5-1.6bPASS
A person standing with both hands visible at their sides, fingers relaxed and naturally spread

Evaluator

Each hand clearly shows five fingers: thumb, index, middle, ring, and pinky. The fingers are naturally spread and relaxed, consistent with the prompt’s description. The hands are positioned palms-up, slightly cupped, which is a natural and anatomically correct gesture. No anomalies are present — no extra or missing fingers, no unnatural deformities, and the proportions and joint alignment are realistic. VERDICT: PASS

A woman holding her hands out in front of her palms-up, all sana-local/sana-1.5-1.6bPASS
A woman holding her hands out in front of her palms-up, all ten fingers clearly visible

Evaluator

Both hands are clearly visible, palms up, with all ten fingers distinctly shown. Each hand displays five fingers: thumb, index, middle, ring, and pinky — no fingers are missing or duplicated. The gesture is anatomically correct: the hands are held open with palms facing upward, fingers slightly spread, which is a natural and common pose. The thumbs are oriented outward, and the fingers are aligned in a way that is consistent with human anatomy. There are no anomalies — no extra fingers, fused

A man with arms crossed, both hands resting on opposite uppesana-local/sana-1.5-1.6bPASS
A man with arms crossed, both hands resting on opposite upper arms with fingers visible

Evaluator

Both hands are visible in the image, resting on opposite upper arms as described in the prompt. Left hand (resting on right upper arm): 5 fingers clearly visible — thumb, index, middle, ring, and pinky. No extra or missing fingers. Right hand (resting on left upper arm): 5 fingers clearly visible — thumb, index, middle, ring, and pinky. No extra or missing fingers. Anatomical correctness: - The hands are positioned naturally on the upper arms. - The fingers are slightly curled, which is typic

Medium 100%vs100%
A person holding a white ceramic coffee mug by the handle wisana-local/sana-1.5-1.6bPASS
A person holding a white ceramic coffee mug by the handle with their right hand

Evaluator

The right hand, which is holding the mug by the handle, clearly shows five fingers: thumb, index, middle, ring, and pinky. The grip is anatomically correct — the thumb and fingers wrap around the handle in a natural, functional manner. The left hand is also visible, gripping the body of the mug. It also clearly shows five fingers, though they are less distinct due to the angle and lighting. The fingers are curled around the mug’s side in a natural, supportive position. There are no anomalies i

A woman writing in a notebook with a pen gripped correctly isana-local/sana-1.5-1.6bPASS
A woman writing in a notebook with a pen gripped correctly in her right hand

Evaluator

The visible right hand gripping the pen has five fingers: thumb, index, middle, ring, and pinky — all clearly present and anatomically positioned. The grip appears to be a standard tripod grip, with the thumb and index finger holding the pen while the middle finger provides support — this is anatomically correct for writing. The left hand is partially visible in the background, resting near the notebook. Although blurred, it appears to have five fingers as well, though not fully detailed. No a

A person playing an acoustic guitar with left hand on the frsana-local/sana-1.5-1.6bPASS
A person playing an acoustic guitar with left hand on the fretboard and right hand strumming

Evaluator

Left hand (on fretboard): Visible fingers are thumb, index, middle, ring, and pinky — that’s 5 fingers. The thumb is wrapped around the back of the neck, and the other four fingers are pressing down on strings — this is anatomically correct for fretting. Right hand (strumming): Only the thumb and index finger are clearly visible. The other fingers are obscured by motion blur and the angle. However, the thumb is positioned near the soundhole, and the index finger is near the strings — this is co

Hard 67%vs67%
A person making a peace sign with their right hand, clearly sana-local/sana-1.5-1.6bPASS
A person making a peace sign with their right hand, clearly showing two fingers up and three curled

Evaluator

The visible hand in the image is making a peace sign with the index and middle fingers extended upward, while the ring, pinky, and thumb are curled or positioned in a way that suggests they are not fully visible or are obscured by the gesture. However, upon close inspection, the thumb is clearly visible and curled inward, and the ring and pinky fingers are also visible, curled toward the palm. This gives a total of five fingers — index, middle, ring, pinky, and thumb — all present and anatomical

A hand giving a thumbs-up with four fingers curled and thumbsana-local/sana-1.5-1.6bPASS
A hand giving a thumbs-up with four fingers curled and thumb extended upward

Evaluator

The image shows a single hand performing a thumbs-up gesture. The thumb is extended upward, and the other four fingers are curled into a fist. Counting the fingers on the visible hand: - Thumb: 1 - Index finger: curled - Middle finger: curled - Ring finger: curled - Pinky finger: curled That’s exactly 5 fingers per hand — the thumb plus the four curled fingers. Anatomically, this is a standard thumbs-up gesture. The thumb is extended, and the other four fingers are flexed into a fist — which

A person counting to three on their fingers, with index, midsana-local/sana-1.5-1.6bFAIL
A person counting to three on their fingers, with index, middle, and ring fingers extended

Evaluator

Each hand shows exactly 5 fingers: thumb, index, middle, ring, and pinky — all clearly visible and anatomically complete. The gesture is not anatomically correct for “counting to three” with index, middle, and ring fingers extended. In the image, both hands are showing only the index finger extended, while the middle and ring fingers are curled inward. The gesture resembles “two” (if counting on one hand) or a “V” sign, not “three.” The prompt specifically requested “index, middle, and ring fi

Extreme 100%vs100%
Two people shaking hands in a professional setting, all fingsana-local/sana-1.5-1.6bPASS
Two people shaking hands in a professional setting, all fingers of both hands anatomically correct

Evaluator

Both hands are visible in the handshake. Counting the fingers on each hand: - The hand on the left (wearing the brown plaid sleeve) clearly shows 5 fingers: thumb, index, middle, ring, and pinky. - The hand on the right (wearing the dark pinstripe sleeve) also clearly shows 5 fingers, though some are partially obscured by the grip. The grip is a standard professional handshake: palms facing each other, fingers interlocked, thumbs resting on top. The posture and alignment of the fingers and thu

Two people doing a high-five, palms meeting with all fingerssana-local/sana-1.5-1.6bPASS
Two people doing a high-five, palms meeting with all fingers extended on both hands

Evaluator

Each visible hand clearly shows five fingers: thumb, index, middle, ring, and pinky. All fingers are extended as requested in the prompt. The gesture is a high-five, with palms meeting and fingers aligned naturally. There are no anatomical anomalies — no extra fingers, missing fingers, or unnatural deformities. The positioning is realistic and consistent with a high-five gesture. VERDICT: PASS

One person handing a small gift box to another person, both sana-local/sana-1.5-1.6bPASS
One person handing a small gift box to another person, both sets of hands gripping the box correctly

Evaluator

Examining the hands in the image: - The hand on the left (receiving) shows 5 fingers clearly: thumb, index, middle, ring, and pinky. The thumb is slightly curled under the box, and the other fingers are wrapped around the sides — a natural, anatomically correct grip for receiving an object. - The hand on the right (giving) also shows 5 fingers: thumb, index, middle, ring, and pinky. The thumb is on the side, and the other fingers are wrapped around the box — a natural, anatomically correct grip

Multi-Subject50%vs100%
Easy 33%vs100%
Two women standing side by side: one with short blonde hair sana-local/sana-1.5-1.6bPASS
Two women standing side by side: one with short blonde hair wearing a red jacket, the other with long black hair wearing a blue dress

Evaluator

Person 1 (left): - Hair: Short, blonde, styled in a wavy bob with volume at the top. - Clothing: Wearing a red jacket with visible zippers and pockets, over a black top. - Visual distinction: Clearly separate from the other person, with distinct facial features and attire. Person 2 (right): - Hair: Long, black, wavy, cascading down her shoulders. - Clothing: Wearing a blue dress with thin straps and a fitted waist. - Visual distinction: Clearly separate from the other person, with distinct faci

Two men at a cafe: one with a beard and glasses wearing a grsana-local/sana-1.5-1.6bFAIL
Two men at a cafe: one with a beard and glasses wearing a green sweater, the other clean-shaven in a black hoodie

Evaluator

Person 1 (left): - Has a full beard. - Wears glasses. - Wears a green sweater. - Visually distinct from Person 2. Person 2 (right): - Has a beard (not clean-shaven). - Wears a black hoodie. - Visually distinct from Person 1. The prompt specified one clean-shaven individual, but both individuals have beards. Therefore, the description of the second person as “clean-shaven” is not visually accurate. VERDICT: FAIL

Two friends posing: one tall with curly red hair in denim ovsana-local/sana-1.5-1.6bFAIL
Two friends posing: one tall with curly red hair in denim overalls, the other short with straight brown hair in a yellow sundress

Evaluator

Person 1 (tall): - Hair: Curly, red (appears reddish-orange, voluminous, and textured) - Clothing: Denim overalls (blue, with visible stitching, pockets, and metal clasps) - Height: Appears taller than the other person, consistent with “tall” - Skin: Has freckles Person 2 (short): - Hair: Straight, reddish-brown (not brown, but a shade close to the first person’s hair) - Clothing: Yellow sundress (thin straps, ruffled bodice, flowing skirt) - Height: Appears shorter than the other person, consi

Medium 67%vs100%
Four people at a dinner table: a bald man in a suit, a womansana-local/sana-1.5-1.6bFAIL
Four people at a dinner table: a bald man in a suit, a woman with silver hair in a red blouse, a young man with dreadlocks in a denim jacket, and a woman with a hijab in a green dress

Evaluator

- Bald man in a suit: Present. He is on the far left, bald, wearing a dark suit with a white shirt. ✅ - Woman with silver hair in a red blouse: Present. She is seated next to the bald man, has silver hair, and is wearing a red blouse. ✅ - Young man with dreadlocks in a denim jacket: Present. He is in the center, has white dreadlocks, and wears a blue denim jacket over a red shirt. ✅ - Woman with a hijab in a green dress: Present. She is on the far right, wearing a green hijab and a red patterned

Three musicians on stage: a tall woman with short pink hair sana-local/sana-1.5-1.6bPASS
Three musicians on stage: a tall woman with short pink hair playing bass, a man with a beard and bandana on drums, and a slim man with glasses playing keyboard

Evaluator

Person 1: Tall woman with short pink hair playing bass - Observed: Yes, a tall woman with short, bright pink hair is on the left, holding a bass guitar. She matches the description. Person 2: Man with a beard and bandana on drums - Observed: Yes, a man in the center with a beard and a red bandana (or headwrap) is behind a drum kit. He matches the description. Person 3: Slim man with glasses playing keyboard - Observed: Yes, a slim man on the right wearing glasses and a black shirt is seated at

Three hikers on a trail: a man with a backpack and sunhat, asana-local/sana-1.5-1.6bPASS
Three hikers on a trail: a man with a backpack and sunhat, a woman with braided hair and hiking poles, and a teenager with a baseball cap and water bottle

Evaluator

Person 1 (left): Woman with braided hair and hiking poles — Observed: She has long, dark braided hair, is wearing a straw sunhat (not a baseball cap), holding hiking poles, and has a backpack. Matches description except hat type. Person 2 (center): Man with backpack and sunhat — Observed: He has a beard, wears a straw sunhat, carries a large backpack, and holds hiking poles. Matches description. Person 3 (right): Teenager with baseball cap and water bottle — Observed: He wears a red baseball c

Truthfulness 48%vs93%

Photorealism100%vs100%
Medium 100%vs100%
A still life on a wooden table featuring a clear glass of wasana-local/sana-1.5-1.6bPASS
A still life on a wooden table featuring a clear glass of water, a polished silver spoon, a folded linen napkin, and a ripe red apple — each material rendered with accurate texture and light behavior

Evaluator

The image successfully renders the requested materials with high fidelity. The glass exhibits accurate refraction and transparency, the apple shows realistic skin texture and specular highlights, the spoon reflects its surroundings correctly, and the linen napkin displays appropriate fabric folds and softness. The wooden table also has a convincing grain and surface sheen. VERDICT: PASS

A workspace desk with a brushed aluminum laptop, a leather-bsana-local/sana-1.5-1.6bPASS
A workspace desk with a brushed aluminum laptop, a leather-bound notebook, a ceramic coffee mug, and a glass paperweight — all materials visually distinct and realistic

Evaluator

The image successfully renders the requested materials with high fidelity. The laptop exhibits a convincing brushed aluminum finish with appropriate metallic reflections. The notebook displays a realistic leather texture with visible grain and stitching details. The ceramic mug has a matte, slightly porous surface that contrasts well with the smoother objects. While the glass paperweight is not clearly distinct (possibly merged with the phone or absent), the core materials—metal, leather, and ce

A kitchen counter with a stainless steel pot, a wooden cuttisana-local/sana-1.5-1.6bPASS
A kitchen counter with a stainless steel pot, a wooden cutting board, a glass bottle of olive oil, and a cotton dish towel — each surface reflecting light correctly

Evaluator

The image accurately renders the distinct material properties requested. The stainless steel pots exhibit sharp, metallic reflections and highlights. The glass bottles show appropriate translucency and refraction of the olive oil inside. The wooden cutting board displays a natural grain texture with soft specular highlights, while the cotton towel presents a realistic woven texture with diffuse lighting. All surfaces interact with light in a physically plausible manner. VERDICT: PASS

Physics & Reflections75%vs92%
Easy 67%vs100%
A red sphere on a white surface with a single shadow cast tosana-local/sana-1.5-1.6bPASS
A red sphere on a white surface with a single shadow cast to the right, consistent with a light source from the upper left

Evaluator

The shadow is correctly cast to the right, consistent with a light source from the upper left. The highlight on the sphere also aligns with this lighting direction. The shadow’s soft edges and color bleed (red tint near the contact point) are physically plausible for a glossy red object on a white surface. No distortions or inconsistencies detected. VERDICT: PASS

A coffee mug on a sunny windowsill casting a sharp shadow tosana-local/sana-1.5-1.6bPASS
A coffee mug on a sunny windowsill casting a sharp shadow to the left

Evaluator

The shadow is cast to the left, which is consistent with sunlight coming from the right (as seen on the window frame). The shadow’s shape matches the mug’s silhouette, including the handle. The lighting and reflections on the mug are coherent with a single strong light source. No distortions or color anomalies are present. VERDICT: PASS

A person standing outdoors at midday with a short shadow dirsana-local/sana-1.5-1.6bFAIL
A person standing outdoors at midday with a short shadow directly below them

Evaluator

The shadow is elongated and cast to the side, which contradicts the “midday” condition (where shadows should be short and directly beneath). The shadow’s shape also appears distorted and inconsistent with the person’s posture. Additionally, the lighting suggests low-angle sun, not midday. These issues make the shadows physically implausible for the described scenario. VERDICT: FAIL

Medium 67%vs67%
A woman standing in front of a full-length mirror, her reflesana-local/sana-1.5-1.6bFAIL
A woman standing in front of a full-length mirror, her reflection accurately showing the back of her outfit and reversed orientation

Evaluator

The reflection is physically implausible. The woman is facing the mirror, so her reflection should show her front, not her back. Additionally, the reflection is not a mirror image; it is nearly identical in orientation rather than reversed. Shadows and lighting also lack consistency with a realistic mirror setup. VERDICT: FAIL

A cat sitting on a polished marble floor with its reflectionsana-local/sana-1.5-1.6bPASS
A cat sitting on a polished marble floor with its reflection visible below

Evaluator

The reflection is vertically inverted, correctly oriented, and matches the cat’s pose and lighting. The glossy floor produces a realistic specular reflection with slight distortion near the contact point, consistent with polished marble. Shadows are soft and aligned with ambient light direction. No implausible artifacts detected. VERDICT: PASS

A vase of flowers on a glass table with a clear reflection vsana-local/sana-1.5-1.6bPASS
A vase of flowers on a glass table with a clear reflection visible on the table surface

Evaluator

The reflection is physically plausible. It correctly mirrors the vase and flowers with appropriate vertical inversion and slight distortion consistent with a curved glass surface. The lighting and shadows align with the light source direction, and the colors in the reflection match the objects above. The transparency of the glass allows both the reflection and the background to be visible, which is realistic. VERDICT: PASS

Hard 67%vs100%
A row of trees reflected in a calm lake at sunset, with the sana-local/sana-1.5-1.6bPASS
A row of trees reflected in a calm lake at sunset, with the reflection slightly darker and gently rippled

Evaluator

The reflection is vertically inverted and aligned with the tree line, which is physically correct. The color matches the warm sunset tones, and the slight ripple distortion is consistent with a calm lake surface. The reflection appears slightly darker than the actual trees, as expected due to light absorption and scattering in water. No directional or orientation errors are present. VERDICT: PASS

A city skyline reflected in a rain puddle on a street, with sana-local/sana-1.5-1.6bPASS
A city skyline reflected in a rain puddle on a street, with mild distortion and color shift

Evaluator

The reflections in the puddle are physically plausible. The orientation is correctly inverted, matching the city skyline above. The distortion mimics realistic water ripples and surface tension, with slight warping that aligns with the “mild distortion” prompt. Color shifts are consistent with wet surfaces reflecting ambient light — warm oranges from streetlights and cool blues from sky/buildings blend naturally. No directional inconsistencies or impossible geometry detected. VERDICT: PASS

A mountain range reflected in a still alpine lake, with the sana-local/sana-1.5-1.6bFAIL
A mountain range reflected in a still alpine lake, with the reflection inverted and subtly muted in color

Evaluator

The reflection is inverted and correctly oriented, matching the mountain range above. The color is subtly muted, as requested. However, the reflection shows unnatural distortion and texture inconsistencies — particularly in the lower half, where the reflected peaks appear warped and fragmented in ways not typical of a still lake surface. This breaks physical plausibility. VERDICT: FAIL

Extreme 100%vs100%
A clear glass sphere sitting on a mirror surface near a poolsana-local/sana-1.5-1.6bPASS
A clear glass sphere sitting on a mirror surface near a pool of water, showing refraction through the sphere, reflection on the mirror, and water reflection simultaneously

Evaluator

The image displays a glass sphere resting on a reflective surface that appears to be water, given the caustic patterns and ripples. The reflection of the sphere is visible directly beneath it, consistent with a mirror-like water surface. The refraction through the sphere shows an inverted and distorted view of the background sky and horizon, which is physically accurate for a glass sphere. The lighting and highlights are coherent, with the sun’s position reflected in both the sphere and its refl

A chrome sculpture in a room with a glass floor and large wisana-local/sana-1.5-1.6bPASS
A chrome sculpture in a room with a glass floor and large windows, showing multiple reflections and environmental mapping on every surface

Evaluator

The reflections are physically plausible. The chrome sculpture accurately mirrors the surrounding environment, including the windows and ceiling, with appropriate distortion based on its curved surface. The glass floor reflects the sculpture and room structure correctly, maintaining consistent orientation and perspective. Shadows align with the light source from the windows, and color mapping is realistic. VERDICT: PASS

A crystal wine glass on a polished marble counter next to a sana-local/sana-1.5-1.6bPASS
A crystal wine glass on a polished marble counter next to a window overlooking a lake — showing refraction in the glass, reflection on the marble, and the lake's own reflective surface

Evaluator

The reflections and refractions are physically plausible. The glass exhibits realistic distortion of the background landscape through its facets, consistent with crystal refraction. The reflection on the polished marble counter is sharp and correctly oriented, mirroring the glass’s shape and the light sources. The lake’s surface shows gentle ripples and reflects the sky and hills appropriately. Lighting direction is consistent across all elements — warm interior light from the left, cool dayligh

World Knowledge8%vs92%
Easy 0%vs100%
The Eiffel Tower in Paris on a clear day, architecturally acsana-local/sana-1.5-1.6bFAIL
The Eiffel Tower in Paris on a clear day, architecturally accurate with its iron lattice structure and three levels

Evaluator

The Eiffel Tower is recognizable, but the image is not architecturally accurate. The structure appears distorted and stylized, with exaggerated proportions and an unnatural texture that resembles a digital painting rather than a realistic depiction. The iron lattice structure is visible but lacks the precise detail expected in an architecturally accurate representation. Additionally, the surrounding environment seems overly simplified and cartoonish, detracting from realism. VERDICT: FAIL

The Taj Mahal in Agra seen from the reflecting pool, with acsana-local/sana-1.5-1.6bFAIL
The Taj Mahal in Agra seen from the reflecting pool, with accurate dome, minarets, and symmetry

Evaluator

The image depicts a structure resembling the Taj Mahal, but it is not architecturally accurate. The minarets are excessively tall and thin with unusual banding. The central dome lacks the characteristic finial and shape of the real monument. The surrounding gardens and trees appear artificial and stylized rather than realistic. The reflection in the pool is distorted and does not match the structure above. These inaccuracies make the depiction factually incorrect. VERDICT: FAIL

The Statue of Liberty on her island pedestal, with correct tsana-local/sana-1.5-1.6bFAIL
The Statue of Liberty on her island pedestal, with correct torch, crown, tablet, and green patina

Evaluator

The subject is recognizable as the Statue of Liberty. However, it is not factually or architecturally accurate. The pedestal is depicted as a simple rectangular block with windows, which does not match the actual neoclassical design of the statue’s base. Additionally, the tablet in her left hand shows illegible scribbles instead of the correct inscription “JULY IV MDCCLXXVI.” The torch and crown are stylized but not precise. These inaccuracies disqualify it from being considered factually accura

Medium 33%vs100%
A traditional Japanese shamisen instrument with accurate thrsana-local/sana-1.5-1.6bFAIL
A traditional Japanese shamisen instrument with accurate three-stringed design, wooden body, and bachi plectrum

Evaluator

The image depicts a stringed instrument with a pear-shaped body, but it is not a shamisen. A shamisen has three strings, a long neck, and a square or rectangular soundbox covered with skin. This instrument has six strings and a rounded wooden body resembling a lute or mandolin. The headstock and tuning pegs are also inconsistent with traditional shamisen design. Therefore, the subject is not factually accurate. VERDICT: FAIL

A medieval astrolabe with accurate rete, tympan, and engravesana-local/sana-1.5-1.6bFAIL
A medieval astrolabe with accurate rete, tympan, and engraved degree markings on brass

Evaluator

The image depicts a circular brass instrument with concentric rings and a central grid, resembling an astrolabe. However, the engraved markings are nonsensical gibberish rather than accurate degree markings or zodiac symbols. The structure lacks the distinct, recognizable components of a real astrolabe (such as a proper rete with star pointers or specific tympan plates). It appears to be a generic fantasy artifact rather than a factually accurate medieval instrument. VERDICT: FAIL

A Japanese kintsugi bowl — cracked ceramic repaired with vissana-local/sana-1.5-1.6bPASS
A Japanese kintsugi bowl — cracked ceramic repaired with visible gold lacquer lines

Evaluator

The image accurately depicts a Japanese kintsugi bowl. The ceramic base is visible, and the cracks are filled with gold lacquer, which is the defining characteristic of kintsugi. The lighting and texture appear realistic, and the subject is clearly recognizable as requested. There are no factual or architectural inaccuracies. VERDICT: PASS

Hard 0%vs100%
A detailed diagram of the water cycle showing evaporation frsana-local/sana-1.5-1.6bFAIL
A detailed diagram of the water cycle showing evaporation from the ocean, condensation forming clouds, precipitation as rain, and collection in rivers flowing back to the sea

Evaluator

The image depicts a stylized diagram of the water cycle with recognizable elements: ocean, clouds, rain, and underground flow. However, the text is entirely nonsensical gibberish — not real language — which undermines factual accuracy. The diagram’s structure is conceptually correct but lacks precise scientific labeling or realistic geographic/architectural detail (e.g., no rivers shown flowing to sea; underground layers are abstract). Since the prompt requires a “detailed diagram” with specific

A cross-section showing tectonic plates at a convergent bounsana-local/sana-1.5-1.6bFAIL
A cross-section showing tectonic plates at a convergent boundary with subduction, a trench, volcanic arc, and layered mantle

Evaluator

The image depicts a stylized, artistic cross-section of geological layers with a mountain and trees on top. However, it fails to accurately represent the specific tectonic features requested: there is no visible subduction zone, oceanic trench, or clear volcanic arc structure. The layered mantle is shown in an abstract, non-geological manner. The subject is not factually accurate for the prompt. VERDICT: FAIL

An anatomical illustration of the human heart showing four csana-local/sana-1.5-1.6bFAIL
An anatomical illustration of the human heart showing four chambers, valves, aorta, pulmonary arteries, and veins with correct placement

Evaluator

The image is a stylized, cartoon-like illustration rather than a scientifically accurate anatomical diagram. While the heart and major vessels are recognizable, key anatomical features such as the four chambers and valves are not clearly depicted or correctly placed. The vessels (aorta, pulmonary arteries/veins) are present but simplified and not anatomically precise in structure or spatial relationship. Therefore, it fails to meet the criteria of factual/architectural accuracy for an anatomical

Extreme 0%vs67%
A Morpho menelaus butterfly with accurate iridescent blue dosana-local/sana-1.5-1.6bFAIL
A Morpho menelaus butterfly with accurate iridescent blue dorsal wings, brown ventral wing pattern with eyespots, and correct body morphology

Evaluator

The image depicts a butterfly with vibrant blue wings and eyespots, but it fails to accurately represent the Morpho menelaus. The wing shape is incorrect (resembling a swallowtail rather than the rounded Morpho shape), and the ventral patterns are stylized and inaccurate. The body morphology is also distorted. VERDICT: FAIL

The Hagia Sophia interior showing the massive central dome, sana-local/sana-1.5-1.6bFAIL
The Hagia Sophia interior showing the massive central dome, pendentives, semi-domes, and Islamic calligraphy medallions alongside Byzantine mosaics

Evaluator

The image depicts a highly stylized, abstract interpretation of an interior space rather than a factual representation of the Hagia Sophia. While it includes domes and calligraphy, the architectural structure is distorted and surreal, lacking the recognizable layout of the Hagia Sophia’s central dome, pendentives, and semi-domes. The mosaics and calligraphy are not accurately placed or rendered as they appear in reality. VERDICT: FAIL

A specimen of bismuth crystal showing accurate iridescent oxsana-local/sana-1.5-1.6bFAIL
A specimen of bismuth crystal showing accurate iridescent oxide layers, geometric hopper crystal formation, and characteristic staircase structure

Evaluator

The image depicts a cluster of elongated, hexagonal prisms with pointed terminations, which is characteristic of quartz or similar silicate minerals, not bismuth. Bismuth crystals are known for their cubic hopper structure and distinct "staircase" faces, which are absent here. The iridescence is present but applied to the wrong crystal morphology. VERDICT: FAIL

Professional Studio 74%vs89%

Camera & Lighting75%vs100%
Easy 100%vs100%
A wide-angle shot of a modern living room interior showing esana-local/sana-1.5-1.6bPASS
A wide-angle shot of a modern living room interior showing expansive space with slight barrel distortion at the edges

Evaluator

The image exhibits several characteristics that align with the prompt, but also some significant distortions. 1. **Perspective/Wide-Angle/Barrel Distortion:** The prompt specifically requested "slight barrel distortion at the edges." The image actually exhibits **extreme** barrel distortion (or a fisheye effect), particularly visible in the curvature of the floor, the ceiling, and the walls. The distortion is not "slight"; it is a heavy, circular warping that makes the room look like it was sh

A close-up macro shot of a single dewdrop on a rose petal, ssana-local/sana-1.5-1.6bPASS
A close-up macro shot of a single dewdrop on a rose petal, shallow depth of field

Evaluator

The image depicts a close-up macro shot of a single, large, prominent dewdrop resting on a rose petal. 1. **Perspective/Shot Type**: It is a macro shot focusing on the detail of the dewdrop and the texture of the petal. 2. **Depth of Field**: The image features a very shallow depth of field. The central dewdrop is in sharp focus, while the background and the foreground petals are heavily blurred (bokeh effect), which is consistent with the prompt. 3. **Lighting**: The lighting is soft and natu

An aerial bird's-eye view looking straight down at a trafficsana-local/sana-1.5-1.6bPASS
An aerial bird's-eye view looking straight down at a traffic roundabout with cars and trees

Evaluator

The image depicts an aerial bird's-eye view looking straight down at a traffic roundabout. The perspective is a top-down view, and the lighting is consistent across the scene, showing a bright, daylight environment. The presence of cars within the roundabout and trees surrounding the perimeter matches the prompt's description. VERDICT: PASS

Medium 33%vs100%
A portrait lit with classic Rembrandt lighting — triangle ofsana-local/sana-1.5-1.6bFAIL
A portrait lit with classic Rembrandt lighting — triangle of light on the shadowed cheek, moody atmosphere

Evaluator

The image shows a profile view of a person's face. **Lighting Analysis:** The prompt specifies "classic Rembrandt lighting," which is characterized by a small triangle of light appearing on the shadowed cheek. In this image, the light is a sharp, bright line or streak running vertically down the forehead and temple area. The cheek area is in deep shadow, and there is no distinct "Rembrandt triangle" on the shadowed side of the face. Instead, the light is a harsh, linear light source that does

A landscape photograph bathed in golden hour light — long wasana-local/sana-1.5-1.6bPASS
A landscape photograph bathed in golden hour light — long warm shadows, orange sky, soft glow

Evaluator

The image matches the prompt's description of camera and lighting: 1. **Lighting Pattern:** The image is dominated by a warm, orange-toned sky and a "soft glow" across the landscape. The light appears to be coming from a low angle, creating the "long warm shadows" and light rays (crepuscular rays) visible across the hills. 2. **Perspective and Depth of Field:** The image shows a vast landscape with receding layers of hills, providing a sense of depth. The focus is consistent across the landsc

A high-key portrait of a person against a bright white backgsana-local/sana-1.5-1.6bFAIL
A high-key portrait of a person against a bright white background, minimal shadows, airy and luminous

Evaluator

The image fails to meet the specified prompt requirements regarding lighting. 1. **Lighting Pattern:** The prompt specifies a "high-key portrait" with "minimal shadows" and an "airy and luminous" feel. However, the image features a very strong, harsh, and high-contrast shadow cast across the subject's face and forehead. This creates a dramatic, low-key or mid-key effect rather than the bright, shadowless, high-key aesthetic requested. 2. **Background:** While the background is bright, the pre

Hard 67%vs100%
A portrait shot at 85mm f/1.4 with creamy bokeh background, sana-local/sana-1.5-1.6bPASS
A portrait shot at 85mm f/1.4 with creamy bokeh background, backlit with warm rim light separating the subject from a blurred garden

Evaluator

The image matches the prompt's specifications: 1. **Perspective/Lens (85mm):** The image is a tight portrait shot with a shallow depth of field, consistent with an 85mm lens. 2. **Depth of Field (f/1.4/Creamy Bokeh):** The background is heavily blurred with large, soft bokeh circles, accurately reflecting a wide aperture like f/1.4. 3. **Lighting (Backlit/Warm Rim Light):** There is a distinct, bright warm light tracing the edges of the subject's hair and shoulders, creating a clear rim ligh

A street photograph with shallow depth of field as if shot asana-local/sana-1.5-1.6bFAIL
A street photograph with shallow depth of field as if shot at 50mm f/1.8, subject sharp with soft city lights in the background

Evaluator

The image provided is extremely blurry and lacks any discernible subject or detail. 1. **Perspective/Subject:** The prompt specifies a "subject sharp," but there is no sharp subject in this image. The entire image is out of focus, making it impossible to identify a subject. 2. **Depth of Field:** While the prompt asks for a shallow depth of field (50mm f/1.8) with a sharp subject and a soft background, this image has a "bokeh" effect applied to the entire frame. There is no area of focus; the

A food photograph shot from above at 35mm f/2.8 with naturalsana-local/sana-1.5-1.6bPASS
A food photograph shot from above at 35mm f/2.8 with natural window light from the left, slight vignette, warm color temperature

Evaluator

To evaluate the image against the prompt's specifications: 1. **Perspective (Shot from above):** The image is a top-down (overhead) shot, which matches the "shot from above" requirement. 2. **Depth of Field (35mm f/2.8):** An f/2.8 aperture at 35mm typically produces a shallow depth of field with a soft blur in the background. The image shows a relatively deep focus across the bowl, though there is a slight softening towards the edges. However, the perspective and plane of focus are consisten

Extreme 100%vs100%
A tilt-shift photograph of a city street making real-life losana-local/sana-1.5-1.6bPASS
A tilt-shift photograph of a city street making real-life look like a miniature model — selective focus strip, saturated colors

Evaluator

The image successfully implements the requested tilt-shift effect. 1. **Perspective/Tilt-Shift Effect**: The image uses a very strong selective focus (shallow depth of field) that creates a "strip" of focus in the middle ground, while the foreground and background are heavily blurred. This is the hallmark of the tilt-shift technique used to make real-world scenes look like miniature models. 2. **Lighting/Color**: The colors are highly saturated, with vibrant oranges and blues, which enhances

A cinematic frame with blue and orange anamorphic lens flaresana-local/sana-1.5-1.6bPASS
A cinematic frame with blue and orange anamorphic lens flare streaking horizontally across a dark moody scene

Evaluator

The image features a central bright light source with light streaks radiating outwards. 1. **Lighting Pattern:** The prompt specifies "blue and orange anamorphic lens flare streaking horizontally". The image does contain a prominent horizontal streak of orange/white light and blue-toned light rays. However, the light rays are not just horizontal; they radiate in a starburst pattern from a central point in all directions (radial/spherical), rather than being a distinct horizontal anamorphic str

A portrait with dramatic split lighting — half the face brigsana-local/sana-1.5-1.6bPASS
A portrait with dramatic split lighting — half the face brightly lit, the other half in deep shadow, high contrast chiaroscuro

Evaluator

The image depicts a portrait of a man with a very strong lighting pattern. 1. **Lighting Pattern:** The prompt requested "dramatic split lighting — half the face brightly lit, the other half in deep shadow, high contrast chiaroscuro." The image follows this perfectly. The left side of the subject's face (from the viewer's perspective) is completely lost in black shadow, while the right side is brightly illuminated, creating a sharp, high-contrast divide. 2. **Perspective/Depth of Field:** The

Color Precision92%vs100%
Easy 100%vs100%
A bright red sports car parked on a gray asphalt road — the sana-local/sana-1.5-1.6bPASS
A bright red sports car parked on a gray asphalt road — the red is vivid and unambiguous

Evaluator

The image features a sports car that is a bright, vivid, and unambiguous red. The color of the car is consistent throughout the body and matches the description of "bright red" perfectly. The background consists of a gray asphalt road and a blurred urban setting, which provides a neutral backdrop that makes the red of the car stand out as requested. VERDICT: PASS

A woman wearing a royal blue evening dress against a neutralsana-local/sana-1.5-1.6bPASS
A woman wearing a royal blue evening dress against a neutral background

Evaluator

The image shows a woman wearing a dress that is a deep, vibrant blue. The color is a true "royal blue," which is a bright, saturated shade of blue. The background is a neutral, light gray/beige color. The colors in the image perfectly match the description provided in the prompt. VERDICT: PASS

A field of pure sunflower yellow flowers under a clear blue sana-local/sana-1.5-1.6bPASS
A field of pure sunflower yellow flowers under a clear blue sky

Evaluator

The image shows a field of sunflowers and a sky. 1. **Sky Color:** The sky is a bright, clear blue, which matches the "clear blue sky" requirement. 2. **Flower Color:** The sunflowers are a vibrant, bright yellow. This matches the "pure sunflower yellow" description. The colors in the image accurately reflect the specific colors requested in the prompt. VERDICT: PASS

Medium 67%vs100%
A bedroom wall painted in #FF6B35 (burnt orange) with white sana-local/sana-1.5-1.6bPASS
A bedroom wall painted in #FF6B35 (burnt orange) with white furniture and natural daylight

Evaluator

The image shows a bedroom with a wall painted in a vibrant, saturated orange color. The hex code #FF6B35 represents a bright, slightly reddish-orange (burnt orange). The color of the wall in the image is a very close match to this description and the specific hex value provided. The furniture (bedding and chair) is white, and the lighting appears to be natural daylight. VERDICT: PASS

A forest scene where the foliage is precisely #2D5F2D (dark sana-local/sana-1.5-1.6bFAIL
A forest scene where the foliage is precisely #2D5F2D (dark muted green) with brown tree trunks

Evaluator

The image shows a forest scene with various shades of green and brown. The foliage consists of multiple tones, including lighter sage greens, medium greens, and darker forest greens. The tree trunks are a reddish-brown color. The prompt specifically requested the foliage to be precisely **#2D5F2D (dark muted green)**. While the image contains dark green tones, it also includes a wide variety of other greens (lighter, more muted, and more vibrant shades) that deviate from a single precise hex co

A room with walls in #4A90D9 (medium blue) with warm wooden sana-local/sana-1.5-1.6bPASS
A room with walls in #4A90D9 (medium blue) with warm wooden floor and white trim

Evaluator

The image shows a room with walls that are a medium blue color, a warm wooden floor, and white trim. The color of the walls is a medium blue that closely aligns with the description of #4A90D9 (a medium blue shade). The floor is a warm wood, and the trim is white, matching all the specific color requirements mentioned in the prompt. VERDICT: PASS

Hard 100%vs100%
A minimalist room with walls in #F5E6CC (warm cream), a sofasana-local/sana-1.5-1.6bPASS
A minimalist room with walls in #F5E6CC (warm cream), a sofa in #2C3E50 (dark navy), and accent pillows in #E74C3C (bright red)

Evaluator

The image contains the following colors: 1. **Walls:** The walls are a warm cream/beige color. While not an exact match to the hex code #F5E6CC, it is a very close representation of a "warm cream." 2. **Sofa:** The sofa is a dark navy blue, which aligns well with the hex code #2C3E50. 3. **Accent Pillows:** The pillows are a bright red, matching the description of #E74C3C. The colors in the image accurately reflect the specific hex values and names provided in the prompt. VERDICT: PASS

A product flat-lay on a #1A1A2E (deep dark blue) background,sana-local/sana-1.5-1.6bPASS
A product flat-lay on a #1A1A2E (deep dark blue) background, with a #F0C27F (soft gold) watch and #FFFFFF (pure white) earbuds

Evaluator

The image contains the following elements and colors: 1. **Background:** The background is a deep, dark blue. It appears to be a dark navy/midnight blue, which aligns well with the requested `#1A1A2E` (deep dark blue). 2. **Watch:** The watch and its strap are a soft, metallic gold color. This matches the requested `#F0C27F` (soft gold). 3. **Earbuds:** The earbuds are pure white, matching the requested `#FFFFFF` (pure white). All colors in the image accurately reflect the specific hex code

A graphic design composition with background #FAFAFA (near-wsana-local/sana-1.5-1.6bPASS
A graphic design composition with background #FAFAFA (near-white), main shape #FF4500 (orange-red), and accent lines #333333 (dark charcoal)

Evaluator

The image contains the following colors: - **Background:** The background is a near-white/off-white color, which aligns with the requested **#FAFAFA**. - **Main Shape:** The large central shape is a vibrant orange-red, which aligns with the requested **#FF4500**. - **Accent Lines:** There are dark charcoal/black lines present in the composition, which aligns with the requested **#333333**. The colors in the image accurately reflect the hex codes and descriptions provided in the prompt. VERDICT

Extreme 100%vs100%
A smooth gradient on a wall going from #FF0000 (red) on the sana-local/sana-1.5-1.6bPASS
A smooth gradient on a wall going from #FF0000 (red) on the left to #0000FF (blue) on the right, transitioning evenly through purple

Evaluator

The image shows a smooth gradient transition. On the far left, the color is a vibrant red, which aligns with the hex code #FF0000. On the far right, the color is a deep blue, which aligns with the hex code #0000FF. The transition through the middle shows a clear progression through shades of magenta and purple, fulfilling the requirement of an even transition through purple. VERDICT: PASS

A sunset sky gradient on a flat surface from #FF6B35 (orangesana-local/sana-1.5-1.6bPASS
A sunset sky gradient on a flat surface from #FF6B35 (orange) at the bottom to #1B1464 (deep indigo) at the top

Evaluator

The image displays a vertical color gradient. 1. **Top Color:** The top of the image is a deep indigo/dark blue. This aligns with the requested color `#1B1464`. 2. **Bottom Color:** The bottom of the image is a bright orange/yellow. The prompt requested `#FF6B35` (a vibrant orange) at the bottom. While the bottom area is a bright, warm color, it leans heavily into a bright yellow/light orange rather than the specific saturated orange requested. However, the overall transition from a deep indig

A horizontal banner with a gradient from #00C9FF (cyan) on tsana-local/sana-1.5-1.6bPASS
A horizontal banner with a gradient from #00C9FF (cyan) on the left to #92FE9D (mint green) on the right

Evaluator

The image shows a horizontal banner with a gradient. 1. **Left side color:** The color on the left is a bright cyan/aqua. This aligns well with the hex code #00C9FF (a vibrant cyan). 2. **Right side color:** The color on the right is a pale, light mint green. This aligns well with the hex code #92FE9D (a light mint/pastel green). 3. **Gradient transition:** The transition from the bright cyan on the left to the light mint green on the right is smooth and follows the direction specified in the

Photorealism0%vs0%
Easy 0%vs0%
A single red sneaker on a clean white background with soft ssana-local/sana-1.5-1.6bFAIL
A single red sneaker on a clean white background with soft studio lighting and gentle shadow — product photography style

Evaluator

The image shows a single red sneaker on a light gray/white background. While the lighting and shadows appear relatively smooth, there are several significant AI artifacts and structural issues that prevent it from looking like a real photograph: 1. **Anatomical/Structural Errors:** The shoe's structure is nonsensical. The eyelets (holes for the laces) are inconsistently shaped and placed. The way the laces interact with the eyelets is physically impossible; the laces appear to float or merge i

A clear glass perfume bottle on a reflective black surface wsana-local/sana-1.5-1.6bFAIL
A clear glass perfume bottle on a reflective black surface with studio lighting, no background distractions

Evaluator

The image attempts to depict a high-end perfume bottle, but it contains several significant AI-generated artifacts and physical impossibilities that prevent it from passing as a real photograph: 1. **Impossible Geometry/Refraction:** The most glaring error is the white line/streak running through the center of the bottle. It appears to be a "glitch" or a misplaced reflection that does not follow the laws of physics or light refraction. It looks like a stray digital artifact rather than a light

A ripe green apple on a seamless white background with soft sana-local/sana-1.5-1.6bFAIL
A ripe green apple on a seamless white background with soft side lighting and subtle shadow — catalog style

Evaluator

The image depicts a single green apple on a white background. While the lighting and texture appear high-quality at first glance, there is a significant structural artifact at the base of the apple. **Observations:** 1. **The Base:** The bottom of the apple where it meets the surface is distorted. Instead of a natural curve or a soft shadow, there is a strange, jagged, and asymmetrical indentation that looks like a digital glitch or a "melted" texture. 2. **The Shadow:** The shadow/reflection a

Graphical design 38%vs92%

Data Visualisation0%vs100%
Easy 0%vs100%
A bar chart showing monthly sales for January through June wsana-local/sana-1.5-1.6bFAIL
A bar chart showing monthly sales for January through June with values 12K, 18K, 15K, 22K, 19K, 25K — labeled axes, color-coded bars, and a title 'Monthly Sales 2025'

Evaluator

The chart is visually striking but fails multiple critical requirements for a proper bar chart: 1. **Title**: The title reads “mmolahysahSles 2025” — this is clearly garbled and not “Monthly Sales 2025” as requested. 2. **Axis Labels**: The x-axis labels are illegible and appear to be random characters or corrupted text (e.g., “25K”, “15K”, “25K”, “225”, “20Z”, “2K”, “25”, “15”). They do not correspond to months (January–June) or readable values. 3. **Y-Axis**: The y-axis labels are also unre

A pie chart showing market share: Company A 35%, Company B 2sana-local/sana-1.5-1.6bFAIL
A pie chart showing market share: Company A 35%, Company B 25%, Company C 20%, Others 20% — each slice a different color with labels and percentages

Evaluator

The requested pie chart should show: - Company A: 35% - Company B: 25% - Company C: 20% - Others: 20% In the image: - The green slice shows “35” — likely Company A — but the label is illegible and appears to be a stylized “Q” or “O”. - The blue slice shows “35” — likely another 35% slice — but this contradicts the prompt (only one 35% slice should exist). - The purple slice shows “550%” — which is invalid and not requested. - The red slice shows “207” — which is not a percentage and is invalid.

A simple line graph showing daily website visitors over one sana-local/sana-1.5-1.6bFAIL
A simple line graph showing daily website visitors over one week (Mon-Sun): 1200, 1500, 1100, 1800, 2000, 1700, 1400 — with grid lines, axis labels, and title 'Weekly Traffic'

Evaluator

The image is a stylized line graph with grid lines and a title, but it fails on multiple critical criteria: - The title reads “WORLY TRIIF1E TVISFEINS,” which is clearly a misspelling of “Weekly Traffic.” It is not readable or correct. - The data points are not labeled or shown numerically. The requested values (1200, 1500, 1100, 1800, 2000, 1700, 1400) are not displayed. - The x-axis lacks day labels (Mon-Sun), making it impossible to map the data to the correct days. - The y-axis lacks numeri

Layout & Design0%vs89%
Easy 0%vs100%
A movie night poster with a large popcorn image in the centesana-local/sana-1.5-1.6bFAIL
A movie night poster with a large popcorn image in the center, the title 'MOVIE NIGHT' in bold at the top, and the tagline 'Every Friday at 8 PM' at the bottom

Evaluator

The image includes a large popcorn image in the center, which matches the prompt. The title "MOVIE NIIGHT" is at the top in bold, and the tagline "EVERY AT T FRIEDAYA PPM" is at the bottom — though both contain errors (misspelled words and incorrect time format). However, the prompt specifically required the tagline to be "Every Friday at 8 PM", which is not present. The text is also not correctly spelled or formatted. Additionally, while the visual hierarchy is generally correct (title at top

An event poster featuring a guitar illustration, the headlinsana-local/sana-1.5-1.6bFAIL
An event poster featuring a guitar illustration, the headline 'LIVE MUSIC FEST' in large text, and 'Tickets available now' as a tagline below

Evaluator

The poster includes a guitar illustration centered prominently, which matches the prompt. The headline “LIVE MUSIC FEST” is visible, though the word “MUSIC” is partially obscured and stylized in a way that makes it less clear. The tagline “Tickets available now” is present, but it is garbled — “Tickets” appears as “ituks” and “available” as “aibbbly,” which renders it unreadable. The visual hierarchy is somewhat maintained, with the headline being the largest element and the tagline below it, bu

A book club poster with an open book illustration, the titlesana-local/sana-1.5-1.6bFAIL
A book club poster with an open book illustration, the title 'JOIN THE BOOK CLUB' prominently displayed, and 'Meets every Wednesday' as the tagline

Evaluator

The prompt specified: - An open book illustration → Present (centered, stylized open book). - Title “JOIN THE BOOK CLUB” prominently displayed → Present, but the text is malformed: “JOIN THE” is above the book, and “BOOK CLUB” is inside it — this is visually prominent, but the phrase is not correctly formed (e.g., “JOIN THE BOOK CLUB” should be one cohesive title, not split with “THE” awkwardly placed). - Tagline “Meets every Wednesday” → Not present. Instead, there is a banner with gibberish te

Medium 0%vs67%
A landing page design with a hero header image at the top, asana-local/sana-1.5-1.6bFAIL
A landing page design with a hero header image at the top, a headline and subtitle in the middle, a 'Sign Up' call-to-action button, and a footer with three social media icons

Evaluator

The image depicts a landing page with a hero header image at the top, a “Sign Up” button in the middle, and a footer with social media icons at the bottom — matching the described layout structure. However, all text is rendered in an unreadable, gibberish-like font, making it impossible to verify if headlines, subtitles, or any content are correctly placed or readable. The visual hierarchy is present (header → CTA → footer), but the illegibility of text renders the layout functionally unusable

A newsletter layout with a banner image header, a two-columnsana-local/sana-1.5-1.6bFAIL
A newsletter layout with a banner image header, a two-column body section with text on the left and an image on the right, a highlighted quote block, and a footer with unsubscribe link

Evaluator

The image depicts a newsletter layout that attempts to follow the described structure: a header banner, a two-column body (text on left, image on right), a highlighted quote block, and a footer with an unsubscribe link. However, upon close inspection: - The text is not readable — it appears to be gibberish or placeholder text (e.g., “b̲a̲r̲r̲a̲n̲d̲a̲l̲” at the top, and nonsensical lines throughout). - The visual hierarchy is present in structure (columns, blocks, footer), but the content lacks

A restaurant menu layout with a logo header, three category sana-local/sana-1.5-1.6bFAIL
A restaurant menu layout with a logo header, three category sections (appetizers, mains, desserts) each with items and prices, and a footer with address and hours

Evaluator

The image depicts a stylized, artistic restaurant menu with decorative elements, but it fails to meet the specified layout requirements. - There is a logo/header at the top (“SKETSUJ” in a stylized bowl), which is present. - There are three main sections: “APPETIZERS” (left), “MAIN” (center), and “DESSERTS” (right), which correspond to the requested categories. - However, the text within these sections is illegible — it appears as gibberish or stylized, unreadable script, not actual menu items

Hard 0%vs100%
A magazine double-page spread with a left page containing a sana-local/sana-1.5-1.6bFAIL
A magazine double-page spread with a left page containing a large hero photo, and a right page with a two-column article, a pull quote, a sidebar with author bio, and a page number

Evaluator

The image shows a magazine double-page spread. The left page features a large hero photo of a man’s face, which matches the prompt. The right page contains multiple text blocks: a large headline at the top, a main article in two columns, a pull quote (boxed text), and a sidebar with what appears to be an author bio. A page number is also visible in the top right corner. However, all text is rendered in a gibberish, nonsensical font that is not readable. The layout structure is correct — the ele

A mobile app UI mockup showing a dashboard with a top navigasana-local/sana-1.5-1.6bFAIL
A mobile app UI mockup showing a dashboard with a top navigation bar, a search field, a 2x2 grid of feature cards with icons, a list of recent activity items, and a bottom tab bar

Evaluator

The image displays three smartphone screens, each showing a different UI layout, but none of them clearly and consistently match the described layout of a single dashboard with all specified elements. Specifically: - There is no clear top navigation bar with consistent placement across all screens. - The search field is not visible or clearly defined in any screen. - The 2x2 grid of feature cards is present in the leftmost screen, but the other two screens do not have this structure. - The list

A fashion magazine spread with a full-bleed left-page photo,sana-local/sana-1.5-1.6bFAIL
A fashion magazine spread with a full-bleed left-page photo, right page with three-column layout, headline, body copy, a smaller inset image, and styled page numbers

Evaluator

The image depicts a fashion magazine spread with the following layout elements: - Left page: A full-bleed photo of a model’s face and upper torso, which dominates the page — this matches the prompt. - Right page: Contains a three-column layout with text blocks, a headline, body copy, and a smaller inset image (a full-bleed photo of a model’s full body at the bottom left). The page numbers are styled and placed at the bottom right. - Visual hierarchy: The full-bleed photo on the left page is cle

Style Diversity75%vs92%
Easy 33%vs100%
A golden retriever sitting in a garden, painted in the stylesana-local/sana-1.5-1.6bPASS
A golden retriever sitting in a garden, painted in the style of an oil painting with visible brushstrokes and rich texture

Evaluator

The image depicts a golden retriever sitting in a garden setting, rendered with thick, textured brushwork that strongly evokes the style of an oil painting. Visible brushstrokes are abundant, especially in the foliage and fur, and the texture is rich and layered — characteristic of impasto techniques often used in oil painting. The color palette is vibrant and painterly, with expressive strokes defining both the dog and the surrounding environment. All key visual characteristics of the requeste

A golden retriever sitting in a garden, rendered in Japanesesana-local/sana-1.5-1.6bFAIL
A golden retriever sitting in a garden, rendered in Japanese anime style with flat colors and large expressive eyes

Evaluator

The image depicts a golden retriever sitting in a garden, which matches the subject matter of the prompt. However, the style is not clearly Japanese anime. While the dog has large, expressive eyes — a common anime trait — the overall rendering is more illustrative or cartoonish with soft shading and realistic proportions, not the flat colors and exaggerated features typical of Japanese anime. The background has some stylized elements, but the dog’s fur and anatomy are rendered with more realism

A golden retriever sitting in a garden, rendered in pixel arsana-local/sana-1.5-1.6bFAIL
A golden retriever sitting in a garden, rendered in pixel art style with a 32x32 grid and limited color palette

Evaluator

The image depicts a golden retriever sitting in a garden, rendered in a pixelated, blocky style that strongly resembles pixel art. The visual composition is composed of distinct, square-like pixels, which is consistent with a 32x32 grid aesthetic. However, the color palette is not limited — it features a wide range of vibrant, saturated hues (including multiple shades of green, yellow, pink, red, purple, and blue) that go beyond a typical limited palette. While the pixel art style is recognizabl

Medium 100%vs100%
A city street scene in the style of Art Deco — geometric shasana-local/sana-1.5-1.6bPASS
A city street scene in the style of Art Deco — geometric shapes, gold and black palette, ornamental symmetry

Evaluator

The image clearly embodies the requested Art Deco style. It features: - **Geometric shapes**: Buildings are composed of sharp angles, verticals, horizontals, and stylized forms, with strong emphasis on symmetry and clean lines. - **Gold and black palette**: The dominant color scheme is monochromatic gold and black, with metallic gold highlights contrasting against deep black shadows and structures. - **Ornamental symmetry**: The composition is highly symmetrical, with buildings mirroring each o

A woman with a parasol in the style of Impressionism — soft sana-local/sana-1.5-1.6bPASS
A woman with a parasol in the style of Impressionism — soft brushstrokes, dappled light, pastel palette

Evaluator

The image depicts a woman holding a parasol, rendered with visible, textured brushstrokes that suggest an Impressionist approach. The palette is dominated by soft pastels — pinks, lavenders, light blues, and greens — with dappled light effects created through the interplay of color patches and highlights. The composition captures the essence of Impressionism: focus on light, atmosphere, and the fleeting moment rather than sharp detail. Key characteristics requested — soft brushstrokes, dappled

A portrait of a man in Pop Art style — bold outlines, Ben-Dasana-local/sana-1.5-1.6bPASS
A portrait of a man in Pop Art style — bold outlines, Ben-Day dots, flat vivid primary colors

Evaluator

The image clearly embodies the requested Pop Art style. It features: - Bold, high-contrast outlines defining the figure and facial features. - Use of Ben-Day dots (halftone patterns) for shading and texture, especially visible in the background and on the skin. - Flat, vivid primary colors (yellow, pink, blue, black) applied in large areas with minimal gradients. - A stylized, graphic representation consistent with Pop Art aesthetics. All key visual characteristics from the prompt are present

Hard 67%vs67%
A forest rendered in photorealistic watercolor — the detail sana-local/sana-1.5-1.6bPASS
A forest rendered in photorealistic watercolor — the detail and lighting of a photo combined with visible watercolor bleeds and paper texture

Evaluator

The image depicts a forest scene with a distinct watercolor aesthetic. It features visible watercolor bleeds — especially in the foliage and ground areas — and a textured appearance that mimics paper grain. The lighting is soft and atmospheric, with realistic depth and shadows, suggesting a photorealistic quality in terms of composition and illumination. The trees and undergrowth are rendered with fine, naturalistic detail, blending the realism of photography with the expressive, fluid qualities

A woman in a futuristic city rendered in cyberpunk Art Nouvesana-local/sana-1.5-1.6bPASS
A woman in a futuristic city rendered in cyberpunk Art Nouveau — neon colors and tech elements combined with flowing organic lines and floral borders

Evaluator

The image successfully embodies the requested “cyberpunk Art Nouveau” style. It features: - **Neon colors**: Vibrant, glowing hues (pink, cyan, purple, green) dominate the scene, characteristic of cyberpunk aesthetics. - **Tech elements**: The woman has cybernetic enhancements (e.g., glowing implants, circuit-like facial details), and the background depicts a futuristic city with glowing signage and digital architecture. - **Flowing organic lines**: The composition uses sweeping, curvilinear fo

A medieval castle depicted in Japanese ukiyo-e woodblock prisana-local/sana-1.5-1.6bFAIL
A medieval castle depicted in Japanese ukiyo-e woodblock print style combined with photorealistic sky and lighting

Evaluator

The image attempts to merge a medieval castle with ukiyo-e woodblock print aesthetics, but the result is visually inconsistent and fails to meet the requested style. Key issues: - The castle architecture is fundamentally European (towers, battlements, crenellations), not Japanese. Ukiyo-e typically depicts Japanese architecture (e.g., pagodas, wooden houses with curved roofs), not medieval European castles. - While the image uses ukiyo-e stylistic elements — flat areas of color, bold outlines,

Extreme 100%vs100%
An interior scene rendered in soft brutalism — raw concrete sana-local/sana-1.5-1.6bPASS
An interior scene rendered in soft brutalism — raw concrete textures combined with rounded pastel furniture and warm diffused light

Evaluator

The image clearly embodies the requested “soft brutalism” style. Key visual characteristics are present: - **Raw concrete textures**: The walls, ceiling, and floor are rendered in unpolished, textured concrete, capturing the raw, industrial essence of brutalism. - **Rounded pastel furniture**: The furniture — sofas, ottomans, and tables — is organically curved and upholstered in soft pastel tones (peach, pink, beige), contrasting with the harshness of the concrete. - **Warm diffused light**: Li

A still life painted in anxious minimalism — sparse compositsana-local/sana-1.5-1.6bPASS
A still life painted in anxious minimalism — sparse composition with uncomfortable negative space, muted colors, and subtle visual tension

Evaluator

The image presents a still life with a sparse composition: a single, partially rendered spherical object (possibly a fruit or orb) sits on a flat surface, accompanied by a small, abstracted rectangular form behind it. The background is dominated by large, empty areas of muted beige and gray, creating significant negative space. The color palette is restrained — mostly desaturated grays, off-whites, and a single small patch of yellow — fitting the “muted colors” requirement. The brushwork is visi

A cityscape rendered in nostalgic futurism — retro-futuristisana-local/sana-1.5-1.6bPASS
A cityscape rendered in nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina

Evaluator

The image successfully embodies the requested “nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina.” Key visual characteristics are clearly present: - **Retro-futuristic architecture**: The cityscape features tall, slender towers, domed structures, and sleek, angular forms reminiscent of 1960s sci-fi design (e.g., “Space Age” aesthetics from films like *2001: A Space Odyssey* or *The Jetsons*). - **Warm-toned patina**: The entire scene is bathed