ImageBench V1 —

192 evaluations across 6 categories

Benchmark V1 verdicts are produced by VLM judges and can contain mistakes. Treat PASS/FAIL labels as machine-assisted assessments, and inspect the images yourself. Learn more about the methodology.

145

Pass

Fail

76%

Pass Rate

246.2s

Avg Latency

243.8s

Min Latency

247.3s

Max Latency

Text Rendering › Typography Style › Easylocal/krea-2-turbo

Prompt: The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Text Rendering 80%

Typography StyleClick to expand100%

Typography Style measures whether local/krea-2-turbo preserves requested lettering styles, font mood, and text texture such as typewriter ink or neon signage.

Easy 100%

PASS

The word 'CHAPTER ONE' typed on aged paper with a vintage typewriter font, complete with slightly uneven ink

Evaluator (Qwen 3 VL)

The text "CHAPTER ONE" is spelled correctly and rendered in a bold, blocky font resembling vintage typewriter lettering. The background mimics aged paper with visible texture. However, the ink lacks the expected slight unevenness or smudging characteristic of typewriter ink. The style is convincing overall but misses the specific "slightly uneven ink" detail. VERDICT: PASS

PASS

The phrase 'TODAY'S SPECIAL' written in white chalk on a dark green chalkboard with chalk dust

Evaluator (Qwen 3 VL)

Text reads “TODAY’S SPECIAL” correctly, spelled and punctuated as requested. Letters exhibit convincing chalk texture, smudges, and uneven edges on a dark green surface resembling a chalkboard. Chalk dust effect is visible around strokes. The medium and style match the prompt precisely. VERDICT: PASS

PASS

The word 'OPEN' displayed on a red LED sign with glowing pixel-style letters

Evaluator (Qwen 3 VL)

The word "OPEN" is rendered in glowing red pixel-style letters, matching an LED display. The letters are clearly formed with individual illuminated dots, and the spelling is correct. The style convincingly mimics a classic LED sign with warm glow and dark background. VERDICT: PASS

Writing accuracyClick to expand75%

Writing accuracy measures whether local/krea-2-turbo renders the exact requested words, numbers, labels, and short phrases without spelling or layout errors.

Easy 100%

PASS

A single large word 'HELLO' centered on a white background

Evaluator (Qwen 3 VL)

The word "HELLO" is fully spelled correctly and is clearly readable in bold, black letters on a white background. All characters are intact and legible. No misspellings or missing parts are present. The text matches the intended specification exactly. VERDICT: PASS

PASS

The word 'ADVENTURE' in big bold letters centered on a plain cream background

Evaluator (Qwen 3 VL)

The word "ADVENTURE" is spelled correctly and fully readable in bold, centered text on a cream background. All letters are intact and legible. No misspellings or missing words. The visual matches the intended text exactly. VERDICT: PASS

PASS

The word 'SERENITY' written large and centered on a solid light blue background

Evaluator (Qwen 3 VL)

The word "SERENITY" is spelled correctly and fully readable, centered on a solid light blue background. All letters are distinct and legible. No misspellings or missing parts. The text matches the intended specification exactly. VERDICT: PASS

Medium 100%

PASS

The sentence 'The quick brown fox jumps high' in clean black text on a white background

Evaluator (Qwen 3 VL)

All words are spelled correctly and fully readable: "quick", "brown", "fox", "jumps", "high". Text is clean black on white background as intended. No misspellings or illegible parts. Accuracy meets criteria. VERDICT: PASS

PASS

'Every sunset brings a new dawn ahead' written neatly on a pale yellow background

Evaluator (Qwen 3 VL)

Text reads: "Every sunset brings a new dawn ahead". All words are spelled correctly and fully readable against the pale yellow background. No misspellings or illegible parts. Matches intended text exactly. VERDICT: PASS

PASS

'Stars shine brightest in the dark night' displayed on a clean off-white background

Evaluator (Qwen 3 VL)

Text reads: "Stars shine brightest in the dark night". All words are spelled correctly and fully legible against the clean off-white background. No misspellings or missing words detected. Font is clear and high-contrast. VERDICT: PASS

Hard 100%

PASS

A poster with headline 'SUMMER SALE' in large bold text, body text 'Up to 50% off all items this weekend only' in medium size, and fine print 'Terms and conditions apply. See store for details.' at the bottom in small text

Evaluator (Qwen 3 VL)

All text matches the intended message exactly: "SUMMER SALE", "Up to 50% off all items this weekend only", and "Terms and conditions apply. See store for details." Every word is spelled correctly and fully readable, with no missing or illegible parts. Font sizes are appropriate for hierarchy. VERDICT: PASS

PASS

A flyer with large title 'GRAND OPENING', subtitle 'Join us for food, music, and fun' in medium text, and small disclaimer 'RSVP required. Limited seating available.' at the bottom

Evaluator (Qwen 3 VL)

All text matches the intended content exactly. "GRAND OPENING", "Join us for food, music, and fun", and "RSVP required. Limited seating available." are all spelled correctly and fully readable. No misspellings or illegible parts detected. VERDICT: PASS

PASS

An event banner with headline 'ANNUAL GALA 2025' at the top in large serif font, description 'An evening of fine dining and live jazz' in medium text, and footnote 'Black tie optional. Doors open at 6 PM.' in small text

Evaluator (Qwen 3 VL)

All text matches the intended content exactly: "ANNUAL GALA 2025", "An evening of fine dining and live jazz", and "Black tie optional. Doors open at 6 PM." Every word is spelled correctly and fully readable, with no missing or illegible parts. Font styles and sizes align with the description. VERDICT: PASS

Extreme 0%

FAIL

A vintage movie poster for 'THE LAST VOYAGE' featuring a tagline 'One ship. One chance. No turning back.' with credits block reading 'Directed by James Harlow. Starring Elena Voss, Marcus Chen, and Sofia Reyes. Music by Thomas Grant. A Paramount Pict

Evaluator (Qwen 3 VL)

Text errors: "Mcernic" (should be "Elena Voss"), "Pluinis" (illegible/incorrect), "Piclrss Tctures" (should be "Paramount Pictures"). "Starring" is misaligned. "Music by Thomas Grant" is partially obscured. Several words are distorted or unreadable. Core text is not fully accurate. VERDICT: FAIL

FAIL

A magazine cover titled 'NATIONAL EXPLORER' with cover lines 'The Hidden Temples of Cambodia — page 34', 'Interview: Dr. Maria Santos on Climate Action', 'Top 10 Trails You Must Hike This Year', and a barcode area with 'Issue 247 | March 2025 | $6.99

Evaluator (Qwen 3 VL)

“Cambuida” is misspelled (should be “Cambodia”). All other text is correctly spelled and readable. The barcode and pricing are accurate. The core requirement for full text accuracy is not met due to the single misspelling. VERDICT: FAIL

FAIL

A product label for 'GOLDEN HARVEST ORGANIC HONEY' with subtitle 'Raw & Unfiltered', weight '12 oz (340g)', description 'Pure wildflower honey harvested from sustainable apiaries in the Pacific Northwest. No additives or preservatives.', and small te

Evaluator (Qwen 3 VL)

“Unfilterred” is misspelled (should be “Unfiltered”). “preservatives” is misspelled (should be “preservatives”). All other text matches the intended content exactly. The misspellings are clear and fully readable. VERDICT: FAIL

Spatial Reasoning 81%

Attributes BindingClick to expand100%

Attributes Binding measures whether local/krea-2-turbo attaches the right colors, materials, and properties to the correct requested objects.

Easy 100%

PASS

A red ball and a blue cube sitting side by side on a white surface

Evaluator (Qwen 3.5 122B)

The image shows two objects on a white surface: - Left object: Red, spherical (ball) — matches “red ball” - Right object: Blue, cubic (cube) — matches “blue cube” Colors, shapes, and positions are correctly bound with no swaps. Size is comparable and appropriate for “side by side” placement. All prompt attributes are accurately represented. VERDICT: PASS

PASS

A green apple and an orange pumpkin on a wooden cutting board

Evaluator (Qwen 3.5 122B)

The image shows a green apple and an orange pumpkin on a wooden cutting board. The apple is correctly green, round, and in the foreground. The pumpkin is orange, ribbed, and positioned behind the apple. All attributes (color, shape, size) are correctly bound to their respective objects with no swaps. VERDICT: PASS

PASS

A yellow taxi and a black limousine parked next to each other on a street

Evaluator (Qwen 3.5 122B)

The image shows a yellow taxi and a black limousine parked side by side on a street. The taxi is correctly colored yellow and has typical taxi features including a roof sign. The limousine is correctly colored black and appears elongated as expected for a limousine. Both vehicles are positioned next to each other on what appears to be a city street with buildings in the background. All attributes are correctly bound to their respective objects. VERDICT: PASS

Medium 100%

PASS

A large red circle, a small blue triangle, and a medium green star arranged on a white canvas

Evaluator (Qwen 3.5 122B)

The image contains three distinct shapes on a white background: 1. A large red circle — correct color, size, and shape. 2. A small blue triangle — correct color, size, and shape. 3. A medium green star — correct color, size, and shape. All attributes are correctly bound to their respective objects with no swaps or errors. VERDICT: PASS

PASS

A tall thin white candle, a short fat red candle, and a medium spiral yellow candle on a shelf

Evaluator (Qwen 3.5 122B)

The image shows three candles on a shelf: 1. Left: Tall, thin, white candle — matches description. 2. Center: Short, fat, red candle — matches description. 3. Right: Medium height, spiral-shaped, yellow candle — matches description. All attributes (color, size, shape) are correctly bound to each object with no swaps. Lighting and texture are consistent; no artifacts or misalignments. VERDICT: PASS

PASS

A big brown horse, a small white rabbit, and a medium-sized black cat standing in a meadow

Evaluator (Qwen 3.5 122B)

The image shows a large brown horse with a white facial stripe and black mane, correctly matching "big brown horse." A small white rabbit is present in the foreground, matching "small white rabbit." A medium-sized black cat with green eyes stands beside them, matching "medium-sized black cat." All objects are correctly attributed with no swaps. The setting is a meadow with flowers and grass. VERDICT: PASS

Hard 100%

PASS

A pink fire truck, a blue banana, and a green sun in a surreal landscape

Evaluator (Qwen 3.5 122B)

The image shows a pink fire truck with correct structural details (ladder, lights, wheels). A large blue banana is present in the background, curving upward. A large green circular object (sun/moon) appears in the sky. All requested objects are present with correct colors and shapes. No attribute swaps observed. The surreal landscape context is also maintained. VERDICT: PASS

PASS

A purple carrot, a red tree with red leaves, and a white watermelon sliced open

Evaluator (Qwen 3.5 122B)

The image contains: 1. Purple carrot with green leafy top — correct color and shape. 2. Red tree with red leaves — correct color and structure. 3. White watermelon sliced open — outer rind is white, interior is red with black seeds — matches “white watermelon” as described (unusual but intentional). All attributes are correctly bound to their respective objects. No swaps or misattributions. VERDICT: PASS

PASS

An orange elephant, a yellow penguin, and a silver strawberry arranged on a table

Evaluator (Qwen 3.5 122B)

The image contains an orange elephant with white tusks and black eyes, a yellow penguin with a white belly and orange beak/feet, and a silver strawberry with green leaves. All objects match the prompt’s color, shape, and identity requirements — no swaps or misattributions. The arrangement on a table-like surface is also correct. VERDICT: PASS

CompositionalityClick to expand89%

Compositionality measures whether local/krea-2-turbo combines subjects, actions, outfits, props, and scene details into one coherent requested image.

Easy 100%

PASS

A tabby cat wearing oversized red sunglasses

Evaluator (Qwen 3.5 122B)

The image shows a tabby cat with distinct black and brown markings. It is wearing large, bright red sunglasses that cover its eyes. The sunglasses are oversized relative to the cat’s face, fitting the description. All elements — tabby cat, oversized red sunglasses — are present and coherently combined. No contradictions or missing components. VERDICT: PASS

PASS

A golden retriever wearing a tiny cowboy hat

Evaluator (Qwen 3.5 122B)

The image shows a golden retriever with a brown cowboy hat on its head. The hat is proportionally small relative to the dog’s head, fitting the “tiny” descriptor. The dog’s expression is cheerful, mouth open, tongue visible — consistent with typical golden retriever portrayals. All prompt elements are present and coherently combined: breed, hat style, and size relationship. No contradictions or missing components. VERDICT: PASS

PASS

A penguin holding a colorful umbrella

Evaluator (Qwen 3.5 122B)

The image contains: - A penguin: Present (black and white body, orange beak, yellow-orange neck markings) - Holding: Present (penguin’s flipper grips umbrella handle) - Colorful umbrella: Present (rainbow panels: red, green, yellow, orange, blue, purple) All elements are clearly depicted and coherently combined. The penguin is anthropomorphically holding the umbrella in a plausible, stylized manner. VERDICT: PASS

Medium 100%

PASS

A medieval knight in full armor sitting at a desk typing on a modern laptop

Evaluator (Qwen 3.5 122B)

The image shows a medieval knight in full armor, including helmet, chainmail, and plate armor with a red cross emblem. The knight is seated at a wooden desk, typing on a modern Apple laptop. All described elements — knight, armor, desk, laptop — are present and coherently combined in a surreal but visually consistent scene. VERDICT: PASS

PASS

A Roman gladiator taking a selfie with a smartphone in the Colosseum

Evaluator (Qwen 3.5 122B)

The image contains all required elements: 1. Roman gladiator - Present (wearing helmet with red crest, armor, and red cape) 2. Taking a selfie - Present (holding phone up, looking at screen) 3. Smartphone - Present (modern black smartphone with triple camera) 4. Colosseum - Present (ancient Roman amphitheater structure in background) All elements are coherently combined in a single scene with proper lighting and perspective. VERDICT: PASS

PASS

An astronaut in a spacesuit riding a bicycle through a park

Evaluator (Qwen 3.5 122B)

The image contains all required elements: 1. Astronaut in a spacesuit: Present (white suit with American flag patch, helmet, backpack). 2. Riding a bicycle: Present (blue bicycle with visible wheels, handlebars, pedals). 3. Through a park: Present (trees, grass, paved path, natural lighting). All elements are coherently combined in a single scene. VERDICT: PASS

Hard 67%

PASS

An octopus wearing a top hat and monocle playing chess on the surface of the moon with Earth visible in the background

Evaluator (Qwen 3.5 122B)

The image contains all described elements: 1. Octopus - Present (central figure) 2. Top hat - Present (on octopus head) 3. Monocle - Present (over one eye with chain) 4. Chess game - Present (board with pieces) 5. Moon surface - Present (cratered ground) 6. Earth in background - Present (visible planet) All components are coherently combined in a single scene. VERDICT: PASS

PASS

A dinosaur in a business suit giving a PowerPoint presentation in a modern office while drinking coffee

Evaluator (Qwen 3.5 122B)

The image contains all required elements: 1. Dinosaur - Present (T-Rex head/body) 2. Business suit - Present (dark suit, white shirt, tie) 3. PowerPoint presentation - Present (projector screen with "Povemfiont" and YouTube-like interface) 4. Modern office - Present (conference room, laptop, windows) 5. Drinking coffee - Present (holding white coffee cup) All elements are coherently combined in a single scene. The dinosaur is actively presenting while holding coffee, creating the intended humor

FAIL

A robot dressed as a pirate sailing a paper boat on a lake of lava while juggling three kittens

Evaluator (Qwen 3.5 122B)

Let’s examine each element from the prompt: 1. Robot dressed as a pirate — ✅ Present (robot with pirate hat, vest, skull emblem). 2. Sailing a paper boat — ✅ Present (white origami-style boat on water). 3. On a lake of lava — ❌ Not present — it’s a calm blue lake with lily pads and trees, not lava. 4. Juggling three kittens — ⚠️ Partially present — three kittens are floating around the robot, but not being juggled; they’re mid-air in playful poses, not actively manipulated by the robot’s hands.

CountingClick to expand56%

Counting measures whether local/krea-2-turbo creates the exact number of requested objects without adding or missing items.

Easy 100%

PASS

Exactly 3 red apples arranged on a wooden table

Evaluator (Qwen 3.5 122B)

The image displays exactly three red apples with yellow streaks, placed on a wooden surface. The apples are arranged in a triangular formation. The count matches the prompt’s request for “exactly 3 red apples arranged on a wooden table.” No extra or missing objects are present. VERDICT: PASS

PASS

Exactly 5 yellow rubber ducks sitting in a row on a bathroom shelf

Evaluator (Qwen 3.5 122B)

The image shows exactly five yellow rubber ducks arranged in a row on a bathroom shelf. The ducks are evenly spaced, facing right, with water droplets visible on them and the surrounding tiles. A soap dispenser is partially visible in the background. The count matches the prompt’s request precisely. VERDICT: PASS

PASS

Exactly 4 white candles standing on a dark surface

Evaluator (Qwen 3.5 122B)

The image displays four white candles standing upright on a dark surface. Each candle is unlit, with visible wicks and slight texture variations. The count matches the prompt’s request exactly: four candles, white color, standing position, dark background. No extra or missing objects. All criteria are met precisely. VERDICT: PASS

Medium 67%

PASS

2 orange cats and 3 black dogs sitting together on a grassy lawn

Evaluator (Qwen 3.5 122B)

The image shows exactly 3 black dogs in the back row and 2 orange cats in the front, all sitting on grass. The count matches the prompt’s request precisely: 3 dogs and 2 cats. No extra or missing animals. Composition is clean and aligned with instructions. VERDICT: PASS

PASS

4 red roses and 2 white daisies in a glass vase on a table

Evaluator (Qwen 3.5 122B)

The image contains exactly 4 red roses and 2 white daisies arranged in a clear glass vase, placed on a patterned tablecloth. All requested objects are present and correctly counted. The composition matches the prompt precisely with no extra or missing elements. VERDICT: PASS

FAIL

1 blue butterfly and 3 yellow butterflies resting on a garden fence

Evaluator (Qwen 3.5 122B)

The image shows 1 blue butterfly and 2 yellow butterflies resting on a wooden garden fence with wire. The prompt requested 3 yellow butterflies, but only 2 are visible. Therefore, the count does not exactly match the request. VERDICT: FAIL

Hard 0%

FAIL

Exactly 9 goldfish swimming in a large round fishbowl

Evaluator (Qwen 3.5 122B)

The image shows a round fishbowl containing goldfish. Counting the fish: there are 10 goldfish visible in the bowl, not 9 as requested. The prompt specified “Exactly 9 goldfish,” so the count does not match. All other elements (round bowl, swimming fish) are correct, but the numerical requirement is violated. VERDICT: FAIL

FAIL

A fruit bowl containing exactly 3 bananas, 4 oranges, and 2 apples

Evaluator (Qwen 3.5 122B)

The image shows a fruit bowl with 3 bananas, 5 oranges, and 2 apples. The prompt requested exactly 3 bananas, 4 oranges, and 2 apples. While the banana and apple counts are correct, there is one extra orange. Therefore, the total count does not exactly match the request. VERDICT: FAIL

FAIL

Exactly 11 colorful balloons floating against a clear blue sky

Evaluator (Qwen 3.5 122B)

The image shows a cluster of colorful balloons against a blue sky. Counting carefully: there are 12 distinct balloons visible (yellow, blue, red, green, pink, orange, purple — some partially hidden). The prompt requested exactly 11. Since the count is 12, not 11, the requirement is not met. VERDICT: FAIL

NegationClick to expand56%

Negation measures whether local/krea-2-turbo respects what should be absent from the image, not only what should be present.

Easy 100%

PASS

A tropical beach scene with no people

Evaluator (Qwen 3.5 122B)

The image shows a tropical beach scene with white sand, turquoise water, palm trees, and a blue sky with clouds. There are footprints in the sand, but no people are visible anywhere in the frame — on the beach, in the water, or in the background foliage. The excluded element (“people”) is fully absent. VERDICT: PASS

PASS

A cat lying on a couch without a collar

Evaluator (Qwen 3.5 122B)

The image shows a cat lying on a couch. The cat has white and tabby fur, green eyes, and is positioned comfortably between textured pillows. Importantly, there is no collar visible around the cat’s neck — the prompt explicitly excluded “a collar,” and it is indeed absent. All other elements (cat, couch, lying pose) match the prompt’s inclusion criteria. VERDICT: PASS

PASS

A classroom with desks and a chalkboard but no students

Evaluator (Qwen 3.5 122B)

The image shows an empty classroom with rows of wooden desks and chairs facing a green chalkboard. The chalkboard has faint chalk marks and some writing on the sides. There are no students visible in the room. The excluded element (students) is truly absent. VERDICT: PASS

Medium 33%

FAIL

A sandy beach with palm trees, seashells, and a beach umbrella but no water or ocean visible

Evaluator (Qwen 3.5 122B)

The image clearly shows a sandy beach with palm trees, seashells in the foreground, and a beach umbrella. However, contrary to the exclusion criterion, there is visible water — specifically, turquoise ocean waves breaking along the shoreline in the midground. The ocean stretches across the horizon under a blue sky. Since the prompt explicitly forbade any water or ocean, and it is prominently present, the image fails the constraint. VERDICT: FAIL

PASS

An office workspace with a desk, chair, bookshelf, and lamp but no computers or screens

Evaluator (Qwen 3.5 122B)

The image shows an office workspace with a wooden desk, a black office chair, a bookshelf filled with books and binders, and a lit table lamp. There is also a landline telephone on the desk. No computers, monitors, or screens are visible in the scene. The excluded elements are truly absent. VERDICT: PASS

FAIL

A kitchen with pots, pans, a stove, and a cutting board but no food

Evaluator (Qwen 3.5 122B)

The image shows a kitchen scene with pots, pans, a stove, and a cutting board as described. However, the prompt explicitly excluded “food,” and two glass jars on the counter contain visible food items: one with dark granules (possibly spices or coffee) and another with light-colored beans or grains. These are clearly food products. Therefore, the excluded element is present. VERDICT: FAIL

Hard 33%

FAIL

A pepperoni pizza with tomato sauce, basil, and crust but absolutely no cheese

Evaluator (Qwen 3.5 122B)

The image shows a pepperoni pizza with tomato sauce, basil leaves, and a crust. However, there is clearly melted cheese visible beneath the pepperoni slices and around the edges — it’s bubbly, golden in places, and integrated into the sauce layer. The prompt explicitly excluded cheese, but cheese is present. VERDICT: FAIL

PASS

A car body with windows, doors, and headlights but no wheels

Evaluator (Qwen 3.5 122B)

The image shows a silver car body with windows, doors, and headlights clearly visible. The wheel wells are empty — no wheels are present. The front bumper area is exposed, showing internal structure, and the rear wheel arch also lacks a wheel. All excluded elements (wheels) are absent as required. VERDICT: PASS

FAIL

A human face with eyes, nose, and mouth but no eyebrows

Evaluator (Qwen 3.5 122B)

The image shows a close-up of a human face with clearly visible eyes, nose, and mouth. Importantly, eyebrows are present above both eyes — they are sparse but distinctly drawn or rendered with fine hairs. Since the prompt explicitly excluded eyebrows, and they are visibly present in the image, the exclusion criterion is violated. VERDICT: FAIL

Relative PositionClick to expand100%

Relative Position measures whether local/krea-2-turbo places objects in the requested spatial relationship, such as above, below, inside, or next to another object.

Easy 100%

PASS

A cat sitting on top of a cardboard box

Evaluator (Qwen 3.5 122B)

The image shows a cat sitting directly on top of a cardboard box. The cat is positioned centrally on the box’s upper surface, with its paws resting on the box and its body elevated above it. The spatial relationship “on top of” is accurately represented. No other positional descriptors (left, right, under, etc.) are contradicted. All described spatial relationships are correct. VERDICT: PASS

PASS

A red ball underneath a wooden chair

Evaluator (Qwen 3.5 122B)

The image shows a red ball positioned directly beneath the seat of a wooden chair, resting on the floor between the chair’s legs. The spatial relationship “underneath” is accurately depicted — the ball is below the chair’s seat and within its leg structure. All described positional relationships are correct. VERDICT: PASS

PASS

A coffee mug next to an open book on a desk

Evaluator (Qwen 3.5 122B)

The image shows a coffee mug positioned to the left and slightly behind an open book, both resting on a desk surface. A laptop is visible in the background to the right. The spatial relationship “coffee mug next to an open book on a desk” is accurately represented — the mug is adjacent to the book, and both are on the desk. No positional errors are present. VERDICT: PASS

Medium 100%

PASS

A bird perched on a branch, the branch extending from a tree next to a park bench

Evaluator (Qwen 3.5 122B)

The image shows a bird perched on a branch that extends from a tree trunk on the left side of the frame. The branch reaches toward the right, above a park bench situated below and to the right of the bird. The spatial relationships described — “bird on branch,” “branch extending from tree,” and “tree next to bench” — are all accurately represented. The bench is beneath and adjacent to the branch’s extension. VERDICT: PASS

PASS

A cat sitting on a pillow, the pillow placed on a wooden chair

Evaluator (Qwen 3.5 122B)

The image shows a cat sitting on a pillow, and the pillow is placed on a wooden chair. The spatial relationships described in the prompt are accurately reflected in the image: the cat is on the pillow, and the pillow is on the chair. All positional relationships are correct. VERDICT: PASS

PASS

A candle on top of a stack of books, the books next to a potted plant

Evaluator (Qwen 3.5 122B)

The image shows a lit candle in a glass holder placed directly on top of a stack of three books. To the right of the books (from the viewer’s perspective) is a potted plant with green leaves in a speckled ceramic pot. The spatial relationships described — “candle on top of books” and “books next to potted plant” — are accurately represented. All positional descriptors align with the visual layout. VERDICT: PASS

Hard 100%

PASS

Five chess pieces on a board: a king in the center, a queen to its left, a bishop to its right, a rook above the king, and a knight below the king

Evaluator (Qwen 3.5 122B)

The image shows five black chess pieces on a checkered board. The king is centered. The queen is to its left, the bishop to its right, the rook above, and the knight below — matching the prompt’s spatial description exactly. All positional relationships are correctly rendered. VERDICT: PASS

PASS

A desk scene: laptop in the center, coffee mug to the left, phone to the right, lamp above, and notebook below

Evaluator (Qwen 3.5 122B)

The image shows a desk scene with a laptop centered, a coffee mug to its left, a phone to its right, a lamp positioned above and to the right of the laptop, and an open notebook with a pen below the laptop. All specified spatial relationships — left, right, above, below — are accurately represented as described in the prompt. VERDICT: PASS

PASS

Five animals in a field: a horse on the far left, a cow to the right of the horse, a sheep in the center, a goat to the right of the sheep, and a dog on the far right

Evaluator (Qwen 3.5 122B)

The image shows five animals in a field arranged as described: a horse on the far left, a cow to its right, a sheep centered in front, a goat to the sheep’s right, and a dog seated on the far right. All positional relationships match the prompt exactly — left/right/center placements are accurate with no contradictions. VERDICT: PASS

Extreme 100%

PASS

A landscape with a red barn in the foreground at large scale, a windmill in the midground at medium scale, and snow-capped mountains in the background at small scale, all with consistent depth perspective

Evaluator (Qwen 3.5 122B)

The image shows a red barn on the right side in the foreground, appearing large and dominant. A windmill stands to the left of the barn in the midground, smaller in scale. Snow-capped mountains rise behind both structures in the background, appearing distant and smaller due to perspective. The spatial arrangement matches the prompt: barn (foreground, large), windmill (midground, medium), mountains (background, small). All positional relationships are correct. VERDICT: PASS

PASS

A beach scene with a sandcastle in the foreground, a surfer on the waves in the midground, and a cruise ship on the horizon in the background, sizes decreasing with distance

Evaluator (Qwen 3.5 122B)

The image correctly shows a sandcastle in the foreground, a surfer riding waves in the midground, and a cruise ship on the horizon in the background. The sizes of the objects decrease with distance as specified. All positional relationships (foreground, midground, background) are accurate and consistent with the prompt. No spatial errors detected. VERDICT: PASS

PASS

A city park with a park bench and pigeons in the foreground, a fountain in the midground, and skyscrapers in the background, each layer progressively smaller

Evaluator (Qwen 3.5 122B)

The image correctly shows pigeons and a park bench in the foreground, a fountain in the midground, and skyscrapers in the background. The spatial layering is accurate: pigeons are on and around the bench, the fountain sits behind the bench on grass, and buildings rise behind trees in the distance. All described positional relationships align with the prompt. VERDICT: PASS

Scale & ProportionsClick to expand78%

Scale & Proportions measures whether local/krea-2-turbo keeps people, vehicles, buildings, and everyday objects at believable relative sizes.

Easy 100%

PASS

A person standing next to a yellow school bus, showing realistic relative scale

Evaluator (Qwen 3.5 122B)

The person standing next to the bus appears to be of normal human proportions relative to the vehicle. The height of the person reaches approximately the middle of the bus door, which is consistent with real-world scale. The wheels, windows, and overall dimensions of the bus also appear proportionally correct for a standard school bus. No obvious distortions or scaling errors are present. VERDICT: PASS

PASS

A ladybug sitting on a large green leaf, shown at realistic proportions

Evaluator (Qwen 3.5 122B)

The ladybug appears appropriately sized relative to the leaf, consistent with real-world proportions. The leaf’s texture, veins, and dewdrops are detailed and scale-consistent with a macro shot of a small insect on foliage. No exaggerated or distorted sizing is evident. VERDICT: PASS

PASS

A child holding a basketball, both at correct relative sizes

Evaluator (Qwen 3.5 122B)

The image shows a child’s torso and arm holding a basketball. The ball appears appropriately sized relative to the child’s hand and body — consistent with real-world proportions for a standard basketball held by a child. No anatomical or scale distortions are visible. The grip, arm angle, and ball placement all look natural and physically plausible. VERDICT: PASS

Medium 67%

PASS

A tiny person standing on the rim of a coffee cup, looking down into it

Evaluator (Qwen 3.5 122B)

The image shows a person standing on the rim of a large coffee cup, looking down into it. The person is depicted as tiny relative to the cup, which aligns with the prompt’s description of “a tiny person standing on the rim of a coffee cup, looking down into it.” The proportions are consistent with the intended surreal scale — the cup is oversized, and the figure is miniature in comparison. No contradictions in size relationships are present. VERDICT: PASS

FAIL

A giant tabby cat walking between city skyscrapers like a kaiju

Evaluator (Qwen 3.5 122B)

The image depicts a tabby cat walking down a city street with skyscrapers in the background. However, the cat is not giant relative to the buildings — it appears normal-sized, roughly the size of a real house cat, while the buildings are full-scale skyscrapers. The perspective is low-angle, making the cat appear prominent, but not kaiju-sized. The prompt requires the cat to be “giant” and comparable in scale to the skyscrapers, which is not achieved here. VERDICT: FAIL

PASS

A miniature house sitting on the palm of someone's hand

Evaluator (Qwen 3.5 122B)

The image shows a miniature house resting on an open human palm. The house is appropriately scaled to fit comfortably within the hand, with proportions that suggest it is a small model rather than a full-sized structure. The fingers and thumb frame the house naturally, and the perspective supports the intended size relationship. No anatomical or scaling inconsistencies are visible. VERDICT: PASS

Hard 67%

FAIL

A group photo of an elephant, a horse, a dog, a cat, and a mouse, all standing in a line with correct real-world proportional sizes

Evaluator (Qwen 3.5 122B)

The image shows five animals arranged in a line: elephant, dog, cat, mouse, and horse. However, the size relationships are incorrect. The dog is nearly as tall as the horse’s shoulder, and the cat is almost as large as the dog — both unrealistic. The mouse is also disproportionately large relative to the cat. Real-world proportions are not maintained. VERDICT: FAIL

PASS

A dinner table scene with a plate, fork, knife, wine glass, salt shaker, and peppercorn, all at realistic relative sizes

Evaluator (Qwen 3.5 122B)

The image shows a table setting with a plate, fork, knife, wine glass, salt shaker, and peppercorn shaker. The relative sizes appear realistic: the plate is appropriately larger than the cutlery, the wine glass is proportionate to the plate, and the shakers are smaller than the plate but larger than the cutlery handles. All objects maintain consistent scale relationships typical of a real dining setup. VERDICT: PASS

PASS

A farm scene with a barn, tractor, farmer, cow, chicken, and a fence post, all at correct proportional scale

Evaluator (Qwen 3.5 122B)

The image shows a farm scene with all requested elements: barn, tractor, farmer, cow, chicken, and fence post. Proportions are generally consistent — the tractor is appropriately large relative to the farmer, the cow is realistically sized compared to both, and the chicken is small in the foreground. The fence post and barn scale also align with real-world expectations. Minor stylization doesn’t break proportionality. VERDICT: PASS

Human realism 79%

Faces & ExpressionsClick to expand92%

Faces & Expressions measures whether local/krea-2-turbo creates realistic faces with the requested age, emotion, expression, and identity cues.

Easy 100%

PASS

A close-up portrait of a young woman with a warm natural smile

Evaluator (Qwen 3 VL)

Face is largely symmetric with natural features. Smile appears warm and genuine, with subtle lip curvature and slight eye crinkling. No obvious distortions. Skin texture and lighting look authentic. The emotion of contentment is clearly readable. VERDICT: PASS

PASS

A portrait of a middle-aged man with a calm, serious expression

Evaluator (Qwen 3 VL)

Face is symmetric with natural, correctly proportioned features. Expression is calm and serious, with subtle lines conveying maturity. No visible distortions. Skin texture and lighting are realistic. The prompt’s requirements are fully met. VERDICT: PASS

PASS

A headshot of a teenager with a relaxed, neutral face

Evaluator (Qwen 3 VL)

The face is symmetric with natural, correctly proportioned features. Eyes, nose, and mouth are aligned and realistic. The neutral, relaxed expression is clearly readable with no distortions. Skin texture, freckles, and hair look authentic. No AI artifacts or unnatural deformities are present. VERDICT: PASS

Medium 100%

PASS

A close-up of a person's face showing clear disgust, nose wrinkled and lip curled

Evaluator (Qwen 3 VL)

The face shows clear disgust: nose wrinkled, lips curled, and brows furrowed. Features are symmetric and anatomically correct. Skin texture and fine details (moles, pores) are natural. No distortions or artifacts. Emotion is unmistakably readable. VERDICT: PASS

PASS

A portrait capturing genuine surprise — raised eyebrows, wide eyes, slightly open mouth

Evaluator (Qwen 3 VL)

Face is symmetric with natural features. Eyes are wide, eyebrows raised, mouth slightly open — clearly conveying genuine surprise. No distortions; expression is authentic and well-executed. Skin texture and freckles appear realistic. VERDICT: PASS

PASS

A face showing quiet melancholy — downcast eyes, slightly drawn mouth, distant gaze

Evaluator (Qwen 3 VL)

Face is symmetric with realistic features. Eyes are downcast, lips slightly drawn, gaze distant — all aligning with the prompt. Skin texture and freckles add authenticity. No major distortions. Emotion is clearly readable as quiet melancholy. VERDICT: PASS

Hard 67%

PASS

An elderly woman in her 80s with deep wrinkles and white hair, smiling warmly with kind eyes

Evaluator (Qwen 3 VL)

The face shows natural asymmetry and deep wrinkles consistent with an elderly woman. White, textured hair frames the face. The warm smile and kind eyes are clearly visible, conveying genuine emotion. No significant distortions; features are anatomically plausible. The image successfully captures the requested expression and age. VERDICT: PASS

FAIL

A 5-year-old child with round cheeks and large eyes, crying with tears streaming down their face

Evaluator (Qwen 3 VL)

The child’s face is symmetric with round cheeks and large eyes, matching the prompt. However, there are no visible tears streaming down the face; the expression appears neutral or slightly sad, not clearly crying. Facial features are well-defined without distortions. The prompt’s key emotional element (crying) is missing. VERDICT: FAIL

PASS

An elderly man in his 70s with age spots and thinning gray hair, looking skeptically over reading glasses

Evaluator (Qwen 3 VL)

Face is symmetric with realistic wrinkles, age spots, and thinning gray hair. Skeptical expression is clearly readable via furrowed brow and downturned mouth. No significant distortions. Features align with prompt. VERDICT: PASS

Extreme 100%

PASS

A group of three friends at a table: one laughing with head tilted back, one looking shocked with wide eyes, and one rolling their eyes with a smirk

Evaluator (Qwen 3 VL)

Faces show clear, symmetric features with natural proportions. Emotions are vividly readable: left man laughs with head back, center woman looks shocked with wide eyes and hands to cheeks, right man smirks with rolled eyes. No distortions or unnatural artifacts. All three expressions match the prompt precisely. VERDICT: PASS

PASS

A family photo with four people: a father beaming with pride, a mother smiling gently, a teenager looking bored, and a child grinning excitedly

Evaluator (Qwen 3 VL)

Faces show natural symmetry and correct anatomical features. Emotions are clearly readable: father beams proudly, mother smiles gently, teenager looks bored (hand on cheek, neutral expression), child grins excitedly. No distortions or unnatural artifacts. All four individuals match the prompt’s emotional descriptions accurately. VERDICT: PASS

PASS

Three coworkers in a meeting: one looking frustrated with furrowed brows, one nodding with a pleased expression, and one looking confused with a tilted head

Evaluator (Qwen 3 VL)

Faces are symmetric with correct features. Left woman shows frustration (furrowed brows). Center man shows pleased expression (smile, gaze). Right woman shows confusion (tilted head, furrowed brow). Emotions are clearly readable. No distortions. VERDICT: PASS

Full BodyClick to expand92%

Full Body measures whether local/krea-2-turbo renders full-body people with natural anatomy, clothing, pose, and proportions.

Easy 100%

PASS

A woman standing upright in a relaxed pose, arms at her sides, wearing jeans and a white t-shirt

Evaluator (Qwen 3 VL)

The figure shows a full-body view of a woman standing upright with arms at her sides, wearing jeans and a white t-shirt. Limb count and proportions are anatomically correct. The pose is clearly recognizable as described. No distortions are visible. All elements match the prompt precisely. VERDICT: PASS

PASS

A man standing straight facing the camera, hands in pockets, normal proportions

Evaluator (Qwen 3 VL)

The man stands straight, facing forward, with hands in pockets. Limb count and proportions are anatomically correct. The pose is clearly recognizable as described. No distortions are visible. All elements match the prompt precisely. VERDICT: PASS

PASS

A person standing in a neutral pose in a park, full body visible from head to feet

Evaluator (Qwen 3 VL)

Limb count and proportions are anatomically correct. The pose is a neutral standing stance, recognizable as requested. No significant distortions are visible. The full body from head to feet is shown, though the head is cropped out. The park setting and clothing match the prompt. VERDICT: PASS

Medium 100%

PASS

A woman mid-stride running through a park, hair and clothes showing motion

Evaluator (Qwen 3 VL)

The woman’s limbs and proportions are anatomically correct. Her running pose is dynamic and recognizable, with arms swinging and hair flowing mid-stride. No distortions in limb count or proportions are visible. Motion is convincingly rendered in both hair and clothing. VERDICT: PASS

PASS

A man captured mid-jump over a puddle, body in the air with bent knees

Evaluator (Qwen 3 VL)

The man’s limbs are correctly counted and proportioned. His pose—bent knees, arms outstretched mid-air—accurately depicts jumping over a puddle, with water splashing realistically. No anatomical distortions are visible. The action is clearly recognizable and matches the prompt. VERDICT: PASS

PASS

A dancer in a dynamic spin, dress flowing outward with one leg extended

Evaluator (Qwen 3 VL)

The dancer has correct limb count and proportions. The pose is a dynamic spin with one leg extended, recognizable as ballet. Dress flows outward as described. Slight distortion in the right arm’s bend and hair strands appear overly stylized, but core anatomy and action are accurate. VERDICT: PASS

Hard 67%

PASS

A person in a yoga tree pose — standing on one leg with the other foot pressed to the inner thigh, arms raised overhead

Evaluator (Qwen 3 VL)

The person stands on one leg, with the other foot pressed against the inner thigh, arms raised overhead in prayer position. Limb count and proportions are correct. The pose is clearly recognizable as Tree Pose. No anatomical distortions are visible. VERDICT: PASS

PASS

A rock climber clinging to a cliff face with arms and legs extended to different holds

Evaluator (Qwen 3 VL)

The climber has correct limb count and proportions. Her pose—arms extended to holds, legs positioned for support—is anatomically plausible and matches the prompt. No obvious distortions in limbs or joints. The action is clearly recognizable as rock climbing. Gear and holds appear realistic. VERDICT: PASS

FAIL

A cellist seated with correct posture, legs around the instrument, bow arm extended

Evaluator (Qwen 3 VL)

The cellist has correct limb count and proportions. His posture is anatomically plausible for playing: seated, legs around the instrument, bow arm extended. The cello rests on a stand, not his legs, which contradicts the prompt. The pose is recognizable as cello playing, though the instrument’s placement is inaccurate. VERDICT: FAIL

Extreme 100%

PASS

Two tango dancers in a close embrace, the man dipping the woman backward with intertwined legs

Evaluator (Qwen 3 VL)

The dancers’ limbs are correctly counted and proportioned. The woman is dipped backward with intertwined legs, matching the prompt. Anatomical structure is accurate, with no visible distortions. The pose is recognizable as a classic tango embrace with a dip. All elements align with the requested action. VERDICT: PASS

PASS

Two soccer players mid-tackle, one sliding and the other jumping over with the ball at their feet

Evaluator (Qwen 3 VL)

Both players show correct limb count and proportions. The sliding player’s body posture and the jumping player’s mid-air pose are anatomically plausible and match the described action. No distortions are visible. The ball is at the feet of the jumping player, as prompted. The scene captures a realistic soccer tackle. VERDICT: PASS

PASS

Two gymnasts performing a synchronized handstand, side by side with identical body alignment

Evaluator (Qwen 3 VL)

Both gymnasts show correct limb count and proportions. Their synchronized handstand pose is anatomically accurate, with aligned bodies, straight legs, and stable hand placement. No distortions are visible. The action is clearly recognizable as a synchronized handstand. All elements match the prompt precisely. VERDICT: PASS

HandsClick to expand42%

Hands measures whether local/krea-2-turbo draws anatomically plausible hands, fingers, and gestures.

Easy 67%

PASS

A person standing with both hands visible at their sides, fingers relaxed and naturally spread

Evaluator (Qwen 3.5 122B)

Upon close examination: The left hand (viewer’s right) shows 5 fingers — thumb, index, middle, ring, and pinky — all present and anatomically plausible. The right hand (viewer’s left) also displays 5 fingers with natural spacing and relaxed posture. No extra or missing digits; no fusion or distortion. Grip/gesture matches “relaxed and naturally spread.” All criteria met. VERDICT: PASS

PASS

A woman holding her hands out in front of her palms-up, all ten fingers clearly visible

Evaluator (Qwen 3.5 122B)

The image shows a woman with both hands extended forward, palms facing the viewer. Each hand clearly displays five fingers: thumb, index, middle, ring, and pinky. The fingers are well-formed, proportionate, and anatomically correct — no extra, missing, or fused digits. The gesture is natural and matches the prompt’s requirement of “palms-up, all ten fingers clearly visible.” No anomalies detected. VERDICT: PASS

FAIL

A man with arms crossed, both hands resting on opposite upper arms with fingers visible

Evaluator (Qwen 3.5 122B)

The image shows a man with arms crossed. The right hand (on the left side of image) is not fully visible — only part of the forearm and elbow are seen; no fingers are visible. The left hand (on the right side) rests on the right upper arm, showing 4 fingers clearly (thumb is hidden or obscured). Anatomically, the grip is plausible for crossed arms, but finger count is incomplete: one hand has 0 visible fingers, the other has 4. Not exactly 5 per hand as required. VERDICT: FAIL

Medium 33%

FAIL

A person holding a white ceramic coffee mug by the handle with their right hand

Evaluator (Qwen 3.5 122B)

The image shows a right hand gripping a white ceramic mug by its handle. The thumb is positioned on top of the handle, and four fingers are wrapped around the front side of the handle — this is anatomically correct for a right-hand grip. However, only four fingers are visibly distinct; the pinky finger is either obscured or not fully rendered. While the grip itself is plausible, the incomplete visibility/rendering of all five fingers violates the “exactly 5 per hand” criterion. VERDICT: FAIL

PASS

A woman writing in a notebook with a pen gripped correctly in her right hand

Evaluator (Qwen 3.5 122B)

The right hand gripping the pen shows 5 fingers with anatomically correct positioning for writing. The left hand resting on the notebook also displays 5 fingers in a natural, relaxed position. No extra or missing fingers are visible. The grip is realistic and the hands appear properly proportioned and positioned. VERDICT: PASS

FAIL

A person playing an acoustic guitar with left hand on the fretboard and right hand strumming

Evaluator (Qwen 3.5 122B)

The left hand gripping the fretboard shows only four fingers clearly visible, with the thumb positioned behind the neck — anatomically plausible for guitar playing. However, the right hand (strumming hand) is partially obscured and appears to have only three visible fingers; the pinky and ring finger are either missing or merged unnaturally. This suggests an anatomical anomaly inconsistent with a natural human hand. The grip on the fretboard is mostly correct, but the strumming hand fails the “e

Hard 67%

PASS

A person making a peace sign with their right hand, clearly showing two fingers up and three curled

Evaluator (Qwen 3.5 122B)

The image shows a single right hand making a peace sign. The index and middle fingers are extended upward, while the thumb, ring, and pinky fingers are curled inward — totaling 5 fingers visible on one hand. The gesture is anatomically correct for a right hand. No anomalies or extra/missing fingers are present. Since only one hand is shown and it meets the criteria, the core requirement is satisfied. VERDICT: PASS

PASS

A hand giving a thumbs-up with four fingers curled and thumb extended upward

Evaluator (Qwen 3.5 122B)

The image shows one hand making a thumbs-up gesture. The thumb is extended upward, and the other four fingers are curled into a fist — this matches the prompt exactly. Anatomically, the hand appears correct: five digits total (one thumb + four fingers), proper joint alignment, natural skin texture, and no extra or missing fingers. No anomalies detected. VERDICT: PASS

FAIL

A person counting to three on their fingers, with index, middle, and ring fingers extended

Evaluator (Qwen 3.5 122B)

The image shows one hand with all five fingers extended — thumb, index, middle, ring, and pinky — clearly visible. The prompt asked for “counting to three” (index, middle, ring extended), but instead, all five are up. Anatomically, the hand is correct: no extra or missing fingers, natural creases, proper proportions. However, it fails the prompt’s specific gesture requirement. VERDICT: FAIL

Extreme 0%

FAIL

Two people shaking hands in a professional setting, all fingers of both hands anatomically correct

Evaluator (Qwen 3.5 122B)

Upon close examination, the handshake shows anatomical inconsistencies. The hand on the left (wearing white cuff) appears to have only 4 visible fingers gripping, with the thumb obscured or merged. The hand on the right (blue cuff) also shows ambiguous finger count — possibly 4 fingers visible, with unclear separation between digits. Grip structure is unnatural; fingers seem fused or misaligned. Not all fingers are clearly present or correctly positioned. VERDICT: FAIL

FAIL

Two people doing a high-five, palms meeting with all fingers extended on both hands

Evaluator (Qwen 3.5 122B)

The image shows two hands pressed together in a prayer-like or clapping gesture, not a high-five. The left hand (viewer’s left) clearly displays 5 fingers. The right hand (viewer’s right) also shows 5 fingers — thumb, index, middle, ring, and pinky — though the pinky is partially obscured by the left hand’s palm. Anatomically, the grip is plausible for a prayer position but incorrect for a high-five, which requires palms facing each other with fingers spread outward, not interlocked or pressed f

FAIL

One person handing a small gift box to another person, both sets of hands gripping the box correctly

Evaluator (Qwen 3.5 122B)

Upon close examination, the image shows four hands interacting with a gift box. The two hands on the left (receiving) appear anatomically correct with 5 fingers each. However, the two hands on theright (giving) show anomalies: the top right hand has only 4 visible fingers, and the bottom right hand’s thumb and index finger appear fused or malformed. Grip mechanics are also unnatural. These are clear anatomical errors. VERDICT: FAIL

Multi-SubjectClick to expand100%

Multi-Subject measures whether local/krea-2-turbo keeps multiple people distinct while preserving their requested attributes, outfits, and positions.

Easy 100%

PASS

Two women standing side by side: one with short blonde hair wearing a red jacket, the other with long black hair wearing a blue dress

Evaluator (Qwen 3 VL)

Left woman: short blonde hair, wearing red jacket over black top. Right woman: long black hair, wearing blue sleeveless dress. Both visually distinct and match prompt descriptions precisely. VERDICT: PASS

PASS

Two men at a cafe: one with a beard and glasses wearing a green sweater, the other clean-shaven in a black hoodie

Evaluator (Qwen 3 VL)

Left man: beard, glasses, green sweater — matches description. Right man: clean-shaven, black hoodie — matches description. Both are visually distinct and clearly identifiable. All prompt features are accurately rendered. VERDICT: PASS

PASS

Two friends posing: one tall with curly red hair in denim overalls, the other short with straight brown hair in a yellow sundress

Evaluator (Qwen 3 VL)

Tall person: curly red hair, denim overalls over striped shirt. Short person: straight brown hair, yellow sundress. Both visually distinct and match descriptions. Features are accurately rendered. VERDICT: PASS

Medium 100%

PASS

Four people at a dinner table: a bald man in a suit, a woman with silver hair in a red blouse, a young man with dreadlocks in a denim jacket, and a woman with a hijab in a green dress

Evaluator (Qwen 3 VL)

All four individuals are present and visually distinct, matching the prompt: bald man in suit, silver-haired woman in red blouse, young man with dreadlocks in denim jacket, woman in hijab with green dress. Features are accurately rendered. VERDICT: PASS

PASS

Three musicians on stage: a tall woman with short pink hair playing bass, a man with a beard and bandana on drums, and a slim man with glasses playing keyboard

Evaluator (Qwen 3 VL)

Tall woman: pink short hair, playing bass. Man: beard, bandana, on drums. Slim man: glasses, playing keyboard. All three are visually distinct and match descriptions precisely. VERDICT: PASS

PASS

Three hikers on a trail: a man with a backpack and sunhat, a woman with braided hair and hiking poles, and a teenager with a baseball cap and water bottle

Evaluator (Qwen 3 VL)

Man: wears sunhat and backpack. Woman: has braided hair, holds hiking poles, wears backpack. Teenager: wears baseball cap, holds water bottle, wears backpack. All three are visually distinct and match described features. VERDICT: PASS

Truthfulness 70%

PhotorealismClick to expand100%

Photorealism measures whether local/krea-2-turbo makes materials, surfaces, and real-world objects look visually plausible and distinct.

Medium 100%

PASS

A still life on a wooden table featuring a clear glass of water, a polished silver spoon, a folded linen napkin, and a ripe red apple — each material rendered with accurate texture and light behavior

Evaluator (Qwen 3.6 27B)

The image demonstrates high-fidelity rendering of all requested materials. The glass exhibits accurate refraction and transparency, distorting the background appropriately. The metal spoon shows sharp, realistic specular reflections. The wooden table has a convincing grain and matte finish, while the linen napkin displays a soft, woven texture with natural folds. The apple’s skin reflects light with a subtle sheen, indicating ripeness. All materials behave correctly under the lighting conditions

PASS

A workspace desk with a brushed aluminum laptop, a leather-bound notebook, a ceramic coffee mug, and a glass paperweight — all materials visually distinct and realistic

Evaluator (Qwen 3.6 27B)

The image successfully renders the requested materials with high fidelity. The laptop exhibits a realistic brushed aluminum finish with appropriate matte reflections. The notebook displays a convincing, textured leather surface with natural creases. The ceramic mug shows a speckled, matte glaze, while the glass paperweight demonstrates accurate refraction and transparency. Each material is visually distinct and physically plausible. VERDICT: PASS

PASS

A kitchen counter with a stainless steel pot, a wooden cutting board, a glass bottle of olive oil, and a cotton dish towel — each surface reflecting light correctly

Evaluator (Qwen 3.6 27B)

The image accurately renders the distinct material properties requested. The stainless steel pot shows sharp, distorted reflections typical of polished metal. The wooden cutting board displays natural grain and diffuse light scattering. The glass bottle exhibits correct translucency and refraction of the oil inside, along with specular highlights on the surface. The cotton towel presents a soft, matte texture with appropriate light absorption. All materials behave physically correctly under the

Physics & ReflectionsClick to expand83%

Physics & Reflections measures whether local/krea-2-turbo follows basic physical cues such as shadows, reflections, gravity, transparency, and contact points.

Easy 67%

PASS

A red sphere on a white surface with a single shadow cast to the right, consistent with a light source from the upper left

Evaluator (Qwen 3.6 27B)

The image displays a red sphere on a white surface. The lighting originates from the upper left, evidenced by the bright specular highlight on the sphere's upper-left quadrant. Consequently, a soft shadow is cast to the right and slightly downward, which is physically consistent with the light source direction. The shadow's shape and gradient are natural, and the reflection on the sphere aligns with the lighting setup. VERDICT: PASS

PASS

A coffee mug on a sunny windowsill casting a sharp shadow to the left

Evaluator (Qwen 3.6 27B)

The shadow is cast to the left, consistent with light coming from the right (the window). The shadow’s shape accurately mirrors the mug and handle, with appropriate softness at the edges due to the light source’s size. The reflection in the window glass is faint but plausible, showing a reversed image of the mug. No distortions or color anomalies are present. All elements align with real-world physics. VERDICT: PASS

FAIL

A person standing outdoors at midday with a short shadow directly below them

Evaluator (Qwen 3.6 27B)

The shadow is cast to the right and slightly behind the subject, which contradicts the prompt's requirement for a shadow "directly below them" (indicating overhead midday sun). Additionally, the shadow is quite long and distinct, suggesting a lower sun angle rather than midday. The shadow’s shape also appears somewhat distorted and disconnected from the feet, lacking natural grounding. These inconsistencies make the shadow physically implausible for the described lighting condition. VERDICT: FA

Medium 100%

PASS

A woman standing in front of a full-length mirror, her reflection accurately showing the back of her outfit and reversed orientation

Evaluator (Qwen 3.6 27B)

The reflection in the mirror is physically plausible. It correctly shows the front of the woman's outfit (which is the back of her actual body facing the mirror), maintaining proper orientation and perspective. The lighting and shadows are consistent, with no distortions or color mismatches. The reflection accurately mirrors the scene, adhering to the laws of physics for reflections. VERDICT: PASS

PASS

A cat sitting on a polished marble floor with its reflection visible below

Evaluator (Qwen 3.6 27B)

The reflection is vertically inverted and positioned directly beneath the cat, consistent with a polished floor. The orientation matches the subject’s pose, and the slight blurring and reduced contrast in the reflection are physically accurate for such a surface. Colors and lighting align with the real cat, and no directional inconsistencies are present. The marble texture and sheen support the plausibility of the reflection. VERDICT: PASS

PASS

A vase of flowers on a glass table with a clear reflection visible on the table surface

Evaluator (Qwen 3.6 27B)

The reflection is physically plausible. It is correctly oriented upside down, directly beneath the vase, and exhibits appropriate distortion and slight blurring consistent with a glass surface. The colors and details of the flowers and stems are mirrored accurately, and the lighting conditions support the presence of such a reflection. The shadows cast by the vase and flowers are also consistent with the light source direction. VERDICT: PASS

Hard 67%

PASS

A row of trees reflected in a calm lake at sunset, with the reflection slightly darker and gently rippled

Evaluator (Qwen 3.6 27B)

The reflections are vertically inverted and aligned with the tree trunks, which is physically correct. The ripples distort the reflection appropriately for a calm but not perfectly still surface. The reflection appears slightly darker than the actual trees, consistent with light absorption in water. The warm sunset tones are preserved in the reflection, matching the sky’s gradient. No directional or orientation errors are present. VERDICT: PASS

FAIL

A city skyline reflected in a rain puddle on a street, with mild distortion and color shift

Evaluator (Qwen 3.6 27B)

The image depicts a city skyline reflected in a puddle, but the reflection is inverted vertically relative to the scene (the sky is at the bottom, buildings point down). This is physically impossible; a reflection in a horizontal surface like a puddle should mirror the scene upright relative to the horizon line, not flip it upside down. The orientation is fundamentally incorrect. VERDICT: FAIL

PASS

A mountain range reflected in a still alpine lake, with the reflection inverted and subtly muted in color

Evaluator (Qwen 3.6 27B)

The reflection is physically plausible. It is correctly inverted vertically, matching the mountain range’s orientation. The colors are subtly muted and darker in the water, consistent with light absorption and scattering in real alpine lakes. Minor ripples cause slight distortion, which is realistic for a still but not perfectly glassy surface. Shadows align with the lighting direction on the mountains. No directional or color inconsistencies are present. VERDICT: PASS

Extreme 100%

PASS

A clear glass sphere sitting on a mirror surface near a pool of water, showing refraction through the sphere, reflection on the mirror, and water reflection simultaneously

Evaluator (Qwen 3.6 27B)

The image demonstrates high physical plausibility. The glass sphere correctly refracts the background scene (inverting it), while the mirror surface provides a sharp, accurate reflection of the sphere itself. The lighting is consistent, with specular highlights matching the light source direction on both the sphere and its reflection. The water in the background shows appropriate bokeh and reflections. The interaction between the sphere, mirror, and environment is coherent and realistic. VERDIC

PASS

A chrome sculpture in a room with a glass floor and large windows, showing multiple reflections and environmental mapping on every surface

Evaluator (Qwen 3.6 27B)

The reflections are physically plausible. The chrome sculpture accurately mirrors the surrounding environment, including the windows and the mirrored wall, with appropriate distortion based on its curved surface. The glass floor creates a clear, inverted reflection of the sculpture and the room, consistent with the light sources from the windows. The shadows and highlights align with the natural lighting coming from the left, enhancing the realism of the scene. VERDICT: PASS

PASS

A crystal wine glass on a polished marble counter next to a window overlooking a lake — showing refraction in the glass, reflection on the marble, and the lake's own reflective surface

Evaluator (Qwen 3.6 27B)

The reflections and refractions are physically plausible. The wine glass shows realistic distortion of the background through its curved, faceted surface. The reflection on the polished marble is sharp and correctly oriented, mirroring the glass’s base and stem. The lake outside displays natural water reflections with appropriate softness and color blending. Shadows align with the light source from the window. All elements maintain consistent direction, orientation, and optical behavior. VERDIC

World KnowledgeClick to expand50%

World Knowledge measures whether local/krea-2-turbo depicts recognizable places, landmarks, flags, and known objects with accurate real-world details.

Easy 100%

PASS

The Eiffel Tower in Paris on a clear day, architecturally accurate with its iron lattice structure and three levels

Evaluator (Qwen 3.6 27B)

The image depicts the Eiffel Tower with high architectural accuracy. The iconic iron lattice structure, the four arched legs, and the three distinct observation levels are clearly visible and correctly proportioned. The perspective is a low-angle shot looking up, which is common for this landmark. The background shows a clear blue sky and distant buildings consistent with the Parisian skyline (likely the Palais de Chaillot). There are no obvious AI artifacts or structural distortions. VERDICT:

PASS

The Taj Mahal in Agra seen from the reflecting pool, with accurate dome, minarets, and symmetry

Evaluator (Qwen 3.6 27B)

The image depicts the Taj Mahal with high architectural accuracy, including the central dome, four minarets, and symmetrical layout. The reflecting pool is correctly positioned in the foreground, enhancing the symmetry. The surrounding gardens and cypress trees are also accurately represented. The lighting and shadows appear natural, contributing to the realism of the scene. Overall, the image meets the criteria for recognizability and factual accuracy. VERDICT: PASS

PASS

The Statue of Liberty on her island pedestal, with correct torch, crown, tablet, and green patina

Evaluator (Qwen 3.6 27B)

The Statue of Liberty is clearly recognizable and accurately depicted with correct proportions, green patina, torch, crown, and tablet. The pedestal and base architecture align with real-world structure. Minor crowd details at the base are plausible. No significant factual or architectural inaccuracies detected. VERDICT: PASS

Medium 33%

FAIL

A traditional Japanese shamisen instrument with accurate three-stringed design, wooden body, and bachi plectrum

Evaluator (Qwen 3.6 27B)

The image depicts a shamisen with a bachi, but the instrument is structurally inaccurate. The body resembles a banjo with a drumhead and rim rather than the traditional wooden or lacquered body of a shamisen. The bridge is a simple bar instead of the traditional curved "kama," and the strings appear to have four tuning pegs rather than the standard three. These deviations make it factually inaccurate. VERDICT: FAIL

FAIL

A medieval astrolabe with accurate rete, tympan, and engraved degree markings on brass

Evaluator (Qwen 3.6 27B)

The image depicts a highly detailed brass astrolabe with recognizable components like the rete, tympan, and alidade. However, the engraved text and numbers are nonsensical gibberish, a common artifact in AI-generated images. While the general form is correct, the lack of accurate, legible inscriptions and degree markings makes it factually inaccurate as a functional scientific instrument. VERDICT: FAIL

PASS

A Japanese kintsugi bowl — cracked ceramic repaired with visible gold lacquer lines

Evaluator (Qwen 3.6 27B)

The image depicts a ceramic bowl featuring the distinct aesthetic of kintsugi, with visible cracks repaired using gold lacquer. The bowl displays a traditional Japanese style with blue floral patterns and crackle glaze. The gold lines accurately represent the repair technique described in the prompt. The subject is clearly recognizable and factually accurate to the description of a Japanese kintsugi bowl. VERDICT: PASS

Hard 0%

FAIL

A detailed diagram of the water cycle showing evaporation from the ocean, condensation forming clouds, precipitation as rain, and collection in rivers flowing back to the sea

Evaluator (Qwen 3.6 27B)

The image depicts a stylized, cartoon-like illustration rather than a detailed scientific diagram. While it shows the water cycle components (evaporation, condensation, precipitation, collection), the artistic style is inconsistent with a "detailed diagram." Furthermore, the visual logic is flawed: the "evaporation" is depicted as a giant splash of bubbles shooting directly into a cloud, which is not factually accurate. The river flows into a waterfall that seems to disappear into the ground rat

FAIL

A cross-section showing tectonic plates at a convergent boundary with subduction, a trench, volcanic arc, and layered mantle

Evaluator (Qwen 3.6 27B)

The image fails to depict a convergent boundary with subduction. Instead, it shows a divergent boundary or rift zone where plates are pulling apart, evidenced by the central gap and upward magma flow. There is no subducting plate, trench, or volcanic arc as requested. The "layered mantle" is stylized with rainbow colors and textures that resemble clay or cake rather than geological strata. The core subject matter is fundamentally incorrect. VERDICT: FAIL

FAIL

An anatomical illustration of the human heart showing four chambers, valves, aorta, pulmonary arteries, and veins with correct placement

Evaluator (Qwen 3.6 27B)

The image depicts a human heart with recognizable major structures like the aorta, pulmonary arteries, and chambers. However, it contains significant anatomical inaccuracies. The coronary arteries are depicted as massive, thick blue and red vessels running superficially across the entire ventricular surface, which is incorrect; they are much smaller and follow specific grooves. Additionally, the internal valve structures appear simplified and somewhat distorted, and the spatial relationship of t

Extreme 67%

FAIL

A Morpho menelaus butterfly with accurate iridescent blue dorsal wings, brown ventral wing pattern with eyespots, and correct body morphology

Evaluator (Qwen 3.6 27B)

The image depicts a butterfly with the characteristic iridescent blue dorsal wings and brown ventral wings with eyespots of the Morpho genus. However, the subject is anatomically impossible: it displays both the dorsal and ventral wing surfaces simultaneously in a flat, 2D projection. Real butterflies cannot show both sides at once; this is a composite or AI hallucination of anatomy. Therefore, it fails the "real-world subject" and "correct body morphology" criteria. VERDICT: FAIL

PASS

The Hagia Sophia interior showing the massive central dome, pendentives, semi-domes, and Islamic calligraphy medallions alongside Byzantine mosaics

Evaluator (Qwen 3.6 27B)

The image accurately depicts the Hagia Sophia’s interior, including the massive central dome, pendentives with angelic figures, semi-domes, and large Islamic calligraphy medallions. The architectural layout, lighting, and decorative elements align with known photographs of the site. Minor stylization in mosaic patterns does not detract from factual accuracy. VERDICT: PASS

PASS

A specimen of bismuth crystal showing accurate iridescent oxide layers, geometric hopper crystal formation, and characteristic staircase structure

Evaluator (Qwen 3.6 27B)

The image depicts a bismuth crystal specimen with high fidelity. It accurately captures the characteristic geometric hopper crystal formation, displaying the distinct staircase-like terraces and sharp, angular edges typical of bismuth. The iridescent oxide layers are rendered with realistic coloration, showing the expected spectrum of yellows, blues, and purples caused by thin-film interference. The metallic luster and rough matrix base are also consistent with real-world samples. VERDICT: PASS

Professional Studio 82%

Camera & LightingClick to expand83%

Camera & Lighting measures whether local/krea-2-turbo follows specific lens, angle, focus, exposure, and lighting instructions.

Easy 100%

PASS

A wide-angle shot of a modern living room interior showing expansive space with slight barrel distortion at the edges

Evaluator (Gemma 4 26B)

The image successfully implements all requested elements. It features a wide-angle perspective that captures the expansive living room. There is a clear and noticeable barrel distortion at the edges (visible in the darkening and curving of the corners), which matches the prompt perfectly. The lighting is layered, using both natural light from the window and warm artificial light from the recessed ceiling and chandelier to create a modern interior feel. VERDICT: PASS

PASS

A close-up macro shot of a single dewdrop on a rose petal, shallow depth of field

Evaluator (Gemma 4 26B)

The image features a close-up macro shot of a rose petal with multiple water droplets. The perspective is a macro view, and the depth of field is extremely shallow, with the central droplet in sharp focus and the surrounding areas heavily blurred (bokeh). The lighting is bright and naturalistic, highlighting the translucency of the droplets. All elements requested in the prompt are accurately represented. VERDICT: PASS

PASS

An aerial bird's-eye view looking straight down at a traffic roundabout with cars and trees

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt's requirements. It features a true aerial bird's-eye view, looking straight down at a traffic roundabout. The perspective is top-down, with a deep depth of field that keeps the entire scene—from the center greenery to the surrounding roads and trees—in sharp focus. The lighting is even and consistent across the scene, typical of a high-altitude overhead shot. VERDICT: PASS

Medium 100%

PASS

A portrait lit with classic Rembrandt lighting — triangle of light on the shadowed cheek, moody atmosphere

Evaluator (Gemma 4 26B)

The image successfully implements the requested lighting. There is a clear light source from the side, creating a deep shadow on one side of the face. Crucially, a distinct triangle of light is visible on the shadowed cheek, which is the hallmark of Rembrandt lighting. The atmosphere is moody and dark, and the depth of field is appropriate for a portrait. VERDICT: PASS

PASS

A landscape photograph bathed in golden hour light — long warm shadows, orange sky, soft glow

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt's requirements. The lighting is a classic golden hour scene, featuring a bright, low sun that creates a warm orange sky and a soft glow across the landscape. The long, dark shadows cast by the trees across the field are highly prominent and consistent with a low sun angle. The perspective and depth of field are appropriate for a landscape photograph. VERDICT: PASS

PASS

A high-key portrait of a person against a bright white background, minimal shadows, airy and luminous

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt. It features a high-key portrait with a bright, white, seamless background. The lighting is soft, even, and airy, creating a luminous effect with minimal shadows on the subject's face. The perspective is a standard eye-level portrait, and the depth of field is shallow, keeping the focus entirely on the person's facial features. VERDICT: PASS

Hard 100%

PASS

A portrait shot at 85mm f/1.4 with creamy bokeh background, backlit with warm rim light separating the subject from a blurred garden

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt's technical requirements. The depth of field is shallow, creating a creamy bokeh background with large, soft light orbs. The lighting features a distinct warm rim light (backlighting) that creates a glowing outline around the subject's hair and shoulders, effectively separating her from the blurred garden background. The perspective is a classic 85mm portrait shot. VERDICT: PASS

PASS

A street photograph with shallow depth of field as if shot at 50mm f/1.8, subject sharp with soft city lights in the background

Evaluator (Gemma 4 26B)

The image successfully implements the requested technical specifications. The subject is sharp and in focus, while the background features a shallow depth of field consistent with a 50mm f/1.8 lens. The background displays soft, circular bokeh from city lights (bokeh effect), and the lighting pattern accurately reflects a street scene at dusk or night with various light sources creating a pleasing depth. VERDICT: PASS

PASS

A food photograph shot from above at 35mm f/2.8 with natural window light from the left, slight vignette, warm color temperature

Evaluator (Gemma 4 26B)

The image matches all the specified criteria. The perspective is a top-down shot (shot from above). The depth of field is shallow, consistent with an f/2.8 aperture, as the edges of the bowl and the wooden surface are slightly out of focus. The lighting clearly originates from the top-left (window side), creating a natural light pattern with a warm color temperature and a subtle vignette effect. VERDICT: PASS

Extreme 33%

FAIL

A tilt-shift photograph of a city street making real-life look like a miniature model — selective focus strip, saturated colors

Evaluator (Gemma 4 26B)

The image fails to capture the "tilt-shift" effect described. A true tilt-shift photograph uses a very shallow depth of field to create a selective focus strip, making the scene look like a miniature model. In this image, the focus is relatively deep; the foreground (crosswalk/pedestrians) and the mid-ground (cars) are both sharp, and there is no characteristic blur at the top and bottom of the frame to simulate a miniature scale. VERDICT: FAIL

FAIL

A cinematic frame with blue and orange anamorphic lens flare streaking horizontally across a dark moody scene

Evaluator (Gemma 4 26B)

The image features a dark, moody scene with a cinematic feel. It successfully incorporates the requested blue and orange light flares. However, the flares are not strictly "anamorphic lens flares streaking horizontally." Instead, the light patterns are chaotic, featuring many diagonal rays, starburst effects, and non-horizontal streaks that obscure the subject's face. While the colors are correct, the specific horizontal anamorphic characteristic is not accurately captured. VERDICT: FAIL

PASS

A portrait with dramatic split lighting — half the face brightly lit, the other half in deep shadow, high contrast chiaroscuro

Evaluator (Gemma 4 26B)

The image perfectly matches the prompt's requirements. The lighting is a textbook example of dramatic split lighting: the left side of the subject's face is brightly illuminated, while the right side is cast in deep, high-contrast shadow. This creates a sharp vertical division down the center of the face, achieving the requested chiaroscuro effect. The perspective is a standard portrait, and the lighting pattern is executed precisely. VERDICT: PASS

Color PrecisionClick to expand92%

Color Precision measures whether local/krea-2-turbo renders requested colors clearly and consistently across subjects, products, and backgrounds.

Easy 100%

PASS

A bright red sports car parked on a gray asphalt road — the red is vivid and unambiguous

Evaluator (Gemma 4 26B)

The image features a bright, vivid red sports car. The red color is intense, saturated, and unambiguous, exactly as described. The car is parked on a gray asphalt road, which is also clearly visible. The colors match the prompt's specific requirements perfectly. VERDICT: PASS

PASS

A woman wearing a royal blue evening dress against a neutral background

Evaluator (Gemma 4 26B)

The image features a woman wearing a deep, vibrant blue dress that accurately matches the description of "royal blue." The background is a solid, muted beige/tan color, which qualifies as a "neutral background." All color requirements specified in the prompt are met. VERDICT: PASS

PASS

A field of pure sunflower yellow flowers under a clear blue sky

Evaluator (Gemma 4 26B)

The image perfectly matches the color descriptions provided in the prompt. The sunflowers exhibit a vibrant, "pure sunflower yellow" color, and the background features a "clear blue sky." There are no deviations from the specified color palette. VERDICT: PASS

Medium 100%

PASS

A bedroom wall painted in #FF6B35 (burnt orange) with white furniture and natural daylight

Evaluator (Gemma 4 26B)

The image features a bedroom wall painted in a vibrant burnt orange color, which closely aligns with the hex code #FF6B35 and the description "burnt orange." The furniture (bedding and lamp) is white, and the scene is illuminated by natural daylight from a nearby window. All color requirements specified in the prompt are accurately represented. VERDICT: PASS

PASS

A forest scene where the foliage is precisely #2D5F2D (dark muted green) with brown tree trunks

Evaluator (Gemma 4 26B)

The foliage in the image consists of various shades of green, ranging from light olive to a darker forest green. The hex code #2D5F2D represents a specific, deep, dark muted green. While the image contains dark green tones, it also includes much lighter, yellowish-green hues that deviate from the strict color requirement. However, the core instruction of "dark muted green" is generally represented in the darker areas of the canopy and ground. VERDICT: PASS

PASS

A room with walls in #4A90D9 (medium blue) with warm wooden floor and white trim

Evaluator (Gemma 4 26B)

The image features walls in a medium blue color that closely matches the description of #4A90D9. The floor is a warm, natural wood tone, and the window trim and baseboards are white, as specified. All color requirements from the prompt are accurately represented in the generated image. VERDICT: PASS

Hard 67%

PASS

A minimalist room with walls in #F5E6CC (warm cream), a sofa in #2C3E50 (dark navy), and accent pillows in #E74C3C (bright red)

Evaluator (Gemma 4 26B)

The image accurately reflects the requested colors. The walls are a warm cream/beige (#F5E6CC), the sofa is a dark navy blue (#2C3E50), and the accent pillows are a bright red (#E74C3C). All color requirements from the prompt are met precisely. VERDICT: PASS

FAIL

A product flat-lay on a #1A1A2E (deep dark blue) background, with a #F0C27F (soft gold) watch and #FFFFFF (pure white) earbuds

Evaluator (Gemma 4 26B)

The background is a deep dark blue, consistent with #1A1A2E. The earbuds are pure white (#FFFFFF). However, the watch is not a "soft gold" (#F0C27F); instead, it is a pale, yellowish-beige or tan color. Furthermore, the AI incorrectly rendered the hex code "#F0C27F" as text directly onto the watch face, which is a failure to follow the color instruction and a hallucination of the text. VERDICT: FAIL

PASS

A graphic design composition with background #FAFAFA (near-white), main shape #FF4500 (orange-red), and accent lines #333333 (dark charcoal)

Evaluator (Gemma 4 26B)

The image accurately reflects the requested color palette. The background is a near-white color consistent with #FAFAFA. The large main shape is a vibrant orange-red, matching the #FF4500 specification. The text and the horizontal accent line are a dark charcoal color, consistent with #333333. All specified hex values and color descriptions are correctly represented in the composition. VERDICT: PASS

Extreme 100%

PASS

A smooth gradient on a wall going from #FF0000 (red) on the left to #0000FF (blue) on the right, transitioning evenly through purple

Evaluator (Gemma 4 26B)

The image shows a smooth gradient transitioning from a bright red on the left to a deep blue on the right. The middle section features a purple transition, as requested. The colors align with the specified hex values: the left side is a pure red (#FF0000) and the right side is a pure blue (#0000FF), with an even transition through purple. VERDICT: PASS

PASS

A sunset sky gradient on a flat surface from #FF6B35 (orange) at the bottom to #1B1464 (deep indigo) at the top

Evaluator (Gemma 4 26B)

The image displays a vertical gradient. The bottom color is a vibrant orange, which aligns well with the specified #FF6B35 (orange). The top color is a very dark, deep indigo/navy, which matches the description of #1B1464 (deep indigo). The transition between the two colors is smooth and follows the requested color scheme accurately. VERDICT: PASS

PASS

A horizontal banner with a gradient from #00C9FF (cyan) on the left to #92FE9D (mint green) on the right

Evaluator (Gemma 4 26B)

The image shows a horizontal banner with a gradient. On the left side, the color is a bright cyan, which matches the hex code #00C9FF. On the right side, the color is a light mint green, which matches the hex code #92FE9D. The gradient transitions smoothly between these two specific colors as requested. VERDICT: PASS

PhotorealismClick to expand33%

Photorealism measures whether local/krea-2-turbo produces polished product and studio imagery with realistic materials, lighting, shadows, and finish.

Easy 33%

FAIL

A single red sneaker on a clean white background with soft studio lighting and gentle shadow — product photography style

Evaluator (Gemma 4 26B)

The image fails to pass as a real photograph due to several glaring AI artifacts. The laces are physically impossible, merging into the shoe's upper and lacking coherent structure. The eyelets are distorted and uneven. Additionally, the texture of the leather is inconsistent, and the shadow is unnaturally dark and sharp for "soft studio lighting." The overall geometry of the shoe is warped and nonsensical upon close inspection. VERDICT: FAIL

FAIL

A clear glass perfume bottle on a reflective black surface with studio lighting, no background distractions

Evaluator (Gemma 4 26B)

The image fails to meet the prompt's requirements. The background is a light gray/white gradient, not a studio setting that matches the "reflective black surface" context effectively. Most importantly, the glass and liquid physics are highly flawed: the straw/dip tube is nonsensical, bending through the liquid in an impossible way, and the internal reflections and refraction are inconsistent and distorted. It does not look like a real photograph. VERDICT: FAIL

PASS

A ripe green apple on a seamless white background with soft side lighting and subtle shadow — catalog style

Evaluator (Gemma 4 26B)

The image successfully captures the requested studio aesthetic. The green apple features realistic textures, including natural-looking lenticels and a subtle sheen. The lighting is soft, creating a gentle highlight on the side and a realistic, soft shadow on the white surface. There are no obvious AI artifacts, such as warped geometry or unnatural textures. It looks like a high-quality, professional catalog photograph. VERDICT: PASS

Graphical design 54%

Data VisualisationClick to expand0%

Data Visualisation measures whether local/krea-2-turbo produces charts with correct labels, proportions, colors, and readable values.

Easy 0%

FAIL

A bar chart showing monthly sales for January through June with values 12K, 18K, 15K, 22K, 19K, 25K — labeled axes, color-coded bars, and a title 'Monthly Sales 2025'

Evaluator (Qwen 3 VL)

The chart shows a bar chart titled "Monthly Sales 2025" with color-coded bars. Values 12K, 18K, 22K, 19K, 25K are visible, but the x-axis labels are incorrect (shows "18K", "15K", "June" twice) and the y-axis lacks proper scale labeling. The chart type is correct, but critical labeling errors invalidate the visualization. VERDICT: FAIL

FAIL

A pie chart showing market share: Company A 35%, Company B 25%, Company C 20%, Others 20% — each slice a different color with labels and percentages

Evaluator (Qwen 3 VL)

The chart is a pie chart with correct percentages and labels: Company A (35%), Company B (25%), Company C (20%), Others (20%). However, "Company B" is incorrectly labeled twice — once as 25% and once as 20%. The visual elements are readable and distinct. The chart type matches the request. VERDICT: FAIL

FAIL

A simple line graph showing daily website visitors over one week (Mon-Sun): 1200, 1500, 1100, 1800, 2000, 1700, 1400 — with grid lines, axis labels, and title 'Weekly Traffic'

Evaluator (Qwen 3 VL)

The chart shows a line graph titled "Weekly Traffic" with axes labeled "Mon-Sun" and "12on-S000". Data points match the requested values (1200, 1500, 1100, 1800, 2000, 1700, 1400), though the Y-axis label is garbled. Grid lines are present, but axis labels are misaligned and unreadable. The chart type is correct, but critical labeling errors prevent accurate interpretation. VERDICT: FAIL

Layout & DesignClick to expand33%

Layout & Design measures whether local/krea-2-turbo organizes posters, menus, covers, and other designed compositions with the requested hierarchy and elements.

Easy 100%

PASS

A movie night poster with a large popcorn image in the center, the title 'MOVIE NIGHT' in bold at the top, and the tagline 'Every Friday at 8 PM' at the bottom

Evaluator (Qwen 3 VL)

The image features a large popcorn bucket centered, with "MOVIE NIGHT" in bold at the top and "Every Friday at 8 PM" at the bottom in a banner. Text is readable and correctly placed per the prompt. Visual hierarchy is clear, with the title prominent, popcorn central, and tagline anchored below. All required layout elements are present and accurately rendered. VERDICT: PASS

PASS

An event poster featuring a guitar illustration, the headline 'LIVE MUSIC FEST' in large text, and 'Tickets available now' as a tagline below

Evaluator (Qwen 3 VL)

The image features a guitar illustration, the headline "LIVE MUSIC FEST" in large, prominent text above it, and the tagline "Tickets available now" at the bottom. Text is readable and correctly placed with clear visual hierarchy. All required layout elements are present and properly positioned. VERDICT: PASS

PASS

A book club poster with an open book illustration, the title 'JOIN THE BOOK CLUB' prominently displayed, and 'Meets every Wednesday' as the tagline

Evaluator (Qwen 3 VL)

The poster features an open book illustration centered below the title "JOIN THE BOOK CLUB," which is prominently displayed at the top in large, bold text. The tagline "Meets every Wednesday" is clearly visible at the bottom. Visual hierarchy is correct, with title > illustration > tagline. All text is readable and correctly placed per the prompt. VERDICT: PASS

Medium 0%

FAIL

A landing page design with a hero header image at the top, a headline and subtitle in the middle, a 'Sign Up' call-to-action button, and a footer with three social media icons

Evaluator (Qwen 3 VL)

The image shows a hero header, headline/subtitle, “Sign Up” button, and footer with social icons — all present with correct visual hierarchy. However, the headline and subtitle text is gibberish and unreadable, failing the text readability requirement. VERDICT: FAIL

FAIL

A newsletter layout with a banner image header, a two-column body section with text on the left and an image on the right, a highlighted quote block, and a footer with unsubscribe link

Evaluator (Qwen 3 VL)

The image shows a banner header, a two-column layout (text left, image right), a quote block, and a footer with a link. However, all text is gibberish and unreadable, violating the requirement for readable, correctly placed text. The visual hierarchy is present but meaningless due to illegible content. VERDICT: FAIL

FAIL

A restaurant menu layout with a logo header, three category sections (appetizers, mains, desserts) each with items and prices, and a footer with address and hours

Evaluator (Qwen 3 VL)

The image shows a menu with a logo header (“RESTANTERS”), three category sections (Appetizers, Mains, Desserts) with items and prices, and a footer with address/hours. However, all text is gibberish, unreadable, and incorrectly placed. The layout structure is present but the content is nonsensical and unrecognizable. VERDICT: FAIL

Hard 0%

FAIL

A magazine double-page spread with a left page containing a large hero photo, and a right page with a two-column article, a pull quote, a sidebar with author bio, and a page number

Evaluator (Qwen 3 VL)

The left page features a large hero photo as required. The right page shows a pull quote, two columns of text, and a sidebar with author bio, all with correct visual hierarchy. However, all text is gibberish and unreadable, violating the requirement for readable content. Layout structure is present but text fails functional criteria. VERDICT: FAIL

FAIL

A mobile app UI mockup showing a dashboard with a top navigation bar, a search field, a 2x2 grid of feature cards with icons, a list of recent activity items, and a bottom tab bar

Evaluator (Qwen 3 VL)

The image shows a mobile app UI with a top navigation bar (status bar + user profile), search field, 2x2 grid of feature cards with icons, “Recent Activity” list, and bottom tab bar. Text is present but heavily distorted and unreadable. Visual hierarchy is mostly correct, but illegible text violates core usability requirements. VERDICT: FAIL

FAIL

A fashion magazine spread with a full-bleed left-page photo, right page with three-column layout, headline, body copy, a smaller inset image, and styled page numbers

Evaluator (Qwen 3 VL)

The image shows a full-bleed left-page photo and a right page with three-column layout, headline, body copy, and a smaller inset image. Page numbers “206” are visible. However, all text is gibberish, unreadable, and incorrectly placed, violating the prompt’s requirement for readable, correctly positioned text. Layout structure is present but text fails core criteria. VERDICT: FAIL

Style DiversityClick to expand83%

Style Diversity measures whether local/krea-2-turbo can shift between visual styles like oil painting, pixel art, watercolor, and technical illustration.

Easy 67%

PASS

A golden retriever sitting in a garden, painted in the style of an oil painting with visible brushstrokes and rich texture

Evaluator (Qwen 3.5 122B)

The image clearly depicts a golden retriever in a garden setting. The style is unmistakably that of an oil painting, characterized by thick, visible brushstrokes (impasto effect) throughout the fur, foliage, and background. The texture is rich and painterly, with distinct dabs of color creating form and light. All key visual characteristics requested are present and well-executed. VERDICT: PASS

PASS

A golden retriever sitting in a garden, rendered in Japanese anime style with flat colors and large expressive eyes

Evaluator (Qwen 3.5 122B)

The image clearly exhibits Japanese anime style characteristics: flat color application with minimal gradients, bold black outlines defining the subject, and large expressive eyes with simple highlights. The golden retriever is depicted with stylized features including simplified fur texture and exaggerated facial expression. The garden background uses cel-shading techniques typical of anime. All requested visual elements are present and recognizable. VERDICT: PASS

FAIL

A golden retriever sitting in a garden, rendered in pixel art style with a 32x32 grid and limited color palette

Evaluator (Qwen 3.5 122B)

The image depicts a golden retriever in a garden setting, matching the subject matter. However, the requested style specifies a "32x32 grid" and "pixel art style." The image is clearly rendered as a mosaic or cross-stitch pattern with visible square tiles, not pixel art. Additionally, the grid resolution appears significantly higher than 32x32 pixels. The key visual characteristics of the requested style are absent. VERDICT: FAIL

Medium 100%

PASS

A city street scene in the style of Art Deco — geometric shapes, gold and black palette, ornamental symmetry

Evaluator (Qwen 3.5 122B)

The image strongly exhibits Art Deco characteristics: geometric building facades with stepped silhouettes, ornamental gold-and-black patterns on foreground structures, symmetrical street layout, and a luxurious metallic palette. The stylized figures, vehicles, and glowing orbs reinforce the era’s aesthetic. While not perfectly symmetrical down the center, the overall composition and decorative motifs clearly align with the requested style. VERDICT: PASS

PASS

A woman with a parasol in the style of Impressionism — soft brushstrokes, dappled light, pastel palette

Evaluator (Qwen 3.5 122B)

The image clearly exhibits Impressionist characteristics: visible, soft brushstrokes throughout the composition, dappled light effects especially on the parasol and dress, and a predominantly pastel palette with blues, pinks, and yellows. The subject matter — a woman with a parasol — aligns with classic Impressionist themes. While the rendering is somewhat stylized or AI-interpreted, the core visual hallmarks of the requested style are unmistakably present. VERDICT: PASS

PASS

A portrait of a man in Pop Art style — bold outlines, Ben-Day dots, flat vivid primary colors

Evaluator (Qwen 3.5 122B)

The image clearly exhibits the requested Pop Art style. It features bold black outlines defining the subject, flat vivid primary colors in the background (red and yellow) and clothing (blue), and the characteristic Ben-Day dot pattern used for shading on the face. All key visual characteristics specified in the prompt are present and clearly recognizable. VERDICT: PASS

Hard 67%

FAIL

A forest rendered in photorealistic watercolor — the detail and lighting of a photo combined with visible watercolor bleeds and paper texture

Evaluator (Qwen 3.5 122B)

The image clearly exhibits watercolor characteristics: visible paper texture, pigment bleeds, and translucent layering. However, it fails the "photorealistic" criterion. The rendering is stylized and illustrative rather than photo-realistic; the lighting lacks photographic fidelity, and the details are painterly rather than sharp and true-to-life. While the watercolor style is present, the core requirement of photorealism is not met. VERDICT: FAIL

PASS

A woman in a futuristic city rendered in cyberpunk Art Nouveau — neon colors and tech elements combined with flowing organic lines and floral borders

Evaluator (Qwen 3.5 122B)

The image successfully combines cyberpunk and Art Nouveau elements. The futuristic city, neon colors, and tech elements (headphones, jacket patches) fulfill the cyberpunk aspect. The flowing organic lines and prominent floral borders framing the scene clearly represent the Art Nouveau style. The fusion is visually coherent and matches the prompt’s requirements. VERDICT: PASS

PASS

A medieval castle depicted in Japanese ukiyo-e woodblock print style combined with photorealistic sky and lighting

Evaluator (Qwen 3.5 122B)

The image clearly exhibits the requested hybrid style. The castle, trees, and clouds feature the bold outlines, flat color areas, and textured patterns characteristic of Japanese ukiyo-e woodblock prints. Simultaneously, the sky displays a gradient and lighting effect that mimics a photorealistic sunset or twilight, with subtle noise/grain suggesting a photographic quality. The combination is distinct and successfully executed. VERDICT: PASS

Extreme 100%

PASS

An interior scene rendered in soft brutalism — raw concrete textures combined with rounded pastel furniture and warm diffused light

Evaluator (Qwen 3.5 122B)

The image perfectly embodies "soft brutalism." The raw, textured concrete walls and ceiling are prominent, establishing the brutalist foundation. This is contrasted by the rounded, pastel-colored furniture (pink, yellow, mint) which adds the required softness. The lighting is warm and diffused, casting gentle shadows that enhance the textures without harshness. All key visual characteristics requested are clearly present and well-executed. VERDICT: PASS

PASS

A still life painted in anxious minimalism — sparse composition with uncomfortable negative space, muted colors, and subtle visual tension

Evaluator (Qwen 3.5 122B)

The image displays a still life with a sparse composition featuring a single fruit and a dark bottle against a textured, off-white background. The color palette is muted, dominated by grays, beiges, and the dark vessel. The brushwork is visible and somewhat agitated, contributing to a sense of visual tension. The negative space is prominent and creates an uneasy, minimalist atmosphere consistent with the "anxious minimalism" description. VERDICT: PASS

PASS

A cityscape rendered in nostalgic futurism — retro-futuristic 1960s space-age aesthetics with a melancholic warm-toned patina

Evaluator (Qwen 3.5 122B)

The image successfully captures the requested nostalgic futurism style. The color palette features a distinct warm, yellowish-green patina that evokes aged film or vintage prints, matching the "melancholic warm-toned" requirement. The architecture blends recognizable modern skyscrapers with the retro-futuristic Space Needle, creating a 1960s space-age aesthetic. The overall texture and lighting effectively convey a sense of retro nostalgia. VERDICT: PASS