Open AI unveils ChatGPT Images 2.0

Images are a language, not decoration. A good image does what a good sentence does—it selects, arranges, and reveals. It can explain a mechanism, stage a mood, test an idea, or make an argument.

A year ago, we released ChatGPT Images, showing that images created by AI can be both beautiful and useful. ChatGPT Images 2.0 is the next step: a state-of-the-art model that can take on complex visual tasks and produce precise, immediately usable visuals.

This model is a step change in detailed instruction following, placing and relating objects accurately, and rendering dense text, with the ability to generate across aspect ratios. Its sense of composition and visual taste means results feel less AI-generated and more intentionally designed. It’s accurate across languages and uses its expanded visual and world knowledge to fill in the gaps for you, so you get smarter images with less prompting.

To extend the model’s capabilities for the most complex tasks, Images 2.0 is our first image model with thinking capabilities. When a thinking or pro model is selected in ChatGPT, Images 2.0 can search the web for real-time information, create multiple distinct images from one prompt, and double-check its own outputs. With thinking, the model can take on even more of the heavy lifting between idea and image, especially when accuracy, up-to-date information, consistency, and visual cohesion matter most.

With both the intelligence of OpenAI’s reasoning models and a vast understanding of the visual world, this model moves image generation from rendering to strategic design, from a tool to a visual system, helping people turn ideas into outputs they can understand, share, teach with, and build from. It’s available starting today to all users in ChatGPT, Codex, and the API.

Greater precision and control

Images 2.0 brings an unprecedented level of specificity and fidelity to image creation. It can not only conceptualize more sophisticated images, it actually brings that vision to life effectively, able to follow instructions, preserve requested details, and render the fine-grained elements that often break image models: small text, iconography, UI elements, dense compositions, and subtle stylistic constraints, and at up to 2K resolution in the API. Instead of getting something vaguely in the neighborhood of what you meant, you get something you can actually use.

Stronger across languages

To date, our image generation models have been more consistent in English and other Latin-script languages, but less precise beyond that, especially when text was complex or dense.

Images 2.0 moves beyond that barrier with stronger multilingual understanding and significant gains in non-Latin text rendering, particularly in Japanese, Korean, Chinese, Hindi, and Bengali. It can produce images with non-English text that’s not only rendered correctly but with language that flows coherently.

Stylistic sophistication and realism

Images 2.0 also shows significantly improved fidelity across a wide range of visual styles. It is better able to capture the defining characteristics of photos—including the tiny flaws that add realism—as well as cinematic stills, pixel art, manga, and other distinctive visual languages, with greater consistency in texture, lighting, composition, and fine detail.

Author

LATEST NEWS

ASO vs. SEO: Key differences and synchronize web and app store presence

Plan a Social Media Calendar with Claude

CONTACTS