OPENAI hoist the AI ​​of reasoning at a new summit with O3 and O4 – MINI

A strategic turning point

Openai lifted the veil on two models baptized o3 And O4 – minipresented as the most advanced reasoning systems ever put online by the company. Breaking with the GPT line of pure conversational orientation, these models unify logical, vision and tool control in the same architecture: web research, python interpretation, generation and image analysis, file reading – all this orchestrated directly by AI, without intermediate human intervention.

What changes concretely

  • Multimodal reasoning

    O3 and O4 -Mini “think” now with images. The AI ​​can integrate a diagram, zoom in on a detail, or rotate a photo in its logical chain before responding, a decisive step towards an understanding close to human cognition.

  • Agentic Tool Use

    The models choose the relevant tool – code execution, web navigation, visual generation – then chain calls to deliver a complete solution in less than a minute. This autonomy transforms Chatgpt into a real execution agent.

  • Reinforcement Learning multiplied

    OPENAI declares that it has increased the computing power allocated to the RL, allowing the model to “think longer”. Result: net earnings on almost all benchmarks, without additional latency for the end user.

Record performance

Benchmark O1 o3 O4 – mini
Likes 2025 (math) 79 % 91.6 % 92.7 %
ELO CODEBORS 1,891 2,706 2,719
MMMU (vision) 77.6 % 82.9 % 81.6 %

These scores, from official version notes, place O3 above the previous state of the art in programming, mathematics and image analysis, while O4 – Mini reproduces most of these performances at a cost by Token divided by two.

Availability and pricing models

Chatgpt plus, Pro and Team subscribers already see appear o3,, O4 – mini And O4 – MINI -HIGH in their model selector. Enterprise license companies and universities will switch next week. On the API side, the two models are accessible today, with a mechanism of “API Responsary” intended to keep the traces of reasoning around the function calls.

Reinforced security

OPENAI simultaneously publishes a System Card retail a new set of refusal data and the use of a LLM Monitor Responsible for detecting sensitive uses (biothreats, generation of malware, attempts at jailbreak). The publisher claims to reach 99 % detection during internal red -teaming phases and guarantees that the models remain under the “high” thresholds of the PREPAREDNESS Framework For biology, cybersecurity and self -improvement.

O3 vs o4 – mini: Which one to choose?

Criteria o3 O4 – mini
Raw power ★★★★ ☆ ★★★ ☆☆
Cost/Token pupil weak
Latency average low
Complex tool chaining optimal Good
Typical use case Research “Deep Research”, heavy visual analyzes, sophisticated code production Embedded assistants, bulk request batch, mobile integration

An ecosystem that widens

In parallel, the publisher publishes CODEX CLAan open -end agent for the terminal capable of locally piloting new models, while a fund of $ 1 million in credits API finances projects operating this tool.

Prospects

By dissociating the arrival of O3 from that of GPT – 5, OPENAI demonstrates its desire to quickly iterate on the reasoning without delay the next big version. The immediate future of AI will go through models capable of acting, either only to converse. It remains to be seen how the risks of instrumented hallucination will be managed on a large scale and the capture of sensitive data induced by this increased autonomy. For the time being, Openai resumes one step ahead in the race for agental reasoning.