A strategic turning point
Openai lifted the veil on two models baptized o3 And O4 – minipresented as the most advanced reasoning systems ever put online by the company. Breaking with the GPT line of pure conversational orientation, these models unify logical, vision and tool control in the same architecture: web research, python interpretation, generation and image analysis, file reading – all this orchestrated directly by AI, without intermediate human intervention.
What changes concretely
- Multimodal reasoning
O3 and O4 -Mini “think” now with images. The AI can integrate a diagram, zoom in on a detail, or rotate a photo in its logical chain before responding, a decisive step towards an understanding close to human cognition.
- Agentic Tool Use
The models choose the relevant tool – code execution, web navigation, visual generation – then chain calls to deliver a complete solution in less than a minute. This autonomy transforms Chatgpt into a real execution agent.
- Reinforcement Learning multiplied
OPENAI declares that it has increased the computing power allocated to the RL, allowing the model to “think longer”. Result: net earnings on almost all benchmarks, without additional latency for the end user.
Record performance
Benchmark | O1 | o3 | O4 – mini |
---|---|---|---|
Likes 2025 (math) | 79 % | 91.6 % | 92.7 % |
ELO CODEBORS | 1,891 | 2,706 | 2,719 |
MMMU (vision) | 77.6 % | 82.9 % | 81.6 % |
These scores, from official version notes, place O3 above the previous state of the art in programming, mathematics and image analysis, while O4 – Mini reproduces most of these performances at a cost by Token divided by two.
Availability and pricing models
Chatgpt plus, Pro and Team subscribers already see appear o3,, O4 – mini And O4 – MINI -HIGH in their model selector. Enterprise license companies and universities will switch next week. On the API side, the two models are accessible today, with a mechanism of “API Responsary” intended to keep the traces of reasoning around the function calls.
Reinforced security
OPENAI simultaneously publishes a System Card retail a new set of refusal data and the use of a LLM Monitor Responsible for detecting sensitive uses (biothreats, generation of malware, attempts at jailbreak). The publisher claims to reach 99 % detection during internal red -teaming phases and guarantees that the models remain under the “high” thresholds of the PREPAREDNESS Framework For biology, cybersecurity and self -improvement.
O3 vs o4 – mini: Which one to choose?
Criteria | o3 | O4 – mini |
---|---|---|
Raw power | ★★★★ ☆ | ★★★ ☆☆ |
Cost/Token | pupil | weak |
Latency | average | low |
Complex tool chaining | optimal | Good |
Typical use case | Research “Deep Research”, heavy visual analyzes, sophisticated code production | Embedded assistants, bulk request batch, mobile integration |
An ecosystem that widens
In parallel, the publisher publishes CODEX CLAan open -end agent for the terminal capable of locally piloting new models, while a fund of $ 1 million in credits API finances projects operating this tool.
Prospects
By dissociating the arrival of O3 from that of GPT – 5, OPENAI demonstrates its desire to quickly iterate on the reasoning without delay the next big version. The immediate future of AI will go through models capable of acting, either only to converse. It remains to be seen how the risks of instrumented hallucination will be managed on a large scale and the capture of sensitive data induced by this increased autonomy. For the time being, Openai resumes one step ahead in the race for agental reasoning.