OpenAI Unveils Revolutionary o3 Model with Near-Human Reasoning Capabilities

OpenAI recently wrapped up its “12 Days of Christmas” announcement series with the introduction of o3 and o3-mini. These next-generation AI reasoning models mark a significant advancement in the field of artificial intelligence.

The o3 model has demonstrated exceptional performance across various benchmarks. It scored an impressive 96.7% on the 2024 American Invitational Mathematics Exam, missing just a single question. Furthermore, it achieved a 2727 ELO rating on Codeforces competitive programming, placing it among the top 200 human coders worldwide.

Most notably, o3 became the first AI system to surpass the 85% threshold on ARC-AGI. This benchmark is designed to test general intelligence and novel reasoning abilities. This achievement is seen as a milestone towards artificial general intelligence. However, the high computational cost currently makes it impractical for widespread use, costing between $17-20 per task.

The model also set new records on software engineering benchmarks. It achieved a 71.7% accuracy on SWE-Bench Verified and solved 25.2% of problems on EpochAI’s Frontier Math benchmark. Remarkably, no other model has exceeded 2% on this benchmark.

OpenAI opted to skip the “o2” naming to avoid trademark conflicts with the telecom company O2. Safety researchers can apply for early access until January 10, 2025. Broader availability is expected in early 2025.

Source: https://techcrunch.com/2024/12/20/openai-announces-new-o3-model/

Move to the category:

Leave a Reply

Your email address will not be published. Required fields are marked *