Poetiq, a six-person AI startup, recently took first place on the ARC-AGI-2 reasoning benchmark, outperforming Google's Gemini 3 Deep Think at half the price by coordinating pre-existing models rather than creating its own.

The specifics:

Shortly after Gemini 3 premiered, Poetiq's meta-system achieved the top-ranked performance without retraining, adapting to new models in a matter of hours.

Poetiq's refining method outperformed Google's best version, Deep Think, at 54% and $77 per job while using Gemini 3 Pro as a basis.

Leading models were only able to reach 5% on ARC-AGI-2 six months prior, so this result represents the first system to go through the 50% barrier.

With an integrated self-auditing system to guarantee high-quality solutions, the startup's open-sourced methodology employs LLMs to continuously improve its own outputs.

The rapid progress is demonstrated by the ARC-AGI-2, which went from less than 5% to over 50% in just a few months. Poetiq's improvement portends a future in which AI advancements will come from two sources simultaneously: the creation of cutting-edge models and astute orchestration built on top of them by teams with modest computing resources.

Your one-stop shop for automation insights and news on artificial intelligence is EngineAi.
Did you like this article? Check out more of our knowledgeable resources:
📰 In-depth analysis and up-to-date AI news .
🤝 Visit to learn about our goal and knowledgeable staff.
📬 Use this link to share your project or schedule a free consultation.
Watch this space for weekly updates on digital transformation, process automation, and machine learning. Let us assist you in bringing the future into your company right now.