The specifics:
Similar to the popular Ralph Wiggum method, Cursor divided agents into planners, laborers, and judges so that hundreds may run and operate together.
The agents created the browser from scratch in less than a week, and it was able to properly load basic webpages.
A Windows 7 emulation, an Excel clone, and an internal migration of Cursor's codebase—all involving more than a million lines of code—were among the other attempts.
Additionally, the team saw that GPT-5.2 performed far better on lengthy autonomous runs than Claude Opus 4.5, which tended to take short routes.
The agents created the browser from scratch in less than a week, and it was able to properly load basic webpages.
A Windows 7 emulation, an Excel clone, and an internal migration of Cursor's codebase—all involving more than a million lines of code—were among the other attempts.
Additionally, the team saw that GPT-5.2 performed far better on lengthy autonomous runs than Claude Opus 4.5, which tended to take short routes.
From viral Claude Code use cases to AI agent swarms destroying 1M line projects over weeks, the current frontier generation of coding agents has overcome an unseen capacity limit. The fundamental economics of development begin to change when agent coordination and task durations rise.