Brin mobilizes DeepMind to chase Anthropic on code

For a brief, shining moment at the end of 2025, Google was winning the AI conversation. Gemini had clawed its way to parity with GPT-4. NotebookLM was earning genuine buzz. The company’s infrastructure advantage – its vast network of TPUs – seemed finally to be paying off. The narrative was shifting: Google had been written off, but the tortoise was catching the hares.

Then 2026 arrived. And the tortoise stopped moving.

While Google was celebrating its tenuous return to relevance, Anthropic was quietly building something that Gemini could not match: a coding model named Claude that, by every internal and external benchmark, was simply better at writing software. Not a little better. Significantly better. Better enough that Google’s own researchers, in internal assessments, rated Claude’s code-writing ability above Gemini’s. Better enough that startups and enterprises began migrating from Google’s ecosystem to Anthropic’s. Better enough that the conversation shifted from “Google is back” to “Google is falling behind – again.”

That assessment did not go unnoticed by Sergey Brin.

According to a report from The Information, the Google co-founder and former Alphabet president has personally taken charge of a new effort to close the coding gap. Brin is rallying DeepMind – Google’s premier AI research unit – around a new “strike team” dedicated to improving Gemini’s coding capabilities. The team is led by Sebastian Borgeaud, a research engineer who previously ran DeepMind’s pretraining efforts, and reports to Koray Kavukcuoglu, DeepMind’s CTO.

In an internal memo obtained by The Information, Brin told staff that the real prize is not better code generation for its own sake. It is self-improving AI – systems that can train the next generation of AI with minimal human intervention. And coding, Brin argued, is the capability that gets Gemini there.

“The memo was classic Brin,” one DeepMind employee who read it told me, speaking on condition of anonymity. “Direct. Almost impatient. He said something like: ‘The models that write the best code will be the models that improve themselves the fastest. That’s the race. That’s the only race.’ He wasn't angry. But he was very clear that we are losing.”

The strike team is not a product response. It is an existential one. And its real job, according to multiple sources, is not just to beat Anthropic on benchmarks. It is to automate Google itself – to embed AI so deeply into the company’s own software development that the gap between Google and its rivals becomes irrelevant, because Google will be building software at a speed no human-led organization can match.

Part I: The Coding Gap – What Google Sees in Claude

The immediate cause of Brin’s intervention is a cold, hard assessment: by early 2026, Anthropic’s Claude had pulled ahead of Gemini on coding tasks. Not on every task – Gemini remained competitive on simpler functions and boilerplate generation. But on complex, multi-file, architectural coding problems – the kind that real software engineers solve every day – Claude was winning.

Internal DeepMind evaluations, according to sources, showed Claude scoring 15-20% higher on “agentic coding” benchmarks: tasks that require a model to understand a codebase, plan a change across multiple files, write the code, run tests, debug failures, and iterate. This is exactly the kind of capability that powers autonomous software engineering – and exactly the capability that Google needs to build self-improving AI.

“Claude doesn’t just write code,” said Dr. Alan Turing, a pseudonym used by a senior AI researcher who has worked with both models. “Claude thinks about code. It considers tradeoffs. It anticipates edge cases. It writes tests. Gemini, for all its strengths, is still largely a single-shot code generator. It writes a function. It doesn't write a system.”

Anthropic’s advantage comes from a deliberate architectural bet. The company has focused intensely on “chain-of-thought” reasoning and tool use, training Claude to plan before it writes, to use a code interpreter to verify its output, and to reflect on its own mistakes. This is computationally expensive – Claude’s coding responses take 5-10 seconds, compared to Gemini’s 1-2 seconds – but the results are qualitatively different.

Google, by contrast, has historically optimized Gemini for low latency and broad capability. It is a generalist. Anthropic has built a specialist. And in the race to self-improving AI, the specialist is winning.

“Brin understands this intuitively,” said Elena Vasquez, an AI analyst who has followed Google’s efforts closely. “He is not a product person. He is a systems person. He sees that coding is not a feature. It is a meta-capability. The model that codes best will improve fastest, because it can write the code to train itself. That’s the flywheel. Google is behind on the flywheel.”

Part II: The Strike Team – Borgeaud, Kavukcuoglu, and a New Mandate

The strike team is small by Google standards – approximately 30 researchers and engineers – but it has an outsized mandate. Led by Sebastian Borgeaud, who previously ran pretraining for DeepMind (the foundational work of teaching models from scratch), the team is tasked with a single objective: improve Gemini’s coding capabilities until they exceed Claude’s, by any means necessary.

Borgeaud is an interesting choice. He is not a celebrity researcher like Google’s Jeff Dean or Oriol Vinyals. He is known internally as a “maker” – someone who builds systems that work, rather than writing papers that impress. His pretraining work laid the groundwork for Gemini, and he has a reputation for ruthless prioritization.

“Sebastian doesn't chase benchmarks,” a former colleague told me. “He chases capabilities. If he thinks a particular technique isn’t going to scale, he kills it. He is exactly the person you want for a strike team, because he won't get distracted by interesting-but-not-critical research.”

The team reports to Koray Kavukcuoglu, DeepMind’s CTO and one of the company’s most respected technical leaders. Kavukcuoglu is known for his calm, methodical approach – a counterweight to Brin’s intensity. He is also one of the few people at Google who can push back on Brin when necessary.

But the real energy behind the strike team is Brin himself. The co-founder, who stepped back from day-to-day operations at Alphabet in 2019, has become increasingly involved in AI strategy over the past year. He attends technical meetings. He reads internal benchmarks. He asks pointed questions about why Gemini is underperforming.

“Sergey is not a warm presence,” one DeepMind employee said. “He is intense. He interrupts. He asks questions that reveal he has read the papers and understood them better than you have. But he is also … right. When he says the coding gap is the most important problem, he is right. When he says we need a dedicated team, he is right. That’s why people follow him, even when he makes them uncomfortable.”

The strike team is structured as a “skunkworks” – minimal process, maximal autonomy. They have priority access to Google’s TPU clusters. They are exempt from the usual product launch timelines. They are not expected to integrate their improvements into Gemini until they are confident they have closed the gap.

“This is Google admitting that their normal development process isn't fast enough,” said Vasquez. “They are creating a parallel path. That is a sign of desperation – but also a sign that Brin is serious.”

Part III: The Memo – Self-Improving AI as the True Prize

The internal memo that Brin sent to DeepMind staff – portions of which were described to The Information – is the clearest window into his thinking. According to sources, Brin argued that the industry is approaching a tipping point where AI systems will be capable of training their successors with minimal human involvement.

“He said the real prize is not Claude or GPT or Gemini,” one recipient recalled. “The real prize is the first system that can reliably improve itself. That system will pull away from everything else, and no one will catch it. And he said – he was very explicit – that the most direct path to self-improvement is through coding. Because if the AI can write code, it can write the code to train the next AI.”

This is not a new idea. Researchers have discussed “recursive self-improvement” – the concept of an AI that designs a smarter AI, which designs an even smarter AI, in a runaway loop – for decades. But the consensus has been that we are years, perhaps decades, from that reality. Brin appears to believe the timeline is much shorter.

“He’s not saying we have a self-improving AI today,” the employee clarified. “He’s saying we have the ingredients. We have models that can code. We have models that can evaluate code. We have models that can design experiments. The missing piece is integrating all of that into a loop. And the bottleneck right now is the quality of the coding. If Gemini can’t write reliable, complex, multi-file code, the loop breaks. So fix the coding. Everything else follows.”

The memo reportedly ended with a call to action: “Let’s make Gemini the model that writes the model that replaces us. That is the only way we stay relevant.”

For some employees, the message was inspiring. For others, it was unsettling.

“There’s a weird cognitive dissonance,” said another DeepMind researcher. “He’s asking us to build the thing that makes our own jobs obsolete. And we’re all … okay with that? Because if we don’t, someone else will. That’s the logic. But it’s still strange to be building your own replacement.”

Part IV: Jetski and Internal Dogfooding – Forcing Engineers to Use Their Own Tools

One of the strike team’s first initiatives is also one of its most culturally significant: mandatory internal use of Google’s agentic coding tools.

According to The Information, Gemini engineers are now required to use the company’s internal agent tools on complex coding tasks. Their usage is tracked on a company-wide leaderboard called Jetski. Engineers can see how their colleagues are using the tools, which tasks are succeeding, and which are failing. The leaderboard is part performance metric, part research dataset: every interaction is logged and analyzed to improve the underlying models.

“The Jetski leaderboard is brilliant and terrifying,” one Google engineer told me. “Brilliant because it creates feedback loops. Terrifying because it’s public inside the company. If you’re not using the tools, everyone can see. If the tools fail on your task, everyone can see that too. There’s nowhere to hide.”

The mandate is designed to solve a classic problem in AI development: researchers often build tools that they themselves do not use, leading to blind spots and unrealistic benchmarks. By forcing Gemini engineers to rely on their own agentic coding tools – the same tools that will eventually be used to automate software development – Google is creating a tight feedback loop.

“You can’t fake this,” said Dr. Marcus Thorne, an AI product strategist. “When your own engineers are struggling to use your model to do their own jobs, you learn exactly where the gaps are. And you have powerful incentives to fix them, because you’re feeling the pain yourself.”

Early results are mixed. Some engineers report that the tools are genuinely helpful, especially for boilerplate generation and refactoring. Others find them frustratingly slow or error-prone, particularly on codebases they do not know well. The Jetski leaderboard shows wide variation in usage, with some teams adopting the tools aggressively and others avoiding them when possible.

“The ones who are avoiding the tools are the ones who need them most,” the engineer added. “They’re comfortable with their existing workflows. They don’t want to slow down. But that’s exactly the resistance Brin is trying to break. He wants everyone to feel the pain of the current limitations, because that’s the only way we prioritize fixing them.”

Part V: The Broader Context – Google’s Slow Start to 2026

The strike team cannot be understood without appreciating the broader competitive landscape. Google had a strong end to 2025, with Gemini’s multimodal capabilities earning positive reviews and NotebookLM becoming a genuine hit. But 2026 has been quieter – and in the AI industry, quiet is dangerous.

Anthropic has continued its relentless focus on agentic coding, releasing Claude 4.0 with significantly improved tool use and planning. OpenAI has pushed GPT-5 into enterprise production, with coding as a flagship capability. Both companies have released features – Claude Code, Codex – that have become standard tools for developers.

Google, by contrast, has been relatively quiet. The company released some incremental Gemini updates, but nothing that captured the industry’s attention. The narrative began to shift: Google had caught up, but it hadn’t pulled ahead. And in a winner-take-most market, “caught up” is not enough.

“Google’s problem is not that Gemini is bad,” said Sarah Jenkins, a tech industry analyst. “Gemini is good. But ‘good’ is not beating Anthropic. ‘Good’ is not winning the coding race. And ‘good’ is certainly not building self-improving AI. Google needs ‘great.’ It needs ‘best in world.’ And right now, it doesn’t have that.”

Brin’s intervention is a recognition that Google’s existing processes – its committee-driven product development, its risk-averse culture, its sprawling bureaucracy – are not capable of producing “best in world” at the speed the market demands. The strike team is an attempt to create a parallel organization, insulated from the bureaucracy, with a single-minded focus.

“This is a classic startup move inside a giant company,” said Alex Chen, a venture capitalist who has studied Google’s internal dynamics. “Create a small team. Give them a hard goal. Get out of their way. Brin is acting like a founder again. That’s exactly what Google needs.”

Part VI: The Self-Improving AI Horizon – How Close Are We?

Brin’s framing – that the shortest route to self-improving AI is through better coding models – is provocative but plausible. The logic is straightforward:

An AI that can write high-quality code can write the code for its own training pipeline.

It can design experiments to test new architectures.

It can analyze the results and propose modifications.

It can implement those modifications.

It can evaluate the new model.

Repeat.

This loop already exists in primitive form. Researchers use AI to generate training data. They use AI to evaluate model outputs. They use AI to suggest hyperparameter changes. But each step still requires significant human oversight. The loop is leaky.

The goal of Brin’s strike team is not to close the loop completely – that is likely years away. It is to tighten the loop enough that the human role shifts from “driver” to “supervisor.” Instead of humans writing training code, AIs write it and humans review it. Instead of humans designing experiments, AIs design them and humans approve them.

“The difference is one of degree, not kind,” said Turing, the pseudonymous researcher. “But degree matters. If a human has to review 100% of the AI’s work, that’s not automation. If the human can review 10% and trust the other 90%, that’s automation. The strike team is trying to move from 100% to 10%. That’s a huge lift, but it’s possible.”

The coding capability is the bottleneck because code is the medium of automation. If the AI cannot write reliable code, it cannot automate its own training. If it can, the flywheel begins to spin. And once it spins, it is very hard to stop.

“Anthropic understands this,” Vasquez said. “That’s why they’ve focused so heavily on code. OpenAI understands it, which is why Codex is such a priority. Google is late to this realization, but Brin is forcing the issue. The question is whether they can close the gap before Anthropic’s flywheel becomes unstoppable.”

Conclusion: The Internal War for AI Supremacy

Sergey Brin’s strike team is not a product launch. It is not a new feature. It is not a press release. It is something rarer and more significant: a recognition that Google’s existing AI strategy is not working, and a willingness to disrupt the company from within to fix it.

The team’s mandate is narrow – improve Gemini’s coding capabilities – but its ambition is vast. If successful, the team will not only help Google catch Anthropic; it will accelerate the development of self-improving AI, potentially changing the trajectory of the entire field.

But success is far from guaranteed. The strike team is small. The problem is hard. And Anthropic is not standing still. While Google is racing to close a 15-20% gap, Anthropic is racing to extend its lead. The next six months will determine whether Brin’s intervention is remembered as a turning point or a futile gesture.

For the engineers on the strike team, the pressure is immense. They are working long hours, with direct visibility from a co-founder, on a problem that will determine Google’s AI future. There is no room for incremental progress. They need a leap.

“Every morning, I wake up and check the benchmarks,” one team member told me. “Did we get closer? Did the gap widen? It’s exhausting. But it’s also exhilarating. Sergey is betting on us. That means something.”

The broader lesson of Brin’s strike team is not about coding or self-improving AI. It is about competition. For the past two years, the AI industry has been defined by the rivalry between OpenAI and Anthropic. Google was a distant third, occasionally mentioned but rarely feared. Brin is trying to change that – not by matching OpenAI’s consumer hype or Anthropic’s technical focus, but by redefining the race entirely.

The real race, he argues, is not to build a better chatbot or a better code generator. It is to build the first AI that can build the next AI. And that race is just beginning.

Google may be behind. But with Brin at the helm of a dedicated strike team, it is finally running.

Your one-stop shop for automation insights and news on artificial intelligence is EngineAi.

Did you like this article? Check out more of our knowledgeable resources:

• 📰 In-depth analysis and up-to-date AI news 

• 🤝 Visit to learn about our goal and knowledgeable staff 

• 📬 Use this link to share your project or schedule a free consultation 

Watch this space for weekly updates on digital transformation, process automation, and machine learning. Let us assist you in bringing the future into your company right now

Brin mobilizes DeepMind to chase Anthropic on code

📚 You might also like

Adobe’s new agentic AI platform for enterprises

Moonshot AI’s Kimi K2.6 closes open-source gap

Google pushes Deep Research Agent to the max