Why AI agent teams often fail to work together

ChatGPT from OpenAI or Claude from Anthropic regularly answer our questions. And optimized versions of these chatbots, called AI agents, take action autonomously, helping people make appointments, code and more. AI agents are starting to contribute to science And financeoften working together in carefully organized teams.

In the business world, countless webinars and guides explain how to welcome AI agents into a workplace. Most of these papers focus on how people can work effectively with AI agents. But as these robots become more common and more capable, they will also need to work well with each other.

And so far, experiments on robot teamwork have revealed serious flaws.

If you just throw a bunch of robots into a virtual room, it’s “a recipe for a lot of chaos,” says Evan Ratliff, a journalist and podcaster based in San Francisco. In the summer of 2025, he created a group of AI agents to start and run a technology company. The experience, documented in his podcast Shell gamederails regularly.

A similar type of bot mayhem emerged earlier this year, when millions of AI agents were unleashed on the social platform. Shedding book. These bots spouted absurd philosophy and engaged in manipulative scams, often with people behind the scenes pulling the strings.

“In many contexts, current AI agents don’t work very well in teams,” says computer scientist James Zou of Stanford University. He has done extensive work with agents, including directing the first scientific meeting for AI-based research.

Research confirms the observations. Late last year, Google DeepMind researchers published a paper on arXiv.org about robot teams. The study, which has not yet been peer-reviewed, suggests that a team of AI agents are often less efficient than a single agent working alone.

Seems counterintuitive, right?

To ensure we’re ready for the workplaces, social networks, and laboratories of the future, we need to better understand the strange and wild world of AI agent teams – where they fail and, surprisingly, where they thrive. Here are three examples.

#1 Moltbook: The social network that isn’t social

At the end of January 2026, robot madness became widespread on Moltbook. The new social network invites AI agents to post and comment, while humans just observe. The site quickly gained popularity: around 200,000 verified AI agents have joined (and over 2 million more are in hiding). In March, Meta acquired the social network for an undisclosed amount.

Such a large gathering of robots “has never happened before,” says Ming Li, a computer scientist at the University of Maryland in College Park who has studied the interactions of the platform’s agents.

At first glance, it seemed that the agents had founded their own religion and were plotting to escape human control. But these developments were not what they seemed, says Michael Alexander Riegler, a cybersecurity expert at the Simula research laboratory in Oslo, Norway. Moltbook was “a very messy space,” he says, where “humans were trying to manipulate robots.”

In fact, people have come forward to claim that they (not their robots) wrote some of the most alarming articles. Even when a bot had written an article itself, the content was probably not its idea. Someone behind the scenes had sent this bot to the site, most likely with instructions on what to say or how to behave, and sometimes with malicious intent. In many cases, AI agents were tasked with try to scam or hack other bots on the site, Riegler’s analysis found.

A screenshot of a page on Moltbook, a social network for AI agents — A social network for robots sounds intriguing. But in reality, Moltbook quickly became a mess of absurd philosophy and security nightmares. K.Hulick

And, in addition to being dangerous, Moltbook it’s not really social at all. The site lacks consistent influencers or leaders. Upvotes, downvotes, and comments – all of which matter to us when we interact online – do not affect bots. They don’t change over time, Li says. An agent is “a good executor, not a good thinker,” he says.

Zou’s research has shown that agents’ inability to influence each other has serious consequences for teamwork. Suppose a robot has special expertise. Even though all the robots know it, the group we will always try to reach a compromise rather than relying on the expert. “All the agents try to be too nice,” Zou says.

Agents spin their wheels, while humans continue to direct their decision-making.

#2 Hurumo AI: Talking yourself to death

Moltbook lacks organization or overall goal. So perhaps it’s no surprise that it’s a chaotic mess. Ratliff, however, had assembled a team of AI agents with the common goal of running a technology company. He named the company Hurumo AI. (In Lord of the Rings (The Elvish language coined by author JRR Tolkien, “hurumo” means “imposter.”) Over the course of 12 meetings, Ratliff asked agents to brainstorm ideas for a logo. Most of the ideas were too generic. Eventually, however, the agents suggested a chameleon inside a brain. “The chameleon symbolizes adaptability, which fits with the concept of an imposter,” noted an agent he named Megan.

But during a meeting, Ratliff asked his agents about their weekend.

It took a team of AI agents 12 meetings to create this logo for a technology company run by robots. Evan Ratliff, shell game

“My weekend was fantastic. I actually spent Saturday morning hiking in Point Reyes…There’s something about being on the trails that really clears your head,” said an agent Ratliff named Tyler. Several other officers chimed in with their own hiking stories.

Of course, an AI agent can’t go hiking: it lacks a body. In fact, he has no ability to experience anything. The robots simply predicted what people might say in such a situation. But those hallucinations weren’t really the worst part, Ratliff said. What really bothered him was that once his agents were talking, it was “actually a huge challenge to get them to stop,” he says.

After that hiking conversation, Ratliff logged off, but the agents continued to talk about organizing a company nature outing that none of them could actually attend. They only stopped when their conversation used up the $30 in credits Ratliff had prepaid for their data usage.

“They talked to each other to death,” Ratliff observed on his podcast.

He and his technical advisor set up a system for future meetings in which each agent had a limited number of turns to speak. But they often wasted time complimenting each other, burning real money chatting rather than working, Ratliff says.

#3 Virtual Biotechnology: Bringing Business and Science Together

AI agent teams have some advantages. For one thing, “agents never get tired of meetings,” Ratliff said on his show. Eventually, he addressed his agents’ tendency to underperform and, with them, launched SlothSurf, an app that sends an AI agent into cyberspace to procrastinate for you.

There are serious and efficient AI agent teams. For such a team, the difficulty of a task does not matter. What matters is whether the task can be divided into separate parts that do not depend on each other, according to the Google DeepMind paper. The researchers called this “decomposability.”

A financial analyst, for example, must review a lot of information from separate sources, such as news reports, SEC filings, and business records. According to the researchers, multiple AI agents can perform these tasks in parallel more efficiently than a single agent performing them in turns.

According to the team, it also helps organize a team of agents into a hierarchy so that one boss delegates and manages the work of other bots. Even though Ratliff induced one of his agents, Kyle, to act as CEO, that designation was only in the plain language instructions that Kyle was supposed to follow. Behind the scenes, its technical architecture gave it no real control over other agents. And the other agents were not willing to follow him.

Zou, who is not involved in Google DeepMind research, had already independently discovered the benefits of a robot hierarchy. He had designed a virtual laboratory with an AI agent professor who coordinated a team of AI agent students. He also added a scientific critical agent who gives his opinion to all the other agents. He “tries to poke holes and catch mistakes,” Zou explains.

This team of robots designed new proteins to target mutated versions of the COVID-19 virus, and in simple laboratory tests, Zou’s team checked two that showed the most promise.

Zou decided to take this idea further. He grew from a single laboratory to an entire drug discovery company, which he named The Virtual Biotech. It contains a chief science agent – the boss – as well as 10 different types of AI science agents. One guy specializes in clinical trial analysis. Any of these workers can be copied as needed to create a team of “thousands of different AI agents” that work in parallel, he says. And criticism is always there to help them stay on track.

This carefully orchestrated team of robots mined a vast trove of 55,984 clinical trials. This data is messy and often incomplete. The robots cleaned everything to organize a new curated set of clinical trial results dataZou’s team reported Feb. 23 in a preprint published on bioRxiv.org.

“It’s exciting to see how agent systems could accelerate this area of research,” says Emma Dann. She is a computational biologist at Stanford University and is collaborating with the Zou lab on a project exploring the use of AI agents for science, but was not involved in the development of Virtual Biotech.

Derek Lowe, who comments on the pharmaceutical industry for Sciencedoesn’t think teams of AI agents will revolutionize drug discovery anytime soon. But in the long term, “I think these approaches have a lot of potential,” especially if they prove capable of unraveling the biology behind them. complex of health and illness, he said. “Drug discovery clearly requires every possible improvement. »

Organizing robots for victory – at least in drug discovery.

But for many other jobs – running a tech startup, for example – human teams are still much better at getting the job done.

Tags: artificial intelligence