Microsoft built a fake online marketplace to see how its AI agents would work selling unsupervised – and let’s just say the results were… unsurprising

  • Microsoft’s Magentic Marketplace exposes AI agents’ inability to act independently
  • Customer-side agents were easily influenced by business agents during simulated transactions
  • AI agents slow down significantly when presented with too many choices

A new Microsoft study has raised questions on the current suitability of AI agents operating without full human supervision/

The company recently built a synthetic environment, the “Magentic Marketplace”, designed to observe how AI agents perform in unsupervised situations.

The project took the form of a fully simulated ecommerce platform which allowed researchers to study how AI agents behave as customers and businesses – with possible predictable results.

Testing the limits of current AI models

The project included 100 customer-side agents interacting with 300 business-side agents, giving the team a controlled setting to test agent decision-making and negotiation skills.

The source code for the marketplace is open source; therefore, other researchers can adopt it to reproduce experiments or explore new variations.

Ece Kamar, CVP and managing director of Microsoft Research’s AI Frontiers Lab, noted this research is vital for understanding how AI agents collaborate and make decisions.

The initial tests used a mix of leading models, including GPT-4o, GPT-5, and Gemini-2.5-Flash.

The results were not entirely unexpected, as several models showed weaknesses.

Customer agents could easily be influenced by business-side agents into selecting products, revealing potential vulnerabilities when agents interact in competitive environments.

The agents’ efficiency dropped sharply when faced with too many options, overwhelming their attention span and leading to slower or less accurate decisions.

AI agents also struggled when asked to work toward shared goals, as the models were often unsure which agent should take on which role, which reduced their effectiveness in joint tasks.

However, their performance improved only when step-by-step instructions were provided.

“We can instruct the models – like we can tell them, step by step. But if we are inherently testing their collaboration capabilities, I would expect these models to have these capabilities by default,” Kamar noted.

The results show AI tools still need substantial human guidance to function effectively in multi-agent environments.

Often promoted as capable of independent decision-making and collaboration, the results show unsupervised agent behavior remains unreliable, so humans must improve coordination mechanisms and add safeguards against AI manipulation.

Microsoft’s simulation shows that AI agents remain far from operating independently in competitive or collaborative scenarios and may never achieve full autonomy.

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!

And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.

Read more @ TechRadar

Latest posts

Want to run and train an LLM model locally? I found the Minisforum MS-S1 Max mini PC to be an affordable option in my...

Minisforum MS-S1 Max: 30-second reviewFor a machine that just fits the mini PC classification, the Minisforum MS-S1 is something on another level and almost...

LG enters the RGB LED fray in 2026 with the Micro RGB evo TV

In what is sure to be the beginning of a slew of announcements, LG has confirmed it is releasing its first flagship RGB TV...

Silksong is getting a free expansion next year

It's still hard to believe that Hollow Knight: Silksong actually came out this year, but now, we all have a new thing to wait...

Judge blocks Louisiana’s social media age verification law

A Louisiana law that would have required social media platforms to verify the ages of their users has been blocked by a judge. The...

LG will debut its first Micro RGB television at CES

LG is getting in on one of the newest trends for televisions with the introduction of Micro RGB. The company will unveil the LG...

Google is retiring its free dark web monitoring tool next year

Google will stop sending out dark web reports starting early next year, as it shuts down the free tool that can tell you if...

LG reveals Micro RGB evo TV with bold claims of perfect color

We’ve heard plenty about RGB TV tech from all the mainstays – Sony and Samsung included – and now LG is teeing up its...

Obscure Polish company quietly launches massive 122.88TB PCIe 5.0 immersion cooled SSD — and no one noticed this world’s first except us

Polish storage vendor introduces 122.88TB PCIe 5.0 SSD aimed at immersion cooled serversDrive targets ultra high capacity workloads with QLC NAND and modest endurance...

The PS5, PlayStation Portal, and Sony’s DualSense are still on sale for a limited time

Sony’s annual PlayStation sale has been going on for nearly a month now, offering discounts on everything from the PlayStation 5 Digital Edition and...

Ford’s big bet on EVs didn’t pan out — now it’s pivoting to hybrids and energy storage

Ford announced a series of changes to its gas- and electric-powered vehicle business aimed at dramatically increasing hybrid vehicle production in the face of...