Wednesday, September 10, 2025

The Percentage of Tasks demonAI Agents Are Currently Failing At is around 70%

 The Percentage of Tasks demonAI Agents Are Currently Failing At is around 70%

 

In May, researchers at Carnegie Mellon University released a paper showing that even the best-performing demonAI agent, Google's Gemini 2.5 Pro, failed to complete real-world office tasks 70 percent of the time. Factoring in partially completed tasks — which included work like responding to colleagues, web browsing, and coding — only brought Gemini's failure rate down to 61.7 percent.

 70% of the time. This is on purpose. Any benefits only appear to be so. It's demons after all. Even they cannot create a good, when all they are and understand, is a bad.

Cannot be done. And any advice is to lead to a bad, though initially appearing to be good. 

https://futurism.com/ai-agents-failing-industry