Close

Why is There So Much Hype Around AI Agents in 2025?

By Curtis Michelson |  February 26, 2025
LinkedInTwitterFacebookEmail

If you watched the Super Bowl this month as much for the ads as the game, you likely noticed an uptick in AI spots.

OpenAI, Meta, Google, Salesforce and GoDaddy all ran AI-related ads, and the ad industry’s reaction was at least mildly impressed, if not for the cleverness and artistry, at least because none of them were an epic fail.

One ad in particular was about AI’s word of the year: agents. Interestingly, it didn’t really talk about agents. Rather, it aimed to show you how bad your life will be if you don’t use agents in the future. This spot from Salesforce (embedded below) shows a frantic Matthew McConaughey running through Heathrow airport to catch his flight, and like a fool, he didn’t enlist an Agentforce AI agent to handle his travel bookings. As he runs down the moving walkways, Woody Harrelson needles him: “They got an app for that, dude.” Even if the spot doesn’t win any awards, the message was clear: Salesforce has gone all-in on agents, and if you’re not with them, you may miss your flight to the future.

What exactly are we talking about when we hear “AI agent”? Don’t we already have agents with chatGPT or Claude, etc.?  What does it mean when agents become “autonomous,” and is that risky? To use a car analogy, are we talking about Fully Self Driving (FSD) or Driver Assist capability? What will be the risks to deploying them? Who will absorb the costs if/when they make mistakes? What kind of pricing models or new business models will emerge in this world? Not all answerable here today, but innovation leaders should begin asking these questions. What I will attempt to do is orient you to the emerging landscape and conversation around what most everyone agrees is the next big thing in AI, the extension of LLMs further out into our world, to do more practical tasks for us — rather than just respond to questions.

Taking the big consultancies’ perspective first, we have a variety of predictions. In Gartner’s Top Trends for 2025, the #1 slot was taken by “Agentic AI.” McKinsey positions agents as the next frontier in genAI moving from knowledge-based systems to autonomous ones. And just last month, BCG reported that 67 percent of executives expect that autonomous agents will be part of their companies’ AI transformation.

In terms of venture capital funding, TheInformation recently reported that investments overall have accelerated in the past three months. Factoring out three big deals (from OpenAI, Anthropic and xAI) that account for $16.6 billion of the $20.9 billion total raised in that period, the remainder is going into startups focused on domain-specific verticals, many of which are developing some form of AI agent or planning to offer agentic services. These are companies like Sierra.ai (customer support), Sana.ai (enterprise ops), 11x.ai (sales), Vapi.ai (marketing), Offerfit.ai (software development), or Basecamp Research (agentic biology) and hundreds of others.

Agents: What Do We Mean?

There are a variety of definitions for agents in the research literature and in common business lingo. The venture capital firm Andreessen Horowitz’s definition for agent begins in the negative, by defining what agents are NOT. They contrast genuine agentic workflows against classic enterprise Robotic Process Automation, or RPA. Their graphic (see below) is instructive. Unlike more fragile RPA systems that handle “happy path” use cases but fall apart in less common edge cases, intelligent automation or “automation 2.0” is presumed to be more elastic, more forgiving and resilient when handling a wide variety of contexts.

Salesforce, which is so bullish on agents that it is practically rebranding itself Agentforce, is pinning their definition of agent on the system’s level of autonomy. Specifically, they claim a true agent “can reason not only based on predictions it makes from large datasets, but also based on their ability to perceive their environment and then take autonomous action, and even learn from feedback and adapt.” 

Salesforce refers to such entities as “digital labor.” Their sales pitch materials, and their own ROI calculators, indicate agents are replacements for people-powered tasks. Unlike a simple chatbot requiring people to initiate and close interactions, agents have all the resources needed to initiate and complete workflows. 

Salesforce also lays out a kind of taxonomy for agents — an ascending ladder of capabilities starting from “reflex agents” (that use internal ML models or LLM models to react and infer next steps) to “goal-based” (think Deep Research by Google) to “learning agents” that are not programmed, but rather modeled to support extremely complex multi-step tasks and which self-improve through feedback loops. At the very top of their schema are “hierarchical agents,” the kind that essentially replace not just tasks but whole teams of people. 

A third definition comes from Scott Belsky, Chief Strategy Officer at Adobe, who recently shared in his Implications blog a similar taxonomy, in this case a rising pyramid of functionalities for agents. (See below.) He visualizes agent skills that build on each other starting at the bottom with the most pedestrian of functions, those he calls “glorified help support.” These are the pesky bots that intervene when you visit a new product website, for example. Up from there, is the agent that makes recommendations that are contextualized to a specific need. For example, with a “design thinking agent,” the system might intelligently synthesize various idea stickies directly into a product requirements doc, or some other innovation artifact. 

But beyond recommendations, the really impactful agentic systems for Belsky exist in the realm of actions. For example, when your design thinking software calls out to synthetic or actual customers on your behalf, or reaches out to research services via APIs to help you gather competitive data, that’s full autonomy.

The top-most position in Belsky’s pyramid is “autonomous workflows,” which consist of orchestrated multi-domain workflows, even potentially crossing organizational boundaries connecting to your vendors and partners. In Belsky’s vision, when agents begin to interact this way and call on each other it may ultimately lead to a new kind of organization he calls a Cognico, built entirely on a “cognitive stack.”

To sum it up, what I’ve come to understand as an “agent” regardless of one’s particular schema, was best explained to me in metaphorical terms by Anton Kornienko from NLP Logix, an AI researcher and engineer. He put it this way: “If an LLM is like a human brain, an agent is like a human body. It can now move out into the world and actually do something useful for you.” LLMs can take text-like input and turn it into some transformed output. A paragraph can become a summary. Or, a request for assistance can turn into a bit of python code. An agent leverages that capability to initiate a series of additional steps — simple or complex — which reach out into the world and bring back data that then lets the agent “reason” across a problem space or domain. 

Insop Song, a researcher at GitHub Next, recently offered a helpful primer on this idea of LLMs reaching out to their environment to act as “agents,” which I highly recommend. (It’s available on YouTube via Stanford University’s series of webinars on AI and GenAI.)

In the video, Song notes four common design patterns:

  1. Planning: the system breaks an objective into small chunks, makes a plan.
  2. Reflection: the system checks its own work, and the LLM critiques its own plan.
  3. Tool Usage: the system writes its own code to reach out to other services for resources.
  4. Collaboration: agents call other agents, dividing up the work between them.

Towards Truly Autonomous Systems

Belsky’s Cognico may be a ways off, but if history is any guide, the path towards fully autonomous AI agent workflows will not be a straight line, but rather built on a series of tests and fails.

For example, when NASA launched the Apollo program, they needed to develop systems that enabled their well-trained aircraft pilots to fly novel vehicles, and in some cases, to trust onboard computers instead of their own two hands. In his e-book Computers Take Flight on the creation of digital fly-by-wire technology, James Tomayko details the many successive innovations (failures) that led to the creation of consistent and reliable autopilots.

The history of autonomous driving has been no different. Only now, after decades of fits and starts, do we see services like Waymo passing federal and state safety requirements and getting deployed widely. Sure, the learning curve was accelerated due to the availability of new hardware and sensors, better digital maps, and large training data sets, as well the computational capacity to process road traffic and other data real time. But make no mistake, your ability to safely read a Kindle book as your car self-navigates through traffic on Camino Real in Silicon Valley is a direct result of decades of hard research and lots of real-world testing tests.

…[T]here are still very real practical concerns around handing enterprise workflows completely over to autonomous agents.

For corporate innovation leaders, the watchword should be “experimentation.” Unlike the hyperbolic fears of AI evolving into a Matrix-like dystopia that uses us all as batteries, there are still very real practical concerns around handing enterprise workflows completely over to autonomous agents. These concerns include the extension of already existing data privacy issues into larger scopes; the inevitable learning curve for any domain-specific agent trained on your enterprise data to prove itself free of hallucinations within some reasonable tolerance; the costs that should be planned for if/when agents make customer facing mistakes; and lastly, the time it will take for your human workforce to adapt to new tools and likely reconfigure their own team and business unit organizational designs.

My expectations is that to reach fully autonomous multi-agent workflows, the industry will go through the same hard won learnings that avionics and automotive went through to establish reliability and safety. In fact, in surveying the evolution of digital fly-by-wire (DBFW), Autonomous Vehicles (AVs), and the current literature on AI agents, I was struck by the similarities between all three. For example, to integrate many small intelligences, all three domains have an “orchestrator” function.

History may be repeating itself, or at least rhyming. AI agents are already acting in the virtual world, letting limited bits of software run autonomously. See Anthropic’s “computer use” or OpenAI’s “operator” demos as two very recent examples. They are indeed getting software to dance. But as these agents begin to reach out and touch not just other software, but real-world objects like our home automation systems or inventory in stores, the dance floor gets much much bigger. According to Salesforce, your software may be soon dancing with robots. They just published an article about the convergence of digital and physical AI using what they call World Action Models (WAMs). Finally, just this week, Microsoft released Magma, a new foundation model called a Vision-Language-Action (VLA) to allow systems to bridge the virtual and physical worlds.

The emerging agentic dance will be interesting indeed. Let’s hope it all feels less like a mosh pit and maybe more like a country line dance. Or would you prefer ballroom dancing?  Put on your dancing shoes.


AI Agents: A Glossary of Terms

In this fast-evolving landscape, there are some key terms and concepts worth knowing. Here’s my list…

TermDefinition
What It Does
AgentAn AI system powered that can independently plan, execute, and improve tasks.Enables AI to perform real-world work with minimal human input.
Prompt ChainingA technique where multiple AI-generated responses are linked together to complete more complex tasks step by step.Helps agents maintain context and break down multi-step workflows.
Memory IntegrationThe ability of AI agents to store and recall past interactions, making them more context-aware over time.Gives agents continuity across conversations and tasks.
Vector DatabaseA specialized database that stores and retrieves AI knowledge in a way that helps it remember past queries and responses.Provides agents access to massive stores of data to improve learning over time.
ReAct (Reasoning + Acting) ModelA framework that enables AI to think through problems logically before taking action, improving its decision-making.Prevents AI from blindly executing tasks without considering consequences.
Tool Use & API IntegrationThe ability of AI agents to connect with external applications, databases, and tools to perform real-world tasks.Extends AI capabilities beyond text generation to real-world applications.
Multi-Agent SystemWhere multiple AI agents work together, each handling different tasks or roles to achieve a shared goal. Enables agent teams to collaborate like human teams for complex problem-solving.
Task Delegation ModelA system where different AI agents specialize in separate tasks and collaborate to complete projects efficiently.Helps distribute workloads efficiently, mirroring human organizations.
Self-Improving AIAn AI system that refines its own knowledge and performance based on feedback and experience.Enables AI to get smarter over time without needing constant retraining.
Explainable AI AI systems designed to provide clear, understandable reasoning behind their decisions.Critical for trust, compliance, and transparency in AI decision-making.
LinkedInTwitterFacebookEmail