AI Agents are turning out to be the next best thing in the AI space. These technological marvels promise to automate a variety of repetitive tasks, engage in complex multi-step behavior, and actively pursue goals and objectives - all with little or no human intervention. But while the workings of AI chatbots are confusing enough, AI Agents are perhaps even more confounding, especially since they utilize many of the same technologies (and then some). Yet, it is important to get a handle on what they are, not least to understand how they can benefit businesses, corporations, and ultimately, the end user.
What are AI Agents?
The term 'AI Agent' refers to any digital system that is actively pursuing goals and objectives, initiating a complex set of actions, making context-aware decisions, and learning from the environment. Leveraging a variety of advanced technologies (such as LLMs, Neural Networks, Natural Language Processing, Machine Learning, etc.), AI Agents are designed to function autonomously and have the capacity to adapt to changing requests and user patterns.
Exactly what level of agency these AI Agents have is ultimately up to the developers and the tools that are available to them, the organization deploying them, and the end user that tells them what they can or can't do. AI Agents can take on specific goal-oriented tasks, leverage data sources, use complementary tools and systems, and even communicate with other AI agents. Given the cutting-edge technologies that underpin them, these AI agents can learn 'on-the-job' like a new employee and be given additional autonomy as they prove their proficiency.
Once an overarching task is assigned, the agent can create a plan of action and divide it into individual tasks that need to be accomplished toward the goal. However, creating a plan isn't always necessary. For simple actions, such as ordering a sandwich online, the AI agent can use predefined actions and simply improve upon them if required. Continuing with our sandwich ordering example, this can include things like looking up the nearest open restaurants that sell sandwiches, confirming the user's dietary preferences, relaying any issues back to the user (no pickle available, for example), acquiring payment details, and finally placing the order. But this is only a very generic example of what AI agents can do.
Different types of AI Agents
There isn't just a single AI Agent type. As mentioned earlier, the term applies to any digital system that can initiate complex multi-step processes, learn from previous experience, and make decisions toward fulfilling a set goal. As such, there are several different types of AI Agents that fit the bill. Here's a look at a few broad types:
- Simple reflex AI agents: These are the simplest forms of AI agents that are designed to fulfil a singular predefined task. Unlike other AI agents (and chatbots), they do not have any 'memory' and base their actions solely on predefined rules. The agents can successfully accomplish their set tasks only if other conditions are met and they have access to the required information. Many basic chatbots and sensor-based electronics such as automatic thermostats fall under the category of simple reflex AI agents.
- Model-based reflex AI agents: These AI agents are slightly more advanced that seek to maintain an internal model of their environment. Along with access to sensory data, these agents also have a memory that allows the model to get updated over time. The actions of these agents are still governed by predefined rules but the added dimension of having a memory enables them to operate in a dynamic environment. Cleaning robots are a good example of model-based reflex AI agents that must overcome hurdles while cleaning as well as update their memory of areas that are already cleaned up.
- Goal-oriented AI agents: These AI agents are governed by their goals or set of goals while also having an internal model of their environment. These are the agents that plan their actions before executing them. More often than not, this allows them to be effective in their tasks compared to previously mentioned AI agent types. Robotic arms in factories, chess playing AI, and self-driving cars that have to search and plan their moves before executing them are all examples of goal-based AI agents.
- Utility oriented AI agents: These AI agents are a more advanced version of goal-oriented AI agents that not only plan their actions before executing them but also prioritize efficiency of action. The agents have an internal efficiency value that allows them to maximize their utility while also satisfying their goals. The parameters that indicate positive movement toward these utility values include time taken, progress towards the goal, levels of complexity, etc. Financial trading agents, navigation systems that prioritize fuel efficiency, and portfolio managers are some examples of utility-based agents.
- Learning AI agents: These are AI agents that, along with their ability to be goal and utility orientated, can also learn with new experiences. As they are used and their knowledge database grows, they become ever more efficient at solving problems and improving with feedback. This allows them to function in foreign and novel situations, making them ideal for generating personalized e-commerce and entertainment recommendations, filtering spam, and software testing environments.
- Hierarchical AI agents: These AI agentic systems involve multiple agents structured in a hierarchy where agents up top control and manage the actions of the agents down below. While multi-agent systems can involve several autonomous agents working independently, hierarchical agent systems bring better information flow and coordination that allows the system to break down complex tasks into simpler ones and assign the work accordingly.
How are AI Agents different from AI chatbots?
Both AI agents and AI chatbots leverage AI technologies (more on these technologies in a later section) to understand human languages and take actions based on what's being prompted. But they're not the same.
Having an understanding of AI chatbots is a good starting point for the workings of AI agents. But AI agents are a step ahead of AI chatbots in more ways than one. Here are a few ways in which AI Agents are different from AI chatbots:
- Nuanced approach to tasks: While AI chatbots are great at providing text-based responses to simple queries, AI agents can handle complex instructions and execute multi-step processes to go with them. They're also much more nuanced in understanding the context of the request and the environment they're operating in, breaking down complex tasks into smaller manageable chunks, accommodating feedback, and changing their methods in real-time.
- Goal oriented: AI chatbots are designed to fulfil specific requests. But when asked to deliver on anything outside of their programming, they can leave you wanting. On the other hand, AI Agents are better equipped at handling complex, multipronged tasks that require completing several intermediary steps. They're also not limited to a single platform and can adapt to different services in their hunt for the steps that lead to the desired outcome. In short, they're not tied by predefined patterns but are actually basing their actions on what works.
- Learning and Growth: Although AI chatbots can remember your preferences as they go along, their learning is restricted to their narrow domains. AI Agents, however, are more dynamically designed. They can adapt and evolve as they go along and use their learnings to inform their choices going forward. This allows them to be more dynamically suited for novel situations and grow beyond their initial programming.
- Ever-expanding knowledge base: AI chatbots already have a wide knowledge base. But their patterned learnings are limited to the data they was fed during training. To expand on it would necessitate the development of newer models. AI Agents, on the other hand, are more flexible. Not only can they access the vast data sets of LLMs, latest web content, and resources from third parties, but their programming allows them to infer from available data, combine data streams, and even produce new knowledge.
Technologies that underpin AI Agents
AI agents rely on a mix of several different technologies working in tandem. Be it the ability to understand human languages, detecting subtle contextual clues, or recognizing the most salient patterns in a deluge of data, these cutting-edge technologies give AI agents the power that no other digital system can boast of. Here's a closer look at the technologies that make AI agents tick.
Natural Language Processing
As the name suggests, Natural Language Processing (NLP) is the technology that allows a digital system to process natural human language and understand and interpret their complexities without having to necessarily rely on keywords and commands. NLP relies on other technologies like machine learning, neural networks and LLMs for textual analysis. It's a key component in the tapestry that supports AI agents and ultimately allows them to understand the intent of a user request and have meaningful conversations.
Large Language Models
Thanks to AI chatbots, Large Language Models (or LLMs) are all the rage these days. To be precise, LLMs are a type of machine learning model that lie within the broader NLP umbrella. LLMs have huge data sets that allow them to understand and generate complex linguistic patterns. This is what enables AI agents to understand natural languages and have a natural dialogue that is contextually relevant, nuanced, and allows for the retrieval of prior information (remembering).
Machine Learning
Machine Learning (ML) is another AI technology, one that specifically enables machines to learn and evolve using the available data and without any explicit human intervention. ML enables AI agents to adapt to new situations, make informed predictions, and deliver results that get more and more refined and accurate over time.
Neural Networks
Finally, we have Neural Networks (NN). These are quite literally the brains behind the whole AI operation as they're modelled after the neural patterns of our own brains. Similar to the neurons in our brains, these networks process a wide array of data points and create clusters (or nodes) that help classify data. This mechanism allows NNs to see how all the different pieces fit together and create hierarchies of information. All of it is done automatically and is continually update as additional data comes in. Leveraging this technology is what enables AI agents to detect patterns in complex data sets and make decisions that are sophisticated, nuanced, and relevant to the task at hand.
AI Agents Benefits and Challenges
The sophistication of AI technologies equips AI agents with unworldly power which, when used well, can change our relationship with technology for the better. But being a relatively recent AI technology, they're also beset with several challenges. Here's a look at some of the inherent benefits of AI Agents and the risks and limitations that they currently face.
Benefits of AI Agents
- Accuracy - AI Agents do away with much of the errors that come with a human operator. Since they're able to handle vast swathes of data, they're also reliably more accurate when executing tasks and making predictions. As long as the data being fed is up to date and accurate, the actions performed by AI agents will tend toward better accuracy and precision.
- Available anytime, anywhere - Unlike human agents, AI agents can work round the clock. Depending on the agent type, they can work autonomously and with little hand-holding. When deployed to the cloud, they can also be accessed by intended parties anywhere.
- Efficiency - Thanks to task automation, AI Agents make work more efficient. This is true for AI agents across the board (and not just utility-based agents) as it allows their human masters to dedicate their time to more demanding tasks. By freeing up their human counterparts from menial and repetitive tasks, AI agents allow for better allocation of limited resources, increased efficiency, and ultimately, improved productivity.
- Data Analysis - AI Agents are masters of data analysis. Not only can they detect patterns in huge sets of data and find new ways to interpret them, they also excel at extrapolating and making predictions. This isn't just useful for anticipating user behavior, but also creating long-term plans, foreseeing challenges, forestalling equipment failure, and staying ahead of the curve.
- High performance and consistency at lower cost - AI agents produce high quality responses that are sophisticated, nuanced, and personalized. Regardless of their type and the industry in which they're used, their ability to continually learn and synthesize new information with the old makes them a great asset. They can be trained to follow set procedures which ensures that they remain consistent in their tasks. They also help cut down on costs by automating tasks and finding new ways of improving operational efficiency.
- Scalability - AI tools may be hard to scale. But they provide a much better alternative to expanding upon existing resources by employing more humans or implementing other technologies. When done right, AI agents can be scaled over time in a manner that allows other workers to get acquainted with the technology while using it to solve problems.
Risks and challenges
- Security - As with AI chatbots, there's always a concern about the security of the company's and the user's data. Since AI agents have access to transactional histories and are in the know of everything that's required for them to operate, they can often be a security liability. It's important to not only have full control of the AI agents but also afford them the security that's required to ward off external attacks.
- Data and agent dependency - AI agents often require access to high-quality data for their operations. At the same time, much of this data has to be fine-tuned and refined for the AI agents to do their best work. Depending on the task, certain AI agents also have to rely on other AI agents that complement their work. When any one of these complementary agents stop working, other AI agents are also at the risk of failing.
- Overfitting and Underfitting - The model that underpins the AI agent can fail to make predictions if it overfits to the trained data and is unable to generalize to new data. On the other hand, models can also underfit the data when they're not trained for long enough and thus fail to find any meaningful connections in new sets of data. While overfitting models are at risk of failing to generalize, underfitting models are at risk of becoming too biased. The sought-after sweet spot between the two is hard to achieve but is what is required for AI agents to make accurate predictions.
- Resource hungry - AI agents are resource hungry beasts that take up copious amounts of computational power. Like their chatbot counterparts, AI agents depend on the continuous provision of computational and storage resources to function properly. When there's a dearth of these resources, AI agents will simply fail or, if they're tuned to operate temporarily with fewer resources, will deliver sub-optimal results.
- Complexities - Building AI agents isn't just resource intensive but also an intricately complex endeavor. While AI Agents that work on specific tasks are relatively easy to build, the level of complexity increases dramatically for agents that handle increasingly complex tasks. Add to that their resource-intensive nature and AI agents can quickly become hard to handle, be it to build, implement, or maintain them.
What is big tech doing with AI Agents?
The insane capabilities of AI Agents aren't lost on the tech giants of the world. All the major players, from Microsoft to Google to Apple and OpenAI are in it for the long haul. New players have also emerged looking to develop a single platform where AI agents can be deployed and accessed across devices. Here's a look at what each of these giants plan to do with AI Agents.
/dev/agents
/dev/agents is a new startup aimed at developing a cloud-based operating system "for trusted agents to work with users across all of their devices," according to co-founder and CEO, David Singleton. The goal is to build the next-gen operating systems for AI agents, along with "new UI patterns, a reimagined privacy model, and a developer platform" - in short, the complete infrastructural caboodle that comes with a new OS - "that makes it radically simpler to build useful agents."
The company is the brainchild of several Android leaders, including former VP of Android product management Hugo Barra and Android founder Ficus Kirkpatrick, among others.
Microsoft
Microsoft is giving Copilot several agentic capabilities that will see it evolve "from copilots that work with you, to copilots that work for you." That means instead of just being an idle companion that waits for your instructions, Copilot will be more proactive in automating tasks, monitoring emails, and doing things that require manual action. You know, like a virtual employee.
At the time of writing, this new capability is being tested and is available as a public preview for Copilot Studio users. But there's a lot more in store in the upcoming weeks.
Businesses and developers will soon have the ability to "create autonomous agents with Copilot Studio". On top of that, Microsoft is also looking to introduce 10 new autonomous agents in Dynamic 365 that will cover sales, service, finance, and supply chain uses.
AI hallucinations and rogue behavior notwithstanding, the autonomous capabilities of copilots still need to be fine-tuned and fitted with controls that give human reviewers a final say. Much will depend on Copilot's performance in these limited previews before mass deployment can be part of the conversation.
Google isn't standing still either. They're reportedly working on their own version of an AI agent that will automate everyday web-based tasks like researching, online shopping, booking travel, etc.
Codenamed 'Project Jarvis', and now officially known as Project Mariner, this Gemini-powered agentic capability will be confined to the browser (that too, only Chrome) and will purportedly automate web-based tasks that require capturing and interpreting screenshots and taking the relevant actions such as browsing the web, retrieving data, entering text, and clicking the right options. Such on-screen awareness is already available with Apple Intelligence. But Google expects to take this a step further with automated actions that require little to no human intervention (except, of course, for a final review).
Because of a recent leak, Project Jarvis is getting fast tracked and will most likely get a demo in December, though that, like most things AI, is subject to change.
OpenAI
OpenAI is developing its own version of an autonomous AI agent - but one that will take control of the whole computer, and not be limited to just the browser (like Google's Jarvis).
Codenamed 'Operator', OpenAI's agentic system will be able to use a computer on the owner's behalf and take such actions as writing code, booking flights, etc. Details about the project are still scant, though the company is reportedly looking at an early 2025 release (January) of Operator as a research preview and developer tool.
Apple
Apple's research team recently introduced CAMPHOR (Collaborative Agents for Multi-input Planning and High-Order Reasoning on device). As part of the broader Apple Intelligence framework, CAMPHOR uses several specialized agents with a higher-level agent functioning as the coordinator. The hierarchy of stacked agents working in coordination with each other will allow CAMPHOR to cover a variety of complex tasks. By utilizing the user's personal context, on-device data, as well as external model data, the top agent will be able to break down complex tasks into multi-step processes that are then handled by specialized agents down the pecking order.
Initially, CAMPHOR will be limited to single interactions. Over time, however, CAMPHOR's increasing sophistication will eventually lead to a multi-agent setup that can handle multiple complex queries. Because of limited server-based interactions, Apple's take on AI agents could yield a faster and more private system that takes care of most user requests on device.
While other companies are developing AI agents that automate both general-user as well as enterprise-level tasks, Apple is sticking with using agentic capabilities for the end user only (for now at least).
Anthropic
Anthropic is not lagging behind either. Its newly introduced 'computer use' feature will allow developers to "direct Claude to use computers the way people do - by looking at a screen, moving a cursor, clicking buttons, and typing text". Powered by Claude 3.5 Sonnet (Anthropic's latest AI model), the company notes that the feature is still experimental, though its API is already in public beta.
A comprehensive directory of AI Agents
While big tech's AI Agents are still in the works and may take some time to be deployed with any level of functional autonomy, denizens of the web have already created several task-specific AI agents. Many of these are available for free while some of them may require a small subscription fee for support. If you're interested in AI Agents, we highly recommend you check out the different AI Agents directories that list a collection of the best AI agents from across the internet. Here are a few of them for you to dive into:
Although these directories will often list several overlapping AI Agents, it's worth checking them out in case you can't find what you're looking for in one directory.
AI Agents mark a pivotal change!
After the rise of generative AI and AI chatbots, AI Agents are proving to be the next best thing in the AI space. With exceptional autonomous capabilities, when integrated well, AI Agents can change the nature of work and automate not only simple, mundane activities but also execute complex, multi-step tasks with little supervision.
Yes, the road to error - and hallucination - free AI agents is long and fraught with unforeseen challenges. And yes, the companies are bleeding money chasing the level of AI technology that will eventually generate some returns. But this is to be expected for those at the helm of technological advancements. However, if AI agents do manage to become functional companions with their own agency (with some oversight), they could usher in a revolution that could reach beyond the tech space.
2025 could well be the year that sees AI Agents go mainstream and change the way we interact with our gadgets, be it at the workspace or your personal abode.
Discussion