Overview of AI Agents

Agents are effectively LLMs that can also take actions. As a very simple example, we could give an LLM access to:

  • Internet search
  • The ability to write and execute code
  • A calculator

The LLM's ability to use these 'tools' makes it an agent.

More broadly, AI Agents can interact with tools, websites, or systems to get things done, such as:

  • Filling out forms
  • Gathering data
  • Automating workflows

This “action” capability transforms an LLM’s one-off replies into ongoing processes that advance toward a user’s objective without requiring constant guidance.

Why Do Agents Matter?

Well, they can do stuff for you that ordinarily takes a lot of time! E.g. buying flights or renewing a license. Agents will also eventually be able to do things in the "real world".

The capability of Agents to take action presents interesting use cases, such as embodied LLMs. This means connecting real-world devices, like robots, to LLMs, producing Agents which are able to take real-world actions.

This significantly increases the impact of attacks against LLMs. An attacker could potentially induce an Agent to cause real-world harm. For example, if an embodied, humanoid LLM is walking around on the street and someone gives it the middle finger, would it start a fight with them? We don't know!

By conducting AI Red Teaming on these AI Agents, we can better understand the risks associated with them. This gives us the knowledge necessary to design these systems with safety in mind to reduce any risk related with their capabilities.