Skip to content

AI Agents

An AI agent is a software system that can interpret a goal, decide what to do next, use available tools, and iterate toward an outcome.

That is more than simple text generation. A useful agent has a loop, access to external systems, and rules for when to continue, stop, or ask for help.

A system starts to look agentic when it can do most of the following:

  • accept a goal instead of only a single direct command
  • decide between multiple next actions
  • call tools such as APIs, databases, browsers, or code runtimes
  • inspect results and revise its plan
  • keep enough state to continue across multiple steps

Most agent systems follow a variation of this cycle:

  1. Understand the goal.
  2. Decide the next step.
  3. Use a tool or generate an action.
  4. Observe the result.
  5. Continue, revise, or stop.

This is why pages about Agentic AI and the Spectrum of Agency belong next to this one. They describe the same idea at different levels of abstraction.

The model handles reasoning, generation, classification, or planning prompts. It is the decision engine, but it is not the whole system.

Tools let the agent do something outside the model itself:

  • search the web
  • query a database
  • send an email
  • execute Python or shell code
  • read or write files

The agent needs enough context to avoid repeating itself and to keep track of what it has already done.

Examples:

  • current conversation state
  • retrieved documents
  • intermediate results
  • a task list or execution history

Production agents need rules:

  • what actions are allowed
  • which tools require confirmation
  • when the agent must stop
  • what counts as failure or escalation
PatternWhat it is good for
Tool-using assistantFocused workflows with API calls or retrieval
Planner-workerBreaking a task into substeps and delegating work
Research agentSearch, retrieval, synthesis, and citation
Coding agentCode changes, testing, debugging, and repo workflows
Multi-agent teamSeparating planner, executor, reviewer, or critic roles

Use an agent when the problem requires:

  • multiple decisions in sequence
  • tool use
  • branching logic
  • recovery from partial failure
  • coordination across several steps

Do not default to an agent for tasks that are really single-shot generation problems. If a direct prompt plus retrieval works, that is usually simpler and more reliable.

Agents fail in ways standard apps do not.

Watch for:

  • hallucinated tool assumptions
  • repeated loops
  • brittle prompts
  • hidden state drift
  • poor cost control
  • weak observability

Useful production disciplines include:

  • execution tracing
  • step-level logging
  • replayable test cases
  • human approval gates for high-risk actions