AI Dictionary
Advanced· ~2 min read#agent#tools#autonomy

AI Agent

An LLM that takes actions

An autonomous system that uses an LLM as its brain to call tools, plan, and reach a goal step by step.

LLM + TOOLS = AGENTAGENTplan + actsearchcodeAPIfilesloops until the goal is met, calling tools as needed
Definition

A plain LLM returns one response. An agent: takes a goal, plans, calls tools (web search, code execution, APIs, files), reads the result, updates the plan, continues. The loop runs until the goal is met.

The core agent loop: 1. Plan: the LLM looks at the current state and decides what to do. 2. Act: calls a tool (function calling). 3. Observe: reads the tool's output. 4. Reflect: are we closer to the goal, or do we need a different path? 5. Repeat.

Frameworks: LangGraph, CrewAI, AutoGen, Anthropic's Computer Use. Production examples: Claude Code, Cursor Agent, GitHub Copilot Workspace, Devin, Manus, most enterprise "AI assistants."

Analogy

LLM = a consultant giving you one sentence of advice. Agent = an intern who actually goes and does it — opens the file, picks up the phone, writes the email, double-checks, and reports back "done." Sometimes turns down the wrong street, but doesn't forget the goal.

Real-world example

"Run all tests in this repo, fix the failing ones, open a PR" → Claude Code (an agent): 1. Runs npm test via the bash tool. 2. Sees 3 tests failed. 3. Reads each affected file with the read tool. 4. Spots the bug, fixes it with the edit tool. 5. Runs npm test again — all green. 6. git commit + gh pr create to open the PR. 7. Tells the user "done, PR #123 is up."

No intermediate commands typed by the user — 10+ steps in a plan + tool-call + observation loop. Every step is an LLM call; every tool call hits an external system.

A deeper look
THE AGENT LOOPGOALnot done?PLANwhat to do?ACTcall a toolOBSERVEread resultREFLECTcloser to goal?repeats until goal is met or budget runs out
When to use
  • Multi-step tasks (planning required)
  • Open-ended problems — the user can't enumerate every step
  • Tool use is essential (code execution, APIs, web search)
  • Long-horizon tasks — workflows that may run for hours
When not to use
  • Single-shot Q&A — agent overhead is unnecessary
  • Security-critical actions (payment, deletion) — don't let the agent decide; require human approval
  • High volume + low margin — every step is an LLM call = expensive
  • Narrow, deterministic workflows — a traditional script is safer
Common pitfalls

Infinite loops

When unable to reach the goal, the agent calls the same tool repeatedly. Cap turns (usually 20–50), cap budget ($), add loop detection.

Side effects from wrong tool calls

An agent might call 'send 1000 orders to the shipping API'. Add guardrails, dry-run mode, human-in-the-loop confirmation for risky tools.

Lack of visibility

If you can't see what the agent did, debugging is impossible. Log plan, tool call, observation — every step. Stream them to the user.