PART 1 - What Is an AI Agent?
Every section follows the same 4-part structure: The Intuition → The Technical Reality → The Production Trap → The Recall Hook
Reference note: Several examples in this guide use Google ADK - an open-source agent development framework - to illustrate production patterns. The principles apply across frameworks.
1.1 The Shift: Predictor → Actor
The Intuition
Old AI = a very smart autocomplete. You ask, it predicts. It has no awareness of the world, no plan, no actions. It's a parrot.
An Agent = a junior employee you give a goal to. They don't need you to tell them every step. They reason, use tools, check results, and adjust - until the goal is done.
The shift is from "finish my sentence" to "go do the job."
The Technical Reality
An LLM alone is stateless and passive. An Agent adds:
- A loop around the LLM (Think → Act → Observe → repeat)
- Tools the LLM can call to interact with the real world
- Memory so it remembers what it already did
- Orchestration to manage when to think vs. when to act
Shortest definition you will ever need:
Agent = LLM in a loop + Tools to accomplish an objective
The Production Trap
People think "agent" = "just add a prompt." It's not. An agent is a complete application with state management, tool execution, failure handling, and observability. Treat it like building a microservice, not writing a prompt.
Recall Hook
Agent = Employee, not Autocomplete. It has a mission, uses tools, and reports back.
1.2 The Anatomy of an Agent
+-----------------------------------------------------------+
| AI AGENT |
| |
| MODEL (Brain) TOOLS (Hands) ORCHESTRATION |
| Reasons/plans APIs, DBs, Manages the loop, |
| Decides Code exec, memory, state, |
| Generates Search, HITL planning strategy |
| |
| DEPLOYMENT (The Body) |
| Monitoring Logging Scaling A2A APIs |
+-----------------------------------------------------------+Body Analogy
| Component | Body Part | What it does |
|---|---|---|
| Model | Brain | Reasons, decides, plans |
| Tools | Hands | Acts on the world |
| Orchestration | Nervous System | Manages the Think→Act→Observe loop |
| Deployment | Body + Legs | Gets the agent out into the world |
Each Component in Detail
MODEL (The Brain)
- The LLM. Its quality determines the agent ceiling.
- Choosing by benchmarks alone = path to failure. Test on your task metrics.
- Model Routing (production pattern): Use Gemini 2.5 Pro for complex planning → route to Gemini 2.5 Flash for simple summarization. Same agent, 70% lower cost.
- Agent Ops rule: models are superseded every 6 months. You need a CI/CD pipeline to swap brains without architectural overhaul.
TOOLS (The Hands)
- Retrieving: RAG (docs), NL2SQL (databases) - grounds responses in facts, kills hallucinations
- Executing: Wrapped APIs (send email, update CRM), code execution in sandboxes
- Human-in-the-Loop:
ask_for_confirmation()- agent pauses, human approves, then resumes - Tools are exposed via Function Calling. Standards: OpenAPI spec, MCP protocol.
- Native tools: Gemini has Google Search built-in as a native tool (baked into the LLM call itself).
ORCHESTRATION (The Nervous System)
- Runs the Think→Act→Observe loop
- Manages how the agent reasons: Chain-of-Thought, ReAct (Reason + Act interleaved)
- Manages memory: what the agent knows right now (short-term) vs. what it should remember across sessions (long-term)
- Design choice: No-code builders (fast, limited) vs. code-first frameworks like ADK (full control, production-grade)
CoT vs ReAct - the two reasoning modes:
| Chain-of-Thought (CoT) | ReAct | |
|---|---|---|
| What it does | Internal reasoning only | Reasoning + tool calls interleaved |
| Best for | Logic, planning, static tasks | Live data, real-world actions |
| Example | Drafting a plan, solving a puzzle | Booking a flight, checking a stock price |
Modern agents use both - CoT for internal reasoning, ReAct when external actions are needed.
DEPLOYMENT (The Body)
- Not just "put it on a server" - monitoring, logging, rate limiting, auth
- Agents talk to users via GUI or to other agents via A2A protocol
Recall Hook
Brain · Hands · Nervous System · Body - model, tools, orchestration, deployment.
1.3 The Think → Act → Observe Loop
The Intuition
Imagine a detective:
- Gets the mission (solve the case)
- Scans available evidence
- Thinks through the strategy
- Acts (interviews a suspect, checks records)
- Observes the result - updates their mental model
- Loops until the case is solved
That is exactly the agent loop.
The Technical Reality
The orchestration layer manages this loop. Each cycle:
- The model sees: system prompt + conversation history + tool results so far
- It decides: keep thinking? call a tool? give final answer?
- The loop terminates when the agent decides it is done or a max turn limit is hit
ReAct Pattern (most common reasoning strategy):
The Production Trap
Loops can go infinite. Always set max_llm_calls. Also: the agent think step costs tokens every iteration.
Recall Hook
Mission → Scan → Think → Act → Observe → Loop - the detective cycle.
Sources
- Google ADK Whitepaper: Introduction to Agents
- Google ADK Documentation: adk.dev
See something wrong or missing? Edit this page on GitHub - reviewed before publishing.