PART 1 β What Is an AI Agent? β
Every section follows the same 4-part structure: The Intuition β The Technical Reality β The Production Trap β The Recall Hook
1.1 The Shift: Predictor β Actor β
The Intuition β
Old AI = a very smart autocomplete. You ask, it predicts. It has no awareness of the world, no plan, no actions. It's a parrot.
An Agent = a junior employee you give a goal to. They don't need you to tell them every step. They reason, use tools, check results, and adjust β until the goal is done.
The shift is from "finish my sentence" to "go do the job."
The Technical Reality β
An LLM alone is stateless and passive. An Agent adds:
- A loop around the LLM (Think β Act β Observe β repeat)
- Tools the LLM can call to interact with the real world
- Memory so it remembers what it already did
- Orchestration to manage when to think vs. when to act
Shortest definition you will ever need:
Agent = LLM in a loop + Tools to accomplish an objective
The Production Trap β
People think "agent" = "just add a prompt." It's not. An agent is a complete application with state management, tool execution, failure handling, and observability. Treat it like building a microservice, not writing a prompt.
Recall Hook β
Agent = Employee, not Autocomplete. It has a mission, uses tools, and reports back.
1.2 The Anatomy of an Agent β
+-----------------------------------------------------------+
| AI AGENT |
| |
| MODEL (Brain) TOOLS (Hands) ORCHESTRATION |
| Reasons/plans APIs, DBs, Manages the loop, |
| Decides Code exec, memory, state, |
| Generates Search, HITL planning strategy |
| |
| DEPLOYMENT (The Body) |
| Monitoring Logging Scaling A2A APIs |
+-----------------------------------------------------------+Body Analogy β
| Component | Body Part | What it does |
|---|---|---|
| Model | Brain | Reasons, decides, plans |
| Tools | Hands | Acts on the world |
| Orchestration | Nervous System | Manages the ThinkβActβObserve loop |
| Deployment | Body + Legs | Gets the agent out into the world |
Each Component in Detail β
MODEL (The Brain)
- The LLM. Its quality determines the agent ceiling.
- Choosing by benchmarks alone = path to failure. Test on your task metrics.
- Model Routing (production pattern): Use Gemini 2.5 Pro for complex planning β route to Gemini 2.5 Flash for simple summarization. Same agent, 70% lower cost.
- Agent Ops rule: models are superseded every 6 months. You need a CI/CD pipeline to swap brains without architectural overhaul.
TOOLS (The Hands)
- Retrieving: RAG (docs), NL2SQL (databases) β grounds responses in facts, kills hallucinations
- Executing: Wrapped APIs (send email, update CRM), code execution in sandboxes
- Human-in-the-Loop:
ask_for_confirmation()β agent pauses, human approves, then resumes - Tools are exposed via Function Calling. Standards: OpenAPI spec, MCP protocol.
- Native tools: Gemini has Google Search built-in as a native tool (baked into the LLM call itself).
ORCHESTRATION (The Nervous System)
- Runs the ThinkβActβObserve loop
- Manages how the agent reasons: Chain-of-Thought, ReAct (Reason + Act interleaved)
- Manages memory: what the agent knows right now (short-term) vs. what it should remember across sessions (long-term)
- Design choice: No-code builders (fast, limited) vs. code-first frameworks like ADK (full control, production-grade)
DEPLOYMENT (The Body)
- Not just "put it on a server" β monitoring, logging, rate limiting, auth
- Agents talk to users via GUI or to other agents via A2A protocol
Recall Hook β
Brain Β· Hands Β· Nervous System Β· Body β model, tools, orchestration, deployment.
1.3 The Think β Act β Observe Loop β
The Intuition β
Imagine a detective:
- Gets the mission (solve the case)
- Scans available evidence
- Thinks through the strategy
- Acts (interviews a suspect, checks records)
- Observes the result β updates their mental model
- Loops until the case is solved
That is exactly the agent loop.
The Technical Reality β
The orchestration layer manages this loop. Each cycle:
- The model sees: system prompt + conversation history + tool results so far
- It decides: keep thinking? call a tool? give final answer?
- The loop terminates when the agent decides it is done or a max turn limit is hit
ReAct Pattern (most common reasoning strategy):
Thought: "I need to find the halfway point first"
Action: maps_tool(origin="Mountain View", destination="SF")
Observation: "Halfway point is Millbrae"
Thought: "Now I search for coffee in Millbrae"
Action: search_tool(query="good coffee in Millbrae")
Observation: [results...]
Final Answer: "Try Blue Bottle in Millbrae"The Production Trap β
Loops can go infinite. Always set max_llm_calls (ADK default: 500). Also: the agent think step costs tokens every iteration. A poorly-scoped task can burn through budget fast.
Recall Hook β
Mission β Scan β Think β Act β Observe β Loop β the detective cycle.
Sources β
- Google ADK Whitepaper: Introduction to Agents
- Google ADK Documentation: ai.google.dev/adk
See something wrong or missing? Edit this page on GitHub β reviewed before publishing.