πŸ€–
AI Course
|
0/1
Sponsor

EthicalAds: Display ethical, developer-targeted ads on your platform without compromising user privacy.

πŸ€–Module 5 of 12

AI Agents: Concepts & Architecture

⏱ 6–7 hours
πŸ“˜ Intermediate
πŸ”§ Python (conceptual)
What you'll learn
  • β†’Define what makes an AI agent different from a simple chatbot
  • β†’Explain the REACT (Reason + Act) agent loop
  • β†’Describe Andrew Ng's 4 agentic design patterns
  • β†’Compare the 4 types of agent memory
  • β†’Know when to add human-in-the-loop checkpoints

What Makes Something an Agent?

The word "agent" gets thrown around loosely in the AI space. Let's give it a precise definition based on three properties that all genuinely agentic systems share.

Property 1: Goal-directed. An agent receives a goal ("research and summarize the top 5 competitors"), not just a prompt ("what are some AI companies?"). It works toward that goal across multiple steps.

Property 2: Tool-using. An agent can take actions beyond generating text. It can search the web, read and write files, call APIs, execute code, query databases, send emails. Tools are how the agent affects the world and acquires new information.

Property 3: Iterative. An agent loops. It takes an action, observes the result, decides what to do next based on that result, takes another action. This continue until the goal is achieved or the agent determines it can't proceed.

A chatbot has none of these properties β€” it takes a single input and produces a single output. Here's how they compare across five dimensions:

| Dimension | Chatbot | Agent | |-----------|---------|-------| | Turn structure | Single turn | Multi-step loop | | External tools | None | Web, files, APIs, code execution | | Goal horizon | Immediate response | Potentially hours-long tasks | | Error recovery | Cannot retry | Can observe failures and try alternatives | | Output | Text only | Text + real-world side effects |

This distinction matters enormously for system design. A chatbot is a stateless function. An agent is closer to a small autonomous program β€” with all the power and risk that entails.


The Agent Loop β€” REACT Pattern

The dominant pattern for building LLM agents is REACT, coined by researchers at Princeton and Google in 2022. REACT stands for Reason + Act.

The full loop has three stages:

The REACT Agent Loop
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   User Goal / Task   β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                               β”‚
                               β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”Œβ”€β”€β”€β”€β–Ίβ”‚  REASON              β”‚
              β”‚     β”‚  "What should I do   β”‚
              β”‚     β”‚   next to reach the  β”‚
              β”‚     β”‚   goal?"             β”‚
              β”‚     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚                β”‚
              β”‚                β–Ό
              β”‚     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚     β”‚  ACT                 β”‚
              β”‚     β”‚  Execute a tool:     β”‚
              β”‚     β”‚  - search_web()      β”‚
              β”‚     β”‚  - read_file()       β”‚
              β”‚     β”‚  - run_code()        β”‚
              β”‚     β”‚  - call_api()        β”‚
              β”‚     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚                β”‚
              β”‚                β–Ό
              β”‚     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚     β”‚  OBSERVE             β”‚
              β”‚     β”‚  Process tool result β”‚
              β”‚     β”‚  Add to context      β”‚
              └──────  Goal achieved? ─ No β”‚
                    β”‚         β”‚            β”‚
                    β”‚         β–Ό Yes        β”‚
                    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
                    β”‚  β”‚  STOP / Return β”‚  β”‚
                    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Here is the REACT loop implemented in pseudocode:

def react_agent(goal: str, tools: dict, max_iterations: int = 10) -> str:
    """
    A simple REACT agent loop.
    
    tools: dict mapping tool_name -> callable function
    """
    messages = [
        {"role": "user", "content": f"Goal: {goal}\n\nWork toward this goal step by step."}
    ]
    
    for iteration in range(max_iterations):
        # REASON: Ask the model what to do next
        response = call_llm(
            messages=messages,
            tools=tool_definitions,  # JSON schemas of available tools
        )
        
        # Check if the model is done (no more tool calls)
        if response.stop_reason == "end_turn":
            return response.content[0].text  # Final answer
        
        if response.stop_reason == "tool_use":
            # ACT: Execute the requested tool
            tool_call = extract_tool_call(response)
            tool_name = tool_call.name
            tool_args = tool_call.input
            
            tool_fn = tools.get(tool_name)
            if tool_fn is None:
                tool_result = f"Error: Unknown tool '{tool_name}'"
            else:
                tool_result = tool_fn(**tool_args)
            
            # OBSERVE: Add the tool result to the conversation
            messages.append({"role": "assistant", "content": response.content})
            messages.append({
                "role": "user",
                "content": [
                    {
                        "type": "tool_result",
                        "tool_use_id": tool_call.id,
                        "content": str(tool_result)
                    }
                ]
            })
            # Loop back to REASON with the new information
    
    return "Max iterations reached without completing the goal."

The key insight is the feedback loop: the agent doesn't just execute steps sequentially. After each action, it observes the result and reasons about what to do next in light of that result. This lets it course-correct when things don't go as expected.


Andrew Ng's 4 Agentic Design Patterns

Andrew Ng β€” AI pioneer and founder of DeepLearning.AI β€” identified four foundational patterns that cover the vast majority of real-world agentic systems. Understanding these patterns helps you design agents architecturally before writing a line of code.

4 Agentic Design Patterns
1. REFLECTION              2. TOOL USE
   Generate                   Model ──► Tool Call
      β”‚                              β”‚
   Critique ◄──────────────── Tool Result
      β”‚                              β”‚
   Improve                    Next Action
      β”‚
   Check quality ─► Done?

3. PLANNING                4. MULTI-AGENT
   Goal                       β”Œβ”€ Researcher Agent
     β”‚                        β”‚
   Decompose               Orchestrator ─── Writer Agent
     β”‚                        β”‚
   Subtask 1                  └─ Editor Agent
   Subtask 2
   Subtask 3                  Each specializes in
   Subtask N                  what it does best

Pattern 1: Reflection

The agent generates an output, critiques that output, and then improves it. This loop continues until the output meets quality criteria or a maximum iteration count is reached.

Reflection is powerful because it applies the model's critical ability to its own output. The model is often better at spotting problems in a draft than it is at producing a perfect draft on the first try.

Use reflection when: output quality is paramount and latency is acceptable (each reflection round costs 1-2 extra API calls).

Pattern 2: Tool Use

The model is given a set of tool definitions and can call them to gather information or take actions. This is the fundamental building block for all agents β€” without tools, a model can only generate text.

Common tools: web search, file read/write, code execution, database queries, API calls, calendar access, email sending.

Use tool use when: the model needs external information or must affect the world beyond generating text.

Pattern 3: Planning

For complex, multi-step goals, the agent first generates a plan β€” a sequence of subtasks β€” and then executes that plan. More sophisticated versions re-plan when steps fail or produce unexpected results.

Planning improves reliability on tasks that require more than 5-10 steps, because it forces upfront reasoning about dependencies and sequencing.

Use planning when: the task is complex enough that "figure it out as you go" produces poor results.

Pattern 4: Multi-Agent Collaboration

Multiple specialized agents work together, each handling the parts of a task they're best suited for. An orchestrator agent coordinates the work.

Example: a research pipeline with a Researcher agent (web search, source evaluation), an Analyst agent (data extraction, synthesis), and a Writer agent (prose generation, formatting). Each agent maintains a narrower context and can be tuned with a more specific system prompt.

Use multi-agent when: a single agent's cognitive load becomes a reliability problem, or when different subtasks genuinely require different expertise or tools.


4 Types of Agent Memory

The "memory" of an agent system can be stored in four different places, each with different characteristics:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     AGENT MEMORY TYPES                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Type             β”‚ Storage      β”‚ Scope          β”‚ Size Limit  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1. In-Context    β”‚ Messages []  β”‚ Current run    β”‚ Context     β”‚
β”‚    (Working)     β”‚              β”‚ only           β”‚ window      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 2. External      β”‚ Vector DB    β”‚ Across runs,   β”‚ Unlimited   β”‚
β”‚    (Long-term)   β”‚              β”‚ semantic searchβ”‚ (practical) β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 3. Episodic      β”‚ DB / JSON    β”‚ Structured     β”‚ Unlimited   β”‚
β”‚    (Records)     β”‚ files        β”‚ records of     β”‚             β”‚
β”‚                  β”‚              β”‚ past runs      β”‚             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 4. Semantic      β”‚ DB or files  β”‚ Learned facts  β”‚ Unlimited   β”‚
β”‚    (Knowledge)   β”‚              β”‚ about user /   β”‚             β”‚
β”‚                  β”‚              β”‚ world          β”‚             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

1. In-context memory (working memory): Everything in the current messages array. The model can see it all, but it's ephemeral β€” gone when the conversation ends β€” and limited by the context window. This is what all agents have by default.

2. External memory (long-term): A vector database that stores past interactions, documents, or knowledge as embeddings. The agent searches it semantically to retrieve relevant past information. This is how you give an agent persistent memory that survives across sessions. Covered in depth in Module 4.

3. Episodic memory (structured records): A database of past task runs β€” what the agent did, what tools it called, what the results were, whether it succeeded. Useful for agents that need to avoid repeating mistakes or build on past work. Stored as structured records (SQL, JSON), not embeddings.

4. Semantic memory (accumulated knowledge): Facts the agent has learned and stored explicitly β€” user preferences, domain knowledge, entity information. Example: "This user prefers code examples over prose explanations." Stored and retrieved as structured key-value pairs or documents.

Most production agents use in-context memory plus one of the others. All four is rare and complex.


Autonomy vs. Reliability β€” The Core Tradeoff

More autonomy means more power. It also means more ways for things to go wrong. This tradeoff is the central design tension in agentic systems.

Consider a fully autonomous customer support agent: it can read emails, look up account data, issue refunds, close tickets, and update records β€” all without human involvement. In the best case, it handles thousands of tickets faster and cheaper than any team. In a bad case, it issues thousands of incorrect refunds before anyone notices.

The minimal footprint principle: give agents exactly the tools and permissions they need to accomplish the task, and no more. An agent that needs to read customer order history does not need write access to that database. An agent that drafts emails does not need the ability to send them.

Three autonomy patterns, roughly in order of risk:

LOW AUTONOMY                                      HIGH AUTONOMY
─────────────────────────────────────────────────────────────────
[Human-in-the-loop]    [Supervised]    [Fully Autonomous]

Agent proposes,         Agent acts,    Agent acts completely
human approves          human reviews  on its own
every action.           at checkpoints.

Best for:               Best for:      Best for:
- Irreversible actions  - Most         - Well-defined tasks
- High-stakes decisions   production    with known failure
- Learning phase          agents        modes and low stakes

Approval gate pattern: insert a human review step at high-risk moments β€” before sending an email, before making a purchase, before deleting data. The agent proceeds autonomously everywhere else.

def approval_gate(action: str, details: dict) -> bool:
    """Ask a human to approve a high-risk action before executing it."""
    print(f"\n[APPROVAL REQUIRED]")
    print(f"Action: {action}")
    print(f"Details: {details}")
    response = input("Approve? (yes/no): ").strip().lower()
    return response == "yes"
 
# In the agent loop:
if tool_name in HIGH_RISK_TOOLS:  # e.g., "send_email", "delete_record"
    if not approval_gate(tool_name, tool_args):
        return "Action cancelled by user."
🚫
Principle of Least Privilege for Agents

Never give an agent tools it doesn't need for the task at hand. An agent with email access and file system access that's asked to "research competitors" has the capability to exfiltrate your company's data if compromised through prompt injection. Scope tools to the minimum required. Review and audit agent tool calls in production.


What Agents Are Bad At

Honest assessment of the current limitations. Knowing these will save you from building systems that fail in production.

1. Long-horizon error accumulation. Each step in a long agent loop has some probability of being subtly wrong. Over 20+ steps, small errors compound. A plan that works 95% at each step succeeds end-to-end only 36% of the time (0.95^20). Plan for failure recovery from the start.

2. Real-time data without explicit tools. An agent that doesn't have a web search tool knows nothing about current events. Don't assume the LLM "knows" recent information β€” give it the tools to fetch it.

3. Subjective success criteria. "Make this email more engaging" is hard for an agent to evaluate because "engaging" is subjective. Agents perform best when success criteria are measurable and objective. When they're not, human review checkpoints are essential.

4. Irreversible high-stakes actions. Agents can act faster than humans can catch mistakes. Sending 10,000 emails in 30 seconds, then discovering the prompt was wrong, is a disaster. Treat irreversible actions with extreme caution β€” require human approval, implement dry-run modes, add hard rate limits.


πŸ’»Design an Agent Spec (No Code Required)

Goal: Think through the architecture of an agent for a real task before building anything.

Your task: Pick a repetitive, multi-step work task you actually do. Design it as an agent specification.

Template:

AGENT SPEC: [Name]

Goal: [What does the agent accomplish? One sentence.]

Trigger: [What kicks off the agent? A schedule? A webhook? A user message?]

Tools needed:
  1. [Tool name] β€” [What it does] β€” [Read or write? Reversible?]
  2. ...

Memory type: [ ] In-context only  [ ] + Vector DB  [ ] + Records DB

Autonomy level:
  Runs fully autonomously: [which steps]
  Requires human approval before: [which steps β€” and why]

Expected steps (numbered):
  1. ...
  2. ...

Failure modes (3 specific things that could go wrong):
  1. [Failure] β†’ [How the agent should handle it]
  2. [Failure] β†’ [How the agent should handle it]
  3. [Failure] β†’ [How the agent should handle it]

Success criteria:
  How do you know the agent succeeded? [Measurable criteria]
  How do you know it failed? [Measurable criteria]

Example to get started: An email triage agent that reads your inbox, categorizes messages, drafts replies for common requests, and escalates unusual ones for human review.

After completing the spec, review it against these questions:

  • Does every tool have a clear justification for why it's needed?
  • Are any tools write-access that could be read-access instead?
  • What happens if step 3 fails β€” does the agent retry, skip, or stop?
  • Would you be comfortable with this agent running for 8 hours while you sleep?
πŸ§ͺ
Knowledge Check
Answer all 3 questions to unlock completion

Q1Which of these is an AI agent (not just a chatbot)?

Q2In the REACT pattern, what does the 'Observe' step do?

Q3Which agentic pattern involves specialized agents working together?

← RAG
Building Agents β†’