Skip to content

Modern AI Agents: From Concepts to Working Systems

Modern artificial intelligence (AI) is moving from simply answering questions to taking actions in the real world.

  • Generative AI focuses on creating content (text, images, code).
  • Agentic AI focuses on acting toward goals (planning, using tools, looping until done).

This article walks from core concepts all the way to practical implementation:

  • Concepts: AI, generative AI, agentic AI.
  • Materialization: how the agentic loop actually runs.
  • Abstraction: different layers (models, frameworks, orchestrators).
  • Where the agent "lives" as software.
  • Dependencies: how pieces like LangChain, LangGraph, CrewAI, Claude Code, and n8n fit together.
  • Visuals: ASCII block diagrams, a mind map, and flashcards.

1. AI in general

Artificial intelligence (AI) is any computer system that performs tasks we normally associate with human intelligence, such as understanding language, recognizing patterns, learning from data, and making decisions.

At a high level, most modern AI systems follow this pattern:

[ Data + Examples ] --train--> [ Model ] --use--> [ Predictions / Actions ]
  • During training, the model learns patterns from many examples.
  • During use/inference, it applies those patterns to new inputs to generate outputs (answers, labels, content, decisions).

Traditional AI systems were often rule-based ("if X then Y"). Large language models (LLMs) and modern AI systems are pattern-based: they learn complex relationships from data instead of being told every rule by hand.

1.1 Scale of modern models

Modern LLMs are trained on hundreds of billions to trillions of tokens of text, with parameter counts in the tens to hundreds of billions. Training uses supervised learning on next-token prediction and is then refined using Reinforcement Learning from Human Feedback (RLHF) — a technique where human raters rank model outputs, and those rankings are used to train a reward model that further shapes the LLM's behavior.

[ Pre-training ]         Learn language patterns from web-scale data
      |
      v
[ Fine-tuning / SFT ]    Specialize on curated examples
      |
      v
[ RLHF / RLAIF ]         Align to human preferences via reward signals
      |
      v
[ Deployed Model ]       Ready for inference

This pipeline is why modern models are not just "fancy search engines" — they have internalized reasoning patterns across an enormous range of human knowledge.


2. Two key types: generative vs agentic

Modern AI applications often combine two layers:

  • A generative layer that creates content.
  • An agentic layer that decides what to do next and uses tools.

2.1 High-level comparison

Generative AI
  - Input: Prompt or question
  - Behavior: Generates text, images, code, etc.
  - Output: One-shot or short conversation

Agentic AI
  - Input: Goal or task
  - Behavior: Plans, calls tools, checks results, loops
  - Output: Completed task or multi-step outcome

You can imagine generative AI as a very fast, smart writer. You can imagine agentic AI as a flexible project manager + operator.

2.2 Simple flows

Generative AI flow (one-shot):

[ Prompt ] --> [ Model thinks once ] --> [ Response ]

Agentic AI flow (loop):

[ Goal ]
  |
  v
[ Think & Plan ] --> [ Choose Tool ] --> [ Act ]
        ^                                   |
        |                                   v
        +--------- [ Check & Adjust ] <-----+
  • Generative AI is mostly "think once, answer once".
  • Agentic AI is "think, act, check, adjust" until the goal is reached or a limit is hit.

In production, most non-trivial AI products combine both: a generative model provides the reasoning and language capability, while an agentic wrapper provides the loop, memory, and tool invocation layer.


3. Generative AI (AI that creates)

Generative AI produces new content based on patterns it learned during training. It does not just look up answers; it generates them.

Examples of what it can do:

  • Write emails, reports, blog posts, and social media content.
  • Brainstorm ideas, outlines, and alternative phrasings.
  • Summarize long documents or meetings.
  • Generate images, audio, or video.
  • Help with coding: explain, write, or refactor code.

3.1 How generative AI works (intuitively)

1. Receive a prompt (your input).
2. Convert it into internal tokens (numbers).
3. Predict the next token (word, pixel, etc.).
4. Repeat step 3 to build up an output.
5. Convert tokens back into human-readable content.

It feels creative because the model is constantly choosing from many plausible next steps, guided by patterns from its training data.

What is a token? Tokens are the base units an LLM processes — not characters, not words, but chunks in between. A rough rule of thumb: a token is approximately four characters, or about ¾ of a word on average. Common short words (the, is) are each a single token; longer or rarer words are split across two or more. A 1,000-word document is roughly 1,300–1,500 tokens.

"tokenization" --> [ "token", "ization" ]   (2 tokens)
"cat"          --> [ "cat" ]                (1 token)

Why does this matter? Every model has a context window — a maximum number of tokens it can process at once (input + output combined). Claude 3.5 Sonnet has a 200,000-token context window. Exceeding it means older content must be dropped. Agents that run long loops must actively manage this.

Temperature and sampling: When predicting the next token, the model assigns a probability to every possible token in its vocabulary. Temperature controls how peaked or spread-out that probability distribution is:

  • Temperature = 0: always picks the most probable token (deterministic, consistent).
  • Temperature = 1: samples from the full distribution (creative, sometimes unpredictable).
  • Temperature > 1: flattens the distribution further (riskier, more random).

Agents typically run at low temperature for tool calls (consistency matters) and higher temperature for creative drafting tasks.

Latency metrics you will see in production:

Metric What it measures
TTFT (Time to First Token) Latency between sending the prompt and receiving the first output token — affects how "snappy" the model feels
Inter-token latency Time between each successive output token — determines streaming throughput
Total latency Full round-trip time — relevant for batch workflows

3.2 Why generative AI matters

Generative AI dramatically reduces the cost of first drafts and variations:

  • Need a draft quickly? It can produce one in seconds.
  • Need 5 different versions? It can explore many options.
  • Need a starting point? It can give you something to edit instead of a blank page.

Most of the time, though, generative AI is reactive: you ask, it answers. It also hallucinates — confidently producing incorrect facts — because it generates plausible-sounding text, not verified truth. Agentic architectures address this partly through tool use (querying real data sources instead of relying on memorized knowledge).


4. Agentic AI (AI that acts)

Agentic AI is about AI that does not just answer once, but works toward a goal by planning, choosing steps, using tools, and checking progress.

If generative AI is like a writer, agentic AI is like an assistant or coordinator that:

  • Understands your goal.
  • Breaks it into steps.
  • Uses the right tools.
  • Adjusts as it learns more.

4.1 What is an AI agent?

An AI agent is a software system that can:

  • Receive a goal or request.
  • Understand what needs to happen.
  • Choose actions and tools.
  • Interact with its environment (APIs, apps, data, files).
  • Move step by step until the task is done or interrupted.

In simple terms, an agent is a goal‑oriented AI worker.

4.2 The ReAct pattern: how agents think

The dominant pattern underlying most modern agent loops is ReAct (Reasoning + Acting), introduced in a 2022 research paper. In ReAct, the model explicitly interleaves three things:

[ Thought ]    "I need to find today's emails. I should call the email API."
    |
    v
[ Action ]     call: get_emails(date="today", filter="unread")
    |
    v
[ Observation] "Received 12 emails. 3 flagged as urgent by subject line."
    |
    v
[ Thought ]    "Now I need to summarize the 3 urgent ones..."
    |
    v
[ Action ]     call: summarize(emails=[...])
    ...

Each cycle of Thought → Action → Observation becomes one step in the agent loop. The model's reasoning and its tool calls reinforce each other: reasoning guides what to call, observations inform the next reasoning step. When a tool returns an unexpected result, the agent can immediately correct course within the same context sequence.

Modern evolution: Models like Claude 4 Sonnet, OpenAI o3, and Gemini 2.5 Pro internalize the reasoning step ("extended thinking") — they solve what used to be a 6-step ReAct trace in 1–2 tool calls plus an internal chain of thought that the user never sees. The architecture is the same; the reasoning just happens inside the model's hidden compute rather than in the visible prompt.

4.3 Agent as a project manager

Think of an agent as a project manager:

  • You give the project goal.
  • The agent breaks the goal into tasks.
  • It decides what to do first.
  • It uses tools or other systems to do the work.
  • It checks whether things are going as expected.
  • It adjusts the plan if needed.

A normal chatbot answers. An agent manages a process.


5. Inside the agent loop (ground-level view)

Here is the full agent loop as an ASCII diagram:

[ Trigger ]
    |
    v
[ 1. Goal / Request ]
    |
    v
[ 2. Reason & Plan ]
    |
    v
[ 3. Choose Tools ]
    |
    v
[ 4. Execute Actions ]
    |
    v
[ 5. Check Results ]
    |
    v
[ 6. Decide Next Step ]
   /   /   continue?  no
 /        v          v
(go back   [ Done ]
to step 2)

Let's unpack each step.

5.1 Step 0: Trigger (like cron or a user)

User / Event / Schedule / API
               |
               v
            [ Trigger ]

Something starts the agent:

  • A user types a request.
  • A system event happens (new support ticket, new email, new file).
  • A schedule fires (for example, every day at 09:00).
  • Another application calls the agent via an API.

5.2 Step 1: Goal / request

[ Trigger ] --> [ 1. Goal / Request ]

The agent receives a clear goal, for example:

  • "Summarize today's emails and draft replies for urgent ones."
  • "Generate a weekly error report and email it to the team."

This is the "what", not the "how".

In practice: the goal is passed as a system prompt + user message to the model. Well-written goals include: a clear success criterion, relevant context (time range, user preferences), and any hard constraints (tone, length, which systems to use). Vague goals produce vague plans.

5.3 Step 2: Reason & plan

[ 1. Goal / Request ]
            |
            v
     [ 2. Reason & Plan ]

Here, the model:

  • Reasons about the request (What does "urgent" mean? Which systems are involved?).
  • Plans steps, for example:
  • Read today's emails.
  • Detect which are urgent.
  • Summarize each urgent email.
  • Draft a reply for each.

This is something a plain cron job does not do. Cron runs the exact command you specify, without inventing steps.

In ReAct terms, this step produces the first Thought token sequence. The model generates a natural-language reasoning trace before deciding which tool to call — making its reasoning transparent and debuggable.

5.4 Step 3: Choose tools

[ 2. Reason & Plan ]
            |
            v
      [ 3. Choose Tools ]

For each planned step, the agent chooses which tools to use, for example:

  • Email API to fetch messages.
  • Classifier or model to detect urgency.
  • Language model to summarize and draft replies.
  • Calendar API to check availability.

Tools are like the programs and APIs a Unix daemon or cron job would call, but the agent can decide dynamically which tools to use.

How tools are defined: each tool is described to the model as a JSON schema — name, description, and parameter definitions. The model selects a tool by generating a structured JSON object (a "function call") rather than free-form text. For example:

{
  "name": "get_emails",
  "description": "Fetch emails for a given date and filter.",
  "parameters": {
    "date": { "type": "string", "description": "ISO date, e.g. 2026-05-05" },
    "filter": { "type": "string", "enum": ["unread", "urgent", "all"] }
  }
}

The Model Context Protocol (MCP) — introduced by Anthropic in November 2024 — standardizes how agents discover and call tools across different systems, solving the "N×M connector problem" where every agent had to implement bespoke integrations for every tool. OpenAI adopted MCP in March 2025, making it the de facto industry standard for tool connectivity.

5.5 Step 4: Execute actions

[ 3. Choose Tools ]
            |
            v
    [ 4. Execute Actions ]

The agent now acts:

  • Calls the email API.
  • Runs the classifier.
  • Asks the language model to generate summaries and drafts.

In traditional systems, this is where a cron job runs your script.

In agentic systems, the agent framework intercepts the model's tool-call output, routes it to the correct tool implementation, executes it, and returns the result as the next Observation in the context. Streaming responses (receiving tokens as they are generated, rather than waiting for the full output) are commonly used here to reduce TTFT for long-running steps.

5.6 Step 5: Check results (feedback)

[ 4. Execute Actions ]
            |
            v
    [ 5. Check Results ]

The agent looks at what happened:

  • Did the tools succeed?
  • Do we have all required outputs?
  • Did any errors occur?

This is the feedback loop: the agent compares what it has with the goal and asks, "Are we done or do we need to fix something?"

Context window management at this step: as the loop accumulates Thought/Action/Observation triples, the context grows. Long-running agents use summarization (compressing earlier turns into a summary token), sliding window (dropping the oldest turns), or retrieval (offloading older context to a vector store and retrieving only what is relevant) to stay within model limits.

5.7 Step 6: Decide next step (loop or finish)

[ 5. Check Results ]
            |
            v
   [ 6. Decide Next Step ]
        /                /              continue        finish
     |              |
     v              v
[ back to 2 ]    [ Done ]
  • If something is missing or wrong, it loops: adjust the plan, call different tools, or ask the user for clarification.
  • If the goal is satisfied, it finishes and returns the result.

This loop of think → act → check → adjust is what makes an AI agent feel like a flexible coordinator.

Safety guard: max iterations. Production agents always set a hard limit on the number of loop iterations (commonly 10–50) to prevent infinite loops when a goal is impossible to satisfy or a tool keeps failing. Exceeding the limit triggers a graceful failure response rather than an unbounded spin.


6. Relation to traditional systems (daemons, cron)

For people with a systems background, agentic AI is easier to understand if you compare it to Unix daemons and cron jobs.

  • A Unix daemon is a background process that starts at boot and quietly handles tasks (logging, networking, scheduling).
  • A cron job is a scheduled task that runs at specific times (for example, every night at 02:00).

An AI agent is similar in spirit:

  • It can run in the background.
  • It can wake up on triggers (user input, events, schedules, APIs).
  • It can call other programs and services.

The key difference is flexibility:

Traditional daemon:
  "I know which command to run; I just wait for the signal."

AI agent:
  "I know the goal; I will figure out the steps and adapt as I go."

A second difference is fault tolerance. A cron job that hits an unexpected API response typically fails and exits, requiring a human to diagnose and re-run it. An agent can reason about the error: "The calendar API returned a 429 rate-limit error — I should wait 30 seconds and retry, or use cached data instead."


7. Where the agent actually lives (service vs on-demand)

An agent is not a mystical thing inside the model. It is just a piece of software that coordinates:

  • A model (the LLM "brain").
  • One or more memories (state, history, preferences).
  • A set of tools (APIs, databases, file systems, crawlers, etc.).

In practice, that coordinating software can be deployed in a few familiar ways.

7.1 Service-style agent (like a Unix daemon)

[ Operating System / Container ]
           |
           v
   [ Agent Service Process ]  <---->  [ LLM API ]
           |                          [ Memory Store ]
           |                          [ Tools / APIs ]
           v
 [ HTTP endpoint, queue listener,
   or message bus subscriber ]
  • The agent runs as a long-lived service (daemon) on a server or container.
  • It listens for triggers (HTTP requests, messages, events).
  • When triggered, it runs the agent loop using the LLM, memory, and tools.

This is a good fit when:

  • You need the agent always available.
  • You expect many small tasks over time.
  • You want easy integration with other backend services.

Production note: containerized stateful agents (ECS, Kubernetes) are the default for high-volume scenarios where cold starts are unacceptable. For sensitive workloads (healthcare, finance, legal), VPC-isolated or on-premises deployments are increasingly standard.

7.2 On-demand agent (CLI or function call)

[ User / Script ]
      |
      v
[ CLI command or function call ]
      |
      v
[ Agent Runner ]  -->  [ LLM + Memory + Tools ]
      |
      v
   [ Result ]
  • The agent is started on demand by a user command, script, or scheduled job.
  • It runs the loop for a single goal and then exits.

Examples:

  • A CLI tool: ai-agent triage-tickets --today.
  • A scheduled function in a serverless platform.

This is a good fit when:

  • You have batch-like jobs (daily reports, periodic clean-up).
  • You want simpler deployment (no always-on process).

Serverless caveat: platforms like AWS Lambda cap execution time at 15 minutes and lack native WebSocket support, which is often insufficient for agents running complex multi-step loops. For agent workloads expected to run longer than a few minutes, containerized or dedicated compute is more reliable.

7.3 Agent as an internal library

[ Your App / Service Code ]
        |
        v
  call agent_library.run(goal, context)
        |
        v
[ Agent Logic ] --> [ LLM + Memory + Tools ]
        |
        v
     [ Result ]
  • The agent is packaged as a library or SDK function.
  • Your existing app calls it whenever it needs goal-directed behavior.

This is common when you embed agents into:

  • Existing SaaS products.
  • Internal tools or back-office systems.

7.4 Coordination role (the agent as glue)

No matter how it is deployed, the agent process plays the same coordination role:

          +------------------+
          |   Agent Logic    |
          | (your software)  |
          +------------------+
            /      |      \
           v       v       v
     [ LLM ]   [ Memory ]  [ Tools ]
      Brain      History    Hands
  • The LLM provides reasoning, language, and planning.
  • Memory provides continuity and context.
  • Tools provide real-world capabilities.
  • The agent code decides when and how to use each.

You can swap out the LLM, change memory backends, or add/remove tools without changing the idea of what the agent is.

7.5 Memory: the three types agents use

Memory in agentic systems divides into two categories based on lifetime:

Short-term memory (in-context): The model's context window is the agent's working memory. Everything in the current loop — the goal, all Thought/Action/Observation triples, tool outputs — lives here. It is fast, requires no external system, but is bounded by the model's token limit and is wiped when the session ends.

Long-term memory (external store): For state that must persist across sessions or sessions longer than the context window, agents use external storage:

Memory type What it stores Typical backend
Episodic Past events and interaction history Database, append-only log
Semantic General knowledge and facts (e.g. KB articles, policies) Vector store + RAG
Procedural Learned workflows and optimal sequences Fine-tuned model, rule store

Vector stores and RAG: semantic memory is typically implemented via Retrieval-Augmented Generation (RAG). Documents are chunked, embedded into high-dimensional vectors (using an embedding model), and stored in a vector database (e.g. Pinecone, Weaviate, Redis, pgvector). At query time, the agent embeds the current query and retrieves the most semantically similar chunks — injecting them into the context before calling the LLM. This sidesteps hallucination for factual lookups without retraining the model.

[ Query ]
   |
   v
[ Embed query ] --> [ Vector similarity search ]
                             |
                             v
                   [ Top-k relevant chunks ]
                             |
                             v
              [ Inject into LLM context ]
                             |
                             v
                   [ Grounded response ]

8. From concept to materialization: layers of an agentic system

Conceptually, an agent is a loop. To make it real, we need multiple layers working together.

8.1 Layered block diagram

Top: Business Use Case
----------------------
  e.g., "Triage support tickets"

Application / Orchestrator
--------------------------
  - Webhook endpoints
  - Schedulers
  - Workflow tools (e.g., n8n)

Agent Framework
---------------
  - LangChain / LangGraph
  - CrewAI
  - Custom agent code

Model + Tools
-------------
  - LLM (Claude, etc.)
  - Search APIs, KBs
  - Databases, CRMs
  - Crawlers (Crawlee), file systems

Infrastructure
-------------
  - Servers / containers
  - Secrets, logging, monitoring
  • Top: You define what problem you want to solve.
  • Middle: Agent frameworks and orchestrators define how the agent thinks and how workflows move.
  • Bottom: Models and tools do the actual work (reasoning, reading/writing data, sending emails, etc.).

8.2 Integration and dependencies

You can think of dependencies like this:

[ Business Need ]
      |
      v
[ Orchestrator / App ]  <-- depends on -->  [ Agent Framework ]
      |                                               |
      v                                               v
[ Agent Logic (plans, tools) ]           [ Model + Tools (LLM, APIs) ]
      |                                               |
      +-------------------- real-world effects <------+  (emails, DB writes, etc.)
  • The orchestrator/app depends on the agent framework to implement smart behavior.
  • The agent framework depends on the model + tools to reason and act.
  • Everything depends on infrastructure to run reliably and securely.

9. Abstraction levels: from raw APIs to full platforms

There are several abstraction levels you can use to build agents, from low-level code to high-level platforms.

9.1 Level 1 – From scratch with model APIs

You talk directly to an LLM via its API and implement the agent loop yourself.

Your Code
  |
  +--> Call LLM (with prompt + tool specs)
  |
  +--> Implement tool functions (search, DB, etc.)
  |
  +--> Handle tool calls, state, retries, logs
  • Maximum control.
  • Maximum responsibility (you own all orchestration, memory, error handling).

Practical considerations: you must handle streaming responses, token counting, context overflow, tool call parsing, retry logic on rate limits, and loop termination. For anything beyond a prototype, this adds up quickly.

9.2 Level 2 – Frameworks: LangChain and LangGraph

These frameworks give you building blocks for agents.

LangChain

LangChain provides:

  • Model wrappers (Claude, OpenAI, Gemini, Mistral, and more — one interface for all).
  • Memory (chat history, vector stores).
  • Tool abstractions and integrations (400+ pre-built connectors: Google Search, SQL, Slack, GitHub, etc.).
  • Agent executors that run the agent loop.

Mental picture:

[ Your App ] --> [ LangChain Agent ] --> [ LLM + Tools ]

Use it when:

  • You want to stay in Python/TypeScript.
  • Workflows are mainly sequential or lightly branching.

LangGraph

LangGraph adds a graph-based, stateful layer on top of LangChain.

Nodes:   [Plan] -> [Search] -> [Summarize] -> [Ask Human?] -> [Finalize]
State:   Shared memory carried between nodes
Edges:   Conditions, loops, error paths

Each node is a function that receives the current shared state, does some work (call an LLM, query a DB, run a classifier), and returns an updated state. Edges are routing rules — conditional edges branch based on state values; unconditional edges always advance to the next node. Parallel execution allows multiple nodes to run simultaneously and merge at a downstream node.

LangGraph adds checkpointing: the full graph state is persisted to a store (in-memory, Redis, PostgreSQL) after every node. If the agent crashes mid-run, it resumes exactly where it left off — rather than starting over.

Key 2025 milestones: - v0.4 (April 2025): Automatic interrupt surfacing — safer long-running graphs that can pause and request human review before taking sensitive actions. - LangGraph Platform GA (May 2025): One-click deploy, autoscale, and monitoring for stateful agents in the cloud.

Use it when:

  • You need complex workflows with branches, loops, or human-in-the-loop.
  • You want observability: see exactly which node ran and in what order.
  • You need fault-tolerant agents that can resume after failure.

9.3 Level 3 – Multi-agent crews: CrewAI

CrewAI focuses on teams of agents with different roles collaborating.

[ Crew: SupportCrew ]
   |
   +-- Agent 1: Classifier
   +-- Agent 2: Knowledge Researcher
   +-- Agent 3: Reply Writer

CrewAI's four core concepts:

Concept Description
Agent An "employee" with a role, a goal, a backstory, and a set of tools
Task A specific piece of work assigned to an agent, with a description and expected output
Crew The container that coordinates agents and tasks together
Process The workflow engine that governs execution order

CrewAI supports three process types:

  • Sequential: agents run one after another, each receiving the previous agent's output.
  • Hierarchical: a manager agent delegates tasks to worker agents, reviewing and routing their outputs.
  • Consensual: agents vote or deliberate on a decision before the crew proceeds.

CrewAI also offers Flows — event-driven, stateful pipelines for precise control — alongside Crews for collaborative autonomy. It ships with 100+ built-in tools (web search, vector DB queries, website scraping, code execution, etc.) and runs independently of LangChain, offering lighter resource footprint and faster startup.

Use it when:

  • Your task naturally splits into roles (researcher, writer, analyst, coder).
  • You want an easy way to define how agents collaborate without hand-writing all coordination logic.
  • You need hierarchical delegation (manager → workers).

9.4 Level 4 – Developer-centric agent environment: Claude Code

Claude Code is a terminal-based agentic coding environment. It acts as a powerful AI teammate in your repo:

[ You ] <--> [ Claude Code ] <--> [ Files, Commands, Tools (MCP) ]

It can:

  • Read/write code.
  • Run tests and scripts.
  • Use external tools (GitHub, DB, Notion, etc.) via MCP.

MCP (Model Context Protocol) is the open standard, introduced by Anthropic in November 2024, that Claude Code uses to connect to tools. MCP standardizes three integration types:

  • Tools: actions the agent can invoke (run a query, send a request, execute code).
  • Resources: data sources the agent can read (files, database tables, API endpoints).
  • Prompts: reusable prompt templates registered with the server.

MCP uses a client-server model: the agent is the MCP client, tool implementations run as MCP servers (locally or remotely), and communication follows a defined protocol with OAuth 2.1 authorization for remote servers (standardized in the March 2025 spec update). This means an agent can discover and call hundreds of tools across dozens of MCP servers with consistent authentication and permission scoping.

Use it when:

  • You are a developer building or maintaining agents.
  • You want an AI that can operate inside your codebase and shell.

9.5 Level 5 – Visual orchestration: n8n

n8n is a visual workflow orchestrator with AI and agent support.

[ Trigger Node: New Ticket ]
        |
        v
[ Node: Pre-process ]
        |
        v
[ Node: Call Agent (LangChain/LangGraph) ]
        |
        v
[ Branch: Simple ] --> [ Send Auto-Reply ]
[ Branch: Complex ] -> [ Create Escalation + Slack Notify ]

n8n implements its AI agent functionality using LangChain's JavaScript library under the hood, with a hierarchical cluster-node architecture: root nodes define agent logic, sub-nodes attach to them providing language models, memory backends, and tools.

As of n8n v1.82.0, all AI Agent nodes use the Tools Agent implementation — a LangChain tool-calling interface that passes available tools and their JSON schemas to the model, with structured output parsing. This makes the tool-calling behavior consistent and predictable across all agent configurations.

Multi-agent orchestration in n8n: the AI Agent Tool node allows a root-level agent to call another agent as if it were a tool, enabling nested multi-agent workflows without the overhead of managing sub-workflow context and variable passing manually.

Human approval gates: for sensitive actions (sending email, modifying a CRM record, executing a payment), n8n workflows can pause the agent and send an approval request to a human via Slack, email, or a webhook — resuming only after explicit approval.

n8n integrates with 400+ services out of the box (Slack, Gmail, Jira, HubSpot, Postgres, MySQL, Airtable, and more), making it practical to connect agents to existing business tool stacks with minimal custom code.

Use it when:

  • You want non-engineers to see and edit workflows.
  • You need to connect agents to many SaaS tools (CRMs, email, Slack, DBs) with little code.
  • You want built-in human-in-the-loop approval for sensitive agent actions.

10. Practical scenario: support ticket triage across the stack

Let's put everything together with one scenario.

10.1 Business goal

"Automatically triage incoming support tickets, answer simple ones, and escalate complex ones with a good summary."

10.2 High-level workflow

[ New Ticket ]
    |
    v
[ Agent: Understand & Classify ]
    |
    v
[ Agent: Search Docs & History ]
    |
    v
[ Agent: Draft Reply or Escalation Note ]
    |
    v
[ Update Ticket System + Notify Humans ]

10.3 Possible implementation path

  1. Orchestration layer (n8n or backend service)
  2. Listens for "new ticket" webhooks from Zendesk, Jira, or Freshdesk.
  3. Passes ticket data (subject, body, customer tier, history) into the agent.
  4. Routes outputs back to the ticketing system and posts escalation notices to Slack.
  5. Human approval gate: before the agent sends a reply to a premium customer, a Slack message asks a senior agent to approve.

  6. Agent framework (LangGraph)

  7. Node 1: Classify ticket (simple vs complex) — binary classifier backed by the LLM.
  8. Node 2: For simple, search KB (RAG over a vector store of documentation) and draft answer.
  9. Node 3: For complex, summarize the full ticket thread and propose an escalation note.
  10. Node 4: Optionally checkpoint state and ask a human to approve before sending.
  11. LangGraph checkpointing means if the process crashes mid-run (e.g. an LLM API timeout), the agent resumes from the last completed node rather than re-triaging from scratch.

  12. Multi-agent refinement (CrewAI)

  13. Within the "answer simple ticket" node, a Crew of two agents collaborates:
    • Researcher agent: searches the KB, product changelog, and known-issues list.
    • Writer agent: drafts a clear, on-brand reply using the researcher's findings.
  14. The Crew runs sequentially (Researcher → Writer); the Writer receives the Researcher's output as context.

  15. Developer loop (Claude Code)

  16. You use Claude Code to:

    • Scaffold the LangGraph project (generate boilerplate, configure state schema).
    • Write and run tests against a local ticket fixture.
    • Connect the KB search tool via an MCP server pointed at your vector database.
    • Monitor logs and iterate on the classifier prompt without leaving the terminal.
  17. Tools

  18. LLM (Claude) for classification, reasoning, and writing.
  19. Vector store + embedding model for KB retrieval (RAG).
  20. Ticketing API (Zendesk, Jira) for reading and updating tickets.
  21. Slack API for escalation notifications and approval requests.
  22. Email API for direct customer replies.

The same conceptual agent loop is there, but you decide which abstractions to use at each layer based on who will maintain the system and how complex it is.


11. Mind map (text version)

Modern AI Agents: From Concepts to Systems
|
+-- AI in General
|   +-- Definition
|   +-- Training vs Inference
|   +-- RLHF: aligning model behavior via human feedback
|
+-- Types of AI
|   +-- Generative AI
|   |   +-- Creates content
|   |   +-- One-shot responses
|   |   +-- Tokens (~4 chars each), temperature, TTFT
|   |   +-- Uses: writing, summarization, coding help
|   |
|   +-- Agentic AI
|       +-- Works toward goals
|       +-- ReAct pattern: Thought → Action → Observation
|       +-- Uses tools & memory
|
+-- Agent Anatomy
|   +-- Trigger (user, event, schedule, API)
|   +-- Goal / Request
|   +-- Reasoning & Planning (ReAct thought trace)
|   +-- Tool selection (JSON schema, function calling, MCP)
|   +-- Executor + feedback loop
|   +-- Max-iteration safety guard
|
+-- Memory Types
|   +-- Short-term: context window (in-session, token-bounded)
|   +-- Long-term:
|       +-- Episodic: past events and interaction history
|       +-- Semantic: facts and KB, backed by vector stores + RAG
|       +-- Procedural: learned workflows and sequences
|
+-- Relation to Traditional Systems
|   +-- Unix daemons
|   +-- Cron jobs
|   +-- Fixed rules vs adaptive planning
|   +-- Agents handle partial failures; cron exits on unexpected output
|
+-- Where Agent Lives
|   +-- Service/daemon (containerized, always-on)
|   +-- On-demand (CLI, function — watch serverless timeouts)
|   +-- Library inside apps
|
+-- Layers & Dependencies
|   +-- Business Use Case
|   +-- Orchestrator/App (e.g., n8n)
|   +-- Agent Framework (LangChain, LangGraph, CrewAI)
|   +-- Model + Tools (LLMs, APIs, Crawlers)
|   +-- Infrastructure
|
+-- Abstraction Levels
|   +-- Raw model APIs (from scratch)
|   +-- Frameworks (LangChain, LangGraph)
|   |   +-- LangGraph: stateful graph, checkpointing, parallel nodes, human gates
|   +-- Multi-agent crews (CrewAI: sequential / hierarchical / consensual)
|   +-- Dev environment (Claude Code + MCP)
|   +-- Visual orchestration (n8n: Tools Agent, human approval, 400+ integrations)
|
+-- Example: Support Ticket Triage
    +-- New ticket trigger (webhook)
    +-- LangGraph: classify → search KB → draft
    +-- CrewAI: Researcher + Writer crew for high-quality replies
    +-- Human approval gate via n8n before sending
    +-- Claude Code for scaffolding, testing, and iterating

12. Key terms at a glance

Term What it means
LLM Large Language Model — the AI "brain" (Claude, GPT-4o, Gemini, Mistral, etc.)
Token The base unit an LLM processes — roughly 4 characters / ¾ of a word on average
Context window Maximum tokens an LLM can process at once (input + output); e.g. Claude 3.5 Sonnet = 200k
Inference Running a trained model to produce an output (as opposed to training it)
TTFT Time to First Token — latency between sending a prompt and receiving the first output token
Temperature Controls randomness in token sampling: 0 = deterministic, 1 = creative, >1 = risky
Hallucination Model confidently generating incorrect or fabricated information
ReAct Reasoning + Acting — the dominant pattern: Thought → Action → Observation loop
Agent Software that gives an LLM a goal, tools, memory, and a loop
Tool A function the agent can call — API, DB query, shell command, web search, etc.
Function calling Mechanism by which the model outputs a structured JSON object to invoke a tool
MCP Model Context Protocol — Anthropic's open standard (Nov 2024) for tool connectivity
Memory (short-term) The context window — fast, in-session, wiped on exit
Memory (episodic) Stored past events and interaction history, persisted externally
Memory (semantic) Facts and knowledge, typically backed by a vector store + RAG
Memory (procedural) Learned workflows and optimal sequences
RAG Retrieval-Augmented Generation — injecting retrieved docs into context before generation
Vector store DB optimized for semantic similarity search (Pinecone, Weaviate, pgvector, Redis)
Embedding Dense numeric vector representing semantic meaning of text
Orchestrator The outer layer that triggers and routes agent workflows (e.g. n8n, custom service)
LangChain Python/TS framework with model wrappers, memory, 400+ tool connectors, agent executors
LangGraph Graph-based stateful orchestration on top of LangChain; adds checkpointing and parallel nodes
CrewAI Multi-agent framework with role-based Agents, Tasks, Crews, and Flows
Claude Code Agentic coding tool that operates inside your repo and terminal, connects via MCP
n8n Visual low-code workflow orchestrator with AI Agent nodes, human approval gates, 400+ integrations
RLHF Reinforcement Learning from Human Feedback — technique used to align LLM behavior post-training
Checkpointing Persisting agent state after each step so it can resume after a failure

13. Flashcards (quick review)

Use these 12 flashcards to quickly review the key ideas.

  1. Q: In one sentence, what is generative AI? A: Generative AI is a type of AI that creates new content (text, images, or code) based on patterns learned from existing data.

  2. Q: In one sentence, what is agentic AI? A: Agentic AI is AI that works toward a goal by planning steps, calling tools, and adjusting based on feedback instead of just answering once.

  3. Q: How is the basic flow of generative AI different from agentic AI? A: Generative AI is usually "prompt → single response", while agentic AI is a loop of "goal → think → act → check → adjust".

  4. Q: Name three core components of an AI agent. A: Examples include: goal/input, reasoning layer, planner, memory/state, tools, executor, and feedback loop.

  5. Q: How is an AI agent similar to a Unix daemon or cron job? A: Like a daemon or cron job, an agent can run in the background, wake up on triggers, and call external programs or services.

  6. Q: What is the key difference between a cron job and an AI agent? A: A cron job runs a fixed command at fixed times, while an AI agent interprets goals, plans steps, and adapts actions based on context and feedback.

  7. Q: What role do LangChain and LangGraph play in agentic systems? A: They provide building blocks and graph-based orchestration for models, memory, tools, and workflows, so you do not have to hand-wire the entire agent loop.

  8. Q: When is CrewAI a good fit? A: When your problem naturally maps to a team of roles (researcher, writer, analyst, etc.) and you want multi-agent collaboration with sequential, hierarchical, or consensual process types.

  9. Q: What is Claude Code mainly used for in this context? A: As an agentic coding environment that helps developers build, test, and maintain agents by reading/writing code and running commands.

  10. Q: Why would you add n8n on top of your agent framework? A: To get a visual, low-code orchestration layer where business users can connect triggers, agents, and many SaaS tools without deep programming — including human approval gates for sensitive actions.

  11. Q: What is the ReAct pattern, and what are its three components? A: ReAct (Reasoning + Acting) is the dominant agent loop pattern where the model alternates between a Thought (reasoning trace), an Action (tool call), and an Observation (tool result) until the goal is satisfied.

  12. Q: What problem does the Model Context Protocol (MCP) solve? A: MCP standardizes how agents discover and call tools across different systems, replacing bespoke per-tool connectors with a universal protocol — solving the N×M integration problem.


14. Further learning and references

Here are some good next steps (videos, blogs, docs) aligned with the ideas in this article.

14.1 General AI and generative AI

  • IBM: What Is Artificial Intelligence (AI)?
  • https://www.ibm.com/think/topics/artificial-intelligence
  • Coursera article: What Is Artificial Intelligence? Definition, Uses, and Types
  • https://www.coursera.org/articles/what-is-artificial-intelligence
  • IBM: What is Generative AI?
  • https://www.ibm.com/think/topics/generative-ai
  • Wikipedia: Generative artificial intelligence
  • https://en.wikipedia.org/wiki/Generative_artificial_intelligence
  • Google Cloud: Introduction to Generative AI (YouTube)
  • https://www.youtube.com/watch?v=G2fqAlgmoPo
  • NVIDIA: What Are AI Tokens?
  • https://blogs.nvidia.com/blog/ai-tokens-explained/
  • Microsoft Learn: Understanding tokens
  • https://learn.microsoft.com/en-us/dotnet/ai/conceptual/understanding-tokens

14.2 Agentic AI and AI agents

  • Google Cloud: What are AI agents? Definition, examples, and types
  • https://cloud.google.com/discover/what-are-ai-agents
  • IBM: How AI agents work
  • https://www.ibm.com/think/topics/ai-agents
  • AWS: What are AI Agents?
  • https://aws.amazon.com/what-is/ai-agents/
  • Wikipedia: Autonomous agent
  • https://en.wikipedia.org/wiki/Autonomous_agent
  • Oracle Developers: What Is the AI Agent Loop?
  • https://blogs.oracle.com/developers/what-is-the-ai-agent-loop-the-core-architecture-behind-autonomous-ai-systems

Videos on AI agents (beginner-friendly)

  • AI Agents Clearly Explained (YouTube)
  • https://www.youtube.com/watch?v=FwOTs4UxQS4
  • AI Agents EXPLAINED in 14 minutes and tools for building one
  • https://www.youtube.com/watch?v=1gm__VUG2m8
  • AI Agent Tutorial, Clearly Explained!
  • https://www.youtube.com/watch?v=-Ccy6wySpD4
  • AI Agents Explained: The Technology That's Changing Everything
  • https://www.youtube.com/watch?v=BiN-NTTQ7tc

14.3 The ReAct pattern

  • ReAct Pattern: Combining Reasoning and Acting in AI Agents
  • https://hopx.ai/blog/ai-agents/react-pattern-reasoning-acting/
  • Agentic AI Design Patterns: ReAct, ReWOO, CodeAct, and Beyond
  • https://capabl.in/blog/agentic-ai-design-patterns-react-rewoo-codeact-and-beyond

14.4 Agent memory and RAG

  • Redis: AI agent memory — types, architecture and implementation
  • https://redis.io/blog/ai-agent-memory-stateful-systems/
  • IBM: What Is AI Agent Memory?
  • https://www.ibm.com/think/topics/ai-agent-memory
  • Machine Learning Mastery: Beyond Short-term Memory — the 3 types of long-term memory AI agents need
  • https://machinelearningmastery.com/beyond-short-term-memory-the-3-types-of-long-term-memory-ai-agents-need/

14.5 Frameworks and tooling for agents

  • LangChain official documentation (agents)
  • Python: https://docs.langchain.com/oss/python/langchain/agents
  • JavaScript/TypeScript: https://docs.langchain.com/oss/javascript/langchain/agents
  • LangChain GitHub repository
  • https://github.com/langchain-ai/langchain
  • LangGraph official page and GitHub
  • https://www.langchain.com/langgraph
  • https://github.com/langchain-ai/langgraph
  • LangGraph: Build Stateful AI Agents in Python (Real Python tutorial)
  • https://realpython.com/langgraph-python/
  • LangGraph review: stateful agent state machine in 2025
  • https://sider.ai/blog/ai-tools/langgraph-review-is-the-agentic-state-machine-worth-your-stack-in-2025

14.6 Multi-agent and platforms

  • CrewAI official documentation
  • https://docs.crewai.com/en/introduction
  • CrewAI GitHub repository
  • https://github.com/crewaiinc/crewai
  • IBM: What is crewAI?
  • https://www.ibm.com/think/topics/crew-ai
  • CrewAI framework 2025 review (Latenode)
  • https://latenode.com/blog/ai-frameworks-technical-infrastructure/crewai-framework/crewai-framework-2025-complete-review-of-the-open-source-multi-agent-ai-platform

14.7 Model Context Protocol (MCP)

  • Official MCP documentation
  • https://modelcontextprotocol.io/
  • Wikipedia: Model Context Protocol
  • https://en.wikipedia.org/wiki/Model_Context_Protocol
  • Anthropic engineering: Code execution with MCP
  • https://www.anthropic.com/engineering/code-execution-with-mcp
  • MCP: A Unified Standard for AI Agents and Tools (SerpAPI blog)
  • https://serpapi.com/blog/model-context-protocol-mcp-a-unified-standard-for-ai-agents-and-tools/

14.8 Developer environments and orchestration

  • Claude Code advanced usage (agents, MCP tools, skills)
  • (Search "Claude Code agents MCP" on dev blogs and YouTube for walkthroughs.)
  • n8n: Build custom AI agents with logic and control
  • https://n8n.io/ai-agents/
  • n8n AI Agent node documentation
  • https://docs.n8n.io/integrations/builtin/cluster-nodes/root-nodes/n8n-nodes-langchain.agent/
  • LangChain concepts in n8n
  • https://docs.n8n.io/advanced-ai/langchain/langchain-n8n/
  • Tutorials on building AI agents with n8n
  • https://strapi.io/blog/build-ai-agents-n8n

14.9 Deployment and production architecture

  • Deploying AI Agents to Production: Architecture, Infrastructure, and Implementation Roadmap
  • https://machinelearningmastery.com/deploying-ai-agents-to-production-architecture-infrastructure-and-implementation-roadmap/
  • Seven Hosting Patterns for AI Agents
  • https://james-carr.org/posts/2026-03-01-agent-hosting-patterns/
  • AWS: Effectively building AI agents on AWS Serverless
  • https://aws.amazon.com/blogs/compute/effectively-building-ai-agents-on-aws-serverless/

These links are good starting points if you want to go beyond the concepts into hands-on tutorials, code examples, and platform-specific documentation.