How OpenClaw Processes Events: From Channel Input to Memory, Context, and Response
OpenClaw is easiest to understand as an event-driven runtime rather than a chat app. A message, webhook, timer, or MCP trigger enters the system, gets normalized, passes through policy checks, binds to a session, assembles context, calls a model, executes tools when needed, and writes back the outcome in a form that can be replayed or remembered.
That matters because each stage solves a different problem. Channels handle platform quirks, the Gateway coordinates flow, sessions isolate state, context assembly protects the model’s window, tools extend capability, and memory keeps the system useful over time.
[Channel / Webhook / Cron / MCP]
|
v
[Gateway]
|
v
[Channel Adapter / Parser]
|
v
[Access Control]
|
v
[Session Resolution]
|
v
[Context Assembly]
|
v
[LLM Call]
|
v
[Tool Execution Loop]
|
v
[Response]
Ingress: every input becomes an event
OpenClaw does not care whether the trigger came from WhatsApp, Telegram, Slack, a webhook, or a scheduled job. The system treats all of them as events. That unified event model is what lets the rest of the runtime stay consistent.
This is the first useful mental model: different sources, same downstream flow.
Gateway and channel parsing
The Gateway is the control plane. It receives the raw request, identifies the source, and routes it to the correct channel adapter. The adapter is the piece that understands WhatsApp JSON, Telegram payloads, Slack events, or a webhook body.
The adapter’s job is normalization. It converts platform-specific data into a common internal event shape and enriches it with metadata such as chat ID, sender ID, reply-to ID, attachments, and group flags.
Raw payload
|
v
[Gateway]
|
v
[Channel Adapter]
|
v
Normalized event
{ channel, sender, chat_id, text, attachments, reply_to }
This is why OpenClaw can support multiple inputs without rewriting the orchestration logic every time a new platform is added.
Access control and session resolution
Once normalized, the event is checked against policy. This is where the system can reject untrusted senders, block unwanted channels, enforce mention rules, or validate signatures. The model should never see traffic that should have been stopped earlier.
If the event passes, OpenClaw resolves a session. That session boundary is the line that keeps one conversation isolated from another. It is usually derived from the channel and sender, sometimes with thread or chat context included.
That session resolution step is one of the reasons the system feels stateful without becoming messy.
Session logs versus memory
This is a distinction worth making very clearly.
A session log is the full event transcript: user messages, assistant messages, tool calls, and tool outputs. It is append-only and replayable. A final summary produced by the model can also live there because it is still part of the assistant’s conversation turn.
Memory is different. It is curated long-term knowledge, usually stored in Markdown files such as MEMORY.md and memory/YYYY-MM-DD.md. The memory store is not the same thing as the conversation log, even though useful information can be derived from the log and written into memory later.
That separation keeps the runtime both auditable and maintainable.
Context assembly
Before the model is called, OpenClaw assembles the prompt. This step combines the system prompt, recent session history, selected memory snippets, and sometimes relevant workspace or skill content. The purpose is to give the model enough information without exceeding the context window.
This is where the system becomes selective. It cannot load everything forever, so it keeps recent turns, uses memory search for relevant recall, and excludes material that is too old or too large.
If the window starts to fill, the runtime may compact older history into a smaller summary so the next prompt stays usable.
Memory search and vectorization
The memory layer is the part that gets vectorized by default, not the raw session log. Markdown memory files are chunked, embedded, and stored in a vector index. When the agent needs recall, it performs semantic search over those chunks and retrieves only the most relevant ones.
That means the model does not receive the full memory archive. It receives the top-ranked snippets that match the current task.
This design gives OpenClaw a practical form of long-term recall without stuffing the prompt with everything it has ever seen.
LLM call and tool execution
After the prompt is assembled, the model is called. Sometimes it answers directly. Sometimes it returns a tool plan. In that second case, the system validates the request, runs the tool, captures the result, and may feed that result back into the model for another turn.
[Prompt] -> [LLM]
|
+-------+--------+
| |
answer tool call(s)
| |
v v
[Response] [Tool execution]
|
v
[Tool result]
|
v
[LLM]
That loop is what makes OpenClaw more than a text generator. It becomes an orchestrator.
Why summaries are written back
A common point of confusion is the final summary. Why store it in the session log if the log is just “conversation”? Because the summary is still an assistant message, and the session log is the full transcript of all observable session events.
That makes the run replayable. It also gives you a clean end-of-run record that can be used for auditing, debugging, or later memory updates. The log preserves the exact words the assistant used, while memory can store the distilled facts.
Context limits and compaction
OpenClaw manages context limits by budgeting tokens and shrinking what it sends to the model. Older turns can be summarized, large files can be truncated, and memory search can inject only a few high-signal snippets. The goal is not to preserve every token; the goal is to preserve the right meaning.
This is the practical answer to long-running agent sessions.
The end-to-end mental model
If you only keep one map in your head, make it this:
- Ingress: an event arrives from a channel or trigger.
- Normalization: the adapter turns it into a common event shape.
- Governance: access control decides whether it can continue.
- State: session resolution binds it to a conversation.
- Reasoning: context assembly prepares the bounded prompt.
- Action: the model calls tools if needed.
- Persistence: logs and memory capture what matters.
Ingress -> Normalize -> Govern -> Session -> Context -> Model -> Tools -> Response
|
v
Logs + Memory + Index
Why this design works
This architecture is modular without becoming abstract. You can add a new channel without rewriting the reasoning loop. You can change the embedding provider without changing sessions. You can tune memory retrieval without touching the transcript format. And because the Gateway stays central, the system keeps its control plane clear.
That balance is the real strength of OpenClaw: it combines event-driven orchestration, state isolation, bounded context, and searchable memory in a way that stays understandable once the handoffs are clear.