EdgeClaw 2.0: Memory + Cost-Saving Router

Apr 1, 2026

Key Takeaways

  • EdgeClaw 2.0 is an edge-cloud collaborative AI agent built on OpenClaw, open-sourced on April 1, 2026 by THUNLP, Renmin University, and OpenBMB
  • ClawXMemory adds multi-layered structured long-term memory (L0/L1/L2 + Global), with model-driven proactive retrieval instead of vector search
  • ClawXRouter routes 60–80% of requests to cheaper models via LLM-as-Judge complexity classification, achieving 58% cost savings with scores 6.3% higher on PinchBench
  • Three-tier privacy: S1 (direct cloud), S2 (desensitized forwarding), S3 (fully local) — sensitive data never leaves the device
  • Zero configuration: pnpm build && node openclaw.mjs gateway run, auto-generates config on first launch

What Is EdgeClaw?

EdgeClaw is an edge-cloud collaborative AI agent built on top of OpenClaw. It is jointly developed by THUNLP (Tsinghua University), Renmin University of China, AI9Stars, ModelBest, and OpenBMB.

Core attributes:

  • Category: Edge-cloud collaborative AI agent framework
  • Base framework: OpenClaw (TypeScript)
  • Repository: OpenBMB/EdgeClaw
  • License: MIT
  • First release: February 12, 2026
  • v2.0 release: April 1, 2026

What problem does it solve?

  • OpenClaw has no cross-session memory — EdgeClaw adds Claude Code-like persistent memory
  • Cloud API calls are expensive — EdgeClaw routes simple tasks to cheap models, saving 58%
  • Sensitive data leaks to cloud — EdgeClaw's three-tier privacy keeps private data local

EdgeClaw vs OpenClaw vs Claude Code

This is the comparison that matters. EdgeClaw positions itself as "Claude Code experience for OpenClaw":

CapabilityOpenClawClaude CodeEdgeClaw 2.0
Cross-session project knowledge
Persistent user preference
Multi-layered structured memory
Memory integration strategyRecallOn-demand readProactive reasoning
Continuous memory consolidationAuto-Dream (backend)Auto-consolidation on idle & topic switch
Cost-aware routing58% cost savings
Three-tier privacy collaborationS1 / S2 / S3
Visual Dashboard

The key differentiator: EdgeClaw doesn't just clone Claude Code's memory — it adds cost-saving routing and privacy controls that neither OpenClaw nor Claude Code offer.


ClawXMemory: How the Memory Engine Works

ClawXMemory is the first plugin to bring Claude Code-like memory capabilities to the OpenClaw ecosystem.

Three-Layer Memory Architecture

The system automatically distills information during conversations, building structured memory layer by layer:

Memory LayerTypeDescription
L2Project memory / Timeline memoryHigh-level long-term memory aggregated around specific topics or timelines
L1Memory fragmentsStructured core summaries distilled from concluded topics
L0Raw conversationsThe lowest-level raw message records
GlobalUser profileA continuously updated global user preference singleton

How Retrieval Works

When the model needs to recall information, it navigates along the "memory tree" through reasoning — not vector search:

  1. First evaluates relevance from high-level memory (project / timeline / profile)
  2. Drills down into finer-grained fragments only when needed
  3. Traces back to specific conversations when necessary

This is closer to how a human expert reasons layer by layer than traditional RAG retrieval.

Key Differentiators vs Claude Code Memory

AspectClaude CodeClawXMemory
Memory retrievalOn-demand file readModel-driven proactive reasoning
Memory consolidationAuto-Dream (backend)Auto-consolidation on idle & topic switch
StorageCloud-sideLocal SQLite (data never leaves device)
VisualizationNoneCanvas view + list view Dashboard
Import/ExportNoOne-click import/export

ClawXRouter: How the Cost-Saving Router Works

ClawXRouter is EdgeClaw's routing brain. It intercepts requests and routes them to the most economical model based on complexity.

LLM-as-Judge Complexity Classification

Most requests — browsing files, reading code, simple Q&A — don't need the most expensive model. Token-Saver classifies each request:

ComplexityTask ExamplesDefault Target Model
SIMPLEQueries, translation, formatting, greetingsgpt-4o-mini
MEDIUMCode generation, single-file editing, email draftinggpt-4o
COMPLEXSystem design, multi-file refactoring, cross-doc analysisclaude-sonnet-4.6
REASONINGMathematical proofs, formal logic, experiment designo4-mini

Performance data:

  • 60–80% of requests are forwarded to cheaper models in typical workflows
  • PinchBench results: 58% cost savings with scores 6.3% higher
  • Judge runs on local small model (MiniCPM-4.1 / Qwen3.5), adding ~1–2s latency
  • Prompt hash caching (SHA-256, TTL 5 min) avoids re-judging identical requests

Three-Tier Privacy System

Every message, tool call, and tool result is inspected and classified in real time:

LevelMeaningRouting StrategyExample
S1SafeSend directly to cloud"Write a poem about spring"
S2SensitiveDesensitize then forwardAddresses, phone numbers, emails
S3PrivateProcess locally onlyPay slips, passwords, SSH keys

How S2 desensitization works:

User message (with PII) → Local LLM detection → S2 → Extract PII → Replace with [REDACTED:*]
    → Privacy Proxy → Strip markers → Forward to cloud → SSE response

Dual-track memory: The cloud model never sees MEMORY-FULL.md or complete session history — the Hook system intercepts at the file access layer.

Composable Router Pipeline

Security router and cost-aware router run in the same pipeline:

User Message

    ├── Phase 1: Fast routers (weight ≥ 50) — privacy router
    │       └── Three-tier sensitivity detection

    ├── Short-circuit: If sensitive → skip Phase 2

    └── Phase 2: Slow routers (weight < 50) — token-saver
            └── LLM Judge complexity classification

Security first — the privacy router runs with high weight. Cost-aware routing kicks in only after security check passes (S1).


Quick Start

Build and Launch

git clone https://github.com/openbmb/edgeclaw.git
cd edgeclaw
pnpm install
pnpm build
node openclaw.mjs gateway run

On first launch, a complete config skeleton is auto-generated at ~/.edgeclaw/. ClawXRouter and ClawXMemory are bundled — no manual plugin installation.

Fill in API Key

Three ways:

  • Environment variable: Set EDGECLAW_API_KEY before launch (auto-fills)
  • Config file: Edit apiKey in ~/.edgeclaw/openclaw.json
  • Dashboard: Visit http://127.0.0.1:18790/plugins/clawxrouter/stats — changes take effect immediately

Verify

node openclaw.mjs agent --local --agent main -m "Hello"

When you see [ClawXrouter] token-saver: S1 redirect → and an agent reply, the deployment is successful.

Dashboard URLs

PanelURL
ClawXRouter (routing config & stats)http://127.0.0.1:18790/plugins/clawxrouter/stats
ClawXMemory (memory visualization)http://127.0.0.1:39394/clawxmemory/

Who Should Use EdgeClaw?

Choose EdgeClaw if you:

  • Already use OpenClaw and want cross-session memory without switching to Claude Code
  • Handle sensitive data (patient records, proprietary sequences) that can't leave your machine
  • Want to cut LLM API costs without sacrificing quality
  • Need a visual dashboard for monitoring routing decisions and memory state

Stay with vanilla OpenClaw if you:

  • Don't need persistent memory across sessions
  • Don't handle sensitive data
  • Are comfortable with current API costs
  • Prefer the simplest possible setup

Consider Claude Code if you:

  • Want a fully managed solution with no self-hosting
  • Don't mind vendor lock-in to Anthropic
  • Don't need cost-saving routing or privacy tiers

FAQ

Q1: Does EdgeClaw require a local GPU?

In Token-Saver-only mode (default), no GPU is needed — the Judge model runs on CPU. If you enable the privacy router with local LLM detection, you'll need a local inference backend (Ollama / vLLM) which benefits from a GPU but can run on CPU.

Q2: How much does EdgeClaw actually save on API costs?

On PinchBench, EdgeClaw achieved 58% cost savings while scoring 6.3% higher than sending everything to the most expensive model. In typical workflows, 60–80% of requests are classified as SIMPLE or MEDIUM and routed to cheaper models.

Q3: Is ClawXMemory data stored locally or in the cloud?

Fully local. ClawXMemory uses SQLite by default. All memory data stays on your device. One-click import/export is supported for backup and migration.

Q4: Can I use EdgeClaw with models other than OpenAI and Anthropic?

Yes. EdgeClaw inherits OpenClaw's multi-provider support. ClawXRouter's Token-Saver tiers are fully customizable — you can map any complexity level to any provider/model combination, including local models via Ollama or vLLM.

Q5: How does EdgeClaw compare to other OpenClaw memory plugins?

ClawXMemory is currently the only plugin that implements multi-layered structured memory with proactive reasoning retrieval. Standard OpenClaw has basic recall; other community plugins typically use vector-search RAG. ClawXMemory's approach — navigating a memory tree through reasoning — is architecturally distinct.

Q6: What happens if the privacy router incorrectly classifies a message?

The system uses dual detection engines: rule-based (keywords + regex, ~0ms) and local LLM-based (semantic understanding, ~1–2s). You can stack both for maximum safety. Custom rules can be added in clawxrouter.json to match your specific sensitivity requirements.


Summary

EdgeClaw 2.0 brings two capabilities that were previously exclusive to Claude Code — persistent memory and intelligent cost management — into the open-source OpenClaw ecosystem, while adding privacy controls that neither platform offers.

Core value proposition:

  • ClawXMemory: Multi-layered structured memory with proactive reasoning — remembers your projects, preferences, and context across sessions
  • ClawXRouter: LLM-as-Judge routing that cuts API costs by 58% without quality loss
  • Three-tier privacy: S1/S2/S3 classification ensures sensitive data never leaves your device

For researchers handling sensitive data who want Claude Code-level memory without the cost or privacy concerns, EdgeClaw 2.0 is currently the strongest open-source option.

EdgeClaw 2.0: Memory + Cost-Saving Router | Blog