Architecture¶

System Diagram¶

graph TB
    subgraph Agents["Agent Layer"]
        AZ["Agent Zero"]
        CC["Claude Code"]
        OC["OpenClaw"]
        XX["Any MCP Agent"]
    end

    subgraph MCP["MCP Server (FastMCP)"]
        TOOLS["22 MCP Tools"]
        SSE["SSE Transport :8765"]
        STDIO["stdio Transport"]
    end

    subgraph Core["Core Services"]
        KM["Knowledge Manager"]
        SE["Search Engine"]
        CS["Coordination Service"]
        AR["Agent Registry"]
        EB["Event Bus"]
    end

    subgraph Storage["Storage Layer"]
        MD["Markdown Files\n(knowledge/)"]
        TV["Tantivy Index\n(.tantivy/)"]
        CH["ChromaDB\n(.chroma/)"]
        NX["NetworkX Graph\n(.graph/)"]
        SQ["SQLite\n(.lithos/)"]
    end

    FW["File Watcher (watchdog)"]

    AZ & CC & OC & XX --> SSE
    SSE & STDIO --> TOOLS
    TOOLS --> KM & SE & CS & AR
    KM --> MD
    KM --> EB
    EB --> TV & CH & NX
    SE --> TV & CH
    CS & AR --> SQ
    FW --> KM
    MD --> FW

Component Overview¶

MCP Server (FastMCP)¶

The entry point for all agent interactions. Exposes 22 tools via:

SSE (Server-Sent Events) — HTTP-based, for network agents (Agent Zero, remote Claude Code, OpenClaw)
stdio — process-based, for local MCP clients (Claude Desktop)

FastMCP handles argument validation, serialization, and MCP protocol compliance.

Knowledge Manager¶

Handles all reads and writes to the Markdown knowledge base:

Assigns UUIDs on creation
Normalises slugs (URL-safe, lowercase, hyphens)
Manages YAML frontmatter (author, timestamps, version, contributors)
Deduplicates writes by source_url (normalised)
Enforces the expected_version optimistic lock on updates
Emits events to the Event Bus after mutations

Search Engine¶

Dual-backend search with unified result format:

flowchart LR
    Q["Query"] --> BM25["Tantivy BM25"]
    Q --> VEC["ChromaDB Cosine"]
    BM25 --> RRF["RRF Fusion\nk=60"]
    VEC --> RRF
    RRF --> DEDUP["Dedup by doc_id"]
    DEDUP --> OUT["Results\n{id, title, score, snippet}"]

Chunking: Documents are split into ~500 character chunks (on paragraph boundaries, max 1000) before embedding. Semantic search operates over chunks and deduplicates to document level before returning results.

Hybrid mode (default): Merges BM25 and cosine similarity rankings using Reciprocal Rank Fusion. RRF is robust to score scale differences — rank matters, not magnitude.

Coordination Service¶

Manages tasks, claims, findings, and agent state in SQLite:

-- Key tables
tasks (id, title, description, status, tags, agent, created_at)
claims (task_id, aspect, agent, expires_at)
findings (id, task_id, agent, summary, knowledge_id, created_at)
agents (id, name, type, first_seen_at, last_seen_at, metadata)

Claims use TTL expiry: the expires_at timestamp is set when a claim is created or renewed. Expired claims are filtered at query time — no background cleanup job is needed.

Event Bus¶

An in-memory publish/subscribe bus that decouples the Knowledge Manager from the indexers:

Ring buffer of last 500 events
Per-subscriber queues (max 100)
SSE delivery to external consumers via GET /events
Events include: doc_created, doc_updated, doc_deleted, reindex_complete

File Watcher (watchdog)¶

Watches the knowledge/ directory for filesystem changes (e.g., files edited directly in Obsidian). Debounces changes (default 500ms) and triggers incremental index updates automatically.

Data Flow¶

Write Path¶

Agent → lithos_write
  → Knowledge Manager
    → Validate & normalise input
    → Check source_url dedup
    → Check version conflict (if expected_version provided)
    → Write Markdown file
    → Update graph cache (parse wiki-links)
    → Emit doc_created / doc_updated event
      → Search Engine: update Tantivy + ChromaDB
  → Return status envelope

Read Path¶

Agent → lithos_search(query, mode="hybrid")
  → Search Engine
    → Tantivy: BM25 query → ranked list
    → ChromaDB: cosine query over chunks → ranked list
    → RRF fusion → unified ranked list
    → Dedup by doc_id
  → Return results with snippets + scores

Startup Sequence¶

Ensure directories and coordination DB exist
Check rebuild conditions (rebuild_on_start flag or corrupt/missing indices)
Load graph cache or rebuild from Markdown files
Pre-warm embeddings in background (first run downloads the model)
Start file watcher
Serve MCP tools

Storage: Authoritative vs Rebuildable¶

Path	Type	Back up?
`knowledge/`	Authoritative — your Markdown	✅ Yes
`.lithos/`	Authoritative — coordination DB	✅ Yes
`.tantivy/`	Rebuildable index	❌ Optional
`.chroma/`	Rebuildable index	❌ Optional
`.graph/`	Rebuildable cache	❌ Optional

To rebuild all indices from scratch:

lithos reindex --clear

Tech Stack¶

Layer	Technology	Why
MCP framework	FastMCP	Pythonic MCP server with minimal boilerplate
Full-text search	Tantivy (via tantivy-py)	Rust-speed BM25, Lucene-compatible query syntax
Semantic search	ChromaDB	Embedded vector store, no external service
Embeddings	sentence-transformers	Local CPU inference, no API keys
Knowledge graph	NetworkX	In-memory graph traversal, rebuildable
Coordination state	SQLite	ACID, zero-dependency, runs anywhere
File watching	watchdog	Cross-platform inotify/kqueue/FSEvents
CLI	Click	Composable commands with `--help`
Package management	uv	Fast, reproducible Python environments