Skip to content

lithos_retrieve

Added in v0.2.0 (LCMA MVP1)

Cognitive retrieval tool that runs a two-phase scout pipeline, merges and normalises results, applies Terrace 1 reranking (weighted scores, note-type priors, salience, MMR diversity), and writes an audit receipt on every call.

Prefer lithos_retrieve over lithos_search when you need explainability, multi-signal recall, or LCMA context — it tells you why each result was returned and which scouts found it.


Signature

lithos_retrieve(query, [limit], [namespace_filter], [agent_id], [task_id], [surface_conflicts], [max_context_nodes], [tags], [path_prefix])

Parameters

Name Type Required Default Description
query string Natural-language retrieval query
limit int 10 Maximum results to return
namespace_filter string[] null Restrict results to documents in these namespaces (OR semantics)
agent_id string null Caller identity for access-scope gating and audit trail
task_id string null Active task ID — enables the scout_task_context scout and records results in working memory
surface_conflicts bool false Reserved for MVP 2 — when true, the response envelope includes a conflicts list of contradiction edges
max_context_nodes int limit Number of Phase A candidates to use as seeds for Phase B scouts (provenance, graph, coactivation)
tags string[] null Filter to documents with all of these tags (AND semantics)
path_prefix string null Filter by document path prefix

LCMA must be enabled

lithos_retrieve requires lcma.enabled: true in your config. If LCMA is disabled it returns {"status": "error", "code": "lcma_disabled", ...}.


Returns

{
  "results": [
    {
      "id": "doc-uuid",
      "title": "Rate limiting pattern for OpenAI API",
      "snippet": "Use exponential backoff with jitter. Base delay 1s…",
      "score": 0.94,
      "path": "rate-limiting-openai.md",
      "source_url": "https://example.com/article",
      "updated_at": "2026-04-01T12:00:00",
      "is_stale": false,
      "derived_from_ids": [],
      "reasons": ["lexical: exact query term match", "vector: high cosine similarity"],
      "scouts": ["scout_vector", "scout_lexical"],
      "salience": 0.87
    }
  ],
  "temperature": 0.5,
  "terrace_reached": 1,
  "receipt_id": "rcpt_2026..."
}

Top-level envelope fields

Field Description
results Ranked list of matching documents (up to limit).
temperature Pipeline temperature (MVP 1: always lcma.temperature_default, typically 0.5). High temperature (≥ 0.5) indicates a cold-start graph — insufficient typed edges to compute coherence. Low temperature will indicate a well-connected graph in a future release.
terrace_reached LCMA pipeline stage reached — 1 = Terrace 1 (fast rerank).
receipt_id Audit receipt ID. Full pipeline trace written to <data_dir>/.lithos/receipts/ in stats.db.
conflicts (Only present when surface_conflicts=true) List of contradiction edges from the knowledge graph. Reserved for MVP 2.

Per-result fields

Field Description
id Document UUID.
title Document title.
snippet Short excerpt generated from document content matching the query.
score Final reranked composite score (higher = better).
path Relative path of the Markdown file within the data directory.
source_url Original source URL, if set in frontmatter; empty string otherwise.
updated_at ISO 8601 timestamp of last update; empty string if unset.
is_stale true if the document is past its expires_at freshness deadline.
derived_from_ids List of document IDs this document was derived from (provenance chain).
reasons Human-readable explanations of why this document scored highly (from each contributing scout).
scouts Names of the scouts that retrieved this document.
salience Stored salience value from retrieval history (0–1). High = frequently retrieved.

Scout Pipeline

lithos_retrieve runs a two-phase scout pipeline:

Phase A — Parallel scouts

All five (or six) Phase A scouts fire concurrently via asyncio.gather:

Scout Description
scout_vector ChromaDB cosine similarity search against the semantic index
scout_lexical Tantivy BM25 full-text search
scout_exact_alias Graph-based exact alias and title lookup
scout_tags_recency Tag-weighted recency scoring — surfaces recently-updated documents whose tags match the query context
scout_freshness Freshness boost for documents with time-sensitive keywords in the query
scout_task_context Documents associated with the active task (only fires when task_id is provided)

Phase A results are merged and normalised into a ranked pool. The top max_context_nodes documents become seeds for Phase B.

Phase B — Sequential scouts (seeded from Phase A)

Phase B scouts run sequentially, seeded from the top Phase A candidates:

Scout Description
scout_provenance Documents in the provenance chain of the Phase A seeds (derived-from relationships)
scout_graph Neighbours via typed LCMA edges (edges.db) and wiki-link graph traversal
scout_coactivation Documents frequently retrieved alongside the seeds in past sessions
scout_source_url Documents sharing a source URL with Phase A seeds (URL-cluster expansion)

A scout appears in the receipt as "fired" if it ran without error — even if it returned zero candidates.

Merge, Rerank, and Diversity

All Phase A and Phase B candidates are merged with merge_and_normalize into a unified pool. Terrace 1 reranking applies a weighted combination of:

  • Scout weights — configurable per-scout contribution via lcma.rerank_weights
  • Note-type priors — document type bias (e.g., reference notes weighted differently than observation)
  • Salience — historical retrieval frequency from stats.db

After reranking, a greedy MMR (Maximal Marginal Relevance) pass over the top 30 candidates promotes diversity by penalising near-duplicates (Jaccard similarity on title tokens, λ=0.7).


Audit Receipts

Every lithos_retrieve call writes a receipt row to stats.db under <data_dir>/.lithos/. The receipt captures: query, limit, namespace filter, scouts fired, candidates considered, final nodes with reasons, temperature, and terrace reached. Receipts are written even when the call errors.


Working Memory

When task_id is provided, each result document is upserted into working memory (stats_store.upsert_working_memory), linking the document to the active task for use by subsequent scout_task_context calls.


Example

# Basic retrieval
results = lithos_retrieve(query="exponential backoff rate limiting")
for r in results["results"]:
    print(f"[{r['score']:.2f}] {r['title']}")
    print(f"  Scouts: {', '.join(r['scouts'])}")
    print(f"  Why: {r['reasons'][0]}")

# Task-context retrieval (activates scout_task_context + working memory)
results = lithos_retrieve(
    query="async queue implementation",
    task_id="task-abc123",
    tags=["python", "patterns"],
    agent_id="my-agent"
)

# Namespace-scoped retrieval
results = lithos_retrieve(
    query="deployment patterns",
    namespace_filter=["production-runbooks", "ops"]
)

# Widen provenance seeding beyond the default limit
results = lithos_retrieve(
    query="rate limiting",
    limit=5,
    max_context_nodes=20  # seed Phase B from top 20 Phase A candidates
)

lithos_retrieve lithos_search
When to use Multi-signal recall, explainability, LCMA context Simple keyword, semantic, or hybrid lookup
Scout model 10 scouts in two phases + rerank + MMR Single mode (fulltext / semantic / hybrid / graph)
Returns Score + reasons + scouts + salience + provenance Score + snippet
Audit trail ✅ Receipt on every call
Task context ✅ via task_id
Speed Slightly slower (fan-out + rerank) Faster
LCMA required ✅ (lcma.enabled: true)

Use lithos_retrieve as the default retrieval tool in LCMA-aware agent pipelines. Use lithos_search for simple, fast lookups where explainability and multi-signal recall are not required.