Learn the basics

Agent-decided Retrieval vs RAG

Why MuBit lets the agent decide retrieval mode and compartment instead of forcing you to hard-code a RAG pipeline.

Traditional RAG forces every retrieval decision into application code: you pick the index, choose the similarity metric, set the top-k, decide which documents to stuff into the prompt, and wire it all together before the model ever sees the query. When requirements change you re-engineer the pipeline.

MuBit takes the opposite approach. The agent decides the retrieval mode and the memory compartment at query time. You call control.query with mode: "agent_routed", and the runtime selects the best retrieval lane — semantic search, temporal scan, sparse recall, SDM, graph traversal, or a combination — based on the query and the data it already holds. You don't hard-code the retrieval path; the agent routes itself.

What this looks like in practice

# One call. The runtime picks the lane.
response = client.advanced.query({
    "run_id": "support:agent:ticket-42",
    "query": "What did the customer say about billing last week?",
    "mode": "agent_routed",
})

Compare with a typical RAG setup where you would write: embed the query → hit a vector store → rank results → truncate to context window → inject into prompt template → call LLM. Each step is a place where retrieval quality can silently degrade, and every step is your code to maintain.

How this works

Concern	RAG pipeline (you own it)	MuBit agent-routed (runtime owns it)
Which index to query	Hard-coded per endpoint	Runtime selects lane from query semantics
Top-k and ranking	Fixed or tuned per use-case	Adaptive per memory compartment
Context assembly	Your prompt-stuffing logic	Runtime returns ranked results; agent consumes
Adding a new retrieval mode	New code path + deploy	Already available — runtime routes when useful
Observability	You instrument each hop	Built-in: `routing_summary` and `mode` on the response, plus `retrieval_mode`, `metadata_json`, and `score` on each evidence item

When to override

Agent-routed retrieval is the right default. Override it when you have an explicit reason:

direct_lane: "semantic_search" — force pure vector similarity when you know the answer is a nearest-neighbor match.
Explicit RAG — when you need to assemble context from multiple external sources before calling MuBit, or when your compliance rules require auditable retrieval steps. In this case, use control.ingest / control.batch_insert to load data and control.query with a specific direct_lane to retrieve it.

Failure modes and troubleshooting

Symptom	Root cause	Fix
Inconsistent answers across requests	Switching between agent-routed and hard-coded lanes	Pick one default mode and stick with it
Hard incident triage	Retrieval path not logged	Inspect `routing_summary` on the response and `retrieval_mode` / `metadata_json` on each evidence item
Missing deterministic relationships	Relying on similarity for structured facts	Ingest explicit facts and use control filters or graph-aware recall
Context window overflow	Returning too many results	Set `limit` on query; let the agent prune

Next steps

Build retrieval contracts at Retrieve data.
Review permissions governance at Data guardrails.