Agent-decided Retrieval vs RAG
Why MuBit lets the agent decide retrieval mode and compartment instead of forcing you to hard-code a RAG pipeline.
Traditional RAG forces every retrieval decision into application code: you pick the index, choose the similarity metric, set the top-k, decide which documents to stuff into the prompt, and wire it all together before the model ever sees the query. When requirements change you re-engineer the pipeline.
MuBit takes the opposite approach. The agent decides the retrieval mode and the memory compartment at query time. You call control.query with mode: "agent_routed", and the runtime selects the best retrieval lane — semantic search, temporal scan, sparse recall, SDM, graph traversal, or a combination — based on the query and the data it already holds. You don't hard-code the retrieval path; the agent routes itself.
What this looks like in practice
# One call. The runtime picks the lane.
response = client.advanced.query({
"run_id": "support:agent:ticket-42",
"query": "What did the customer say about billing last week?",
"mode": "agent_routed",
})Compare with a typical RAG setup where you would write: embed the query → hit a vector store → rank results → truncate to context window → inject into prompt template → call LLM. Each step is a place where retrieval quality can silently degrade, and every step is your code to maintain.
How this works
| Concern | RAG pipeline (you own it) | MuBit agent-routed (runtime owns it) |
|---|---|---|
| Which index to query | Hard-coded per endpoint | Runtime selects lane from query semantics |
| Top-k and ranking | Fixed or tuned per use-case | Adaptive per memory compartment |
| Context assembly | Your prompt-stuffing logic | Runtime returns ranked results; agent consumes |
| Adding a new retrieval mode | New code path + deploy | Already available — runtime routes when useful |
| Observability | You instrument each hop | Built-in: routing_summary and mode on the response, plus retrieval_mode, metadata_json, and score on each evidence item |
When to override
Agent-routed retrieval is the right default. Override it when you have an explicit reason:
direct_lane: "semantic_search"— force pure vector similarity when you know the answer is a nearest-neighbor match.- Explicit RAG — when you need to assemble context from multiple external sources before calling MuBit, or when your compliance rules require auditable retrieval steps. In this case, use
control.ingest/control.batch_insertto load data andcontrol.querywith a specificdirect_laneto retrieve it.
Failure modes and troubleshooting
| Symptom | Root cause | Fix |
|---|---|---|
| Inconsistent answers across requests | Switching between agent-routed and hard-coded lanes | Pick one default mode and stick with it |
| Hard incident triage | Retrieval path not logged | Inspect routing_summary on the response and retrieval_mode / metadata_json on each evidence item |
| Missing deterministic relationships | Relying on similarity for structured facts | Ingest explicit facts and use control filters or graph-aware recall |
| Context window overflow | Returning too many results | Set limit on query; let the agent prune |
Next steps
- Build retrieval contracts at Retrieve data.
- Review permissions governance at Data guardrails.