Recipes

Step-Level Outcomes and Process Rewards

Record per-step reward signals during agentic runs so reflection produces richer, step-attributed lessons.

Run-level outcomes tell you whether the whole run succeeded. Step-level outcomes tell you which specific step caused success or failure. MuBit stores these signals and feeds them into reflection so lessons are attributed to the exact step that mattered.

Prerequisites

MuBit client initialized with a valid API key
A multi-step agent run where each step produces an observable result

Flow

Execute an agent step (tool call, LLM inference, decision).
Record a step outcome with signal, rationale, and optional directive hint.
Repeat for each step in the run.
Reflect with include_step_outcomes set to true to produce step-attributed lessons. This is a wire-level field on the reflect request — the typed reflect() helpers do not forward it, so use the low-level passthrough (client.control.reflect({...}) in JS, or send the field directly over gRPC/HTTP from Python).
Use record_outcome() at the end for the overall run-level signal, passing the reference_id of the lesson, evidence item, or archive block the outcome is about.

Minimal implementation example

step_outcomes.py

from mubit import Client
import os
 
run_id = "agent:planner:task-123"
client = Client(
    endpoint=os.getenv("MUBIT_ENDPOINT", "https://api.mubit.ai"),
    api_key=os.environ["MUBIT_API_KEY"],
    run_id=run_id,
    transport="http",
)
 
# Step 1: Planning
plan = call_llm("Break down the task into sub-steps")
client.record_step_outcome(
    step_id="step-1-planning",
    step_name="initial_planning",
    outcome="success",
    signal=0.8,
    rationale="Generated a clear 3-step plan with dependencies identified",
    directive_hint="Include sub-task dependencies explicitly in plans",
)
 
# Step 2: Tool call
tool_result = execute_tool("search_api", query="relevant docs")
client.record_step_outcome(
    step_id="step-2-search",
    step_name="search_api",
    outcome="failure",
    signal=-0.6,
    rationale="Search returned no results — query was too narrow",
    directive_hint="Use broader search terms before narrowing",
)
 
# Step 3: Recovery
recovery = call_llm("Retry with broader query")
client.record_step_outcome(
    step_id="step-3-recovery",
    step_name="search_retry",
    outcome="success",
    signal=0.9,
    rationale="Broader query found the target document",
)
 
# Reflect to produce lessons.
#
# include_step_outcomes is a wire-level field (gRPC/HTTP) on the reflect
# request — the typed reflect() helper does not forward it. To produce
# step-attributed lessons from Python, send the field directly over the
# transport (POST /v2/control/reflect with {"run_id": run_id,
# "include_step_outcomes": true}), or use the JS/TS control passthrough shown
# in the next tab.
lessons = client.reflect()
 
# Run-level outcome — reference_id is required and points at the lesson,
# evidence item, or archive block this outcome is about.
client.record_outcome(
    reference_id="<lesson_or_evidence_id>",
    outcome="success",
    signal=0.7,
    rationale="Task completed after one retry",
    # Optionally attribute the same outcome to every recalled entry that
    # contributed; the primary reference_id is never double-counted.
    # entry_ids=["<entry-1>", "<entry-2>"],
)

Field reference

Field	Type	Required	Description
`step_id`	string	yes	Unique step identifier within the run
`step_name`	string	no	Human-readable label for the step
`outcome`	string	yes	`success`, `failure`, `partial`, or `neutral`
`signal`	float	no	Reward signal from -1.0 (worst) to 1.0 (best)
`rationale`	string	no	Explanation of why the outcome was assigned
`directive_hint`	string	no	Hindsight guidance for future runs
`agent_id`	string	no	Agent that performed the step
`metadata_json`	string	no	Arbitrary structured metadata

Combining with lane-scoped memory

Step outcomes work naturally with lanes. If your multi-agent system uses lane-scoped memory, record step outcomes from each agent and then reflect with both lane and step context:

# Agent "planner" records its step outcomes in the run
client.record_step_outcome(
    step_id="plan-v1",
    step_name="planning",
    outcome="success",
    signal=0.9,
    agent_id="planner",
)
 
# Reflect across all step outcomes. include_step_outcomes is a wire-level
# field the typed helper does not forward — send it directly over gRPC/HTTP
# (POST /v2/control/reflect with "include_step_outcomes": true), or use the
# JS control passthrough: client.control.reflect({ run_id, include_step_outcomes: true }).
client.reflect()

Failure modes and troubleshooting

Symptom	Root cause	Fix
Reflection produces generic lessons despite step outcomes	`include_step_outcomes` not set, or dropped by the typed helper	Send `include_step_outcomes: true` as a wire-level field on the reflect request — `client.control.reflect({ run_id, include_step_outcomes: true })` in JS, or POST it directly over gRPC/HTTP from Python (the typed `reflect()` helper does not forward it)
Step outcome not accepted	Missing `step_id` or `outcome`	Both fields are required
Lessons lack step attribution	Step outcomes recorded after reflection	Record step outcomes before calling `reflect()`
Too many step outcomes dilute signal	Recording outcomes for trivial steps	Only record outcomes for decision-significant steps

Next steps

Review the full HTTP contract at Control HTTP reference.
Review the gRPC surface at Control gRPC reference.
See Lane-Scoped Multi-Agent Memory for memory isolation patterns.