fleet-rlm is a transport and orchestration shell around DSPy. Both agent layers — the chat surface and the recursive engine — are realDocumentation Index
Fetch the complete documentation index at: https://docs.qredence.ai/llms.txt
Use this file to discover all available pages before exploring further.
dspy.* modules.
Where DSPy lives in the codebase
| Concern | DSPy module | fleet-rlm location |
|---|---|---|
| Chat orchestration | dspy.ReAct | src/fleet_rlm/runtime/agent/agent.py (FleetAgent) |
| Recursive long-context | dspy.RLM | src/fleet_rlm/runtime/models/builders.py (build_recursive_subquery_rlm) |
| Tool delegation | ReAct tool | src/fleet_rlm/runtime/tools/rlm_delegate.py |
| Offline optimization | dspy.GEPA | src/fleet_rlm/runtime/quality/* |
Configure the planner LM
The planner LM drives both ReAct and RLM. Configure it once via environment variables — fleet-rlm wires it into DSPy automatically..env
Use the chat agent from Python
AgentRuntime owns:
- The DSPy LM configuration.
- The Daytona interpreter instance.
- Conversation history (
dspy.History). - Loaded document paths and core memory.
Build a recursive sub-query RLM
When you need a bounded recursive runner manually:delegate_to_rlm tool from inside the ReAct agent — it enforces the shared call budget for you. See Recursive RLM for the full delegation flow.
Offline GEPA optimization
fleet-rlm’s optimization layer registers DSPy modules and runs GEPA against them. List registered modules:LongCoT reasoner
The optimization registry includeslongcot-reasoner — a LongCoT reasoning module evaluated with a continuous answer-dominant 0.6/0.4 GEPA metric and tiered feedback. A one-time offline GEPA optimization was run against an 80-row answered-only LongCoT dataset (64 train / 16 validation), producing a reviewable optimized artifact bundle.
The pipeline persists:
- Baseline-vs-optimized holdout evidence.
- Prompt snapshots.
- Reflection-model provenance.
- MLflow metadata.
The optimized artifact is saved for manual review and is not auto-loaded into the live runtime. Promote artifacts deliberately.
MLflow tracing
WhenMLFLOW_ENABLED=true (default), every DSPy call emits a trace to MLFLOW_TRACKING_URI under the MLFLOW_EXPERIMENT experiment (default: fleet-rlm).
In APP_ENV=local, the API server auto-starts a localhost MLflow target on port 5001 unless you set MLFLOW_AUTO_START=false. To run it yourself:
POST /api/v1/traces/feedback.
See also
- CLI reference — the
optimize,chat, andserve-apisubcommands. - Recursive RLM — delegation policy and shared call budget.