fleet-rlm is a Daytona-backed recursive runtime wrapped by a thin transport shell. Three intentional design choices drive the shape of the codebase:
- The backend is intentionally thin. The Python layer is a transport and orchestration shell over
dspy.ReAct, dspy.RLM, and Daytona sandboxes. Intelligence lives in DSPy (upstream) and in the recursive scheduling policy (this repo). Plumbing belongs in src/fleet_rlm/api/; runtime policy belongs in src/fleet_rlm/runtime/.
- The UI is treated as core, not peripheral. The runtime emits streaming events, code-execution results, and artifacts that only make sense in an interactive surface.
src/frontend/ is comparable in line count to src/fleet_rlm/ because it is surfacing work the runtime is already doing, not duplicating it.
- Two agent layers, both
dspy.*, both real.
- Chat surface —
dspy.ReAct at src/fleet_rlm/runtime/agent/agent.py (FleetAgent).
- Recursive engine —
dspy.RLM at src/fleet_rlm/runtime/models/builders.py with delegation at src/fleet_rlm/runtime/tools/rlm_delegate.py.
High-level layering
The runtime core is the live center. The transport shell, persistence layer, and offline quality lane all attach to it; they do not replace it.
Layer 1 — Transport shell
A thin FastAPI app with two WebSocket endpoints and a small set of REST routers.
Primary files:
| File | Role |
|---|
api/main.py | App factory, lifespan, route mounting, SPA asset serving |
api/bootstrap.py | Critical startup wiring + optional warmup |
api/routers/ws/endpoint.py | Two WebSocket surfaces: chat stream + execution events |
api/runtime_services/chat_runtime.py | Per-turn runtime preparation |
api/runtime_services/chat_persistence.py | Turn and session lifecycle writes |
api/runtime_services/diagnostics.py | Connectivity probes, status composition |
api/runtime_services/settings.py | Runtime settings read/mutate (local-only writes) |
api/runtime_services/volumes.py | Daytona volume browsing |
Responsibilities:
- App factory, lifespan, route mounting, and SPA asset serving (the SPA is served from
src/fleet_rlm/ui/dist in published installs, or src/frontend/dist in source checkouts).
- Auth-derived HTTP and WebSocket identity (dev or Entra).
- Session lookup, runtime preparation, and service orchestration.
- WebSocket lifecycle and execution-event envelope delivery.
- Runtime settings, diagnostics, and Daytona volume browsing.
The transport layer should not contain business logic. If you find yourself reaching for DSPy primitives or Daytona SDK calls inside a router, it belongs one layer down.
Layer 2 — Runtime core
The live cognition loop. This is where the ReAct agent runs, tools are dispatched, and per-turn execution events are assembled.
Primary files:
| File | Role |
|---|
api/routers/ws/stream.py | Streaming turn execution coordination |
runtime/factory.py | Builds the canonical Daytona-backed chat agent |
runtime/agent/agent.py | FleetAgent — the dspy.ReAct chat module |
runtime/agent/runtime.py | AgentRuntime — wraps FleetAgent with sandbox + memory state |
runtime/agent/signatures.py | DSPy input/output signatures |
runtime/execution/* | Execution-event assembly, streaming helpers, citation tracking |
runtime/models/builders.py | RLM construction (build_recursive_subquery_rlm) |
runtime/models/registry.py | Runtime model registry |
runtime/tools/* | Tool registry, including rlm_delegate.py |
Responsibilities:
- Shared chat and runtime execution (CLI, HTTP, WebSocket all converge here).
- Recursive delegation policy and tool execution.
- Execution-event assembly and workbench hydration inputs.
- Runtime model assembly and registry management.
AgentRuntime owns the per-session state envelope: the DSPy LM configuration, the active DaytonaInterpreter instance, the conversation dspy.History, loaded document paths, and core memory blocks.
Layer 3 — Daytona substrate
The execution backend. Daytona is the maintained sandbox provider; the runtime contract is intentionally Daytona-only.
Primary files:
| File | Role |
|---|
integrations/daytona/interpreter.py | Public DaytonaInterpreter facade |
integrations/daytona/workspace_manager.py | Workspace config, session lifecycle, persistent state, import/export |
integrations/daytona/sandbox_executor.py | Code execution, sanitization, tool dispatch, result finalization |
integrations/daytona/isolation.py | Recursive child policy, host-mediated evidence persistence, and context staging |
integrations/daytona/models.py | Sandbox specs, workspace config, staged-context records, smoke results, chat/session contracts |
integrations/daytona/runtime.py | Runtime facade around workspace bootstrap and session creation |
integrations/daytona/workspace_runtime.py | Workspace path, repo checkout, and session reconciliation helpers |
integrations/daytona/sdk_ops.py | Volume, snapshot, lifecycle, and lower-level Daytona SDK helpers |
integrations/daytona/bridge.py | Host-callback broker (sandbox → host) |
integrations/daytona/diagnostics.py | Structured diagnostics + smoke validation |
integrations/daytona/errors.py | Provider-local error types |
Responsibilities:
- Sandbox and interpreter lifecycle (creation, resume, shutdown).
- Repo checkout (
sandbox.git.clone(...)), workspace path staging, and durable mounted volumes.
- Persistent Python execution context via
sandbox.code_interpreter.create_context(...) and run_code(...).
- Provider-specific diagnostics and volume normalization.
The Fleet-facing provider contract is async-first. DaytonaInterpreter.astart(), ashutdown(), aexecute(), aconfigure_workspace(), aimport_session_state(), and the DaytonaSandboxSession file/lifecycle a* methods are real coroutines. Sync helpers remain as public compatibility shims for notebooks, tests, and direct Python API users.
See Daytona runtime for the full substrate model: snapshots, volumes, session continuity, and the broker bridge.
Layer 4 — Offline quality
DSPy evaluation, GEPA optimization, and offline scoring run in their own lane.
Primary files: src/fleet_rlm/runtime/quality/* — dspy_evaluation.py, gepa_optimization.py, mlflow_evaluation.py, mlflow_optimization.py, workspace_metrics.py, scorers.py, module_registry.py.
Responsibilities:
- DSPy evaluation against registered modules.
- GEPA optimization runs.
- Offline scoring, datasets, and module registry management.
Optimized artifacts are persisted for manual review; they are not auto-loaded into the live runtime. See DSPy integration for the optimization workflow.
Module map
Stateful restore
Session manifests on durable Daytona storage are the authoritative restart-restore source. The manifest state payload restores:
dspy.History conversation turns.
AgentRuntime core memory — default core memory plus persisted keys.
- Session-local loaded document paths.
- Daytona interpreter state — sandbox ID, workspace path, repo URL/ref, context paths, volume name, volume subpath.
Importing a session replaces session-local memory and document state instead of merging into the active runtime. Empty or missing state resets history, core memory, loaded documents, and sandbox buffers so switching sessions cannot leak stale agent context.
Manifests live under meta/workspaces/<workspace_id>/users/<user_id>/react-session-<session_id>.json on the mounted Daytona volume.
Reading order
When you need to understand the live backend, read in this order:
src/fleet_rlm/api/main.py
src/fleet_rlm/api/routers/ws/endpoint.py
src/fleet_rlm/api/routers/ws/stream.py
src/fleet_rlm/runtime/factory.py
src/fleet_rlm/runtime/agent/agent.py
src/fleet_rlm/runtime/tools/rlm_delegate.py
src/fleet_rlm/integrations/daytona/interpreter.py
src/fleet_rlm/integrations/daytona/runtime.py
Source of truth
When the docs disagree with the code, trust the code and the generated contracts:
- Backend routes and WebSocket behavior:
src/fleet_rlm/api/.
- Runtime cognition and Daytona execution:
src/fleet_rlm/runtime/ and src/fleet_rlm/integrations/daytona/.
- Canonical HTTP schema:
openapi.yaml.
Historical transition notes may mention orchestration_app/ or api/orchestration/. Those labels are not part of the current tree.