Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.qredence.ai/llms.txt

Use this file to discover all available pages before exploring further.

fleet-rlm is a web workspace for running recursive language-model tasks on top of DSPy and Daytona sandboxes. You chat with a ReAct agent in the browser; when a task is larger than a single context window, the agent delegates pieces to isolated sub-sandboxes, each running a bounded dspy.RLM per arXiv 2512.24601v2.

Who it’s for

DSPy users who want a UI-driven workspace for long-context tasks, recursive decomposition, and sandboxed code execution — without hand-rolling the transport, persistence, and sandbox plumbing.

What it removes

Writing your own WebSocket transport, session persistence, Daytona sandbox lifecycle, execution-trace UI, and recursive-delegation policy around a DSPy program. fleet-rlm ships all of that behind a single uv run fleet web.

Two layers, both dspy.*

Chat surface

dspy.ReAct for interactive turn-taking. Implemented as FleetAgent in src/fleet_rlm/runtime/agent/agent.py.

Recursive engine

dspy.RLM running inside a child Daytona sandbox. Built in src/fleet_rlm/runtime/models/builders.py via build_recursive_subquery_rlm().
The chat agent is the entry point. The recursive engine runs when a task exceeds what a single ReAct context can handle. Both share a single LLM-call budget across the recursive tree.

Empirical capability

Fleet-RLM’s RLM capabilities were benchmarked against the published RLM paper and Prime Intellect’s official oolong-rlm environment.
BenchmarkPaper RLM (GPT-5)Fleet-RLM + Gemini 3.1 Pro
S-NIAH (50 tasks, 50K–200K chars)(solved)100.0%
OOLONG-Official (trec_coarse @ 128K)56.5%91.67% (+35.2 pp)
OOLONG synthetic (30 tasks)56.5% (reference)74.0%

Where to go next

Quickstart

Install fleet-rlm and launch the Web UI in 30 seconds.

Architecture

Thin transport, runtime core, and Daytona substrate.

Agent model

FleetAgent (dspy.ReAct), AgentRuntime, signatures, and core memory.

Recursive RLM

Algorithm 1, delegation, REPL-variable mode, and the shared call budget.

Daytona runtime

Sandbox lifecycle, volumes, session continuity, and the host-callback bridge.

Sessions & persistence

Manifests, stateful restore, and the Neon-backed tenant store.

Observability

MLflow tracing, WebSocket events, diagnostics, and trace feedback.

Python API

DaytonaInterpreter, FleetAgent, runners, and DSPy signatures.

Source of truth

When the docs disagree with the code, trust the code:
  • Backend routes and WebSocket behavior live in src/fleet_rlm/api/.
  • Runtime and Daytona execution live in src/fleet_rlm/runtime/ and src/fleet_rlm/integrations/daytona/.
  • The canonical HTTP schema is openapi.yaml.