Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.qredence.ai/llms.txt

Use this file to discover all available pages before exploring further.

fleet-rlm is a web workspace for running recursive language-model tasks on top of DSPy and Daytona sandboxes. You chat with a ReAct agent in the browser; when a task is larger than a single context window, the agent delegates pieces to isolated sub-sandboxes, each running a bounded dspy.RLM per arXiv 2512.24601v2.

Who it’s for

DSPy users who want a UI-driven workspace for long-context tasks, recursive decomposition, and sandboxed code execution — without hand-rolling the transport, persistence, and sandbox plumbing.

What it removes

Writing your own WebSocket transport, session persistence, Daytona sandbox lifecycle, execution-trace UI, and recursive-delegation policy around a DSPy program. fleet-rlm ships all of that behind a single uv run fleet web.

Two layers, both dspy.*

Chat surface

dspy.ReAct for interactive turn-taking. Implemented as FleetAgent in src/fleet_rlm/runtime/agent/agent.py.

Recursive engine

dspy.RLM running inside a child Daytona sandbox. Built in src/fleet_rlm/runtime/models/builders.py via build_recursive_subquery_rlm().
The chat agent is the entry point. The recursive engine runs when a task exceeds what a single ReAct context can handle. Both share a single LLM-call budget across the recursive tree.

Empirical capability

Fleet-RLM’s RLM capabilities were benchmarked against the published RLM paper and Prime Intellect’s official oolong-rlm environment.
BenchmarkPaper RLM (GPT-5)Fleet-RLM + Gemini 3.1 Pro
S-NIAH (50 tasks, 50K–200K chars)(solved)100.0%
OOLONG-Official (trec_coarse @ 128K)56.5%91.67% (+35.2 pp)
OOLONG synthetic (30 tasks)56.5% (reference)74.0%

Where to go next

Quickstart

Install fleet-rlm and launch the Web UI in 30 seconds.

Concepts

Understand the ReAct + RLM split and how delegation works.

Architecture

The thin transport, runtime core, and Daytona substrate.

CLI reference

fleet, fleet-rlm chat, serve-api, and optimize.

Source of truth

When the docs disagree with the code, trust the code:
  • Backend routes and WebSocket behavior live in src/fleet_rlm/api/.
  • Runtime and Daytona execution live in src/fleet_rlm/runtime/ and src/fleet_rlm/integrations/daytona/.
  • The canonical HTTP schema is openapi.yaml.