Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.qredence.ai/llms.txt

Use this file to discover all available pages before exploring further.

fleet-rlm combines a ReAct chat orchestrator with recursive long-context execution over a shared Daytona-backed interpreter runtime.

ReAct chat orchestrator

FleetAgent (wrapped by AgentRuntime) is the interactive orchestrator. It:
  • Receives user requests from the CLI, HTTP API, or WebSocket transport.
  • Decides tool actions using dspy.ReAct.
  • Streams intermediate and final events back to the client.
  • Maintains conversation history and document context.
The chat agent is the entry point for every user interaction. It does not by itself run long-context jobs — instead it dispatches them through a specific tool.

Recursive long-context execution

For tasks that exceed a single ReAct context window, the agent delegates to a bounded dspy.RLM running inside a child Daytona sandbox. The recursive engine implements Algorithm 1 from arXiv 2512.24601v2:
  • Inputs are stored as REPL variables inside the child sandbox.
  • Sub-queries are dispatched recursively, bounded by max_iterations and max_llm_calls.
  • Sandboxes are isolated per delegation.
  • A single shared semantic-call budget covers the entire recursive tree.
See Recursive RLM for the delegation flow and isolation policy.

Interpreter runtime backends

Interpreter backends provide isolated remote execution. The current contract is Daytona-only:
  • Sandbox isolation from the host environment.
  • Persistent storage via durable mounted volumes (memory/, artifacts/, buffers/, meta/).
  • Controlled execution profiles for root and delegate behavior.
The same ReAct + recursive dspy.RLM runtime serves the CLI, HTTP API, and Web UI.

Runtime surfaces

SurfaceCommand
Web UI + APIuv run fleet web
Terminal chatuv run fleet or uv run fleet-rlm chat
API server onlyuv run fleet-rlm serve-api
All surfaces converge on shared orchestration and runtime modules.

Observability and state

The system emits two WebSocket streams:
  • /api/v1/ws/execution — chat stream events
  • /api/v1/ws/execution/events — execution graph events
Persistence lives in Neon/Postgres as canonical multi-tenant state. Session manifests on durable storage are the authoritative restart-restore source — the manifest’s state payload restores dspy.History turns, agent core memory, loaded documents, and Daytona interpreter state.

Auth and environment guardrails

Runtime behavior is environment-sensitive via configuration:
VariablePurpose
APP_ENVlocal, staging, or production
AUTH_MODEdev or entra
AUTH_REQUIREDEnforce auth on API routes
DATABASE_REQUIREDEnforce Neon/Postgres connectivity
When AUTH_MODE=entra, HTTP and WebSocket access use real Entra bearer-token validation plus Neon-backed tenant admission. Runtime settings writes are intentionally limited to APP_ENV=local.

Goal-first, not repo-first

Repositories are one possible source of context, alongside local files, staged documents, pasted content, and URLs. Requests may include repo_url, repo_ref, context_paths, and batch_concurrency as per-turn execution hints.

Next

Architecture

Thin transport, runtime core, and Daytona substrate.

Recursive RLM

Delegation, isolation, and the shared call budget.