fleet-rlm is a single FastAPI process that serves both the API and the prebuilt frontend SPA. This guide covers running it outside of a developer laptop.Documentation Index
Fetch the complete documentation index at: https://docs.qredence.ai/llms.txt
Use this file to discover all available pages before exploring further.
Architecture in production
A production deploy needs:- fleet-rlm process —
uv run fleet-rlm serve-apioruv run fleet web. - Daytona — sandbox provider (
DAYTONA_API_KEY,DAYTONA_API_URL). - Neon/Postgres — canonical multi-tenant state (
DATABASE_URL). - Entra ID — bearer-token auth (
AUTH_MODE=entra). - MLflow (optional) — tracing backend.
Environment configuration
.env.production
Authentication
WhenAUTH_MODE=entra, both HTTP and WebSocket access use real Entra bearer-token validation plus Neon-backed tenant admission.
| Mode | Behavior |
|---|---|
dev | Debug headers, local HS256 tokens, optional identity |
entra | JWKS-backed Entra tokens, Neon tenant admission required |
Run the server
Direct invocation:/ and the API under /api/v1/* from the same port.
Health probes
Two unauthenticated endpoints are designed for orchestrators and load balancers:| Endpoint | Purpose |
|---|---|
GET /health | Liveness — returns {"ok": true, "version": "..."} |
GET /ready | Readiness — reports planner, database, and sandbox provider status |
WebSocket transport
Two streams power the live UI:/api/v1/ws/execution— chat stream events./api/v1/ws/execution/events— execution graph events.
AUTH_REQUIRED=true. Make sure your reverse proxy is configured to upgrade HTTP/1.1 connections (no buffering).
Frontend assets
Published installs offleet-rlm include built frontend assets — you do not need pnpm or a separate build step in production. If you build from source, run pnpm run build in src/frontend/ before launching the server.
Smoke-test the deploy
Validate Daytona connectivity from the host:See also
- Configuration reference — full env var list.
- HTTP API reference — endpoint surface.