Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.qredence.ai/llms.txt

Use this file to discover all available pages before exploring further.

fleet-rlm is a single FastAPI process that serves both the API and the prebuilt frontend SPA. This guide covers running it outside of a developer laptop.

Architecture in production

A production deploy needs:
  • fleet-rlm processuv run fleet-rlm serve-api or uv run fleet web.
  • Daytona — sandbox provider (DAYTONA_API_KEY, DAYTONA_API_URL).
  • Neon/Postgres — canonical multi-tenant state (DATABASE_URL).
  • Entra ID — bearer-token auth (AUTH_MODE=entra).
  • MLflow (optional) — tracing backend.

Environment configuration

.env.production
# App
APP_ENV=production
AUTH_MODE=entra
AUTH_REQUIRED=true
DATABASE_REQUIRED=true

# LLM
DSPY_LM_MODEL=openai/gpt-4o
DSPY_LLM_API_KEY=sk-...

# Daytona
DAYTONA_API_KEY=...
DAYTONA_API_URL=https://app.daytona.io/api

# Database
DATABASE_URL=postgresql://USER:PASS@HOST/DB?sslmode=require

# MLflow
MLFLOW_ENABLED=true
MLFLOW_TRACKING_URI=https://mlflow.example.com
Runtime settings writes (PATCH /api/v1/runtime/settings) are intentionally limited to APP_ENV=local. In staging and production, settings are read-only via the API.

Authentication

When AUTH_MODE=entra, both HTTP and WebSocket access use real Entra bearer-token validation plus Neon-backed tenant admission.
ModeBehavior
devDebug headers, local HS256 tokens, optional identity
entraJWKS-backed Entra tokens, Neon tenant admission required
See the Auth Modes reference for the full configuration matrix.

Run the server

Direct invocation:
uv run fleet-rlm serve-api --host 0.0.0.0 --port 8000
Behind a reverse proxy (recommended), terminate TLS at the proxy and forward to fleet-rlm on a private interface. fleet-rlm serves the SPA at / and the API under /api/v1/* from the same port.

Health probes

Two unauthenticated endpoints are designed for orchestrators and load balancers:
EndpointPurpose
GET /healthLiveness — returns {"ok": true, "version": "..."}
GET /readyReadiness — reports planner, database, and sandbox provider status
Sample readiness response:
{
  "ready": true,
  "planner_configured": true,
  "planner": "ready",
  "database": "ready",
  "database_required": true,
  "sandbox_provider": "daytona"
}

WebSocket transport

Two streams power the live UI:
  • /api/v1/ws/execution — chat stream events.
  • /api/v1/ws/execution/events — execution graph events.
Both require the same auth as HTTP endpoints when AUTH_REQUIRED=true. Make sure your reverse proxy is configured to upgrade HTTP/1.1 connections (no buffering).

Frontend assets

Published installs of fleet-rlm include built frontend assets — you do not need pnpm or a separate build step in production. If you build from source, run pnpm run build in src/frontend/ before launching the server.

Smoke-test the deploy

Validate Daytona connectivity from the host:
uv run fleet-rlm daytona-smoke \
  --repo https://github.com/Qredence/fleet-rlm.git \
  --ref main
Hit the readiness endpoint:
curl https://your-deploy.example.com/ready

See also