Skip to main content
Track what’s new across the Qredence product suite. For documentation changes, see the docs repository.
June 29, 2026
fleet-rlm

Week of June 29 — Fleet-RLM 0.6.2

New features

Bring-your-own-key (BYOK) LLM provider profiles Hosted deployments running AUTH_MODE=neon can now bind their own planner and delegate LLM credentials per tenant/user. API keys are encrypted at rest with Fernet under FLEET_SECRET_ENCRYPTION_KEY, and responses only ever return has_api_key plus a masked preview — plaintext keys never cross the API surface, and the runtime does not mutate the process environment to route requests. See Configuration — Auth modes and HTTP API — LLM provider profiles.Per-workspace encrypted Daytona credentials PATCH /api/v1/runtime/settings under AUTH_MODE=neon now persists each workspace’s DAYTONA_* keys as encrypted workspace_runtime_settings ciphertext instead of returning 403 forbidden. Chat and runtime paths resolve the per-user Daytona config first and fall back to the server-level env only if none is set. Non-Daytona keys remain local-only. See HTTP API reference.LiteLLM custom-provider opt-in hint Two new environment variables — DSPY_LM_CUSTOM_PROVIDER and DSPY_DELEGATE_LM_CUSTOM_PROVIDER — let OpenAI-compatible bare-model endpoints pass an explicit custom_llm_provider hint to LiteLLM. The runtime no longer force-sets custom_llm_provider="openai" for every bare model with an api_base, so Anthropic and other non-OpenAI providers stop receiving OpenAI-format requests. See Configuration — Required: LLM.PATCH /api/v1/runtime/settings reports skipped keys Responses now include a skipped field listing masked-round-trip keys that were intentionally not persisted, so clients can distinguish updated keys from ignored no-op saves.

Updates

  • FastAPI pinned to ==0.139.0. Installs are reproducible on the current validated FastAPI release instead of floating forward against a >=0.138.2 floor.
  • litellm policy hardened. LiteLLM is installed only as DSPy’s transitive dependency; [tool.uv].override-dependencies still pins litellm>=1.87.0 to close 7 documented CVEs. A parse-time invariant test fails if litellm is ever re-added to direct deps or removed from the override pin.
  • Neon multi-tenant migrations. llm_role_bindings is now UUID-PK’d and scoped by tenant_id / user_id / workspace_id; the workspace_runtime_settings unique constraint is tightened to (tenant_id, workspace_id) so the settings upsert is tenant-aware.
  • README rewritten around the actual routed surfaces (/app/workspace, /app/optimization, /app/volumes, /app/settings) and the current make / pnpm validation lanes.

Bug fixes

  • Legacy XOR-encrypted profile ciphertext keeps decrypting. After rotation, the runtime tries FLEET_SECRET_ENCRYPTION_KEY, DEV_JWT_SECRET, and change-me in turn until a stored row decrypts — old rows are no longer bricked by rotating in a real Fernet key.
  • No cross-tenant BYOK leak from the connectivity probe. POST /runtime/tests/lm no longer mutates the shared LmDeps.planner_lm singleton; the per-user planner is invoked directly, so a smoke test can never swap another user’s in-flight chat onto a foreign BYOK LM.
  • Decrypt failures are observable. GET /api/v1/runtime/settings logs when a stored DAYTONA_API_KEY fails to decrypt (without leaking the value), and the PATCH path treats an empty incoming value for a key with an existing stored credential as a no-op — a failed GET can no longer enable an empty save that wipes the stored key.
June 17, 2026
fleet-rlm

Week of June 17 — Fleet-RLM 0.6.0

New features

Workbench sidepanel with Trajectories, Graph, and Volume tabs A workspace-local collapsible sidepanel now sits alongside the chat. Trajectories renders the session trace timeline, Graph renders a React Flow parent/child span view backed by persisted MLflow/debug spans, and Volume embeds a searchable Daytona volume tree with resizable desktop split and inline file preview. Chat stays the primary surface; the sidepanel starts closed and can resize up to 75% of the workspace width. See Concepts — Observability.Per-trace performance summaries The session trace debug contract now carries span durations, token counts, output sizes, selected-skill metadata, and adapter fallback signals per trace. The sidepanel can diagnose slow or noisy RLM runs directly from the same durable trace lookup used by the timeline and graph.Active skill injection into the sandbox Selected scaffold-skill markdown is injected as a sandbox variable for RLM turns, document turns, and workspace turns — the REPL sees the skill without stuffing full instructions into every model prompt.Bounded RLM action-generation token budget Operators can cap the action-prompt token budget separately from REPL output truncation. The effective budget is exposed in runtime settings metadata and attributed on every trace so slow turns can be traced back to their action-generation configuration.

Updates

  • GEPA is now the only supported public optimizer. MIPROv2 was removed from the unified optimization pipeline. CLI, API, manifests, and the Optimization UI all target one optimizer contract. See the CLI reference.
  • Unified RuntimeEvent streaming. Runtime, persistence, and Web UI consumers now share one typed streaming contract for execution start, step, and completion frames — the public Workbench frame shapes are unchanged.
  • Hardened session trace lookup. Trajectories and Graph now populate from live session traces after a message completes, even before the frontend has a durable session id — trace lookup resolves both durable chat-session ids and runtime websocket external_session_id values.
  • Frontend feature-module reorganization. Feature entrypoints are now the public boundary; routes and layout consume stable feature contracts and import-boundary linting blocks deep coupling. shadcn-style primitives were migrated from Radix wrappers to Base UI primitives while preserving the existing button, tooltip, popover, dialog, menu, scroll-area, and toggle contracts.
  • Compact local chat-history persistence. Local storage now stores session previews and durable session ids instead of full rendered transcripts, so quota failures never break chat saves.
  • RLM action generation compacted. Long REPL histories are compacted before action generation and driven through JSONAdapter, so long-running sessions spend fewer tokens on prior tool output and avoid avoidable chat-adapter fallback retries.

Removed

  • MIPROv2 public optimizer surface. Review bundles, CLI flags, and API requests no longer advertise a second optimizer.
  • Retired Tool UI helpers. Option-list and shared action helpers were removed after Agent Elements became the canonical tool-rendering path.

Notes

  • GET /api/v1/optimization/runs/compare remains API-ready; the Compare tab UI is deferred to v1.1.
June 11, 2026
fleet-pi

Week of June 11 — Fleet Pi 0.5.0

New features

hax-design consolidation packages/ui is renamed to packages/hax-design and is now the single source of truth for agent-elements, OpenUI, Fleet Pi chat surfaces, shadcn primitives, and shared Pi protocol types. apps/web routes are thinner, and the config panel is split into focused modules. Forks must update imports from @workspace/ui to @workspace/hax-design. See Project structure.Google Gemini as the default LLM provider The default model is now gemini-3.5-flash through Pi’s google provider. Extensions receive mode-aware context (ctx.mode, getSystemPromptOptions()). Amazon Bedrock remains available via AWS credentials — set provider and model in .pi/settings.json or environment variables if you need it. See Configuration.Neon Postgres session mirror Setting FLEET_PI_CHAT_DATABASE_URL mirrors Pi session entries, run events, tool executions, and file mutations into Neon tables prefixed with pi_. JSONL remains the source of truth and mirror failures never break streaming. Apply migrations with pnpm chat:migrate. See Configuration and Runtime SDK integration.Web access tools in Agent mode The new pi-web-access package wires web_search, fetch_content, and code_search into Agent mode end-to-end. See Chat modes.

Updates

  • Memory recall improvements. Workspace memory content is now enriched and retrieval is prompt-aware for better long-session context.
  • Question bar UX. New usePendingQuestionBar hook and suppressQuestionTool prop on AgentChat for cleaner Plan-mode question handling.
  • Security and reliability. Critical and high-severity issues fixed and vulnerable transitive dependencies patched.
  • Documentation. Comprehensive docs added for the UI package, configuration, data models, dependencies, and security posture.

Breaking changes

  • Import path rename: @workspace/ui@workspace/hax-design (package directory: packages/uipackages/hax-design).
  • Default LLM provider changed from Amazon Bedrock to Google Gemini.
  • New optional environment variables for the chat mirror: FLEET_PI_CHAT_DATABASE_URL, FLEET_PI_CHAT_MIGRATION_DATABASE_URL.
June 11, 2026
fleet-rlm

Week of June 11 — Fleet-RLM 0.5.50

New features

MIPROv2 as an optional offline optimizer The unified offline optimization pipeline now accepts MIPROv2 alongside the default GEPA backend. Pass --optimizer miprov2 to fleet-rlm optimize, or send "optimizer": "miprov2" in the body of POST /api/v1/optimization/runs. CLI, API, MLflow run metadata, and review bundles share the same runner, so existing GEPA tooling keeps working unchanged. See the DSPy integration guide and the CLI reference.Native dspy.RLM large-input support Large documents and workspace context now ship to the sandbox through DSPy’s upstream SandboxSerializable contract. Use LargeDocument or WorkspaceContext from fleet_rlm.runtime.sandbox_types on a signature input field, and dspy.RLM injects the payload into the REPL as a native Python dict while the LM only sees a short preview. Custom signatures that previously relied on Fleet-maintained variable-mode wrappers should switch to these types. See DSPy integration.

Updates

  • DSPy pinned to 3.3.0b1. Fleet now depends on the upstream DSPy RLM and SandboxSerializable contracts directly; the local DSPy monkeypatch modules have been removed. Programs that build modules through fleet_rlm.runtime.modules keep working without changes.
  • Unified dspy.streamify chat streaming. Direct, tool-using, and recursive RLM turns now share one WebSocket replay path with response-first DSPy signatures. The public Workbench WebSocket frame shapes are unchanged — existing clients require no updates. See Observability — WebSocket execution events.
  • Centralized DSPy observability callback registration. MLflow and PostHog callbacks are now registered once through a shared registry that stays lazy, deduplicated, and visible to worker-thread DSPy contexts. Optional observability stays optional — no configuration changes are required.

Removed

  • Variable-mode wrappers and local DSPy patch modules. Retired together with archived optimization/history frontend clients and legacy bare WebSocket frame parsing. The supported surface is the generated OpenAPI client and the canonical WebSocket event envelope.
May 23, 2026
fleet-rlm

Week of May 23 — Fleet-RLM 0.5.40

New features

Canonical API error envelope across all HTTP and WebSocket routes Every error response on /api/v1/* now returns the same { code, message, detail } JSON shape, including FastAPI validation errors and unknown-route 404s served by Starlette. Branch on the stable code field instead of parsing message. See HTTP and WebSocket API.Volume access security boundaries GET /api/v1/runtime/volume/tree and /api/v1/runtime/volume/file now enforce explicit canonical roots and return 403 forbidden for paths outside them. The tree endpoint accepts a new max_entries parameter (default 200, max 1000) and reports max_depth, max_entries, and entries_returned so clients can tell when a listing was clipped. File previews include sha256, encoding (utf-8, utf-8-lossy, or binary), and a binary flag so you can deduplicate or short-circuit on non-text files. See Volume access boundaries.Offline-only DSPy module flag GET /api/v1/optimization/modules entries now carry an offline_only field (default true) so optimization UIs know which modules can only be tuned through the offline endpoints, not from live traffic.

Updates

  • Health probe shape clarified. GET /health now returns status: "live" instead of ok: true. The legacy ok field has been removed. (HTTP API reference)
  • Readiness 503 carries component state. GET /ready now returns the same ReadyResponse body on 503, so monitoring probes can read which component is missing or degraded from a failing response. The redundant planner_configured field was removed — read planner instead.
  • Sandbox environment variables are redacted. The env_vars field on sandbox responses no longer surfaces raw secret values.
  • Recursive RLM delegation, DSPy signatures, and streaming contracts redesigned around clearer service boundaries while preserving the public Workbench WebSocket frame shapes.
  • Daytona VFS and evidence substrate redesigned with explicit security boundaries between child workspaces, mounted volumes, and evidence staging. See Daytona runtime.
  • CLI now emits structured errors matching the canonical API envelope. See the CLI reference.

Removed

  • Memory API retired. /api/v1/memory* is no longer part of the supported HTTP surface. Memory item browsing has been removed from the API navigation and OpenAPI schema. Clients that depended on listing memory items should migrate to the session endpoints under /api/v1/sessions/*.
May 20, 2026
fleet-rlm

Week of May 20

Updates

Fleet-RLM — decoupled WebSocket streaming runtime Turn execution no longer runs inline with the WebSocket handler. Each user message is processed in a background task that builds its own agent context and publishes execution events through a shared event emitter. The same emitter fans out frames to every subscriber on /api/v1/ws/execution and /api/v1/ws/execution/events, so a dropped or reconnected client no longer cancels the turn. No client changes are required — frame shapes are unchanged. See Observability and HTTP and WebSocket API.Fleet-RLM — Entra JWKS cache and joserfc token validation AUTH_MODE=entra now uses joserfc instead of PyJWT for token verification and ships with a built-in JWKS cache (5-minute TTL) that falls back to the last-known keyset if Entra’s JWKS endpoint is unreachable. Bearer-token validation, tid/aud/iss enforcement, and tenant admission behavior are unchanged. See Deployment.

Bug fixes

  • Final assistant text no longer duplicated in replay. The terminal trajectory step now omits the planner’s intermediate thought, so reopening a session replays the assistant’s final response once instead of twice. (Sessions and persistence)
  • Frontend WebSocket parser prefers step.output for final frames. Execution-step envelopes with kind: "final" now surface the actual response text instead of the internal label.
May 19, 2026
fleet-rlmfleet-pi

Week of May 12 – May 19

New features

Fleet-RLM 0.5.3 — backend-driven runtime settings The Settings page now renders typed runtime options and diagnostics directly from backend descriptors, so available configuration always matches what the server actually supports. See Configuration reference.Fleet-RLM — “About this instance” panel A new Settings panel surfaces the running service version, environment, and feature flags so you can confirm exactly what’s deployed before filing an issue. Powered by the new /api/v1/info endpoint in the HTTP API reference.Fleet-RLM — MLflow observability and auto-assessment MLflow span processors now emit richer trace metadata, and you can wire scorer schedules to run automated assessment loops over completed sessions. See Observability.Fleet Pi — Daytona sandbox integration Pi chat modes can now invoke Daytona sandbox tools end-to-end, with webhook and client support added to the web surface and improved startup memory recall. See Chat modes and Runtime SDK integration.

Updates

  • Session titles auto-derive from the first user message when no title is set, so conversations get human-readable labels without manual renaming. (fleet-rlm)
  • Workbench UI polish — refined sidepanel controls, event display, and composer prompt overhead for a cleaner workspace.
  • Runtime stack alignment — Fleet-RLM is now tested and published against Daytona 0.176, DSPy 3.2.1, Pydantic 2.13.4, SQLModel 0.0.38, Psycopg 3.3.4, Typer 0.25.1, and Uvicorn 0.47.0. Update your environment to match — see Installation.
  • Fleet-RLM 0.5.31 patch release with a synced OpenAPI schema for frontend and SDK consumers.

Bug fixes

  • History page restored. Conversation titles and transcript replay now show correctly when the durable session store only contains placeholder rows — the History view falls back to local conversation history instead of rendering opaque IDs.
  • Resilient analytics initialization. PostHog callback registration no longer fails in threaded environments; it retries under a settings lock when needed.
  • Hardened recursive delegation. Remote document context, degraded child execution metadata, and chunk-document aliases are handled more defensively, so partial failures surface clearly instead of returning stale evidence. See Recursive RLM.
  • Frontend dependency security patches applied to address Dependabot alerts.