> ## Documentation Index
> Fetch the complete documentation index at: https://docs.qredence.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Product changelog and release notes

> Weekly release notes for Fleet-RLM, Fleet Pi, and Qredence Plugins, covering new features, runtime updates, bug fixes, and security patches.

Track what's new across the Qredence product suite. For documentation changes, see the [docs repository](https://github.com/qredence/documentation).

<Update label="June 29, 2026" tags={["fleet-rlm"]}>
  ## Week of June 29 — Fleet-RLM 0.6.2

  ### New features

  **Bring-your-own-key (BYOK) LLM provider profiles**
  Hosted deployments running `AUTH_MODE=neon` can now bind their own planner and delegate LLM credentials per tenant/user. API keys are encrypted at rest with Fernet under `FLEET_SECRET_ENCRYPTION_KEY`, and responses only ever return `has_api_key` plus a masked preview — plaintext keys never cross the API surface, and the runtime does not mutate the process environment to route requests. See [Configuration — Auth modes](/fleet-rlm/reference/configuration#auth-modes) and [HTTP API — LLM provider profiles](/fleet-rlm/reference/http-api#llm-provider-profiles).

  **Per-workspace encrypted Daytona credentials**
  `PATCH /api/v1/runtime/settings` under `AUTH_MODE=neon` now persists each workspace's `DAYTONA_*` keys as encrypted `workspace_runtime_settings` ciphertext instead of returning `403 forbidden`. Chat and runtime paths resolve the per-user Daytona config first and fall back to the server-level env only if none is set. Non-Daytona keys remain local-only. See [HTTP API reference](/fleet-rlm/reference/http-api#runtime).

  **LiteLLM custom-provider opt-in hint**
  Two new environment variables — `DSPY_LM_CUSTOM_PROVIDER` and `DSPY_DELEGATE_LM_CUSTOM_PROVIDER` — let OpenAI-compatible bare-model endpoints pass an explicit `custom_llm_provider` hint to LiteLLM. The runtime no longer force-sets `custom_llm_provider="openai"` for every bare model with an `api_base`, so Anthropic and other non-OpenAI providers stop receiving OpenAI-format requests. See [Configuration — Required: LLM](/fleet-rlm/reference/configuration#required-llm).

  **`PATCH /api/v1/runtime/settings` reports skipped keys**
  Responses now include a `skipped` field listing masked-round-trip keys that were intentionally not persisted, so clients can distinguish `updated` keys from ignored no-op saves.

  ### Updates

  * **FastAPI pinned to `==0.139.0`.** Installs are reproducible on the current validated FastAPI release instead of floating forward against a `>=0.138.2` floor.
  * **`litellm` policy hardened.** LiteLLM is installed only as DSPy's transitive dependency; `[tool.uv].override-dependencies` still pins `litellm>=1.87.0` to close 7 documented CVEs. A parse-time invariant test fails if `litellm` is ever re-added to direct deps or removed from the override pin.
  * **Neon multi-tenant migrations.** `llm_role_bindings` is now UUID-PK'd and scoped by `tenant_id` / `user_id` / `workspace_id`; the `workspace_runtime_settings` unique constraint is tightened to `(tenant_id, workspace_id)` so the settings upsert is tenant-aware.
  * **README rewritten** around the actual routed surfaces (`/app/workspace`, `/app/optimization`, `/app/volumes`, `/app/settings`) and the current `make` / `pnpm` validation lanes.

  ### Bug fixes

  * **Legacy XOR-encrypted profile ciphertext keeps decrypting.** After rotation, the runtime tries `FLEET_SECRET_ENCRYPTION_KEY`, `DEV_JWT_SECRET`, and `change-me` in turn until a stored row decrypts — old rows are no longer bricked by rotating in a real Fernet key.
  * **No cross-tenant BYOK leak from the connectivity probe.** `POST /runtime/tests/lm` no longer mutates the shared `LmDeps.planner_lm` singleton; the per-user planner is invoked directly, so a smoke test can never swap another user's in-flight chat onto a foreign BYOK LM.
  * **Decrypt failures are observable.** `GET /api/v1/runtime/settings` logs when a stored `DAYTONA_API_KEY` fails to decrypt (without leaking the value), and the PATCH path treats an empty incoming value for a key with an existing stored credential as a no-op — a failed GET can no longer enable an empty save that wipes the stored key.
</Update>

<Update label="June 17, 2026" tags={["fleet-rlm"]}>
  ## Week of June 17 — Fleet-RLM 0.6.0

  ### New features

  **Workbench sidepanel with Trajectories, Graph, and Volume tabs**
  A workspace-local collapsible sidepanel now sits alongside the chat. `Trajectories` renders the session trace timeline, `Graph` renders a React Flow parent/child span view backed by persisted MLflow/debug spans, and `Volume` embeds a searchable Daytona volume tree with resizable desktop split and inline file preview. Chat stays the primary surface; the sidepanel starts closed and can resize up to 75% of the workspace width. See [Concepts — Observability](/fleet-rlm/concepts/observability).

  **Per-trace performance summaries**
  The session trace debug contract now carries span durations, token counts, output sizes, selected-skill metadata, and adapter fallback signals per trace. The sidepanel can diagnose slow or noisy RLM runs directly from the same durable trace lookup used by the timeline and graph.

  **Active skill injection into the sandbox**
  Selected scaffold-skill markdown is injected as a sandbox variable for RLM turns, document turns, and workspace turns — the REPL sees the skill without stuffing full instructions into every model prompt.

  **Bounded RLM action-generation token budget**
  Operators can cap the action-prompt token budget separately from REPL output truncation. The effective budget is exposed in runtime settings metadata and attributed on every trace so slow turns can be traced back to their action-generation configuration.

  ### Updates

  * **GEPA is now the only supported public optimizer.** MIPROv2 was removed from the unified optimization pipeline. CLI, API, manifests, and the Optimization UI all target one optimizer contract. See the [CLI reference](/fleet-rlm/reference/cli#fleet-rlm-optimize).
  * **Unified `RuntimeEvent` streaming.** Runtime, persistence, and Web UI consumers now share one typed streaming contract for execution start, step, and completion frames — the public Workbench frame shapes are unchanged.
  * **Hardened session trace lookup.** `Trajectories` and `Graph` now populate from live session traces after a message completes, even before the frontend has a durable session id — trace lookup resolves both durable chat-session ids and runtime websocket `external_session_id` values.
  * **Frontend feature-module reorganization.** Feature entrypoints are now the public boundary; routes and layout consume stable feature contracts and import-boundary linting blocks deep coupling. shadcn-style primitives were migrated from Radix wrappers to Base UI primitives while preserving the existing button, tooltip, popover, dialog, menu, scroll-area, and toggle contracts.
  * **Compact local chat-history persistence.** Local storage now stores session previews and durable session ids instead of full rendered transcripts, so quota failures never break chat saves.
  * **RLM action generation compacted.** Long REPL histories are compacted before action generation and driven through `JSONAdapter`, so long-running sessions spend fewer tokens on prior tool output and avoid avoidable chat-adapter fallback retries.

  ### Removed

  * **MIPROv2 public optimizer surface.** Review bundles, CLI flags, and API requests no longer advertise a second optimizer.
  * **Retired Tool UI helpers.** Option-list and shared action helpers were removed after Agent Elements became the canonical tool-rendering path.

  ### Notes

  * `GET /api/v1/optimization/runs/compare` remains API-ready; the Compare tab UI is deferred to v1.1.
</Update>

<Update label="June 11, 2026" tags={["fleet-pi"]}>
  ## Week of June 11 — Fleet Pi 0.5.0

  ### New features

  **hax-design consolidation**
  `packages/ui` is renamed to `packages/hax-design` and is now the single source of truth for agent-elements, OpenUI, Fleet Pi chat surfaces, shadcn primitives, and shared Pi protocol types. `apps/web` routes are thinner, and the config panel is split into focused modules. Forks must update imports from `@workspace/ui` to `@workspace/hax-design`. See [Project structure](/fleet-pi/project-structure).

  **Google Gemini as the default LLM provider**
  The default model is now `gemini-3.5-flash` through Pi's `google` provider. Extensions receive mode-aware context (`ctx.mode`, `getSystemPromptOptions()`). Amazon Bedrock remains available via AWS credentials — set provider and model in `.pi/settings.json` or environment variables if you need it. See [Configuration](/fleet-pi/configuration).

  **Neon Postgres session mirror**
  Setting `FLEET_PI_CHAT_DATABASE_URL` mirrors Pi session entries, run events, tool executions, and file mutations into Neon tables prefixed with `pi_`. JSONL remains the source of truth and mirror failures never break streaming. Apply migrations with `pnpm chat:migrate`. See [Configuration](/fleet-pi/configuration) and [Runtime SDK integration](/fleet-pi/runtime-sdk-integration).

  **Web access tools in Agent mode**
  The new `pi-web-access` package wires `web_search`, `fetch_content`, and `code_search` into Agent mode end-to-end. See [Chat modes](/fleet-pi/chat-modes).

  ### Updates

  * **Memory recall improvements.** Workspace memory content is now enriched and retrieval is prompt-aware for better long-session context.
  * **Question bar UX.** New `usePendingQuestionBar` hook and `suppressQuestionTool` prop on `AgentChat` for cleaner Plan-mode question handling.
  * **Security and reliability.** Critical and high-severity issues fixed and vulnerable transitive dependencies patched.
  * **Documentation.** Comprehensive docs added for the UI package, configuration, data models, dependencies, and security posture.

  ### Breaking changes

  * Import path rename: `@workspace/ui` → `@workspace/hax-design` (package directory: `packages/ui` → `packages/hax-design`).
  * Default LLM provider changed from Amazon Bedrock to Google Gemini.
  * New optional environment variables for the chat mirror: `FLEET_PI_CHAT_DATABASE_URL`, `FLEET_PI_CHAT_MIGRATION_DATABASE_URL`.
</Update>

<Update label="June 11, 2026" tags={["fleet-rlm"]}>
  ## Week of June 11 — Fleet-RLM 0.5.50

  ### New features

  **MIPROv2 as an optional offline optimizer**
  The unified offline optimization pipeline now accepts MIPROv2 alongside the default GEPA backend. Pass `--optimizer miprov2` to `fleet-rlm optimize`, or send `"optimizer": "miprov2"` in the body of `POST /api/v1/optimization/runs`. CLI, API, MLflow run metadata, and review bundles share the same runner, so existing GEPA tooling keeps working unchanged. See the [DSPy integration guide](/fleet-rlm/guides/dspy-integration#offline-optimization-gepa-and-miprov2) and the [CLI reference](/fleet-rlm/reference/cli#fleet-rlm-optimize).

  **Native `dspy.RLM` large-input support**
  Large documents and workspace context now ship to the sandbox through DSPy's upstream `SandboxSerializable` contract. Use `LargeDocument` or `WorkspaceContext` from `fleet_rlm.runtime.sandbox_types` on a signature input field, and `dspy.RLM` injects the payload into the REPL as a native Python dict while the LM only sees a short preview. Custom signatures that previously relied on Fleet-maintained variable-mode wrappers should switch to these types. See [DSPy integration](/fleet-rlm/guides/dspy-integration#pass-large-inputs-to-dspyrlm).

  ### Updates

  * **DSPy pinned to `3.3.0b1`.** Fleet now depends on the upstream DSPy `RLM` and `SandboxSerializable` contracts directly; the local DSPy monkeypatch modules have been removed. Programs that build modules through `fleet_rlm.runtime.modules` keep working without changes.
  * **Unified `dspy.streamify` chat streaming.** Direct, tool-using, and recursive RLM turns now share one WebSocket replay path with `response`-first DSPy signatures. The public Workbench WebSocket frame shapes are unchanged — existing clients require no updates. See [Observability — WebSocket execution events](/fleet-rlm/concepts/observability#websocket-execution-events).
  * **Centralized DSPy observability callback registration.** MLflow and PostHog callbacks are now registered once through a shared registry that stays lazy, deduplicated, and visible to worker-thread DSPy contexts. Optional observability stays optional — no configuration changes are required.

  ### Removed

  * **Variable-mode wrappers and local DSPy patch modules.** Retired together with archived optimization/history frontend clients and legacy bare WebSocket frame parsing. The supported surface is the generated OpenAPI client and the canonical WebSocket event envelope.
</Update>

<Update label="May 23, 2026" tags={["fleet-rlm"]}>
  ## Week of May 23 — Fleet-RLM 0.5.40

  ### New features

  **Canonical API error envelope across all HTTP and WebSocket routes**
  Every error response on `/api/v1/*` now returns the same `{ code, message, detail }` JSON shape, including FastAPI validation errors and unknown-route 404s served by Starlette. Branch on the stable `code` field instead of parsing `message`. See [HTTP and WebSocket API](/fleet-rlm/reference/http-api#canonical-error-envelope).

  **Volume access security boundaries**
  `GET /api/v1/runtime/volume/tree` and `/api/v1/runtime/volume/file` now enforce explicit canonical roots and return `403 forbidden` for paths outside them. The tree endpoint accepts a new `max_entries` parameter (default `200`, max `1000`) and reports `max_depth`, `max_entries`, and `entries_returned` so clients can tell when a listing was clipped. File previews include `sha256`, `encoding` (`utf-8`, `utf-8-lossy`, or `binary`), and a `binary` flag so you can deduplicate or short-circuit on non-text files. See [Volume access boundaries](/fleet-rlm/reference/http-api#volume-access-boundaries).

  **Offline-only DSPy module flag**
  `GET /api/v1/optimization/modules` entries now carry an `offline_only` field (default `true`) so optimization UIs know which modules can only be tuned through the offline endpoints, not from live traffic.

  ### Updates

  * **Health probe shape clarified.** `GET /health` now returns `status: "live"` instead of `ok: true`. The legacy `ok` field has been removed. ([HTTP API reference](/fleet-rlm/reference/http-api#get-health))
  * **Readiness 503 carries component state.** `GET /ready` now returns the same `ReadyResponse` body on `503`, so monitoring probes can read which component is missing or degraded from a failing response. The redundant `planner_configured` field was removed — read `planner` instead.
  * **Sandbox environment variables are redacted.** The `env_vars` field on sandbox responses no longer surfaces raw secret values.
  * **Recursive RLM delegation, DSPy signatures, and streaming contracts** redesigned around clearer service boundaries while preserving the public Workbench WebSocket frame shapes.
  * **Daytona VFS and evidence substrate** redesigned with explicit security boundaries between child workspaces, mounted volumes, and evidence staging. See [Daytona runtime](/fleet-rlm/concepts/daytona-runtime).
  * **CLI** now emits structured errors matching the canonical API envelope. See the [CLI reference](/fleet-rlm/reference/cli).

  ### Removed

  * **Memory API retired.** `/api/v1/memory*` is no longer part of the supported HTTP surface. Memory item browsing has been removed from the API navigation and OpenAPI schema. Clients that depended on listing memory items should migrate to the session endpoints under `/api/v1/sessions/*`.
</Update>

<Update label="May 20, 2026" tags={["fleet-rlm"]}>
  ## Week of May 20

  ### Updates

  **Fleet-RLM — decoupled WebSocket streaming runtime**
  Turn execution no longer runs inline with the WebSocket handler. Each user message is processed in a background task that builds its own agent context and publishes execution events through a shared event emitter. The same emitter fans out frames to every subscriber on `/api/v1/ws/execution` and `/api/v1/ws/execution/events`, so a dropped or reconnected client no longer cancels the turn. No client changes are required — frame shapes are unchanged. See [Observability](/fleet-rlm/concepts/observability) and [HTTP and WebSocket API](/fleet-rlm/reference/http-api).

  **Fleet-RLM — Entra JWKS cache and `joserfc` token validation**
  `AUTH_MODE=entra` now uses [`joserfc`](https://jose.authlib.org/) instead of `PyJWT` for token verification and ships with a built-in JWKS cache (5-minute TTL) that falls back to the last-known keyset if Entra's JWKS endpoint is unreachable. Bearer-token validation, `tid`/`aud`/`iss` enforcement, and tenant admission behavior are unchanged. See [Deployment](/fleet-rlm/guides/deployment#authentication).

  ### Bug fixes

  * **Final assistant text no longer duplicated in replay.** The terminal trajectory step now omits the planner's intermediate thought, so reopening a session replays the assistant's final response once instead of twice. ([Sessions and persistence](/fleet-rlm/concepts/sessions-persistence))
  * **Frontend WebSocket parser prefers `step.output` for final frames.** Execution-step envelopes with `kind: "final"` now surface the actual response text instead of the internal label.
</Update>

<Update label="May 19, 2026" tags={["fleet-rlm", "fleet-pi"]}>
  ## Week of May 12 – May 19

  ### New features

  **Fleet-RLM 0.5.3 — backend-driven runtime settings**
  The Settings page now renders typed runtime options and diagnostics directly from backend descriptors, so available configuration always matches what the server actually supports. See [Configuration reference](/fleet-rlm/reference/configuration).

  **Fleet-RLM — "About this instance" panel**
  A new Settings panel surfaces the running service version, environment, and feature flags so you can confirm exactly what's deployed before filing an issue. Powered by the new `/api/v1/info` endpoint in the [HTTP API reference](/fleet-rlm/reference/http-api).

  **Fleet-RLM — MLflow observability and auto-assessment**
  MLflow span processors now emit richer trace metadata, and you can wire scorer schedules to run automated assessment loops over completed sessions. See [Observability](/fleet-rlm/concepts/observability).

  **Fleet Pi — Daytona sandbox integration**
  Pi chat modes can now invoke Daytona sandbox tools end-to-end, with webhook and client support added to the web surface and improved startup memory recall. See [Chat modes](/fleet-pi/chat-modes) and [Runtime SDK integration](/fleet-pi/runtime-sdk-integration).

  ### Updates

  * **Session titles auto-derive from the first user message** when no title is set, so conversations get human-readable labels without manual renaming. ([fleet-rlm](/fleet-rlm/introduction))
  * **Workbench UI polish** — refined sidepanel controls, event display, and composer prompt overhead for a cleaner workspace.
  * **Runtime stack alignment** — Fleet-RLM is now tested and published against Daytona 0.176, DSPy 3.2.1, Pydantic 2.13.4, SQLModel 0.0.38, Psycopg 3.3.4, Typer 0.25.1, and Uvicorn 0.47.0. Update your environment to match — see [Installation](/fleet-rlm/installation).
  * **Fleet-RLM 0.5.31** patch release with a synced OpenAPI schema for frontend and SDK consumers.

  ### Bug fixes

  * **History page restored.** Conversation titles and transcript replay now show correctly when the durable session store only contains placeholder rows — the History view falls back to local conversation history instead of rendering opaque IDs.
  * **Resilient analytics initialization.** PostHog callback registration no longer fails in threaded environments; it retries under a settings lock when needed.
  * **Hardened recursive delegation.** Remote document context, degraded child execution metadata, and chunk-document aliases are handled more defensively, so partial failures surface clearly instead of returning stale evidence. See [Recursive RLM](/fleet-rlm/concepts/recursive-rlm).
  * **Frontend dependency security patches** applied to address Dependabot alerts.
</Update>
