> ## Documentation Index
> Fetch the complete documentation index at: https://docs.qredence.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# fleet-rlm agent model

> How FleetAgent wraps dspy.ReAct with AgentRuntime, the tool registry, core-memory state, per-turn execution, and the signatures driving each ReAct step.

`FleetAgent` is the entry point for every user turn. It is a `dspy.ReAct` module wrapped by `AgentRuntime`, which owns the sandbox, conversation history, and core-memory state.

## FleetAgent (`dspy.ReAct`)

Source: `src/fleet_rlm/runtime/agent/agent.py`.

`FleetAgent` is a thin `dspy.Module` that extends `dspy.ReAct` with fleet-rlm's tool registry and per-turn behavior. The ReAct loop drives turn-taking:

```text theme={null}
user_request
   ↓
thought  — what should I do?
   ↓
action   — pick a tool, fill arguments
   ↓
observation — run the tool, capture result
   ↓
(repeat until done)
   ↓
assistant_response
```

### Signature

Behavior is defined through a DSPy signature (`src/fleet_rlm/runtime/agent/signatures.py`):

```python theme={null}
class RLMReActChatSignature(dspy.Signature):
    user_request: str = dspy.InputField()
    core_memory: str = dspy.InputField()
    history: dspy.History = dspy.InputField()
    assistant_response: str = dspy.OutputField()
```

Typed inputs and outputs make the agent:

* **Discoverable** — DSPy can introspect the signature.
* **Serializable** — agent state can round-trip through manifests.
* **Optimizable** — GEPA, BootstrapFewShot, and MIPROv2 all see the signature.

## AgentRuntime

Source: `src/fleet_rlm/runtime/agent/runtime.py`.

`AgentRuntime` is the per-session state envelope around `FleetAgent`. It carries:

| Field            | Purpose                                    |
| ---------------- | ------------------------------------------ |
| `agent`          | The `FleetAgent` instance                  |
| `interpreter`    | The active `DaytonaInterpreter`            |
| `history`        | `dspy.History` of conversation turns       |
| `core_memory`    | Persona, Human, Scratchpad blocks          |
| `document_paths` | Session-local loaded document paths        |
| `lm`             | Configured DSPy `LM` for the planner       |
| `delegate_lm`    | Optional separate LM for `delegate_to_rlm` |

`AgentRuntime` is what gets restored from a session manifest. Importing a session **replaces** these fields rather than merging them.

## Core memory

Core memory is a mixin pattern offering three named blocks:

| Block        | Purpose                                      |
| ------------ | -------------------------------------------- |
| `Persona`    | Agent persona and behavior contract          |
| `Human`      | What the agent knows about the user          |
| `Scratchpad` | Free-form working memory for the active task |

Core memory is injected into the agent's prompt envelope on every turn. Updates happen through tools (e.g., a `CoreMemoryUpdateProposal` signature) — never by silent mutation.

## Tool registry

Tools are registered into the ReAct agent at construction time. `FleetAgent.discover_tools()` builds the default registry; callers can pass `extra_tools=[...]` to extend it.

Key tools:

| Tool                    | Purpose                                                              |
| ----------------------- | -------------------------------------------------------------------- |
| `delegate_to_rlm`       | Spawn an isolated child Daytona sandbox running a bounded `dspy.RLM` |
| Document / file tools   | Load and inspect session documents                                   |
| Sandbox execution tools | Run code in the active interpreter                                   |
| Memory tools            | Read and update core memory blocks                                   |

See [Recursive RLM](/fleet-rlm/concepts/recursive-rlm) for the delegation tool in detail.

## Per-turn execution

A single turn flows through the runtime as:

```mermaid theme={null}
sequenceDiagram
    participant Client
    participant WS as api/routers/ws
    participant Runtime as AgentRuntime
    participant Agent as FleetAgent
    participant Tools
    participant Sandbox as DaytonaInterpreter

    Client->>WS: user message
    WS->>Runtime: prepare turn (history, core_memory)
    Runtime->>Agent: respond(user_request, history, core_memory)
    loop ReAct iterations
        Agent->>Agent: thought
        Agent->>Tools: action(tool, args)
        Tools->>Sandbox: run code / fetch context
        Sandbox-->>Tools: observation
        Tools-->>Agent: observation
    end
    Agent-->>Runtime: assistant_response
    Runtime->>WS: stream events
    WS-->>Client: assistant frames + execution events
```

The streaming events flow back through `runtime/execution/streaming_events.py`, are shaped by `api/events/events.py`, and exit the system on the two WebSocket endpoints.

## History and trajectory

`dspy.History` is the conversation memory primitive. It's a typed list of turn pairs (`user_request`, `assistant_response`) that the agent reads on every turn.

The **trajectory** is the per-turn ReAct execution trace — every thought, action, and observation. It's optionally returned through `run_react_chat_once(include_trajectory=True)` and is always written to MLflow if tracing is enabled.

History is **session-scoped** and persisted in session manifests. Trajectory is **turn-scoped** and lives in MLflow + the streaming event log.

## Delegate LM

`delegate_to_rlm` accepts an optional `delegate_lm` distinct from the planner LM. This is set via:

```bash .env theme={null}
DSPY_LM_MODEL=openai/gpt-4o              # planner LM (ReAct chat)
DSPY_DELEGATE_LM_MODEL=openai/gpt-4o-mini # delegate LM (recursive RLM)
```

Using a smaller/cheaper model for delegation is the common pattern — the ReAct agent does high-level reasoning, the child RLM does bulk reduction.

## Reset semantics

`agent.areset(...)` is the async reset path. It tears down:

* The current sandbox interpreter context.
* Conversation history.
* Core memory (back to defaults).
* Loaded documents.

The WebSocket session-switch path calls this when clearing Daytona sandbox buffers for a fresh or restored session. It guarantees no stale state leaks across sessions.

## Implementation pointers

| File                                    | Role                        |
| --------------------------------------- | --------------------------- |
| `runtime/agent/agent.py`                | `FleetAgent` (`dspy.ReAct`) |
| `runtime/agent/runtime.py`              | `AgentRuntime`              |
| `runtime/agent/signatures.py`           | DSPy signatures             |
| `runtime/agent/chat_turns.py`           | Per-turn state and metrics  |
| `runtime/factory.py`                    | `build_runtime()` factory   |
| `runtime/tools/`                        | Tool registry               |
| `runtime/execution/streaming_events.py` | Event construction          |

## See also

* [Recursive RLM](/fleet-rlm/concepts/recursive-rlm) — delegation policy.
* [Sessions & persistence](/fleet-rlm/concepts/sessions-persistence) — what gets restored.
* [Python API](/fleet-rlm/reference/python-api) — public interfaces.
