Building an Intelligent Message Router: The DelegateAgent | OpenCaddis Blog - OpenCaddis

The Problem: Users Shouldn't Be Dispatchers

OpenCaddis supports multiple agent types — Assistant, Workflow, Event Log, and now Delegate. In a typical deployment, you might have a general-purpose assistant for everyday questions, a research assistant configured with web search plugins, a code agent with file system tools, and a workflow agent for multi-step planning.

Without a router, the user has to manually select which agent to talk to. That works for power users who understand the agent topology, but it breaks down quickly:

Context switching is expensive. The user has to mentally map their request to an agent's capabilities, switch to that agent's conversation, and then switch back. Every switch loses conversational context.
The wrong choice wastes tokens. If you ask a general-purpose assistant to plan a multi-step project, it'll try — but a workflow agent would do it better. The user gets a mediocre result and has to start over.
New agents are invisible. When you add a new specialized agent, existing users don't know it's there. The routing layer discovers and advertises agents automatically.

The DelegateAgent solves this by presenting a single conversational endpoint. Users talk to one agent, and it routes transparently to the best specialist for each request.

The Design: Four Steps, One Turn

The DelegateAgent is a single-turn router. Unlike the WorkflowAgent, which plans multi-step execution and tracks state across messages, the DelegateAgent handles each request independently: receive a message, pick an agent, delegate, and return the result. There's no persistent workflow state, no plan approval loop, no task dependencies. This makes it fast and predictable.

Delegate vs. Workflow

Use Delegate when the user's request can be handled by a single agent in a single turn. Use Workflow when the request requires multiple agents working on a coordinated plan with dependencies and approval. The DelegateAgent is the "smart receptionist" — the WorkflowAgent is the "project manager."

The pipeline for every incoming message:

Select Agent Formulate Message Delegate with Timeout Analyze & Respond

Each step involves an LLM call — the routing model for selection, the agent's own chat client for formulation and analysis, and the delegated agent for the actual work. It's deliberate: the extra LLM calls let the DelegateAgent translate between the user's natural language and each agent's specific strengths.

Agent Discovery: Building the Catalog

Before the DelegateAgent can route anything, it needs to know what agents are available. Discovery happens at initialization and again if the cached list is empty when a message arrives.

The agent names come from the ManagedAgents configuration argument — a comma-separated list of agent names defined in your fabr.json:

fabr.json — DelegateAgent Configuration

{
  "Handle": "opencaddis-user:delegate",
  "AgentType": "delegate",
  "Args": {
    "ManagedAgents": "assistant, research, code-helper",
    "DelegationTimeoutSeconds": "180",
    "ModelConfig": "default"
  }
}

For each name, the DelegateAgent constructs a handle (opencaddis-user:{name}) and queries the agent's health through FabrCore's IFabrAgentHost:

DelegateAgent.cs — Agent Discovery

private async Task DiscoverAvailableAgents()
{
    _availableAgents = [];

    var managedAgentsCsv = config.Args?.GetValueOrDefault("ManagedAgents") ?? "";
    var agentNames = managedAgentsCsv
        .Split(',', StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries)
        .ToList();

    foreach (var name in agentNames)
    {
        var handle = $"{UserHandle}:{name}";
        var health = await fabrAgentHost.GetAgentHealth(
            handle, HealthDetailLevel.Detailed);

        if (health.State == HealthState.Healthy && health.IsConfigured)
        {
            var desc = health.Configuration?.Description
                ?? health.Configuration?.SystemPrompt?[..Math.Min(200, ...)]
                ?? $"Agent '{name}' (type: {health.AgentType})";

            _availableAgents.Add(new AvailableAgentInfo
            {
                AgentName = name,
                Handle = handle,
                AgentType = health.AgentType ?? "unknown",
                Description = desc
            });
        }
    }
}

The health check is the key piece here. The DelegateAgent doesn't just blindly forward messages to agent names — it verifies each agent is Healthy and IsConfigured through FabrCore's health reporting system. Unhealthy agents are silently skipped. This means you can take an agent offline for maintenance and the router adapts automatically — no restart, no config change.

The description is pulled from the agent's configuration and becomes part of the catalog that the routing LLM uses to make decisions. If the agent has a Description field, that's used. Otherwise it falls back to the first 200 characters of the system prompt, or a generic label. The quality of the description directly affects routing accuracy — a well-described agent gets selected more reliably.

Step 1: Agent Selection with Structured Output

When a user message arrives, the DelegateAgent's first job is to pick which managed agent should handle it. This is an LLM call with structured output — the routing model returns JSON, not freeform text:

DelegateAgentContracts.cs

public class AgentSelectionOutput
{
    public string SelectedAgentName { get; set; } = "";
    public string Reasoning { get; set; } = "";
}

The routing call uses ExtractJsonAsync<T>, a helper that combines AIJsonUtilities.CreateJsonSchema with ChatResponseFormat.ForJsonSchema to get deterministic structured output from the model:

DelegateAgent.cs — JSON Schema Extraction

private async Task<T> ExtractJsonAsync<T>(
    string systemPrompt, string userPrompt)
    where T : class, new()
{
    var schema = AIJsonUtilities.CreateJsonSchema(typeof(T));
    var chatOptions = new ChatOptions
    {
        Instructions = systemPrompt,
        ResponseFormat = ChatResponseFormat.ForJsonSchema(
            schema: schema,
            schemaName: typeof(T).Name,
            schemaDescription: $"Structured {typeof(T).Name} response")
    };

    var response = await _routingClient!.GetResponseAsync(
        [new ChatMessage(ChatRole.User, userPrompt)],
        chatOptions);

    var text = response.Text?.Trim() ?? "{}";
    var json = TryExtractJsonObject(text) ?? "{}";
    return JsonSerializer.Deserialize<T>(json, JsonOpts) ?? new T();
}

The system prompt for routing is where the catalog comes together. The agent builds a formatted list of available agents with their names, types, and descriptions, then instructs the model to select exactly one:

DelegateAgent.cs — Routing Prompt

You are an intelligent message router. Given a user's request
and a catalog of available agents, select the single best agent
to handle the request.

## Available Agents
- assistant (type: assistant): General-purpose AI assistant for everyday tasks
- research (type: assistant): Research assistant with web search capabilities
- code-helper (type: assistant): Code analysis agent with file system tools

## Rules
- Select exactly ONE agent by name (the AgentName field).
- Choose the agent whose capabilities best match the user's request.
- Provide brief reasoning for your selection.
- If no agent is a clear match, choose the most general-purpose agent.

If the model returns an agent name that doesn't match any available agent (a rare case with JSON schema enforcement, but possible), the DelegateAgent falls back to the first available agent rather than failing. The Reasoning field is logged for observability — it's valuable for debugging routing decisions without having to reproduce the full conversation.

Two Chat Clients

The DelegateAgent maintains two separate chat clients: a _routingClient for stateless selection calls (agent selection uses clean, single-turn requests with no history), and an _agent/_session pair for the conversational formulation and analysis steps (which benefit from seeing prior exchanges). This separation keeps routing decisions unbiased by prior conversation context.

Step 2: Message Formulation

The user's raw message might be casual, ambiguous, or assume context that the target agent doesn't have. Rather than forwarding it verbatim, the DelegateAgent uses its conversational chat client to formulate an optimized delegation message:

DelegateAgent.cs — Formulation

var formulationPrompt = $"""
    The user sent the following request:
    ---
    {userMessage}
    ---

    You are delegating this to the **{selectedAgent.AgentName}** agent
    ({selectedAgent.Description}).

    Formulate the best possible message to send to this agent so it
    can fulfill the user's request effectively.
    Be clear, specific, and include all relevant context from the
    user's message.
    Output ONLY the message to send — no preamble, no explanation.
    """;

var formulationResult = await _agent!.RunAsync(
    formulationPrompt, _session);
var delegationMessage = formulationResult.Text ?? userMessage;

This step is subtle but important. Because the formulation runs through the agent's persistent chat session (_agent/_session), it has access to the conversation history. If the user said "do the same thing for the third one" — a message that's meaningless without context — the formulation step can expand it into "Analyze the third competitor from our research list using the same framework we established earlier" because it can see the prior conversation.

The fallback to the raw userMessage if formulation produces no text ensures the delegation always has something to send, even if the LLM call returns empty.

Step 3: Delegation with Timeout

With the right agent selected and the message formulated, it's time to actually delegate. The DelegateAgent uses FabrCore's SendAndReceiveMessage — a request-reply pattern through the agent messaging system — wrapped in a timeout:

DelegateAgent.cs — Delegation with Timeout

private async Task<string> DelegateWithTimeoutAsync(
    AgentMessage taskMessage, string agentName, int timeoutSeconds)
{
    var delegationTask = fabrAgentHost.SendAndReceiveMessage(taskMessage);
    var timeoutTask = Task.Delay(TimeSpan.FromSeconds(timeoutSeconds));

    var completed = await Task.WhenAny(delegationTask, timeoutTask);

    if (completed == delegationTask)
    {
        var agentResponse = await delegationTask;
        return agentResponse.Message ?? "";
    }

    // Observe the late result to prevent unobserved task exceptions
    _ = delegationTask.ContinueWith(
        t => logger.LogWarning(t.Exception,
            "Late delegation response from {Agent} faulted", agentName),
        TaskContinuationOptions.OnlyOnFaulted);

    return $"[Timeout] The {agentName} agent did not respond within " +
           $"{timeoutSeconds} seconds. The request may still be processing.";
}

The timeout pattern uses Task.WhenAny — a race between the delegation and a delay. The default timeout is 180 seconds (3 minutes), configurable via DelegationTimeoutSeconds.

There's an important detail in the timeout path: the ContinueWith call on the delegation task. When the timeout wins the race, the delegation task is still running — the target agent is still processing. If we don't observe that task and it later throws an exception, the .NET runtime raises an UnobservedTaskException that can crash the process. The continuation handler catches and logs those late faults cleanly.

The delegation message itself is structured deliberately:

DelegateAgent.cs — Task Message

var taskMessage = new AgentMessage
{
    ToHandle = selectedAgent.Handle,
    FromHandle = myHandle,
    Channel = "agent",          // Not the default channel
    Kind = MessageKind.Request,   // Expects a reply
    MessageType = "task",        // Semantic label
    Message = delegationMessage
};

The Channel = "agent" is critical, and it connects directly to the message filtering logic that prevents infinite loops — which we'll get to next.

The Filtering Problem: Preventing Infinite Loops

When the DelegateAgent delegates to a managed agent, that agent may send messages back — not just the final response (which comes through SendAndReceiveMessage), but also thinking notifications, status updates, and other fire-and-forget messages. If the DelegateAgent processes those as new user requests, it would try to route them, potentially creating an infinite delegation loop.

The message filter at the top of OnMessage handles this:

DelegateAgent.cs — Message Filtering

// Only process genuine user requests on the default channel.
//
// Ignore:
//  - "agent" channel: delegation responses already handled inline
//    via SendAndReceiveMessage.
//  - "thinking"/"status" MessageType: notifications from delegated
//    agents whose ThinkingNotifier targets us.
//  - OneWay Kind: any fire-and-forget notification, not a user request.
if (!string.IsNullOrEmpty(message.Channel)
    || message.MessageType is "thinking" or "status"
    || message.Kind == MessageKind.OneWay)
{
    return message.Response();
}

Three filters, each catching a different class of non-user messages:

Filter	What It Catches	Why
`!IsNullOrEmpty(Channel)`	Messages on named channels like `"agent"`	Delegation responses arrive on the `"agent"` channel but are already handled inline by `SendAndReceiveMessage`. Processing them again would double-handle.
`MessageType is "thinking" or "status"`	Thinking/status notifications from child agents	When a delegated agent uses `ThinkingNotifier`, FabrCore delivers the notification through the message stream. These are display-only, not actionable requests.
`Kind == OneWay`	Any fire-and-forget message	Catch-all for notifications that don't expect a reply. Prevents routing of system-level messages.

This is the kind of defensive code that only exists because of real bugs encountered during development. Without the filter, a DelegateAgent managing another DelegateAgent (or a chatty agent with frequent thinking updates) would loop endlessly.

Step 4: Response Analysis

The final step translates the delegated agent's response back into a user-facing answer. The DelegateAgent doesn't just pass through the raw response — it analyzes it in the context of the original request:

DelegateAgent.cs — Response Analysis

var analysisPrompt = $"""
    The user's original request was:
    ---
    {userMessage}
    ---

    You delegated to the **{selectedAgent.AgentName}** agent
    and received this response:
    ---
    {responseText}
    ---

    Analyze the response and formulate a clear, helpful reply
    for the user.
    If the agent's response fully answers the request, present
    it cleanly.
    If it's partial or unclear, note what was accomplished and
    what may still be needed.
    If the agent timed out or returned an error, let the user
    know clearly.
    Output ONLY the final response for the user.
    """;

var analysisResult = await _agent!.RunAsync(analysisPrompt, _session);
response.Message = analysisResult.Text ?? responseText;

The analysis step handles three cases gracefully:

Success: The delegated agent returned a complete answer. The analysis cleans up formatting and presents it naturally.
Partial result: The agent answered part of the question. The analysis notes what was accomplished and what might still be needed.
Timeout or error: The response contains a [Timeout] or [Error] prefix from DelegateWithTimeoutAsync. The analysis translates this into a user-friendly message.

Like the formulation step, analysis runs through the persistent chat session. This means the DelegateAgent's conversational memory builds up naturally — it remembers what it delegated previously and can reference prior results in its analysis.

Real-Time Progress: Thinking Notifications

A delegation cycle involves multiple LLM calls and potentially a long wait for the target agent. Without progress feedback, the user stares at a blank screen wondering if anything is happening. The DelegateAgent sends thinking notifications at each stage:

DelegateAgent.cs — Thinking Notifications

await SendThinkingAsync("Selecting the best agent...");
var selection = await SelectAgent(userMessage);

await SendThinkingAsync("Formulating request...");
// ... formulation ...

await SendThinkingAsync($"Delegating to {selectedAgent.AgentName}...");
// ... delegation ...

await SendThinkingAsync("Analyzing response...");

The SendThinkingAsync helper sends a OneWay message with MessageType = "thinking" directly to the client handle. In the OpenCaddis UI, these appear as ephemeral status indicators — the user sees "Selecting the best agent..." then "Delegating to research..." then "Analyzing response..." in real time.

The DelegateAgent also integrates with OpenCaddis's ThinkingNotifier — a static registry that maps agent handles to client handles. This means plugins running inside the delegated agent can also send thinking notifications back to the user through the same channel. The user gets a unified progress stream regardless of how deep the delegation goes.

Compaction Integration

The DelegateAgent maintains its own chat history (for formulation and analysis), which grows over time just like any conversational agent. It calls TryCompactAsync() before processing each message, using the same compaction system as the AssistantAgent:

DelegateAgent.cs — Compaction

// Run compaction if needed before invoking the model
var compaction = await TryCompactAsync(
    onCompacting: () => SendThinkingAsync("Compacting history..."));
if (compaction?.WasCompacted == true)
{
    await SendThinkingAsync(
        $"Compacted history: {compaction.OriginalMessageCount} → {compaction.CompactedMessageCount} messages");
}

This is particularly valuable for the DelegateAgent because its conversation history grows faster than a typical assistant's — each delegation cycle adds the formulation prompt, the formulation result, the analysis prompt, and the analysis result. Four messages per user request instead of two. Without compaction, a busy DelegateAgent hits its context window ceiling twice as fast.

Initialization: Two Clients, One Agent

The DelegateAgent's OnInitialize sets up the dual-client architecture and runs agent discovery:

DelegateAgent.cs — Initialization

public override async Task OnInitialize()
{
    var modelConfigName = config.Args?
        .GetValueOrDefault("ModelConfig") ?? "default";

    // Stateless client for routing decisions
    _routingClient = await GetChatClient(modelConfigName);

    // Stateful agent for formulation and analysis
    var result = await CreateChatClientAgent(
        modelConfigName,
        threadId: config.Handle ?? fabrAgentHost.GetHandle(),
        tools: []  // No tools — DelegateAgent delegates, not executes
    );

    _agent = result.Agent;
    _session = result.Session;

    await DiscoverAvailableAgents();
}

Notice tools: [] — the DelegateAgent has no tools of its own. It doesn't execute anything directly; it delegates. Tools belong to the managed agents. This is a design choice: the router should be a pure routing layer, not a Swiss army knife that also tries to do things itself.

Configuration Reference

Argument	Default	Description
`ManagedAgents`	(required)	Comma-separated list of agent names to route to. Each name maps to a handle `opencaddis-user:{name}`.
`DelegationTimeoutSeconds`	`180`	Maximum seconds to wait for a delegated agent to respond before returning a timeout message.
`ModelConfig`	`"default"`	Name of the model configuration to use for routing, formulation, and analysis.
`CompactionEnabled`	`true`	Whether to run chat history compaction on the DelegateAgent's own conversation.
`CompactionKeepLastN`	`20`	Number of recent messages to preserve during compaction.

A complete fabr.json example with a DelegateAgent managing three specialized agents:

fabr.json — Full Example

{
  "Agents": [
    {
      "Handle": "opencaddis-user:delegate",
      "AgentType": "delegate",
      "Description": "Intelligent router for all user requests",
      "Args": {
        "ManagedAgents": "assistant, research, code-helper",
        "DelegationTimeoutSeconds": "180"
      }
    },
    {
      "Handle": "opencaddis-user:assistant",
      "AgentType": "assistant",
      "Description": "General-purpose AI assistant"
    },
    {
      "Handle": "opencaddis-user:research",
      "AgentType": "assistant",
      "Description": "Research assistant with web search",
      "Args": { "Plugins": "WebSearch" }
    },
    {
      "Handle": "opencaddis-user:code-helper",
      "AgentType": "assistant",
      "Description": "Code analysis with file system access",
      "Args": { "Plugins": "FileSystem" }
    }
  ]
}

What It Looks Like in Practice

Here's a real interaction showing the full delegation cycle. The user talks to the DelegateAgent, which routes to a research agent transparently:

You

What are the key differences between Blazor Server and Blazor WebAssembly for an internal business app?

Delegate thinking

Selecting the best agent...

Delegate thinking

Formulating request...

Delegate thinking

Delegating to research...

Delegate thinking

Analyzing response...

Delegate

For an internal business application, here are the key differences between Blazor Server and Blazor WebAssembly...

From the user's perspective, they asked one question and got one answer. Behind the scenes, the DelegateAgent made three LLM calls (selection, formulation, analysis) and one agent-to-agent delegation. The thinking notifications keep the user informed without requiring them to understand the routing topology.

Shared Patterns: DelegateAgent and WorkflowAgent

The DelegateAgent and WorkflowAgent share the same agent discovery code — the DiscoverAvailableAgents pattern, the ManagedAgents configuration, the AvailableAgentInfo data structure, and the health-check-based filtering. This isn't accidental. Both agents are orchestrators that manage other agents, and they both need to know what's available.

Aspect	DelegateAgent	WorkflowAgent
Interaction model	Single-turn routing	Multi-turn workflow execution
State	Conversational history only	Full workflow plan with task statuses
Agent selection	LLM-based, per-message	LLM-based, during planning phase
User interaction	Transparent — user doesn't see routing	Explicit — plan approval, pause/resume, status
Error handling	Timeout + fallback response	Retry + replan + stall detection
Best for	Q&A, single-task requests	Multi-step projects with dependencies

A common deployment pattern is to have a DelegateAgent as the user's primary entry point, with a WorkflowAgent as one of its managed agents. Simple requests get routed to specialized assistants; complex multi-step requests get routed to the workflow planner. The user doesn't need to decide — the router figures it out.

Looking Ahead

The DelegateAgent is deliberately simple — 420 lines of code for a fully functional LLM-powered router. There are areas to evolve: routing confidence scores that fall back to asking the user when no agent is a clear match, learned preferences that weight routing based on past delegation success, and multi-agent fan-out where the DelegateAgent consults multiple specialists and synthesizes their responses.

The foundation is solid. The four-step pipeline (select, formulate, delegate, analyze) provides clean separation of concerns. The message filtering prevents the recursive pitfalls of agent-to-agent communication. And the shared discovery pattern with the WorkflowAgent means new orchestration patterns can be built on the same infrastructure.

Check out the OpenCaddis source on GitHub and the DelegateAgent documentation to see the full implementation and configuration options.