Agents
Agents
Agents are the core of OpenCaddis. Each agent is a conversation thread with its own personality (system prompt), capabilities (plugins), and reasoning strategy (agent type). You define agents in opencaddis.json.
Simple request-response. Fast and direct.
Intelligent router. Picks the best agent for each request.
Multi-agent orchestrator. Plans, delegates, and verifies.
Live application log analyst. Ask questions about your logs.
Iterative reasoning. Breaks problems into steps and self-evaluates.
Assistant Agent
The Assistant agent uses a straightforward request-response pattern:
- User sends a message
- Message is sent to the LLM with the system prompt and configured tools
- LLM responds (optionally calling tools)
- Response is returned to the chat UI
Best for: General chat, simple tool use, quick questions, straightforward tasks where you don't need multi-step reasoning.
{
"Handle": "Helper",
"AgentType": "assistant",
"Models": ["default"],
"SystemPrompt": "You are a helpful assistant with access to the web and file system.",
"Plugins": ["WebBrowser", "FileSystem", "Memory"]
}
Workflow Agent
The Workflow agent is a multi-agent orchestrator. Instead of doing everything itself, it plans work, delegates tasks to other agents, and verifies the results — coordinating multiple specialist agents to accomplish complex goals.
How It Works
The Workflow agent follows a structured pipeline:
- Planning — The Workflow agent sends your request to the LLM, which generates a structured plan with tasks, assigned agents, dependencies, and acceptance criteria
- Approval — The plan is presented to you for review. Say
"go"to approve,"cancel"to abort, or provide feedback to adjust - Execution — Tasks are executed sequentially, respecting dependencies. Each task is delegated to the assigned agent, which uses its own plugins and tools to complete it
- Acceptance — After each task completes, the Workflow agent evaluates the result against the task's acceptance criteria. Failed tasks may be retried with improved instructions or trigger a replan
- Completion — Once all tasks are done, the Workflow agent summarizes what was accomplished
Managed Agents
The Workflow agent doesn't do work directly — it delegates to other agents you've configured. Use the ManagedAgents arg to specify which agents it can assign tasks to:
{
"Handle": "Orchestrator",
"AgentType": "workflow",
"Models": ["default"],
"SystemPrompt": "You are a project coordinator. Break complex requests into tasks and delegate them to the appropriate specialist agents.",
"Args": {
"ManagedAgents": "Researcher,Developer,Writer"
}
}
At startup, the Workflow agent discovers which of its managed agents are actually available by running health checks. Only healthy agents appear in the plan. If ManagedAgents is not set, it can see all agents in the workspace.
User Commands
During different phases of the workflow, you can interact with the agent using these commands:
| Phase | Command | What It Does |
|---|---|---|
| Awaiting Approval | go, run, yes, approve | Approve the plan and begin execution |
| Awaiting Approval | cancel, no | Cancel the plan |
| Executing | status, progress | Show current task progress and overall plan status |
| Executing | pause, stop | Pause execution after the current task completes |
| Paused | resume, continue, go | Resume execution from where it stopped |
Task Lifecycle
| Status | Description |
|---|---|
| Pending | Waiting to be executed (dependencies not yet met) |
| InProgress | Currently being worked on by the assigned agent |
| Completed | Finished and acceptance criteria met |
| Failed | Could not be completed after max attempts |
| Blocked | Blocked by a dependency that failed |
| Skipped | Skipped due to upstream failure or replan |
Failure Handling
When a task fails acceptance criteria, the Workflow agent decides how to handle it:
- Retry — Re-run the task with improved instructions (up to
MaxTaskAttempts) - Blocked — Mark the task as blocked and continue with other tasks
- Replan — Generate a new plan that accounts for the failure
- Stall — If no progress can be made, stop and ask the user for guidance
Configuration
| Arg | Default | Description |
|---|---|---|
ManagedAgents | all agents | Comma-separated list of agent names this workflow can delegate to |
MaxTaskAttempts | 3 | Maximum retry attempts per task before marking it failed |
ModelConfig | "default" | Which model configuration to use for planning and evaluation |
Best for: Multi-step projects, coordinating multiple specialist agents, research workflows, any task that benefits from planning and delegation.
{
"Handle": "Project Lead",
"AgentType": "workflow",
"Models": ["default"],
"SystemPrompt": "You are a project coordinator. Break complex requests into clear tasks, assign them to the right agents, and verify results meet quality standards.",
"Args": {
"ManagedAgents": "Helper,Researcher,Writer",
"MaxTaskAttempts": 3
}
}
Delegate Agent
The Delegate agent is an intelligent message router. Instead of doing work itself, it analyzes each user request, selects the most appropriate agent from a managed set, and delegates the work — then synthesizes the response back to you. It's a single-turn orchestrator: one request in, one answer out, routed to the right specialist automatically.
How It Works
The Delegate agent follows a four-step cycle for each request:
- Select Agent — The Delegate agent sends your message to the LLM along with a catalog of available agents. The LLM picks the best-suited agent and explains its reasoning
- Formulate — The LLM rewrites your request into an optimized delegation message tailored for the selected agent's capabilities
- Delegate — The message is sent to the selected agent, which uses its own plugins and tools to complete the work. A configurable timeout prevents hanging
- Analyze & Respond — The delegated agent's response is analyzed and synthesized into a clear, user-friendly answer
Best for: Providing a single natural-language entry point to a set of specialist agents. Users don't need to know which agent to talk to — the Delegate figures it out.
Managed Agents
Like the Workflow agent, the Delegate agent delegates to other agents you've configured. Use the ManagedAgents arg to specify which agents it can route to:
{
"Handle": "Router",
"AgentType": "delegate",
"Models": ["default"],
"Description": "Intelligent message router that selects the best agent for each request.",
"Args": {
"ManagedAgents": "Web,Files,Secretary"
}
}
At startup, the Delegate agent discovers which of its managed agents are actually available by running health checks. Only healthy, configured agents appear in the routing catalog. If an agent is unhealthy or missing, it's silently excluded.
Real-time Progress
While processing, the Delegate agent sends thinking notifications so you can see what's happening:
- "Selecting the best agent..."
- "Formulating request..."
- "Delegating to Web..."
- "Analyzing response..."
Delegate vs. Workflow
Both are orchestration agents, but scoped differently:
| Delegate | Workflow | |
|---|---|---|
| Scope | Single request → single agent | Single request → multi-step plan |
| Agent Selection | LLM picks 1 agent per request | Decomposes into tasks, assigns agents to each |
| User Interaction | Transparent — just chat normally | Approval flow, status commands, pause/resume |
| Retry Logic | Timeout-based fallback | Full retry/replan cycle per task |
| Best For | Natural single-turn routing | Complex multi-step projects |
Configuration
| Arg | Default | Description |
|---|---|---|
ManagedAgents | all agents | Comma-separated list of agent names this delegate can route to |
DelegationTimeoutSeconds | 180 | Maximum seconds to wait for a delegated agent's response |
ModelConfig | "default" | Which model configuration to use for routing decisions |
{
"Handle": "Front Desk",
"AgentType": "delegate",
"Models": ["default"],
"Description": "Intelligent message router that analyzes requests, selects the best agent, and synthesizes responses.",
"Args": {
"ManagedAgents": "Helper,Researcher,Secretary",
"DelegationTimeoutSeconds": 180
},
"Tools": ["GetDateTime"]
}
Example Conversation
What's the latest news on WebAssembly server-side support?
Selecting the best agent... → Delegating to Web...
Based on the latest sources, WebAssembly server-side support has been progressing steadily. WASI Preview 2 reached stable status in early 2026, and frameworks like Spin and wasmCloud are now production-ready for microservices workloads. The .NET team has also improved their WASI integration in .NET 10...
The user didn't need to know that the "Web" agent handled this request. The Delegate agent selected it automatically because the question required web browsing, formulated an optimized query, and synthesized the response. The same message to a different Delegate might route to a "Researcher" agent that also checks memory — it depends on which agents are available.
Event Log Agent
The Event Log agent is a live application diagnostics tool. It receives a real-time stream of OpenCaddis application logs and lets you query them through natural conversation — search by text, filter by severity or category, drill into time ranges, and get summary statistics.
How It Works
Unlike the other agent types, the Event Log agent has a passive data pipeline feeding it:
- A custom
ILoggerProvidercaptures all application log entries at Information level and above - Log entries are streamed through a bounded channel to the Event Log agent as events
- The agent stores them in an in-memory ring buffer (last 1,000 entries, O(1) writes, no allocations)
- When you ask a question, the LLM calls the agent's built-in tools to search and analyze the buffer
The Event Log agent has its own built-in tools for log analysis — you don't assign plugins to it. Its tools are scoped to the log buffer only and cannot access the file system, network, or any other resources.
Built-in Tools
| Tool | Parameters | Description |
|---|---|---|
GetRecentLogs | count (default 20) | Get the most recent log entries, newest first |
SearchLogs | query, maxResults (default 50) | Search log messages and categories for text (case-insensitive) |
SearchLogsByLevel | level, maxResults (default 50) | Filter logs by severity: Trace, Debug, Information, Warning, Error, Critical |
SearchLogsByCategory | categoryPrefix, maxResults (default 50) | Filter logs by category prefix (e.g., "OpenCaddis.Services") |
SearchLogsByTimeRange | from, to (ISO 8601), maxResults (default 100) | Filter logs within a time range |
GetLogStats | — | Summary statistics: total count, counts per severity, top 10 categories by volume |
GetLogCount | — | Get the total number of entries in the buffer |
Configuration
| Arg | Default | Description |
|---|---|---|
ModelConfig | "default" | Which model configuration to use for the LLM |
What Gets Captured
- Included: All log entries at Information, Warning, Error, and Critical levels from OpenCaddis application code
- Excluded: Trace and Debug levels, and internal framework logs from FabrCore and Orleans (suppressed to avoid noise and recursion)
- Buffer: Rolling window of the last 1,000 entries — older entries are overwritten
Best for: Debugging, monitoring, understanding application behavior, triaging errors — without leaving the chat UI or opening a log viewer.
{
"Handle": "Log Analyst",
"AgentType": "eventlog",
"Models": ["default"]
}
Example Conversation
Are there any errors in the logs?
I found 3 Error-level entries in the log buffer:
- 10:42:15 —
OpenCaddis.Agentic.Plugins.WebBrowser: Navigation timeout after 30000ms for https://example.com/api - 10:38:02 —
OpenCaddis.Services.EmbeddingService: Embedding API returned 429 Too Many Requests - 10:35:47 —
OpenCaddis.Agentic.Plugins.WebBrowser: Playwright browser context crashed, restarting
Show me the stats breakdown
Log buffer summary (847 entries):
- Information: 812 • Warning: 32 • Error: 3
- Top categories: OpenCaddis.Agentic.Agents (284), OpenCaddis.Services (198), OpenCaddis.Agentic.Plugins (156)
- Time span: 10:15:02 to 10:47:33
Only one Event Log agent can be active at a time. If you define multiple "eventlog" agents, the log provider will signal the first one it finds. The system prompt can be customized, but the tools are built-in and cannot be changed.
Chain of Thought Agent
The ChainOfThoughtAgent is a reasoning agent that uses an adaptive confidence-threshold loop to answer complex questions. Unlike the standard AssistantAgent which makes a single LLM call, the CoT agent breaks problems into steps, executes them, synthesizes results, and iterates until it reaches a confidence threshold — producing deeper, more thorough answers.
Agent Type: chainofthought
How It Works
If the Assess phase determines the question is trivial with high confidence, the agent skips planning entirely and returns a direct answer in a single LLM call — no overhead for simple questions.
Phases
| Phase | Purpose |
|---|---|
| Assess | Evaluates question complexity (trivial/simple/medium/complex) and initial confidence. Simple questions with high confidence take the fast path — single LLM call, no planning overhead. |
| Plan | Generates up to 5 execution steps with dependency ordering. Steps can be marked as parallelizable. Each step includes a self-contained prompt for execution. |
| Execute | Runs steps in topologically-sorted layers. Steps within the same layer that are marked canParallelize run concurrently via Task.WhenAll. Each step can optionally use configured tools. |
| Synthesize | Combines all step results into a single answer with a confidence score (0.0–1.0). Identifies information gaps. |
| Replan | If confidence is below the decaying threshold, generates targeted new steps (max half of MaxSteps) to address identified gaps. Completed results are preserved. |
| Finalize | Passes the synthesized answer through the main agent to produce a clean, user-facing response written into conversation history for follow-up context. |
Confidence Threshold Decay
The agent uses a linearly decaying confidence threshold. This ensures it tries hard early but gracefully accepts "good enough" answers rather than looping forever.
| Loop | Threshold (default config) |
|---|---|
| 1 | 0.75 |
| 2 | 0.70 |
| 3 | 0.65 |
| 4 | 0.60 |
| 5 | 0.55 |
| 6 | 0.50 |
| 7 | 0.45 |
| 8 | 0.40 |
Configuration
{
"Name": "Cot",
"AgentType": "chainofthought",
"SystemPrompt": "You are a thoughtful reasoning assistant.",
"Args": {
"ModelConfig": "default",
"MaxLoops": "8",
"MaxSteps": "5",
"InitialConfidenceThreshold": "0.75",
"MinConfidenceThreshold": "0.40",
"NetworkTimeoutSeconds": "180"
},
"Tools": ["DateTime"],
"Plugins": ["WebBrowser", "FileSystem"]
}
| Parameter | Default | Description |
|---|---|---|
ModelConfig | "default" | Name of the model configuration in fabrcore.json to use for all LLM calls |
MaxLoops | 8 | Maximum number of assess→plan→execute→synthesize iterations |
MaxSteps | 5 | Maximum steps per plan. Replan is capped at half this value. |
InitialConfidenceThreshold | 0.75 | Starting confidence threshold. Higher = more demanding first loop. |
MinConfidenceThreshold | 0.40 | Floor for the decaying threshold. Agent exits at this confidence on the last loop. |
NetworkTimeoutSeconds | 180 | HTTP timeout for LLM API calls (seconds) |
When to Use
- Multi-faceted analysis and comparisons
- Research questions requiring synthesis
- Questions needing self-evaluation of answer quality
- Simple Q&A and conversational interactions
- Tool-driven tasks without iterative reasoning
- Speed-sensitive workflows
Performance Characteristics
The CoT agent makes multiple LLM calls per user message. Typical performance on Azure OpenAI gpt-5-nano:
| Scenario | LLM Calls | Time | Notes |
|---|---|---|---|
| Fast path (trivial question) | 2 | ~25s | Assess + direct answer |
| Medium complexity (1 loop) | 7–8 | ~3 min | Assess + Plan + 4 steps + Synthesize + Finalize |
| Complex with replan (2 loops) | 10–12 | ~5 min | Adds Replan + additional steps + second Synthesize |
Structured Output & LLM Optimization
All internal LLM calls use structured JSON output with the following optimizations:
- ReasoningEffort = None — Disables internal chain-of-thought reasoning in the LLM, reducing latency by ~30–40% on reasoning-capable models (GPT-5, o-series).
- ReasoningOutput = None — Suppresses reasoning traces from the response body, saving output tokens.
- AllowMultipleToolCalls = false — Required for compatibility with structured JSON output on Azure OpenAI, which does not support parallel tool calls with JSON schemas.
Data Models
The agent uses structured types for LLM communication:
| Category | Type | Purpose |
|---|---|---|
| Core State | CoTState | Tracks execution: original request, working memory, plan, step results, loop count, confidence, timing |
ThoughtEntry | A single entry in working memory (type, content, loop iteration, confidence snapshot) | |
PlanStep | A step in the execution plan with dependencies, parallelization flag, prompt, and result | |
| LLM Output | AssessmentOutput | Complexity rating, confidence, whether planning is needed, optional direct answer |
PlanOutput | List of steps with rationale | |
SynthesisOutput | Combined answer, confidence, identified gaps | |
CoTReplanOutput | Revised steps and rationale |
Best for: Multi-faceted analysis, comparisons, research questions, anything requiring synthesis of multiple perspectives, and questions where you want the agent to self-evaluate its answer quality.
{
"Handle": "Deep Thinker",
"AgentType": "chainofthought",
"Models": ["default"],
"SystemPrompt": "You are a thoughtful reasoning assistant. Break complex problems into steps and synthesize thorough answers.",
"Args": {
"MaxLoops": "8",
"MaxSteps": "5",
"InitialConfidenceThreshold": "0.75",
"MinConfidenceThreshold": "0.40"
},
"Plugins": ["WebBrowser", "Memory"]
}
Chat History Compaction
Long conversations accumulate message history that can fill the model's context window, increasing token costs and eventually hitting size limits. Compaction solves this by automatically summarizing older messages when the conversation grows too large — keeping agents effective in long-running sessions without losing important context.
How It Works
- Estimate — Before each message, the agent estimates the total token count of all stored messages
- Threshold — If the estimate exceeds the configured threshold (default 75% of the context window), compaction triggers
- Summarize — Older messages are sent to the LLM with instructions to produce a concise summary that preserves key decisions, facts, names, numbers, outstanding tasks, and overall context
- Replace — The older messages are atomically replaced with a single
[Compacted History]system message containing the summary. The most recent messages (default 20) are always kept intact
Compaction is available on Assistant and Delegate agents. Workflow and Event Log agents manage their own context differently and don't use compaction.
Configuration
Compaction requires the ContextWindowTokens property set on your model configuration in fabr.json. Per-agent behavior can be tuned via Args:
| Arg | Default | Description |
|---|---|---|
CompactionEnabled | true | Enable or disable compaction for this agent |
CompactionKeepLastN | 20 | Number of most recent messages to always preserve (never summarized) |
CompactionMaxContextTokens | from model config | Override the context window size for this agent (falls back to ContextWindowTokens in fabr.json) |
CompactionThreshold | 0.75 | Trigger compaction when token usage exceeds this ratio of the context window |
{
"Handle": "Long Chat",
"AgentType": "assistant",
"Models": ["default"],
"SystemPrompt": "You are a helpful assistant.",
"Plugins": ["WebBrowser", "Memory"],
"Args": {
"CompactionEnabled": "true",
"CompactionKeepLastN": "30",
"CompactionThreshold": "0.80"
}
}
What Gets Preserved
The LLM summarization prompt is designed to preserve:
- Key decisions and conclusions
- Important facts, names, and numbers
- Outstanding tasks or open questions
- The overall topic and context of the conversation
Safety
- Tool message integrity — Compaction never splits a tool result from its parent assistant message, preventing API errors from orphaned tool messages
- Atomic replacement — The entire message history is replaced in a single operation, preventing race conditions
- Graceful failure — If compaction fails for any reason, the error is logged and the agent continues normally with the original history
Compaction is enabled by default but requires ContextWindowTokens to be set on your model in fabr.json. Without it, the agent can't determine when the context window is filling up. Set it in the Settings UI under Context Window, or add it directly to your model configuration.
Creating Agents
Here's a complete example with all five agent types:
{
"Agents": [
{
"Handle": "Quick Chat",
"AgentType": "assistant",
"Models": ["default"],
"SystemPrompt": "You are a friendly, concise assistant.",
"Plugins": ["WebBrowser", "Memory"]
},
{
"Handle": "Dev Assistant",
"AgentType": "assistant",
"Models": ["default"],
"SystemPrompt": "You are a senior software developer. Write clean, well-tested code.",
"Plugins": ["FileSystem", "PowerShell", "Docker", "Memory", "TaskManager"],
"Args": {
"FileSystem:RootPath": "C:\\Projects",
"PowerShell:TimeoutSeconds": 60
}
},
{
"Handle": "Front Desk",
"AgentType": "delegate",
"Models": ["default"],
"Description": "Routes requests to the best available agent.",
"Args": {
"ManagedAgents": "Quick Chat,Dev Assistant"
}
},
{
"Handle": "Project Lead",
"AgentType": "workflow",
"Models": ["default"],
"SystemPrompt": "You are a project coordinator. Plan and delegate work to the right agents.",
"Args": {
"ManagedAgents": "Quick Chat,Dev Assistant"
}
},
{
"Handle": "Log Analyst",
"AgentType": "eventlog",
"Models": ["default"]
},
{
"Handle": "Deep Thinker",
"AgentType": "chainofthought",
"Models": ["default"],
"SystemPrompt": "You are a thoughtful reasoning assistant.",
"Plugins": ["WebBrowser", "Memory"],
"Args": {
"MaxLoops": "8",
"InitialConfidenceThreshold": "0.75"
}
}
]
}