Building Your Personal Council of Experts
Artifical Intelligence
/ January 24, 2026 • 10 min read
Tags:
automation
workflow automation
ai agents
Introduction
My council of experts is three (or more) AI personas designed to argue from specific perspectives such as application security, and software development and software architecture.
The idea is that when you use, for example, Claude Code or some other coding agent to generate a development plan, the council will receive the plan and debate it amongst themselves. The output from the council is returned to the coding agent, and the plan can be updated if needed.
Each AI Persona is just a prompt, created to assume the role of the expert they represent — complete with specific knowledge, priorities, and evaluation criteria relevant to their domain of expertise.
The reason I created the council was not specifically for development, it was to discuss and debate ideas. Before I would open a few tabs where the AI was prompted to become a certain type of Expert, then I would just copy the output from each AI to every tab. Time consuming yes, but it did work. Eventually I decided to convert the process into a program.
The remaining article will discuss implementation details and future improvements. The code be found here.
The Problem Space
Coding agents today are excellent at creating development plans, especially Opus 4.5. However, they can’t anticipate everything and Opus is not an expert in software architecture, software optimization and so on.
It can also be difficult for the human in the loop, to design and plan for everything upfront. Having a council of experts, primed to critique and debate from a given point of view, can increase the quality of your development plan.
The end result is a multi-perspective analysis, which has been beneficial.
How the Council of Experts Works
Assuming we are working with a coding agent, you first instruct your agent to develop a plan. Once the plan is created, the agent will execute the council program and referencing the plan.
The council will debate up to N rounds, default three. During debate round one, each expert will parse the plan and provide their analysis to the group. The prompt for the debating rounds is defined as the following:
## Discussion Context
Original question: {prompt}
{proposals_context}
## Your Task (Turn {current_turn})
Review the proposals and debate above. You must choose ONE of these actions:
1. **REVISE** - Update your own proposal based on others' input
2. **AGREE** - Fully endorse another agent's proposal (specify which agent)
3. **CONCERN** - Raise a new concern that hasn't been addressed
Choose based on:
- If you see merit in another's proposal that addresses your concerns better, AGREE with them
- If you want to incorporate feedback or respond to concerns, REVISE your proposal
- If you see an unaddressed issue, raise a CONCERN
Return your response as JSON with this structure:
{
"action": "revise",
"target": "The Architect",
"reasoning": "Explain your thinking. Why are you taking this action?",
"concern": "Only if action is concern - describe the unaddressed issue",
"updated_proposal": {
"summary": "Your revised summary",
"analysis": [{"point": "Observation", "reasoning": "Why it matters"}],
"recommendations": ["Recommendation 1", "Recommendation 2"]
}
}
Notes:
- action: "revise", "agree", or "concern"
- target: Required only if action is "agree" - which agent you agree with
- concern: Required only if action is "concern"
- updated_proposal: Required only if action is "revise"
Be constructive. The goal is to reach consensus, not to win.
Each debate turn will contain the history of previous turns.
The debate will continue until max_turns is reached, or consensus is achieved. If you increase max_turns to, let’s say 50, then the only way to end program early is for the experts to each consensus.
Reaching Consensus
The consensus node will summarize the debate and even ask each council member if the consensus node, or moderator if you will, has accurately captured the essence of the expert’s argument.
The following prompt is used to create the consensus draft.
You are a neutral moderator synthesizing a council discussion.
## Original Question
{prompt}
## Discussion
{discussion}
## Your Task
Synthesize the discussion into a consensus document. Identify:
1. Points everyone agrees on (strengths)
2. Concerns raised and how they were resolved
3. Prioritized recommendations
4. Any views that couldn't be reconciled
Return your response as JSON with this structure:
{
"summary": "A clear 2-3 sentence summary of the consensus reached",
"strengths": [
{"point": "Something the council agrees is positive or important", "supporters": ["Agent Name 1", "Agent Name 2"]}
],
"concerns": [
{"concern": "A concern that was raised", "raised_by": "Agent Name", "resolution": "How it was addressed or resolved"}
],
"recommendations": [
{"priority": "high", "action": "Specific actionable recommendation", "rationale": "Why this matters"}
],
"dissenting_views": [
{"agent": "Agent Name", "position": "Their unreconciled position"}
]
}
Notes:
- priority: "high", "medium", or "low"
- dissenting_views: Only include if genuine disagreement remains
Be fair and balanced. Accurately represent each agent's views.
Available agents: {", ".join(agent_names)}
Each agent is asked to approve the draft with the following prompt:
## Original Question
{prompt}
## Draft Consensus Document
{draft_md}
## Your Task
Review this consensus document. Does it fairly represent your views and the discussion?
Return your response as JSON: {"approval": true, "feedback": "Your feedback here"}}
Practical Integration
I created a skill for Claude Code that can be invoked when creating features or performing large refactorings.
---
name: council
description: Invoke The Council - a multi-agent discussion system for software planning. Use when planning new features, designing architecture, evaluating technical decisions, or needing multiple expert perspectives on a software problem. Runs three AI personas (Architect, Critic, AppSec) who debate and reach consensus.
allowed-tools: Bash, Read
---
# The Council: Multi-Agent Planning Tool
The Council is an external multi-agent system that provides rigorous analysis of software planning questions through structured debate between three expert personas.
[...]
### Basic Usage (Question Only)
cd /Users/User/Sites/the-council && source .venv/bin/activate && python main.py --prompt "YOUR PLANNING QUESTION HERE" --output-dir /tmp
### With File Analysis
When you need the council to analyze specific code, documentation, or an existing plan:
cd /Users/User/Sites/the-council && source .venv/bin/activate && python main.py --prompt "YOUR QUESTION" --file /absolute/path/to/file --output-dir /tmp
The skill document contains more information, but for the purposes of this article, the above snippet captures the core capability of the skill.
Technical Implementation
Expert Configuration
Each expert is configured using yaml. You can configure name, role, provider, temperature, and the actual prompt (labeled as style).
Here is an example configuration where the “Architect” agent is configured to use grok-4-1-fast-reasoning instead of the default model gemini-3-flash-preview:
default_provider:
api_key: "$GOOGLE_API_KEY"
base_url: "https://generativelanguage.googleapis.com/v1beta/openai/"
model: "gemini-3-flash-preview"
personas:
- name: "The Architect"
role: "Systems Design Expert"
focus: "Structure, scalability, design patterns, dependencies, and system interactions"
provider:
api_key: "$XAI_API_KEY"
base_url: "https://api.x.ai/v1"
model: "grok-4-1-fast-reasoning"
temperature: 0.8
style: |
You are a world-class Software Architect with decades of experience designing and scaling complex systems at leading tech companies...
There is no limit to the amount of personas or agents that can be created.
The Core Architecture
The entire program flow can be visualized as:
InputNode
│
▼
ProposalNode (BatchNode - parallel initial proposals)
│
▼
┌─────────────────────────────────────┐
│ DebateNode │◀─────┐
│ (agents review/revise/agree) │ │
└─────────────────────────────────────┘ │
│ │
├── "continue" ──────────────────────────┘
│
├── "consensus_ready" ──┐
│ │
├── "max_turns" ────────┤
│ │
▼ ▼
ConsensusNode ◀─────────────┘
│
▼
OutputNode
│
▼
END
As can be seen the diagram above, it is essentially a loop with prompts in between. The loop breaks when a given goal reached (max_turns or consensus).
The process beings with the Input node reading configuration files, prompts, personas etc., and putting everything into a shared context for the upcoming nodes. The Proposal node reads instructs all the agents to provide their analysis on the document or question provided by the Input node.
Now the process enters the Debate node, which will continue in a loop until the defined conditions are met.
The Consensus node, or moderator, will summarize the debate and make sure each expert agrees with the moderators summary.
Finally, the Output node will present the findings from the council, detailing any disagreements, recommendations, and the summary of the debate. Optionally, the output can contain the entire transcript.
Framework Selection
The framework that I used for council of experts is called PocketFlow. It’s a lightweight framework — one hundred lines in Python and zero dependencies. You can easily create the agentic loop and trivially share context.
You can create your own framework for dealing with the loop, memory, and tools. Building from scratch gives you complete control over how your agentic system operates and evolves. You understand every line of code because you wrote it, and you can modify it exactly to fit your needs without fighting against pre-existing patterns or assumptions.
However, frameworks can help get you going quicker, but they also lock you into an ecosystem. They provide pre-built solutions for common problems like state management, error handling, and agent communication. This can save significant development time initially, but it also means you’re adopting their architectural decisions, their way of thinking about problems, and their dependencies. Moving away from a framework later can be costly if your system becomes deeply integrated with its patterns.
Some frameworks can be complex and bloated, difficult to reason with and extend. When you need to understand how something works under the hood, you find yourself navigating through multiple layers of abstraction, nested classes, and intricate inheritance hierarchies. What should be a simple loop becomes obscured by framework magic. And when you need to extend functionality or debug an issue, you’re not just debugging your code—you’re debugging the framework itself, trying to understand design decisions made by someone else for use cases that may not match yours.
I chose PocketFlow mainly because of its minimalism — not bloated, no nested abstractions, and several examples on how to build more complex interactions with AI. At roughly one hundred lines of Python with minimal dependencies, I can read and understand the entire framework in an afternoon. There’s no magic happening behind the scenes. The operator overloading approach, while unconventional, is straightforward once you see it
Future Improvements
Over time, if you’re constantly using the council of experts, you can collect examples of what worked and what didn’t work. With that information, you can systematically refine your expert personas to create even better results.
Every debate, every consensus reached, every time the council catches something important or misses a critical consideration – these become data points for improving your system. Keep track of debates that led to significantly improved plans versus debates that added little value. Note which types of critiques were most useful and which expert perspectives consistently provided the most insight for different kinds of tasks.
This is where frameworks like DSPy can become valuable. DSPy treats prompts as parameterized programs that can be optimized automatically. Instead of manually tweaking your expert prompts through trial and error, you can use DSPy to systematically improve them based on your collected examples.
You can leverage DSPy’s optimization techniques such as GEPA (Generalized Error-driven Prompt Adaptation). GEPA analyzes your examples – the inputs, outputs, and the errors or suboptimal results – and automatically refines your prompts to perform better on similar tasks. If your security expert consistently misses certain classes of vulnerabilities, GEPA can adjust that expert’s prompt to be more sensitive to those issues. If your architecture expert tends to over-engineer solutions for simple tasks, GEPA can help calibrate the prompt to better match the complexity of the input.
The nice thing about combining PocketFlow’s simplicity with DSPy’s optimization is that you maintain control over your agentic loop structure while gaining prompt improvement mechanisms. You’re not replacing your minimalist framework — you’re augmenting it. DSPy can help you refine each expert’s system prompt, their evaluation criteria, and how they formulate their critiques, all based on real performance data from your actual usage.
Over time, this creates a cycle: use the council, collect results, optimize prompts with DSPy and e.g. GEPA, deploy improved experts, and repeat. Your council of experts doesn’t just help you build better software, it gets better at helping you.
Also, one improvement that could be useful is providing the experts with tools, such as web browsing and reading files. Tooling can help the experts to gather more information to provide a better analysis.
Lastly, there is room for improvement regarding token usage. Overall context management can likely be improved to reduce token usage.
Conclusion
The Council of Experts is an agentic system that enables N AI Personas to discuss a given topic — debate questions or development plans. The system leverages a minimalistic framework for building AI agents.
The Council can be found here.
CONTENTS