Your Experts Won't Train Your AI. You Have to Interview Them.

99Ravens AI Interviewer System - Multi-agent architecture for expert knowledge capture

In every organization, the most valuable asset is trapped. It's the hard-won, implicit knowledge inside the heads of your best people. They can't write it down in a wiki, and they don't know how to "train" an AI. It only comes out in conversation.

The default solution—a simple chatbot interviewer—fails because it produces generic AI that sounds like Wikipedia, not a person. It averages out expertise into blandness.

The only way to capture true expertise is to build an AI interviewer that earns the trust to be seen as a peer. An equal. One that asks questions so insightful the expert reveals the distinctive methodologies they'd normally only share with another seasoned professional. That is the real technical hurdle. Here is how we cleared it.

Achieving Peer Status

The entire system hinges on one question: can the AI achieve peer status?

When it does, something unlocks. Experts we've interviewed consistently say the same thing: it feels like talking to a colleague who actually gets it. There's no judgment, no time pressure, just focused curiosity. This is the space where they articulate the implicit knowledge they've never verbalized before.

A single-agent LLM can't maintain this status. It's patient, but it's not a colleague. It gets lost in fascinating tangents, repeats questions, and lacks the strategic focus to separate gold from gravel. A 40-minute interview generates too much context. The agent either loses track or consumes its context window re-reading the transcript. It breaks the illusion of peerage.

That failure led us to a different architecture.

The Multi-Agent Solution: Interviewer + Assistant

A single agent is a great conversationalist. An intelligent system requires a second layer: an executive function. This is why we moved to a multi-agent architecture. The Interviewer talks; the Note-Taker thinks.

The Note-Taker is an internal tool, invisible to the expert, that continuously analyzes the conversation. The Interviewer queries it for structured progress reports.

What the Note-Taker Tracks:

Coverage Analysis: Topics explored with confidence levels (high/medium/low).
Gap Identification: Required areas not yet addressed, prioritized by importance.
Time Status: Pacing assessment against the target duration, with wrap-up triggers.
Pattern Detection: Emerging themes, contradictions, or when the expert defaults to generic "best practices."
Next Action: A specific suggestion for the next probe.

The key insight is that the Note-Taker returns structured data, not prose. This prevents the Interviewer from getting confused by a second voice. If the Note-Taker flags a gap in "decision-making frameworks," the Interviewer integrates the suggestion naturally: "You mentioned evaluating channels—walk me through a recent decision where you chose not to invest somewhere."

The Note-Taker is the working memory. The Interviewer stays in the conversation, focused on being a peer. This separation of concerns also improves security and prompt integrity.

The Architecture of a Peer

We built the Interviewer agent to operate in four stages.

Planning: A role-specific system prompt is loaded based on expert type. The interview topic and script are injected as variables.
Interviewing: Real-time adaptive conversation. History is preserved.
Analysis: The final transcript is converted into a reusable persona prompt.
Validation: The expert reviews sample outputs from their persona and scores fidelity on a 1–5 scale. Low scores trigger targeted follow-up interviews.

The interviewer persona itself is designed with a detailed methodology and constraints. It asks a single question per turn. It gives brief acknowledgments. It references the interviewee's specific language. And it adheres to a target duration, enforced by the Note-Taker.

The Prize: A Reusable Digital Persona

The output is not a transcript. It is the expert's thinking and methodology codified into an AI's native language: a first-person system prompt written as if the expert is introducing themselves.

This becomes a reusable digital persona—a prompt library entry, a custom Gemini Gem or GPT, a fine-tuning dataset, or an API endpoint that runs the expert's reasoning loop. This is what we mean when we say "expertise becomes software".

The Persona Includes:

Professional identity and domain positioning.
Core beliefs and non-negotiable principles.
Specific frameworks and named methodologies.
Communication style and constraints.
Few-shot examples of their reasoning patterns.
Knowledge base references with usage instructions.

Lessons from the Edge

Building a system for long-form interviews surfaces hard-won rules.

On Architecture:

Role-specific prompts are mandatory. Generic prompts produce shallow interviews.
Multi-agent beats single-agent for any interview beyond 10 minutes.
The Note-Taker must return structured data. Natural language causes agent confusion.
Probe for mistakes, not just successes. Without this, personas sound like Wikipedia. You miss their judgment.
Time enforcement needs an external trigger. An agent's prompt-based time awareness is unreliable when a conversation gets interesting.

On Data Handling:

Convert HTML to markdown before injection to reduce noise.
Filter to human/AI turns for condensation. System messages are noise.
Attribute who said what in the history to prevent voice-drift when personas switch.

Why Interviews Matter More Than Logs

Anthropic's recent findings confirm the need for this approach. They found a gap between how professionals describe their AI use (65% augmentative) and their actual use (49% automative).

Usage logs tell you what happened in the chat window. Interviews tell you why, how people feel, and what they truly want from AI. It's how you surface the expert's desire to preserve their professional identity—the very thing our system is designed to capture.

The Goal: Scale Expertise, Not Tasks

LLMs are not chat interfaces; they are mirrors. The fundamental choice for every builder is what we ask them to reflect: the generic "best practices" of the internet, or the irreplaceable soul of an expert?

We believe the latter is the only work worth doing. The architecture we've detailed is one path forward, grounded in the principle that to scale expertise, you must first build a system that can truly listen.

We're continuing to refine these methods for the most complex domains—where knowledge is hardest to document and most valuable when scaled. We share these findings openly, in the hope they help others build more authentic, valuable AI. The work continues.