Resources
99Ravens AI Interviewer System - Multi-agent architecture for expert knowledge capture

In every organization, the most valuable asset is trapped. It's the hard-won, implicit knowledge inside the heads of your best people. They can't write it down in a wiki, and they don't know how to "train" an AI. It only comes out in conversation.

The default solution—a simple chatbot interviewer—fails because it produces generic AI that sounds like Wikipedia, not a person. It averages out expertise into blandness.

The only way to capture true expertise is to build an AI interviewer that earns the trust to be seen as a peer. An equal. One that asks questions so insightful the expert reveals the distinctive methodologies they'd normally only share with another seasoned professional. That is the real technical hurdle. Here is how we cleared it.

Achieving Peer Status

The entire system hinges on one question: can the AI achieve peer status?

When it does, something unlocks. Experts we've interviewed consistently say the same thing: it feels like talking to a colleague who actually gets it. There's no judgment, no time pressure, just focused curiosity. This is the space where they articulate the implicit knowledge they've never verbalized before.

A single-agent LLM can't maintain this status. It's patient, but it's not a colleague. It gets lost in fascinating tangents, repeats questions, and lacks the strategic focus to separate gold from gravel. A 40-minute interview generates too much context. The agent either loses track or consumes its context window re-reading the transcript. It breaks the illusion of peerage.

That failure led us to a different architecture.

The Multi-Agent Solution: Interviewer + Assistant

A single agent is a great conversationalist. An intelligent system requires a second layer: an executive function. This is why we moved to a multi-agent architecture. The Interviewer talks; the Note-Taker thinks.

The Note-Taker is an internal tool, invisible to the expert, that continuously analyzes the conversation. The Interviewer queries it for structured progress reports.

What the Note-Taker Tracks:

The key insight is that the Note-Taker returns structured data, not prose. This prevents the Interviewer from getting confused by a second voice. If the Note-Taker flags a gap in "decision-making frameworks," the Interviewer integrates the suggestion naturally: "You mentioned evaluating channels—walk me through a recent decision where you chose not to invest somewhere."

The Note-Taker is the working memory. The Interviewer stays in the conversation, focused on being a peer. This separation of concerns also improves security and prompt integrity.

The Architecture of a Peer

We built the Interviewer agent to operate in four stages.

  1. Planning: A role-specific system prompt is loaded based on expert type. The interview topic and script are injected as variables.
  2. Interviewing: Real-time adaptive conversation. History is preserved.
  3. Analysis: The final transcript is converted into a reusable persona prompt.
  4. Validation: The expert reviews sample outputs from their persona and scores fidelity on a 1–5 scale. Low scores trigger targeted follow-up interviews.

The interviewer persona itself is designed with a detailed methodology and constraints. It asks a single question per turn. It gives brief acknowledgments. It references the interviewee's specific language. And it adheres to a target duration, enforced by the Note-Taker.

The Prize: A Reusable Digital Persona

The output is not a transcript. It is the expert's thinking and methodology codified into an AI's native language: a first-person system prompt written as if the expert is introducing themselves.

This becomes a reusable digital persona—a prompt library entry, a custom Gemini Gem or GPT, a fine-tuning dataset, or an API endpoint that runs the expert's reasoning loop. This is what we mean when we say "expertise becomes software".

The Persona Includes:

Lessons from the Edge

Building a system for long-form interviews surfaces hard-won rules.

On Architecture:

On Data Handling:

Why Interviews Matter More Than Logs

Anthropic's recent findings confirm the need for this approach. They found a gap between how professionals describe their AI use (65% augmentative) and their actual use (49% automative).

Usage logs tell you what happened in the chat window. Interviews tell you why, how people feel, and what they truly want from AI. It's how you surface the expert's desire to preserve their professional identity—the very thing our system is designed to capture.

The Goal: Scale Expertise, Not Tasks

LLMs are not chat interfaces; they are mirrors. The fundamental choice for every builder is what we ask them to reflect: the generic "best practices" of the internet, or the irreplaceable soul of an expert?

We believe the latter is the only work worth doing. The architecture we've detailed is one path forward, grounded in the principle that to scale expertise, you must first build a system that can truly listen.

We're continuing to refine these methods for the most complex domains—where knowledge is hardest to document and most valuable when scaled. We share these findings openly, in the hope they help others build more authentic, valuable AI. The work continues.


Related Reading

Ready to scale your organization's expertise? Request a demo to see how 99Ravens can transform your best people's knowledge into reusable AI capabilities.

Work with us

Expert knowledge becomes working software.

We codify expertise for senior practitioners and the enterprises that want to build with it. Tell us what you're solving.

Start a conversation