Featured image for: Using AI to Synthesize Customer Interviews at Scale
AI & Product

Using AI to Synthesize Customer Interviews at Scale

Running customer interviews is only half the work. The half that usually doesn't get done — or gets done badly — is synthesis: turning a stack of transcripts into clear, actionable insights that actually influence what gets built.

The bottleneck is time. A 45-minute interview produces 6,000–8,000 words of transcript. Ten interviews is 60,000–80,000 words — roughly a short novel. Reading all of it carefully, identifying patterns, coding themes, and writing up findings is 15–20 hours of work that most PMs don't have.

AI changes the maths significantly. Here's how to use it well.

What Good Synthesis Produces

Before getting into the workflow, it helps to be precise about what synthesis is trying to achieve. Good interview synthesis produces:

Theme inventory — A list of topics that appeared across interviews, ranked by frequency (how many sessions each topic appeared in, not just how many times someone mentioned it).

Insight statements — Declarative sentences that express what you learned: "Users treat the export function as their primary way of sharing work with stakeholders outside the platform, even when native sharing exists, because they don't trust recipient access will work." Not "users mentioned export a lot."

Supporting evidence — Specific quotes that illustrate each insight. These are the citations you'll use in PRDs and roadmap discussions. Without them, insights are just assertions.

Segment differentiation — Whether the same insight holds across all user types or is specific to a segment. Synthesis that doesn't separate "all users" from "enterprise users" or "power users" from "occasional users" produces misleadingly uniform findings.

AI can produce all four of these reliably when given well-structured inputs. The quality depends heavily on the quality of the transcripts and the clarity of your synthesis goals.

Preparing Your Inputs

The single most impactful thing you can do before running AI synthesis is clean and label your transcripts.

Label each transcript with:

  • Participant ID or pseudonym
  • User segment (if you have multiple)
  • Interview date
  • Key product areas discussed

Remove:

  • Scheduling small talk from the beginning and end
  • Moderator instructions that weren't part of the substantive discussion
  • Irrelevant tangents (if they're clearly off-topic)

Don't:

  • Edit the substance of what participants said
  • Remove hesitations, corrections, or qualifications — these carry signal
  • Combine multiple interviews into a single document (process them separately, then synthesise across)

This takes 5–10 minutes per transcript. It dramatically improves AI output quality.

The Synthesis Workflow

Step 1: Per-interview summary (AI)

For each transcript, generate a structured summary using a consistent prompt:

Summarise this user interview transcript. Include:
1. Participant profile (what we know about them from the session)
2. Key topics discussed (bullet list)
3. Top 3–5 insights (declarative statements, not summaries)
4. Best 3–5 quotes (verbatim, with context)
5. Anything surprising or contradictory

Interview transcript:
[paste transcript]

This generates a structured single-page summary for each interview. Save these.

Step 2: Cross-interview theme extraction (AI)

Once you have per-interview summaries, run a second pass across all of them:

I have summaries from [N] customer interviews. Identify:
1. Themes that appear in 3 or more interviews (with frequency count)
2. Themes specific to a single user segment
3. Contradictions or tensions between what different participants said
4. Any insights that appear only once but are particularly significant

Interview summaries:
[paste all summaries]

This produces your theme inventory. The frequency counts will need verification against your raw summaries, but they're a reliable starting point.

Step 3: Insight articulation (Human + AI)

Take the theme inventory and, for each theme, articulate an insight statement. This is where human judgment enters most importantly.

AI can draft the insight statement: "Draft an insight statement for the theme 'users struggle with the search interface' based on these quotes: [quotes]."

You validate: Does this accurately represent what users said? Does it reflect the nuance of different user segments? Is it more specific than a generic observation?

A good insight statement is falsifiable — you could imagine evidence that would contradict it — and actionable — it points toward a specific product direction.

Step 4: Evidence assembly (AI-assisted)

For each insight, assemble the supporting quotes. Ask AI to find the 3–5 quotes across all transcripts that best illustrate each insight.

Review these manually. AI finds statistically representative quotes; you may want quotes that are more vivid, more specific, or better matched to your audience's context.

Step 5: Segment analysis (Human judgment)

Review your theme inventory and insight list through the lens of user segments. Ask:

  • Does this insight hold for all segments, or primarily for one?
  • Are there insights that are contradictory across segments?
  • Which segments are over- or under-represented in the data?

This is manual work. AI doesn't have reliable access to the business context (which segments are most valuable to your product) needed to weight this analysis correctly.

Handling Large Interview Corpora

For teams running ongoing research programs — quarterly user panels, monthly interviews, continuous NPS follow-ups — the volume quickly exceeds what you can synthesise per-study.

The approach that scales: maintain a living insight library.

Structure: A database where each row is an insight statement, with columns for: supporting quotes, participant segments, date range, confidence level (how many sessions support this), and links to the full transcripts.

Workflow: After each study, add new insights to the library. For existing insights that a new study confirms, update the frequency count and add new supporting quotes. For insights that a new study contradicts, flag them as "challenged" with the new evidence.

Retrieval: When writing a PRD, query the insight library for the relevant product area. AI can help match a query ("what do we know about the export workflow?") to the relevant insights, even if the phrasing differs.

This is the foundation of the "product second brain" — a connected, queryable body of evidence that grows more valuable as it accumulates.

The Common Synthesis Errors

Averaging across segments — Insights that are true for 80% of users but critically wrong for your highest-value 20% will mislead. Always segment before concluding.

Confirming rather than discovering — Running AI synthesis with implicit assumptions baked into the prompts. "Find evidence that users want faster search" is confirmation; "What do users say about their experience with search?" is discovery.

Insight inflation — Calling everything an insight. Observations ("users mentioned export in 8 of 10 interviews") are not insights. Insights explain something: "Users treat export as a trust mechanism because they don't believe recipients will navigate the sharing permissions correctly."

Stale library — Insights age. A finding from 18 months ago about user mental models may have changed as your product evolved. Date your insights and flag ones that are more than 12 months old for re-validation.


AI synthesises faster. You decide what it means. The combination — AI for coverage and efficiency, human for interpretation and judgment — produces better synthesis than either alone. The investment is in the habit of doing it consistently rather than letting the transcripts pile up.

Want PRDs grounded in your actual customer data?

PMRead ingests your interviews, Slack threads, and feedback files — and generates PRDs backed by real evidence, not guesses.

Try PMRead free →