Dec 2025 · Study designDraft

A generalizable approach to building conversation-derived personalized psychoeducation

We want to test a simple idea: our words in therapy contain measurable, structured information. If we can reliably extract evidence from real session language, we can turn it into personalized psychoeducation that stays grounded in what was actually said, and test it like any other measurement tool or intervention.

Talk with us about running this study

Why conversations are powerful data

Self-report questionnaires capture what someone believes about their inner life. Therapy conversations capture something different: how someone actually talks about their experience in interaction, especially when invited to reflect, and also when they bring things up spontaneously.

If we treat sessions as data (with speaker turns and timestamps), we can extract verbatim evidence, summarize patterns, and quantify change over time. That same evidence can also power personalized psychoeducation tools that feel specific to the person because they are literally grounded in their own words.

What we’re building and testing

Conversation-derived personalized psychoeducation means: tools that turn real session language into structured outputs backed by quotes from the transcript (so a clinician can audit it).

We use two simple “units” of conversation:

Prompted windows: a therapist question plus the client’s next 1 to 3 turns.
Spontaneous windows: client turns that introduce internal states without a direct prompt.

From those windows, we generate:

Evidence-linked session review: short, readable summaries of key themes, linked to supporting excerpts.
Psychoeducation cards: brief explanations and prompts tailored to what showed up in session (with evidence links).
Guided annotation / practice: brief, structured exercises built from excerpts (e.g., labeling or rephrasing).
Simple signals over time: interpretable language summaries (themes, vocabulary, consistency).

Phase 1: Does this work in the real world?

Before outcome trials, we check whether the pipeline works and whether people trust it.

We also test whether the same extraction rules produce usable, auditable outputs across different sites and clinician styles, without retuning for each setting.

Feasibility: can sites generate transcripts with appropriate governance, and can outputs be produced fast enough to use.
Auditability: do the quotes actually support the output (clinician spot-check).
Use and trust: do clinicians use it; do clients return to it; does it feel helpful (not creepy).
Early signals: do language signals move in plausible directions, without claiming symptom change yet.

Phase 2: Does it improve outcomes?

If Phase 1 works, we run condition-specific studies with validated outcomes.

Alexithymia: PAQ-24 plus ALI evidence markers (as a transcript-grounded complement to self-report).
Other conditions: use the standard validated outcomes for that condition.
Benchmark (performance-based): LEAS (Levels of Emotional Awareness Scale) as an additional reference point for emotional awareness alongside self-report. See Lane & Smith (2021).

The pipeline stays the same. Only the outcomes change.

In Phase 2, language-derived signals (themes, emotion vocabulary, evidence markers) should be treated as process / measurement markers, not as “symptom improvement” unless they are explicitly validated against clinical outcomes in the studied population.

How we describe insights (responsibly)

Before outcome trials: talk about session evidence and transparency, reliability of extraction, and usability, without claiming symptom reduction.
After Phase 1: talk about feasibility, trust, and safety checks (pre-registered), plus early measurable signals.
After Phase 2: talk about condition-specific outcomes, scoped strictly to the population and measures studied.

Who we’re looking for

US-based academic researchers (or clinical research groups) who can run studies with appropriate ethics oversight for therapy-session transcripts.
Labs working on emotion, language, or psychotherapy process who are excited about turning conversations into measurable, auditable signals.
Graduate students / postdocs who want a staged project with publishable Phase 1 endpoints (feasibility + auditability + trust).
Training programs willing to evaluate adoption, workflow fit, and therapist-reviewed rollout in real practice.

We prefer US-based partners because HIPAA compliance and data export logistics are usually simpler, but we’re open to conversations and collaboration with researchers outside the US as well.

Ethics & privacy

This work involves therapy-session transcripts and sensitive text. Any publishable study should use clear participant consent and an appropriate IRB determination. The design should prioritize data minimization, therapist control, and the ability to trace outputs back to specific excerpts (so errors can be audited and corrected).

For governance details, see our Privacy & Compliance guide.