Feelpath Logo
Dec 2025 · ResearchDraft

From Words to Measurement: A Transcript-Based Alexithymia Profile (ALI) Aligned with the Perth Alexithymia Questionnaire (PAQ)

We introduce an interpretable, transcript-based profile: the Alexithymia Language Index (ALI). It is derived from clients' spontaneous language during psychotherapy, aligned with the Perth Alexithymia Questionnaire (PAQ) structure and focused on auditable, evidence-based markers rather than numeric scores.

Overview

The Alexithymia Language Index (ALI) is a transcript-based measure computed from psychotherapy conversations. It is designed to align with the Perth Alexithymia Questionnaire (PAQ) structure (including Difficulty Identifying Feelings (DIF), Difficulty Describing Feelings (DDF), and Externally Oriented Thinking (EOT)), while remaining transparent, auditable, and suitable for psychometric validation.

Instead of self-report items, ALI works directly with session transcripts, capturing how people actually talk about feelings when invited to do so. The current implementation focuses on constructing structured, per-session profiles grounded in verbatim excerpts; numeric scoring and validation against PAQ-24 are planned but intentionally kept out of this initial draft.

What ALI produces today

Today, ALI outputs structured profiles made of evidence markers instead of scores. Each marker represents a concrete moment in the conversation, with enough context for clinicians and researchers to inspect and debate.

  • Evidence markers: short verbatim excerpts drawn from participant transcripts, each with a brief justification and labels for DIF, DDF, or EOT, plus affect valence and direction (supports vs contradicts the facet).
  • Per-session, per-participant profiles: ALI aggregates markers into profiles at the session–participant level, rather than collapsing everything into a single global score.
  • Deterministic context windows: every stored excerpt is verified as a contiguous substring of the transcript and paired with fixed before/after windows computed deterministically.

Method in brief

The paper walks through a minimal, auditable extraction pipeline that turns raw transcripts into ALI profiles. At a high level, ALI identifies opportunities to talk about feelings, reads responses to those opportunities, and tags language that either supports or contradicts PAQ-style alexithymia patterns.

  1. Transcript bundling: build participant-level transcript bundles that combine the best available participant transcript with the full-session transcript for context.
  2. Facet-wise extraction: for each PAQ facet and valence, a language model proposes candidate evidence spans with facet labels, direction, and short rationales.
  3. Verification and normalization: only excerpts that can be matched exactly in the participant transcript are kept; near-misses and hallucinated spans are dropped rather than repaired in place.
  4. Profile construction: verified markers are flattened into an ALI profile that can later support scoring, visualization, and validation analyses.

Planned validation

The current draft stops at the profile layer; it does not report numeric scores or psychometric statistics. The validation program is sketched for readers who want to understand where this work is heading.

  • Convergent validity with PAQ-24: derive simple numeric descriptors from ALI markers and test whether they align with the matching PAQ-24 subscales (DIF, DDF, EOT) more strongly than with non-matching facets.
  • Reliability over time: examine session-to-session stability of ALI profiles for the same clients using straightforward test–retest metrics.
  • Human–model agreement: compare ALI-derived labels with human-coded excerpts for DIF, DDF, and EOT using agreement statistics.
  • Fairness and generalization: monitor how well ALI profiles transfer across sites, populations, and language varieties, with explicit checks for systematic bias.

Safety and privacy

ALI is designed to fit within HIPAA-focused, research-ready workflows. The paper outlines a zero-retention posture for language models and keeps transcripts auditable without overexposing sensitive data.

  • Processing environment: on-prem or providers configured with strict no-training, zero-retention contracts, with local models as an option.
  • De-identification before analysis: PHI is masked prior to any model calls, and logs store only hashed window identifiers plus short evidence spans.
  • Controlled access and retention: role-based permissions, explicit retention windows, and documented incident response for research deployments.

Code, examples, and next steps

The paper's appendix points to open resources that will be shared alongside the manuscript, so that collaborators can inspect prompts, evidence-span extractors, and worked examples.

  • Versioned prompts and evidence-span extractors in a public repository.
  • Synthetic example transcripts demonstrating ALI markers and profiles.
  • A working scoring manual describing how to interpret markers and profiles in practice.

If you are interested in collaborating on validation studies or extensions of ALI, the homepage for research partners describes current priorities and ways to get in touch.