Academic

TriageSim: A Conversational Emergency Triage Simulation Framework from Structured Electronic Health Records

arXiv:2603.10035v1 Announce Type: new Abstract: Research in emergency triage is restricted to structured electronic health records (EHR) due to regulatory constraints on nurse-patient interactions. We introduce TriageSim, a simulation framework for generating persona-conditioned triage conversations from structured records. TriageSim enables multi-turn nurse-patient interactions with explicit control over disfluency and decision behaviour, producing a corpus of ~800 synthetic transcripts and corresponding audio. We use a combination of automated analysis for linguistic, behavioural and acoustic fidelity alongside manual evaluation for medical fidelity using a random subset of 50 conversations. The utility of the generated corpus is examined via conversational triage classification. We observe modest agreement for acuity levels across three modalities: generated synthetic text, ASR transcripts, and direct audio inputs. The code, persona schemata and triage policy prompts for TriageSim

arXiv:2603.10035v1 Announce Type: new Abstract: Research in emergency triage is restricted to structured electronic health records (EHR) due to regulatory constraints on nurse-patient interactions. We introduce TriageSim, a simulation framework for generating persona-conditioned triage conversations from structured records. TriageSim enables multi-turn nurse-patient interactions with explicit control over disfluency and decision behaviour, producing a corpus of ~800 synthetic transcripts and corresponding audio. We use a combination of automated analysis for linguistic, behavioural and acoustic fidelity alongside manual evaluation for medical fidelity using a random subset of 50 conversations. The utility of the generated corpus is examined via conversational triage classification. We observe modest agreement for acuity levels across three modalities: generated synthetic text, ASR transcripts, and direct audio inputs. The code, persona schemata and triage policy prompts for TriageSim will be available upon acceptance.

Executive Summary

TriageSim is a novel simulation framework designed to generate persona-conditioned triage conversations from structured electronic health records (EHR), addressing regulatory constraints on real-time nurse-patient interactions. By leveraging synthetic transcript generation and audio synthesis, TriageSim creates a corpus of approximately 800 synthetic transcripts, offering a scalable resource for training, evaluation, and research in emergency triage. The framework incorporates adjustable disfluency and decision behavior parameters, enhancing realism. Evaluation—both automated and manual—suggests moderate linguistic, behavioral, and acoustic fidelity, with a notable attempt to bridge the gap between structured data and conversational context. While the corpus offers potential for computational linguistics and emergency medicine research, the modest agreement across modalities indicates room for refinement in cross-modality consistency.

Key Points

  • TriageSim generates synthetic triage conversations from structured EHR data
  • Framework enables controlled disfluency and decision behavior in nurse-patient interactions
  • Corpus of ~800 synthetic transcripts and audio is produced for triage research

Merits

Innovation

TriageSim introduces a scalable, controlled simulation framework that fills a gap in EHR-constrained triage research by enabling reproducible, persona-conditioned interactions.

Demerits

Consistency Limitation

Modest agreement across synthetic text, ASR transcripts, and audio inputs suggests variability in cross-modal fidelity, potentially affecting reliability in real-world application.

Expert Commentary

TriageSim represents a significant step toward reconciling the dual imperatives of regulatory compliance and research innovation in emergency medicine. By transforming structured data into conversational simulations, the framework enables ethical and legal compliance while preserving the complexity of human-clinician interactions. The authors’ decision to release code and persona schemata upon acceptance enhances reproducibility and community engagement. That said, the observed modest agreement across modalities indicates a critical area for future work: improving cross-modal consistency through enhanced alignment between structured data transformations and conversational output. This may involve refining disfluency modeling, integrating more nuanced behavioral heuristics, or incorporating richer acoustic metadata. Additionally, expanding manual evaluation beyond 50 conversations could provide more robust validation. Overall, TriageSim’s conceptual design and technical execution merit recognition, and its trajectory aligns with broader trends in synthetic data generation for clinical AI.

Recommendations

  • 1. Expand manual evaluation to a larger sample size (e.g., 200+ conversations) to improve statistical confidence in fidelity assessments.
  • 2. Integrate multimodal alignment metrics (e.g., BLEU-4, coherence indices) to quantify consistency between synthetic text, ASR, and audio.
  • 3. Consider publishing anonymized subsets of the corpus under open-access licenses to accelerate adoption in academic and industry research.

Sources