SIMSAMU - A French medical dispatch dialog open dataset

IF 4.9 2区医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computer methods and programs in biomedicine Pub Date : 2025-05-15 DOI:10.1016/j.cmpb.2025.108857

Aimé Nun , Olivier Birot , Gaël Guibon , Frédéric Lapostolle , Ivan Lerner

{"title":"SIMSAMU - A French medical dispatch dialog open dataset","authors":"Aimé Nun , Olivier Birot , Gaël Guibon , Frédéric Lapostolle , Ivan Lerner","doi":"10.1016/j.cmpb.2025.108857","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Dispatch Services (DS) are essential to Emergency Medical Services (EMS). Dispatchers enable patients to access medical assistance in emergencies, anytime and anywhere, within limited time and resources. AI-based decision-support tools hold great promise for dispatchers. Developing these tools requires medical field-specific data. Medical dispatch dialogue is unique: it is a brief phone exchange in an emergency, within a limited time frame, without a physical examination.</div></div><div><h3>Objective</h3><div>Our main objective was to (i) create an open French dataset of medical dispatch dialogues. Our secondary objectives were to (ii) develop a detailed medical dispatch scheme from this dataset using an unsupervised method, and (iii) provide a baseline evaluation of diarization and speech recognition models for this domain in French.</div></div><div><h3>Methods</h3><div>From 2022 to 2023, emergency medicine junior doctors simulated real-life medical dispatch calls. These calls were recorded and transcribed to form the SIMSAMU corpus. We developed a dispatch scheme based on (i) recording analysis, (ii) data-driven utterance typology, and (iii) domain expertise. Utterance typology was derived via hierarchical clustering of representations learned by finetuning BERT embeddings on SIMSAMU. Clusters were mapped to the Roter Interaction Analysis System (RIAS) and included in our dispatch scheme. SIMSAMU was used to train and evaluate state-of-the-art neural network models for diarization and speech recognition. Diarization used the PyaNet model, fine-tuned on the ESLO2 dataset. Speech recognition used a CTC model with pre-trained wav2vec 2.0 embedding, compared to the multilingual Whisper model. The CTC-wav2vec model was further fine-tuned on SIMSAMU and evaluated by leave-one-speaker-out cross-validation.</div></div><div><h3>Results</h3><div>The dataset consists of 61 audio recordings totaling 3 h 14 min. Four clusters were identified for callers and 3 for dispatchers. Two main dialogue phases were identified: interrogation and contractualization. The diarization model achieved a 10.4 % error rate. Speech recognition word error rates were 35.8 % for Whisper, 24.8 % for the CTC-wav2vec model fine-tuned on ESLO2, and 16.1 % after in-domain fine-tuning.</div></div><div><h3>Conclusion</h3><div>We propose a French open medical dispatch dialogue dataset and an expert-validated schema of the medical dispatch dialogue based on unsupervised analysis. Notable gaps in how well speech recognition models generalize underscore the need for targeted, in-domain fine-tuning in this specialized application. SIMSAMU is designed to support this effort by serving as a benchmark for evaluating domain-adapted speech recognition and dialogue modeling strategies.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"268 ","pages":"Article 108857"},"PeriodicalIF":4.9000,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260725002743","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Background

Dispatch Services (DS) are essential to Emergency Medical Services (EMS). Dispatchers enable patients to access medical assistance in emergencies, anytime and anywhere, within limited time and resources. AI-based decision-support tools hold great promise for dispatchers. Developing these tools requires medical field-specific data. Medical dispatch dialogue is unique: it is a brief phone exchange in an emergency, within a limited time frame, without a physical examination.

Objective

Our main objective was to (i) create an open French dataset of medical dispatch dialogues. Our secondary objectives were to (ii) develop a detailed medical dispatch scheme from this dataset using an unsupervised method, and (iii) provide a baseline evaluation of diarization and speech recognition models for this domain in French.

Methods

From 2022 to 2023, emergency medicine junior doctors simulated real-life medical dispatch calls. These calls were recorded and transcribed to form the SIMSAMU corpus. We developed a dispatch scheme based on (i) recording analysis, (ii) data-driven utterance typology, and (iii) domain expertise. Utterance typology was derived via hierarchical clustering of representations learned by finetuning BERT embeddings on SIMSAMU. Clusters were mapped to the Roter Interaction Analysis System (RIAS) and included in our dispatch scheme. SIMSAMU was used to train and evaluate state-of-the-art neural network models for diarization and speech recognition. Diarization used the PyaNet model, fine-tuned on the ESLO2 dataset. Speech recognition used a CTC model with pre-trained wav2vec 2.0 embedding, compared to the multilingual Whisper model. The CTC-wav2vec model was further fine-tuned on SIMSAMU and evaluated by leave-one-speaker-out cross-validation.

Results

The dataset consists of 61 audio recordings totaling 3 h 14 min. Four clusters were identified for callers and 3 for dispatchers. Two main dialogue phases were identified: interrogation and contractualization. The diarization model achieved a 10.4 % error rate. Speech recognition word error rates were 35.8 % for Whisper, 24.8 % for the CTC-wav2vec model fine-tuned on ESLO2, and 16.1 % after in-domain fine-tuning.

Conclusion

We propose a French open medical dispatch dialogue dataset and an expert-validated schema of the medical dispatch dialogue based on unsupervised analysis. Notable gaps in how well speech recognition models generalize underscore the need for targeted, in-domain fine-tuning in this specialized application. SIMSAMU is designed to support this effort by serving as a benchmark for evaluating domain-adapted speech recognition and dialogue modeling strategies.

查看原文本刊更多论文

一个法国医疗调度对话开放数据集

背景调度服务（DS）对紧急医疗服务（EMS）至关重要。调度员使病人在紧急情况下，在有限的时间和资源内，随时随地获得医疗援助。基于人工智能的决策支持工具为调度员带来了巨大的希望。开发这些工具需要特定于医疗领域的数据。医疗调度对话是独特的：它是在紧急情况下，在有限的时间内，不进行身体检查的简短电话交流。我们的主要目标是(i)创建一个开放的法语医疗调度对话数据集。我们的次要目标是（ii）使用无监督方法从该数据集开发详细的医疗调度方案，以及（iii）用法语提供该领域的数字化和语音识别模型的基线评估。方法从2022年到2023年，急诊医学初级医生模拟真实的医疗调度呼叫。这些呼叫被记录下来并转录成SIMSAMU语料库。我们开发了一个基于(i)记录分析，（ii）数据驱动的话语类型学和（iii）领域专业知识的调度方案。通过对SIMSAMU上的BERT嵌入进行微调，得到表征的分层聚类，从而得到话语类型。群集被映射到Roter相互作用分析系统（RIAS），并纳入我们的调度方案。SIMSAMU用于训练和评估最先进的神经网络模型，用于拨号和语音识别。Diarization使用了PyaNet模型，并对ESLO2数据集进行了微调。与多语言Whisper模型相比，语音识别使用了带有预训练的wav2vec 2.0嵌入的CTC模型。在SIMSAMU上进一步对CTC-wav2vec模型进行了微调，并通过留一个扬声器的交叉验证进行了评估。结果该数据集由61段录音组成，总计3小时14分钟。呼叫者分为4组，调度员分为3组。确定了两个主要的对话阶段：审问和契约化。该模型的错误率为10.4%。在ESLO2上微调的cbc -wav2vec模型的语音识别错误率为35.8%，在域内微调后的错误率为16.1%。我们提出了一个法国开放医疗调度对话数据集和一个基于无监督分析的专家验证的医疗调度对话模式。语音识别模型泛化程度的显著差距强调了在这一专门应用中需要有针对性的、领域内的微调。SIMSAMU旨在通过作为评估领域适应语音识别和对话建模策略的基准来支持这一努力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer methods and programs in biomedicine 工程技术-工程：生物医学

CiteScore

12.30

自引率

6.60%

发文量

601

审稿时长

135 days

期刊介绍： To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine. Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.