{"title":"Symptom Recognition in Medical Conversations Via multi- Instance Learning and Prompt.","authors":"Hua Wang, Xue-Feng Bai, Xiu-Tao Cui, Gang Chen, Guo-Ming Fan, Guo-Lian Wei, Ye-Ping Zheng, Jing-Jing Wu, Sheng-Sheng Gao","doi":"10.1007/s10916-025-02240-w","DOIUrl":null,"url":null,"abstract":"<p><p>With the widespread adoption of electronic health record (EHR) systems, there is a crucial need for automatic extraction of key symptom information from medical dialogue to support intelligent medical record generation. However, symptom recognition in such dialogues remains challenging because (a) symptom clues are scattered across multi-turn, unstructured conversations, (b) patient descriptions are often informal and deviate from standardized terminology, and (c) many symptom statements are ambiguous or negated, making them difficult for conventional models to interpret. To address these challenges, we propose a novel symptom identification approach that combines multi-instance learning (MIL) with prompt-guided attention for fine-grained symptom identification. In our framework, each conversation is treated as a bag of utterances. A MIL-based model aggregates information across utterances to improve recall and pinpoints which specific utterances mention each symptom, thus enabling sentence-level symptom recognition. Concurrently, a prompt-guided attention strategy leverages standardized symptom terminology as prior knowledge to guide the model in recognizing synonyms, implicit symptom mentions, and negations, thereby improving precision. We further employ R-Drop regularization to enhance robustness against noisy inputs. Experiments on public medical-dialogue datasets demonstrate that our method significantly outperforms existing techniques, achieving an 85.93% F1-score (with 85.09% precision and 86.83% recall) - about 8% points higher than a strong multi-label classification baseline. Notably, our model accurately identifies the specific utterances corresponding to each symptom mention (symptom-utterance pairs), highlighting its fine-grained extraction capability. Ablation studies confirm that the MIL component boosts recall, while the prompt-guided attention component reduces false positives. By precisely locating symptom information within conversations, our approach effectively tackles the issues of dispersed data and inconsistent expressions. This fine-grained symptom documentation capability represents a promising advancement for automated medical information extraction, more intelligent EHR systems, and diagnostic decision support.</p>","PeriodicalId":16338,"journal":{"name":"Journal of Medical Systems","volume":"49 1","pages":"107"},"PeriodicalIF":5.7000,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Systems","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10916-025-02240-w","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
With the widespread adoption of electronic health record (EHR) systems, there is a crucial need for automatic extraction of key symptom information from medical dialogue to support intelligent medical record generation. However, symptom recognition in such dialogues remains challenging because (a) symptom clues are scattered across multi-turn, unstructured conversations, (b) patient descriptions are often informal and deviate from standardized terminology, and (c) many symptom statements are ambiguous or negated, making them difficult for conventional models to interpret. To address these challenges, we propose a novel symptom identification approach that combines multi-instance learning (MIL) with prompt-guided attention for fine-grained symptom identification. In our framework, each conversation is treated as a bag of utterances. A MIL-based model aggregates information across utterances to improve recall and pinpoints which specific utterances mention each symptom, thus enabling sentence-level symptom recognition. Concurrently, a prompt-guided attention strategy leverages standardized symptom terminology as prior knowledge to guide the model in recognizing synonyms, implicit symptom mentions, and negations, thereby improving precision. We further employ R-Drop regularization to enhance robustness against noisy inputs. Experiments on public medical-dialogue datasets demonstrate that our method significantly outperforms existing techniques, achieving an 85.93% F1-score (with 85.09% precision and 86.83% recall) - about 8% points higher than a strong multi-label classification baseline. Notably, our model accurately identifies the specific utterances corresponding to each symptom mention (symptom-utterance pairs), highlighting its fine-grained extraction capability. Ablation studies confirm that the MIL component boosts recall, while the prompt-guided attention component reduces false positives. By precisely locating symptom information within conversations, our approach effectively tackles the issues of dispersed data and inconsistent expressions. This fine-grained symptom documentation capability represents a promising advancement for automated medical information extraction, more intelligent EHR systems, and diagnostic decision support.
期刊介绍:
Journal of Medical Systems provides a forum for the presentation and discussion of the increasingly extensive applications of new systems techniques and methods in hospital clinic and physician''s office administration; pathology radiology and pharmaceutical delivery systems; medical records storage and retrieval; and ancillary patient-support systems. The journal publishes informative articles essays and studies across the entire scale of medical systems from large hospital programs to novel small-scale medical services. Education is an integral part of this amalgamation of sciences and selected articles are published in this area. Since existing medical systems are constantly being modified to fit particular circumstances and to solve specific problems the journal includes a special section devoted to status reports on current installations.