从养老院用药事故报告中提取多种事故因素的多标签分类器的构建：自然语言处理方法

IF 3.1 3区医学 Q2 MEDICAL INFORMATICS

JMIR Medical Informatics Pub Date : 2024-07-23 DOI:10.2196/58141

Hayato Kizaki, Hiroki Satoh, Sayaka Ebara, Satoshi Watabe, Yasufumi Sawada, Shungo Imai, Satoko Hori

{"title":"从养老院用药事故报告中提取多种事故因素的多标签分类器的构建：自然语言处理方法","authors":"Hayato Kizaki, Hiroki Satoh, Sayaka Ebara, Satoshi Watabe, Yasufumi Sawada, Shungo Imai, Satoko Hori","doi":"10.2196/58141","DOIUrl":null,"url":null,"abstract":"Background: Medication safety in residential care facilities is a critical concern, particularly when nonmedical staff provide medication assistance. The complex nature of medication-related incidents in these settings, coupled with the psychological impact on health care providers, underscores the need for effective incident analysis and preventive strategies. A thorough understanding of the root causes, typically through incident-report analysis, is essential for mitigating medication-related incidents.Objective: We aimed to develop and evaluate a multilabel classifier using natural language processing to identify factors contributing to medication-related incidents using incident report descriptions from residential care facilities, with a focus on incidents involving nonmedical staff.Methods: We analyzed 2143 incident reports, comprising 7121 sentences, from residential care facilities in Japan between April 1, 2015, and March 31, 2016. The incident factors were annotated using sentences based on an established organizational factor model and previous research findings. The following 9 factors were defined: procedure adherence, medicine, resident, resident family, nonmedical staff, medical staff, team, environment, and organizational management. To assess the label criteria, 2 researchers with relevant medical knowledge annotated a subset of 50 reports; the interannotator agreement was measured using Cohen κ. The entire data set was subsequently annotated by 1 researcher. Multiple labels were assigned to each sentence. A multilabel classifier was developed using deep learning models, including 2 Bidirectional Encoder Representations From Transformers (BERT)-type models (Tohoku-BERT and a University of Tokyo Hospital BERT pretrained with Japanese clinical text: UTH-BERT) and an Efficiently Learning Encoder That Classifies Token Replacements Accurately (ELECTRA), pretrained on Japanese text. Both sentence- and report-level training were performed; the performance was evaluated by the F1-score and exact match accuracy through 5-fold cross-validation.Results: Among all 7121 sentences, 1167, 694, 2455, 23, 1905, 46, 195, 1104, and 195 included \"procedure adherence,\" \"medicine,\" \"resident,\" \"resident family,\" \"nonmedical staff,\" \"medical staff,\" \"team,\" \"environment,\" and \"organizational management,\" respectively. Owing to limited labels, \"resident family\" and \"medical staff\" were omitted from the model development process. The interannotator agreement values were higher than 0.6 for each label. A total of 10, 278, and 1855 reports contained no, 1, and multiple labels, respectively. The models trained using the report data outperformed those trained using sentences, with macro F1-scores of 0.744, 0.675, and 0.735 for Tohoku-BERT, UTH-BERT, and ELECTRA, respectively. The report-trained models also demonstrated better exact match accuracy, with 0.411, 0.389, and 0.399 for Tohoku-BERT, UTH-BERT, and ELECTRA, respectively. Notably, the accuracy was consistent even when the analysis was confined to reports containing multiple labels.Conclusions: The multilabel classifier developed in our study demonstrated potential for identifying various factors associated with medication-related incidents using incident reports from residential care facilities. Thus, this classifier can facilitate prompt analysis of incident factors, thereby contributing to risk management and the development of preventive strategies.","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e58141"},"PeriodicalIF":3.1000,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11303886/pdf/","citationCount":"0","resultStr":"{\"title\":\"Construction of a Multi-Label Classifier for Extracting Multiple Incident Factors From Medication Incident Reports in Residential Care Facilities: Natural Language Processing Approach.\",\"authors\":\"Hayato Kizaki, Hiroki Satoh, Sayaka Ebara, Satoshi Watabe, Yasufumi Sawada, Shungo Imai, Satoko Hori\",\"doi\":\"10.2196/58141\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Medication safety in residential care facilities is a critical concern, particularly when nonmedical staff provide medication assistance. The complex nature of medication-related incidents in these settings, coupled with the psychological impact on health care providers, underscores the need for effective incident analysis and preventive strategies. A thorough understanding of the root causes, typically through incident-report analysis, is essential for mitigating medication-related incidents.Objective: We aimed to develop and evaluate a multilabel classifier using natural language processing to identify factors contributing to medication-related incidents using incident report descriptions from residential care facilities, with a focus on incidents involving nonmedical staff.Methods: We analyzed 2143 incident reports, comprising 7121 sentences, from residential care facilities in Japan between April 1, 2015, and March 31, 2016. The incident factors were annotated using sentences based on an established organizational factor model and previous research findings. The following 9 factors were defined: procedure adherence, medicine, resident, resident family, nonmedical staff, medical staff, team, environment, and organizational management. To assess the label criteria, 2 researchers with relevant medical knowledge annotated a subset of 50 reports; the interannotator agreement was measured using Cohen κ. The entire data set was subsequently annotated by 1 researcher. Multiple labels were assigned to each sentence. A multilabel classifier was developed using deep learning models, including 2 Bidirectional Encoder Representations From Transformers (BERT)-type models (Tohoku-BERT and a University of Tokyo Hospital BERT pretrained with Japanese clinical text: UTH-BERT) and an Efficiently Learning Encoder That Classifies Token Replacements Accurately (ELECTRA), pretrained on Japanese text. Both sentence- and report-level training were performed; the performance was evaluated by the F1-score and exact match accuracy through 5-fold cross-validation.Results: Among all 7121 sentences, 1167, 694, 2455, 23, 1905, 46, 195, 1104, and 195 included \\\"procedure adherence,\\\" \\\"medicine,\\\" \\\"resident,\\\" \\\"resident family,\\\" \\\"nonmedical staff,\\\" \\\"medical staff,\\\" \\\"team,\\\" \\\"environment,\\\" and \\\"organizational management,\\\" respectively. Owing to limited labels, \\\"resident family\\\" and \\\"medical staff\\\" were omitted from the model development process. The interannotator agreement values were higher than 0.6 for each label. A total of 10, 278, and 1855 reports contained no, 1, and multiple labels, respectively. The models trained using the report data outperformed those trained using sentences, with macro F1-scores of 0.744, 0.675, and 0.735 for Tohoku-BERT, UTH-BERT, and ELECTRA, respectively. The report-trained models also demonstrated better exact match accuracy, with 0.411, 0.389, and 0.399 for Tohoku-BERT, UTH-BERT, and ELECTRA, respectively. Notably, the accuracy was consistent even when the analysis was confined to reports containing multiple labels.Conclusions: The multilabel classifier developed in our study demonstrated potential for identifying various factors associated with medication-related incidents using incident reports from residential care facilities. Thus, this classifier can facilitate prompt analysis of incident factors, thereby contributing to risk management and the development of preventive strategies.\",\"PeriodicalId\":56334,\"journal\":{\"name\":\"JMIR Medical Informatics\",\"volume\":\"12 \",\"pages\":\"e58141\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-07-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11303886/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR Medical Informatics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.2196/58141\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/58141","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

摘要

背景：住院护理设施中的用药安全是一个至关重要的问题，尤其是在非医务人员提供用药协助的情况下。在这些环境中，与用药相关的事故性质复杂，加上对医护人员的心理影响，凸显了有效的事故分析和预防策略的必要性。通常情况下，通过事故报告分析来透彻了解根本原因，对于减少用药相关事故至关重要：我们的目标是开发并评估一种使用自然语言处理的多标签分类器，该分类器可利用寄宿式护理机构的事故报告描述识别导致药物相关事故的因素，重点关注涉及非医务人员的事故：我们分析了 2015 年 4 月 1 日至 2016 年 3 月 31 日期间来自日本养老机构的 2143 份事故报告，其中包含 7121 个句子。根据已建立的组织因素模型和以往的研究成果，使用句子对事件因素进行了注释。定义了以下 9 个因素：程序遵守、医疗、居民、居民家庭、非医疗人员、医疗人员、团队、环境和组织管理。为了评估标注标准，两名具有相关医学知识的研究人员对 50 份报告的子集进行了标注；标注者之间的一致性采用 Cohen κ 进行测量。随后，由一名研究人员对整个数据集进行标注。每个句子都有多个标签。使用深度学习模型开发了多标签分类器，其中包括 2 个双向编码器表征转换器（BERT）型模型（Tohoku-BERT 和东京大学医院 BERT，使用日语临床文本进行预训练：UTH-BERT）和以日语文本为基础进行预训练的 "可准确分类标记替换的高效学习编码器"（ELECTRA）。对句子和报告进行了训练；通过 5 倍交叉验证，以 F1 分数和精确匹配准确率来评估性能：在所有 7121 个句子中，分别有 1167、694、2455、23、1905、46、195、1104 和 195 个句子包含 "遵守程序"、"医学"、"居民"、"居民家庭"、"非医务人员"、"医务人员"、"团队"、"环境 "和 "组织管理"。由于标签有限，"居民家庭 "和 "医务人员 "在模型开发过程中被省略。每个标签的注释者间一致值均高于 0.6。分别有 10 份、278 份和 1855 份报告没有、1 份和多个标签。使用报告数据训练的模型优于使用句子训练的模型，Tohoku-BERT、UTH-BERT 和 ELECTRA 的宏观 F1 分数分别为 0.744、0.675 和 0.735。经过报告训练的模型也表现出更高的精确匹配准确率，Tohoku-BERT、UTH-BERT 和 ELECTRA 的准确匹配准确率分别为 0.411、0.389 和 0.399。值得注意的是，即使只分析包含多个标签的报告，准确率也是一致的：我们在研究中开发的多标签分类器证明了它在利用养老院的事故报告识别与用药相关事故有关的各种因素方面的潜力。因此，该分类器可促进对事故因素的及时分析，从而有助于风险管理和预防策略的制定。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Construction of a Multi-Label Classifier for Extracting Multiple Incident Factors From Medication Incident Reports in Residential Care Facilities: Natural Language Processing Approach.

Background: Medication safety in residential care facilities is a critical concern, particularly when nonmedical staff provide medication assistance. The complex nature of medication-related incidents in these settings, coupled with the psychological impact on health care providers, underscores the need for effective incident analysis and preventive strategies. A thorough understanding of the root causes, typically through incident-report analysis, is essential for mitigating medication-related incidents.

Objective: We aimed to develop and evaluate a multilabel classifier using natural language processing to identify factors contributing to medication-related incidents using incident report descriptions from residential care facilities, with a focus on incidents involving nonmedical staff.

Methods: We analyzed 2143 incident reports, comprising 7121 sentences, from residential care facilities in Japan between April 1, 2015, and March 31, 2016. The incident factors were annotated using sentences based on an established organizational factor model and previous research findings. The following 9 factors were defined: procedure adherence, medicine, resident, resident family, nonmedical staff, medical staff, team, environment, and organizational management. To assess the label criteria, 2 researchers with relevant medical knowledge annotated a subset of 50 reports; the interannotator agreement was measured using Cohen κ. The entire data set was subsequently annotated by 1 researcher. Multiple labels were assigned to each sentence. A multilabel classifier was developed using deep learning models, including 2 Bidirectional Encoder Representations From Transformers (BERT)-type models (Tohoku-BERT and a University of Tokyo Hospital BERT pretrained with Japanese clinical text: UTH-BERT) and an Efficiently Learning Encoder That Classifies Token Replacements Accurately (ELECTRA), pretrained on Japanese text. Both sentence- and report-level training were performed; the performance was evaluated by the F₁-score and exact match accuracy through 5-fold cross-validation.

Results: Among all 7121 sentences, 1167, 694, 2455, 23, 1905, 46, 195, 1104, and 195 included "procedure adherence," "medicine," "resident," "resident family," "nonmedical staff," "medical staff," "team," "environment," and "organizational management," respectively. Owing to limited labels, "resident family" and "medical staff" were omitted from the model development process. The interannotator agreement values were higher than 0.6 for each label. A total of 10, 278, and 1855 reports contained no, 1, and multiple labels, respectively. The models trained using the report data outperformed those trained using sentences, with macro F₁-scores of 0.744, 0.675, and 0.735 for Tohoku-BERT, UTH-BERT, and ELECTRA, respectively. The report-trained models also demonstrated better exact match accuracy, with 0.411, 0.389, and 0.399 for Tohoku-BERT, UTH-BERT, and ELECTRA, respectively. Notably, the accuracy was consistent even when the analysis was confined to reports containing multiple labels.

Conclusions: The multilabel classifier developed in our study demonstrated potential for identifying various factors associated with medication-related incidents using incident reports from residential care facilities. Thus, this classifier can facilitate prompt analysis of incident factors, thereby contributing to risk management and the development of preventive strategies.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

JMIR Medical Informatics Medicine-Health Informatics

CiteScore

7.90

自引率

3.10%

发文量

173

审稿时长

12 weeks

期刊介绍： JMIR Medical Informatics (JMI, ISSN 2291-9694) is a top-rated, tier A journal which focuses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, ehealth infrastructures and implementation. It has a focus on applied, translational research, with a broad readership including clinicians, CIOs, engineers, industry and health informatics professionals. Published by JMIR Publications, publisher of the Journal of Medical Internet Research (JMIR), the leading eHealth/mHealth journal (Impact Factor 2016: 5.175), JMIR Med Inform has a slightly different scope (emphasizing more on applications for clinicians and health professionals rather than consumers/citizens, which is the focus of JMIR), publishes even faster, and also allows papers which are more technical or more formative than what would be published in the Journal of Medical Internet Research.