使用结构化和非结构化数据确定新西兰全科实践的长期条件:一项横断面研究。

IF 4.1 Q1 HEALTH CARE SCIENCES & SERVICES
Yeunhyang Catherine Choi, Katrina Poppe, Vanessa Selak, Allan Ronald Moffitt, Claris Yee Seung Chung, Jane Ullmer, Sue Wells
{"title":"使用结构化和非结构化数据确定新西兰全科实践的长期条件:一项横断面研究。","authors":"Yeunhyang Catherine Choi, Katrina Poppe, Vanessa Selak, Allan Ronald Moffitt, Claris Yee Seung Chung, Jane Ullmer, Sue Wells","doi":"10.1136/bmjhci-2024-101393","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>This study examined whether incorporating free-text entries into structured general practice records improves the detection of long-term conditions (LTCs) and multimorbidity (MM) in New Zealand (NZ) general practices.</p><p><strong>Methods: </strong>Data from 374 071 deidentified individuals in general practices were analysed to identify 61 LTCs. Structured data were extracted using Read codes from a national master list, and clinical raters independently identified condition-related free-text, including synonyms, negation terms and common misspellings in randomised samples. Keywords were categorised and refined through ten iterative tests. Programmatic text classification was developed and assessed against gold-standard clinician ratings, using sensitivity, specificity, positive predictive value (PPV) and F<sub>1</sub>-score.</p><p><strong>Results: </strong>A quarter of general practitioner classifications contained either unrecognised Read codes or consisted of free-text only. Clinician inter-rater reliability was high (kappa ≥0.9). Compared with clinical gold standard, text classification yielded an average sensitivity of 88%, specificity of 99% and PPV of 95%, with an F<sub>1</sub>-score range of 82%-95%. Incorporating free text increased LTC prevalence from 42.1% to 46.3%, reducing misclassification of MM diagnoses by identifying 12 626 additional patients with MM and 15 972 additional patients with at least one LTC.</p><p><strong>Discussion: </strong>In the course of workflow, general practitioners face barriers to accurate LTC coding or may simply annotate with text-based descriptions. Programmatic text classification has demonstrated high performance and identified many more patients receiving LTC care.</p><p><strong>Conclusions: </strong>Combining structured and unstructured data optimises MM detection in NZ general practices and has the potential to improve case management, follow-up care and allocation of healthcare resources.</p>","PeriodicalId":9050,"journal":{"name":"BMJ Health & Care Informatics","volume":"32 1","pages":""},"PeriodicalIF":4.1000,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Identifying long-term conditions in New Zealand general practice using structured and unstructured data: a cross-sectional study.\",\"authors\":\"Yeunhyang Catherine Choi, Katrina Poppe, Vanessa Selak, Allan Ronald Moffitt, Claris Yee Seung Chung, Jane Ullmer, Sue Wells\",\"doi\":\"10.1136/bmjhci-2024-101393\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>This study examined whether incorporating free-text entries into structured general practice records improves the detection of long-term conditions (LTCs) and multimorbidity (MM) in New Zealand (NZ) general practices.</p><p><strong>Methods: </strong>Data from 374 071 deidentified individuals in general practices were analysed to identify 61 LTCs. Structured data were extracted using Read codes from a national master list, and clinical raters independently identified condition-related free-text, including synonyms, negation terms and common misspellings in randomised samples. Keywords were categorised and refined through ten iterative tests. Programmatic text classification was developed and assessed against gold-standard clinician ratings, using sensitivity, specificity, positive predictive value (PPV) and F<sub>1</sub>-score.</p><p><strong>Results: </strong>A quarter of general practitioner classifications contained either unrecognised Read codes or consisted of free-text only. Clinician inter-rater reliability was high (kappa ≥0.9). Compared with clinical gold standard, text classification yielded an average sensitivity of 88%, specificity of 99% and PPV of 95%, with an F<sub>1</sub>-score range of 82%-95%. Incorporating free text increased LTC prevalence from 42.1% to 46.3%, reducing misclassification of MM diagnoses by identifying 12 626 additional patients with MM and 15 972 additional patients with at least one LTC.</p><p><strong>Discussion: </strong>In the course of workflow, general practitioners face barriers to accurate LTC coding or may simply annotate with text-based descriptions. Programmatic text classification has demonstrated high performance and identified many more patients receiving LTC care.</p><p><strong>Conclusions: </strong>Combining structured and unstructured data optimises MM detection in NZ general practices and has the potential to improve case management, follow-up care and allocation of healthcare resources.</p>\",\"PeriodicalId\":9050,\"journal\":{\"name\":\"BMJ Health & Care Informatics\",\"volume\":\"32 1\",\"pages\":\"\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2025-05-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMJ Health & Care Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1136/bmjhci-2024-101393\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ Health & Care Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1136/bmjhci-2024-101393","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

摘要

目的:本研究考察了将自由文本条目纳入结构化全科医生记录中是否可以改善新西兰全科医生对长期疾病(LTCs)和多病(MM)的检测。方法:分析了来自374071名全科医生的资料,确定了61个LTCs。使用Read代码从国家主列表中提取结构化数据,临床评分员独立识别随机样本中与病情相关的自由文本,包括同义词、否定术语和常见拼写错误。通过10次迭代测试对关键词进行分类和细化。程序化文本分类被开发出来,并根据金标准临床医生评分,使用敏感性、特异性、阳性预测值(PPV)和f1评分进行评估。结果:四分之一的全科医生分类包含无法识别的读取代码或仅由自由文本组成。临床医师间信度高(kappa≥0.9)。与临床金标准相比,文本分类的平均灵敏度为88%,特异性为99%,PPV为95%,f1评分范围为82%-95%。结合自由文本将LTC的患病率从42.1%提高到46.3%,通过确定12 626名额外的MM患者和15 972名额外的至少有一种LTC的患者,减少了MM诊断的错误分类。讨论:在工作流程的过程中,全科医生面临着准确LTC编码的障碍,或者可能只是简单地用基于文本的描述进行注释。程序性文本分类显示出高性能,并识别出更多接受LTC护理的患者。结论:结合结构化和非结构化数据优化了新西兰普通医疗实践中的MM检测,并有可能改善病例管理、随访护理和医疗资源分配。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Identifying long-term conditions in New Zealand general practice using structured and unstructured data: a cross-sectional study.

Objectives: This study examined whether incorporating free-text entries into structured general practice records improves the detection of long-term conditions (LTCs) and multimorbidity (MM) in New Zealand (NZ) general practices.

Methods: Data from 374 071 deidentified individuals in general practices were analysed to identify 61 LTCs. Structured data were extracted using Read codes from a national master list, and clinical raters independently identified condition-related free-text, including synonyms, negation terms and common misspellings in randomised samples. Keywords were categorised and refined through ten iterative tests. Programmatic text classification was developed and assessed against gold-standard clinician ratings, using sensitivity, specificity, positive predictive value (PPV) and F1-score.

Results: A quarter of general practitioner classifications contained either unrecognised Read codes or consisted of free-text only. Clinician inter-rater reliability was high (kappa ≥0.9). Compared with clinical gold standard, text classification yielded an average sensitivity of 88%, specificity of 99% and PPV of 95%, with an F1-score range of 82%-95%. Incorporating free text increased LTC prevalence from 42.1% to 46.3%, reducing misclassification of MM diagnoses by identifying 12 626 additional patients with MM and 15 972 additional patients with at least one LTC.

Discussion: In the course of workflow, general practitioners face barriers to accurate LTC coding or may simply annotate with text-based descriptions. Programmatic text classification has demonstrated high performance and identified many more patients receiving LTC care.

Conclusions: Combining structured and unstructured data optimises MM detection in NZ general practices and has the potential to improve case management, follow-up care and allocation of healthcare resources.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
6.10
自引率
4.90%
发文量
40
审稿时长
18 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信