Bridging information gaps in menopause status classification through natural language processing

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES
Hannah Eyre, Patrick R. Alba, Carolyn J Gibson, E. Gatsby, Kristine E Lynch, Olga V. Patterson, S. Duvall
{"title":"Bridging information gaps in menopause status classification through natural language processing","authors":"Hannah Eyre, Patrick R. Alba, Carolyn J Gibson, E. Gatsby, Kristine E Lynch, Olga V. Patterson, S. Duvall","doi":"10.1093/jamiaopen/ooae013","DOIUrl":null,"url":null,"abstract":"\n \n \n To use natural language processing (NLP) of clinical notes to augment existing structured electronic health record (EHR) data for classification of a patient’s menopausal status.\n \n \n \n A rule-based NLP system was designed to capture evidence of a patient’s menopause status including dates of a patient’s last menstrual period, reproductive surgeries, and postmenopause diagnosis as well as their use of birth control and menstrual interruptions. nlp-derived output was used in combination with structured EHR data to classify a patient’s menopausal status. NLP processing and patient classification was performed on a cohort of 307,512 female Veterans receiving healthcare at the US Department of Veterans Affairs (VA).\n \n \n \n NLP was validated at 99.6% precision. Including the nlp-derived data into a menopause phenotype increased the number of patients with data relevant to their menopausal status by 118%. Using structured codes alone, 81,173 (27.0%) are able to be classified as postmenopausal or premenopausal. However, with the inclusion of NLP, this number increased 167,804 (54.6%) patients. The premenopausal category grew by 532.7% with the inclusion of NLP data.\n \n \n \n By employing NLP, it became possible to identify documented data elements that predate VA care, originate outside VA networks, or have no corresponding structured field in the VA EHR that would be otherwise inaccessible for further analysis.\n \n \n \n NLP can be used to identify concepts relevant to a patient’s menopausal status in clinical notes. Adding nlp-derived data to an algorithm classifying a patient’s menopausal status significantly increases the number of patients classified using EHR data, ultimately enabling more detailed assessments of the impact of menopause on health outcomes.\n","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":null,"pages":null},"PeriodicalIF":2.5000,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAMIA Open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jamiaopen/ooae013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

To use natural language processing (NLP) of clinical notes to augment existing structured electronic health record (EHR) data for classification of a patient’s menopausal status. A rule-based NLP system was designed to capture evidence of a patient’s menopause status including dates of a patient’s last menstrual period, reproductive surgeries, and postmenopause diagnosis as well as their use of birth control and menstrual interruptions. nlp-derived output was used in combination with structured EHR data to classify a patient’s menopausal status. NLP processing and patient classification was performed on a cohort of 307,512 female Veterans receiving healthcare at the US Department of Veterans Affairs (VA). NLP was validated at 99.6% precision. Including the nlp-derived data into a menopause phenotype increased the number of patients with data relevant to their menopausal status by 118%. Using structured codes alone, 81,173 (27.0%) are able to be classified as postmenopausal or premenopausal. However, with the inclusion of NLP, this number increased 167,804 (54.6%) patients. The premenopausal category grew by 532.7% with the inclusion of NLP data. By employing NLP, it became possible to identify documented data elements that predate VA care, originate outside VA networks, or have no corresponding structured field in the VA EHR that would be otherwise inaccessible for further analysis. NLP can be used to identify concepts relevant to a patient’s menopausal status in clinical notes. Adding nlp-derived data to an algorithm classifying a patient’s menopausal status significantly increases the number of patients classified using EHR data, ultimately enabling more detailed assessments of the impact of menopause on health outcomes.
通过自然语言处理弥补更年期状态分类方面的信息差距
利用临床笔记的自然语言处理(NLP)来增强现有的结构化电子健康记录(EHR)数据,从而对患者的绝经状态进行分类。 我们设计了一个基于规则的 NLP 系统来捕捉患者更年期状态的证据,包括患者最后一次月经的日期、生殖手术和绝经后诊断,以及他们使用避孕药具和月经中断的情况。在美国退伍军人事务部(VA)接受医疗服务的 307,512 名女性退伍军人中进行了 NLP 处理和患者分类。 NLP 的精确度达到 99.6%。将 NLP 导出的数据纳入更年期表型后,获得更年期状态相关数据的患者人数增加了 118%。仅使用结构化代码,就有 81,173 人(27.0%)可被归类为绝经后或绝经前。然而,加入 NLP 后,这一数字增加了 167 804 人(54.6%)。纳入 NLP 数据后,绝经前类别增加了 532.7%。 通过使用 NLP,可以识别出在退伍军人事务部护理之前、源自退伍军人事务部网络之外或在退伍军人事务部电子病历中没有相应结构字段的记录数据元素,否则将无法进行进一步分析。 NLP 可用于识别临床笔记中与患者绝经状态相关的概念。将 NLP 衍生的数据添加到对患者更年期状态进行分类的算法中,可大大增加使用 EHR 数据对患者进行分类的数量,最终可对更年期对健康结果的影响进行更详细的评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
JAMIA Open
JAMIA Open Medicine-Health Informatics
CiteScore
4.10
自引率
4.80%
发文量
102
审稿时长
16 weeks
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信