Identifying and Reducing Stigmatizing Language in Home Health Care With a Natural Language Processing-Based System (ENGAGE): Protocol for a Mixed Methods Study.

IF 1.5 Q3 HEALTH CARE SCIENCES & SERVICES

JMIR Research Protocols Pub Date : 2025-09-25 DOI:10.2196/69753

Zhihong Zhang, Pallavi Gupta, Stephanie Potts-Thompson, Laura Prescott, Morgan Morrison, Scott Sittig, Margaret V McDonald, Chase Raymond, Jacquelyn Y Taylor, Maxim Topaz

{"title":"Identifying and Reducing Stigmatizing Language in Home Health Care With a Natural Language Processing-Based System (ENGAGE): Protocol for a Mixed Methods Study.","authors":"Zhihong Zhang, Pallavi Gupta, Stephanie Potts-Thompson, Laura Prescott, Morgan Morrison, Scott Sittig, Margaret V McDonald, Chase Raymond, Jacquelyn Y Taylor, Maxim Topaz","doi":"10.2196/69753","DOIUrl":null,"url":null,"abstract":"Background: Stigmatizing language is common in clinical notes and can adversely affect the quality of patient care. Natural language processing (NLP) is a promising technology for identifying such language across large volumes of clinical notes in electronic health records.Objective: This study proposes an NLP-driven reduce stigmatizing language (ENGAGE) system to automatically identify and replace stigmatizing language.Methods: Using a mixed methods study, we will extract electronic health record data for patients admitted to 2 large, diverse home health care (HHC) organizations between January 2019 and December 2021. We propose the following 4 aims: aim 1 is to refine the ontology of stigmatizing language in HHC by (1) interviewing a diverse sample of HHC nurses and patients to identify terms to avoid and (2) analyzing clinical notes from various regions in the United States to categorize stigmatizing language. Aim 2 is to determine the best NLP approach for accurately identifying stigmatizing language by training algorithms and comparing their performance to human annotations. Aim 3 is to analyze the prevalence of stigmatizing language based on patients' race and ethnicity using adjusted statistical analyses of a sample of approximately half a million HHC patients (34% racial and ethnic minority groups). Aim 4 is to develop the NLP-driven ENGAGE system by (1) testing NLP methods (rule based; \"delete, retrieve, and generate\"; and transformers) for suggesting alternative wording and (2) designing and refining the user interface for clinical trial preparation.Results: We received funding from the National Institute on Minority Health and Health Disparities in September 2023. Recruitment began in May 2024, and as of March 2025, interviews have been completed for 9 enrolled participants. We anticipate completing all study aims by April 2027.Conclusions: This study will leverage extensive data sources to examine stigmatizing language in HHC settings and contribute to the development of systems aimed at effectively reducing the use of such language among HHC nurses.International registered report identifier (irrid): DERR1-10.2196/69753.","PeriodicalId":14755,"journal":{"name":"JMIR Research Protocols","volume":"14 ","pages":"e69753"},"PeriodicalIF":1.5000,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12511817/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Research Protocols","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/69753","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Stigmatizing language is common in clinical notes and can adversely affect the quality of patient care. Natural language processing (NLP) is a promising technology for identifying such language across large volumes of clinical notes in electronic health records.

Objective: This study proposes an NLP-driven reduce stigmatizing language (ENGAGE) system to automatically identify and replace stigmatizing language.

Methods: Using a mixed methods study, we will extract electronic health record data for patients admitted to 2 large, diverse home health care (HHC) organizations between January 2019 and December 2021. We propose the following 4 aims: aim 1 is to refine the ontology of stigmatizing language in HHC by (1) interviewing a diverse sample of HHC nurses and patients to identify terms to avoid and (2) analyzing clinical notes from various regions in the United States to categorize stigmatizing language. Aim 2 is to determine the best NLP approach for accurately identifying stigmatizing language by training algorithms and comparing their performance to human annotations. Aim 3 is to analyze the prevalence of stigmatizing language based on patients' race and ethnicity using adjusted statistical analyses of a sample of approximately half a million HHC patients (34% racial and ethnic minority groups). Aim 4 is to develop the NLP-driven ENGAGE system by (1) testing NLP methods (rule based; "delete, retrieve, and generate"; and transformers) for suggesting alternative wording and (2) designing and refining the user interface for clinical trial preparation.

Results: We received funding from the National Institute on Minority Health and Health Disparities in September 2023. Recruitment began in May 2024, and as of March 2025, interviews have been completed for 9 enrolled participants. We anticipate completing all study aims by April 2027.

Conclusions: This study will leverage extensive data sources to examine stigmatizing language in HHC settings and contribute to the development of systems aimed at effectively reducing the use of such language among HHC nurses.

International registered report identifier (irrid): DERR1-10.2196/69753.

查看原文本刊更多论文

使用基于自然语言处理的系统（ENGAGE）识别和减少家庭医疗保健中的污名化语言：混合方法研究的协议。

背景：污名化语言在临床记录中很常见，并且会对患者护理质量产生不利影响。自然语言处理（NLP）是一种很有前途的技术，用于识别电子健康记录中大量临床笔记中的此类语言。目的：本研究提出了一种基于nlp驱动的减少污名化语言（ENGAGE）系统，用于自动识别和替换污名化语言。方法：采用混合方法研究，我们将提取2019年1月至2021年12月期间在2个大型、多样化的家庭卫生保健（HHC）组织入院的患者的电子健康记录数据。我们提出以下4个目标：目标1是通过(1)采访不同样本的HHC护士和患者，以确定避免使用的术语；(2)分析来自美国不同地区的临床记录，对污名化语言进行分类，从而完善HHC中污名化语言的本体论。目标2是通过训练算法并将其性能与人类注释进行比较，确定准确识别污名化语言的最佳NLP方法。目的3是通过对大约50万HHC患者（34%的种族和少数民族）样本进行调整后的统计分析，分析基于患者种族和民族的污名化语言的流行程度。目标4是通过以下方式开发NLP驱动的ENGAGE系统：(1)测试NLP方法（基于规则；“删除、检索和生成”；以及转换），以建议替代措辞；(2)为临床试验准备设计和改进用户界面。结果：我们于2023年9月获得了国家少数民族健康和健康差异研究所的资助。招募于2024年5月开始，截至2025年3月，已经完成了9名被招募参与者的面试。我们预计在2027年4月完成所有的研究目标。结论：本研究将利用广泛的数据来源来检查HHC环境中的污名化语言，并有助于开发旨在有效减少HHC护士使用此类语言的系统。国际注册报告标识符（irrid）： DERR1-10.2196/69753。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊