Zhihong Zhang, Pallavi Gupta, Stephanie Potts-Thompson, Laura Prescott, Morgan Morrison, Scott Sittig, Margaret V McDonald, Chase Raymond, Jacquelyn Y Taylor, Maxim Topaz
{"title":"使用基于自然语言处理的系统(ENGAGE)识别和减少家庭医疗保健中的污名化语言:混合方法研究的协议。","authors":"Zhihong Zhang, Pallavi Gupta, Stephanie Potts-Thompson, Laura Prescott, Morgan Morrison, Scott Sittig, Margaret V McDonald, Chase Raymond, Jacquelyn Y Taylor, Maxim Topaz","doi":"10.2196/69753","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Stigmatizing language is common in clinical notes and can adversely affect the quality of patient care. Natural language processing (NLP) is a promising technology for identifying such language across large volumes of clinical notes in electronic health records.</p><p><strong>Objective: </strong>This study proposes an NLP-driven reduce stigmatizing language (ENGAGE) system to automatically identify and replace stigmatizing language.</p><p><strong>Methods: </strong>Using a mixed methods study, we will extract electronic health record data for patients admitted to 2 large, diverse home health care (HHC) organizations between January 2019 and December 2021. We propose the following 4 aims: aim 1 is to refine the ontology of stigmatizing language in HHC by (1) interviewing a diverse sample of HHC nurses and patients to identify terms to avoid and (2) analyzing clinical notes from various regions in the United States to categorize stigmatizing language. Aim 2 is to determine the best NLP approach for accurately identifying stigmatizing language by training algorithms and comparing their performance to human annotations. Aim 3 is to analyze the prevalence of stigmatizing language based on patients' race and ethnicity using adjusted statistical analyses of a sample of approximately half a million HHC patients (34% racial and ethnic minority groups). Aim 4 is to develop the NLP-driven ENGAGE system by (1) testing NLP methods (rule based; \"delete, retrieve, and generate\"; and transformers) for suggesting alternative wording and (2) designing and refining the user interface for clinical trial preparation.</p><p><strong>Results: </strong>We received funding from the National Institute on Minority Health and Health Disparities in September 2023. Recruitment began in May 2024, and as of March 2025, interviews have been completed for 9 enrolled participants. We anticipate completing all study aims by April 2027.</p><p><strong>Conclusions: </strong>This study will leverage extensive data sources to examine stigmatizing language in HHC settings and contribute to the development of systems aimed at effectively reducing the use of such language among HHC nurses.</p><p><strong>International registered report identifier (irrid): </strong>DERR1-10.2196/69753.</p>","PeriodicalId":14755,"journal":{"name":"JMIR Research Protocols","volume":"14 ","pages":"e69753"},"PeriodicalIF":1.5000,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12511817/pdf/","citationCount":"0","resultStr":"{\"title\":\"Identifying and Reducing Stigmatizing Language in Home Health Care With a Natural Language Processing-Based System (ENGAGE): Protocol for a Mixed Methods Study.\",\"authors\":\"Zhihong Zhang, Pallavi Gupta, Stephanie Potts-Thompson, Laura Prescott, Morgan Morrison, Scott Sittig, Margaret V McDonald, Chase Raymond, Jacquelyn Y Taylor, Maxim Topaz\",\"doi\":\"10.2196/69753\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Stigmatizing language is common in clinical notes and can adversely affect the quality of patient care. Natural language processing (NLP) is a promising technology for identifying such language across large volumes of clinical notes in electronic health records.</p><p><strong>Objective: </strong>This study proposes an NLP-driven reduce stigmatizing language (ENGAGE) system to automatically identify and replace stigmatizing language.</p><p><strong>Methods: </strong>Using a mixed methods study, we will extract electronic health record data for patients admitted to 2 large, diverse home health care (HHC) organizations between January 2019 and December 2021. We propose the following 4 aims: aim 1 is to refine the ontology of stigmatizing language in HHC by (1) interviewing a diverse sample of HHC nurses and patients to identify terms to avoid and (2) analyzing clinical notes from various regions in the United States to categorize stigmatizing language. Aim 2 is to determine the best NLP approach for accurately identifying stigmatizing language by training algorithms and comparing their performance to human annotations. Aim 3 is to analyze the prevalence of stigmatizing language based on patients' race and ethnicity using adjusted statistical analyses of a sample of approximately half a million HHC patients (34% racial and ethnic minority groups). Aim 4 is to develop the NLP-driven ENGAGE system by (1) testing NLP methods (rule based; \\\"delete, retrieve, and generate\\\"; and transformers) for suggesting alternative wording and (2) designing and refining the user interface for clinical trial preparation.</p><p><strong>Results: </strong>We received funding from the National Institute on Minority Health and Health Disparities in September 2023. Recruitment began in May 2024, and as of March 2025, interviews have been completed for 9 enrolled participants. We anticipate completing all study aims by April 2027.</p><p><strong>Conclusions: </strong>This study will leverage extensive data sources to examine stigmatizing language in HHC settings and contribute to the development of systems aimed at effectively reducing the use of such language among HHC nurses.</p><p><strong>International registered report identifier (irrid): </strong>DERR1-10.2196/69753.</p>\",\"PeriodicalId\":14755,\"journal\":{\"name\":\"JMIR Research Protocols\",\"volume\":\"14 \",\"pages\":\"e69753\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2025-09-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12511817/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR Research Protocols\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2196/69753\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Research Protocols","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/69753","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
Identifying and Reducing Stigmatizing Language in Home Health Care With a Natural Language Processing-Based System (ENGAGE): Protocol for a Mixed Methods Study.
Background: Stigmatizing language is common in clinical notes and can adversely affect the quality of patient care. Natural language processing (NLP) is a promising technology for identifying such language across large volumes of clinical notes in electronic health records.
Objective: This study proposes an NLP-driven reduce stigmatizing language (ENGAGE) system to automatically identify and replace stigmatizing language.
Methods: Using a mixed methods study, we will extract electronic health record data for patients admitted to 2 large, diverse home health care (HHC) organizations between January 2019 and December 2021. We propose the following 4 aims: aim 1 is to refine the ontology of stigmatizing language in HHC by (1) interviewing a diverse sample of HHC nurses and patients to identify terms to avoid and (2) analyzing clinical notes from various regions in the United States to categorize stigmatizing language. Aim 2 is to determine the best NLP approach for accurately identifying stigmatizing language by training algorithms and comparing their performance to human annotations. Aim 3 is to analyze the prevalence of stigmatizing language based on patients' race and ethnicity using adjusted statistical analyses of a sample of approximately half a million HHC patients (34% racial and ethnic minority groups). Aim 4 is to develop the NLP-driven ENGAGE system by (1) testing NLP methods (rule based; "delete, retrieve, and generate"; and transformers) for suggesting alternative wording and (2) designing and refining the user interface for clinical trial preparation.
Results: We received funding from the National Institute on Minority Health and Health Disparities in September 2023. Recruitment began in May 2024, and as of March 2025, interviews have been completed for 9 enrolled participants. We anticipate completing all study aims by April 2027.
Conclusions: This study will leverage extensive data sources to examine stigmatizing language in HHC settings and contribute to the development of systems aimed at effectively reducing the use of such language among HHC nurses.
International registered report identifier (irrid): DERR1-10.2196/69753.