增强医疗保健数据集成：协调实验室标签的机器学习方法。

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science Pub Date : 2025-06-10 eCollection Date: 2025-01-01

Mehmet F Bagci, Samantha R Spierling, Anna L Ritko, Truong Nguyen, Brian D Modena, Yusuf Ozturk

{"title":"增强医疗保健数据集成：协调实验室标签的机器学习方法。","authors":"Mehmet F Bagci, Samantha R Spierling, Anna L Ritko, Truong Nguyen, Brian D Modena, Yusuf Ozturk","doi":"","DOIUrl":null,"url":null,"abstract":"Variations in laboratory test names across healthcare systems-stemming from inconsistent terminologies, abbreviations, misspellings, and assay vendors-pose significant challenges to the integration and analysis of clinical data. These discrepancies hinder interoperability and complicate efforts to extract meaningful insights for both clinical research and patient care. In this study, we propose a machine learning-driven solution, enhanced by natural language processing techniques, to standardize lab test names. By employing feature extraction methods that analyze both string similarity and the distributional properties of test results, we improve the harmonization of test names, resulting in a more robust dataset. Our model achieves a 99% accuracy rate in matching lab names, showcasing the potential of AI-driven approaches in resolving long-standing standardization challenges. Importantly, this method enhances the reliability and consistency of clinical data, which is crucial for ensuring accurate results in large-scale clinical studies and improving the overall efficiency of informatics-based research and diagnostics.","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2025 ","pages":"65-73"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12150698/pdf/","citationCount":"0","resultStr":"{\"title\":\"Enhancing Healthcare Data Integration: A Machine Learning Approach to Harmonizing Laboratory Labels.\",\"authors\":\"Mehmet F Bagci, Samantha R Spierling, Anna L Ritko, Truong Nguyen, Brian D Modena, Yusuf Ozturk\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Variations in laboratory test names across healthcare systems-stemming from inconsistent terminologies, abbreviations, misspellings, and assay vendors-pose significant challenges to the integration and analysis of clinical data. These discrepancies hinder interoperability and complicate efforts to extract meaningful insights for both clinical research and patient care. In this study, we propose a machine learning-driven solution, enhanced by natural language processing techniques, to standardize lab test names. By employing feature extraction methods that analyze both string similarity and the distributional properties of test results, we improve the harmonization of test names, resulting in a more robust dataset. Our model achieves a 99% accuracy rate in matching lab names, showcasing the potential of AI-driven approaches in resolving long-standing standardization challenges. Importantly, this method enhances the reliability and consistency of clinical data, which is crucial for ensuring accurate results in large-scale clinical studies and improving the overall efficiency of informatics-based research and diagnostics.\",\"PeriodicalId\":72181,\"journal\":{\"name\":\"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science\",\"volume\":\"2025 \",\"pages\":\"65-73\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12150698/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

医疗保健系统中实验室检测名称的变化（源于不一致的术语、缩写、拼写错误和检测供应商）对临床数据的整合和分析构成了重大挑战。这些差异阻碍了互操作性，并使提取临床研究和患者护理有意义的见解的努力复杂化。在本研究中，我们提出了一种由自然语言处理技术增强的机器学习驱动的解决方案，以标准化实验室测试名称。通过使用分析字符串相似性和测试结果分布特性的特征提取方法，我们提高了测试名称的协调性，从而获得更健壮的数据集。我们的模型在匹配实验室名称方面达到了99%的准确率，展示了人工智能驱动方法在解决长期存在的标准化挑战方面的潜力。重要的是，该方法增强了临床数据的可靠性和一致性，这对于确保大规模临床研究结果的准确性和提高基于信息学的研究和诊断的整体效率至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

本刊更多论文

Enhancing Healthcare Data Integration: A Machine Learning Approach to Harmonizing Laboratory Labels.

Variations in laboratory test names across healthcare systems-stemming from inconsistent terminologies, abbreviations, misspellings, and assay vendors-pose significant challenges to the integration and analysis of clinical data. These discrepancies hinder interoperability and complicate efforts to extract meaningful insights for both clinical research and patient care. In this study, we propose a machine learning-driven solution, enhanced by natural language processing techniques, to standardize lab test names. By employing feature extraction methods that analyze both string similarity and the distributional properties of test results, we improve the harmonization of test names, resulting in a more robust dataset. Our model achieves a 99% accuracy rate in matching lab names, showcasing the potential of AI-driven approaches in resolving long-standing standardization challenges. Importantly, this method enhances the reliability and consistency of clinical data, which is crucial for ensuring accurate results in large-scale clinical studies and improving the overall efficiency of informatics-based research and diagnostics.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science

自引率

0.00%

发文量