{"title":"基于机器学习的听力损失预测:美国NHANES 2003年至2018年的研究结果","authors":"Yi Mi, Pin Sun","doi":"10.1016/j.heares.2025.109252","DOIUrl":null,"url":null,"abstract":"<div><div>The prevalence of hearing loss (HL) has emerged as an escalating public health concern globally. The objective of this study was to leverage data from the National Health and Nutritional Examination Survey (NHANES) to develop an interpretable predictive machine learning (ML) model for HL.</div><div>In accordance with the established inclusion and exclusion criteria, a total of 2814 participants were randomly assigned to one of two distinct groups for the training and validation of the predictive models. We identified the most significant variables using Recursive Feature Elimination and constructed a HL prediction model through various ML models. The generalization ability of the models was evaluated via 10-fold cross-validation. Eight different models were utilized to develop the optimal prediction model for HL. Subsequently, three interpretable methods, Feature importance analysis, Generalized linear model (GLM) and Restricted cubic spline (RCS) were integrated into a pipeline and embedded in ML for model interpretation.</div><div>In this study, the Random Forest (RF) exhibited superior performance across all evaluation metrics after balancing the data using the Synthetic Minority Oversampling Technique (SMOTE), particularly excelling in AUC, PR-AUC and F1 score. Feature importance analysis uncovered significant correlations between HL and top 10 features, including age, blood lead (Pb) level, urine thallium (Tl) level, BMI, total energy, urine antimon (Sb) level, vitamin E intake, urine cobalt (Co) level, calcium intake and urine cesium (Cs) level. Moreover, both univariate and multivariate GLMs identified blood Pb [OR (95 % CI):1.169 (1.037,1.311)] and vitamin E intake [OR (95 % CI):0.776 (0.641,0.928)] as the main features associated with HL. The RCS analysis further revealed that increased blood Pb level and decreased vitamin E intake correspond to a proportional rise in the anticipated risk of HL after adjusted by confounders.</div><div>Our ML models identify key factors that, if validated by future studies, will have important implications for hearing conservation. Furthermore, these ML-based point-of-care prediction models will help overcome barriers to hearing healthcare and enable the efficient allocation of resources by accurately identifying individuals who are in dire need of hearing assessment.</div></div>","PeriodicalId":12881,"journal":{"name":"Hearing Research","volume":"461 ","pages":"Article 109252"},"PeriodicalIF":2.5000,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning-based prediction of hearing loss: Findings of the US NHANES from 2003 to 2018\",\"authors\":\"Yi Mi, Pin Sun\",\"doi\":\"10.1016/j.heares.2025.109252\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The prevalence of hearing loss (HL) has emerged as an escalating public health concern globally. The objective of this study was to leverage data from the National Health and Nutritional Examination Survey (NHANES) to develop an interpretable predictive machine learning (ML) model for HL.</div><div>In accordance with the established inclusion and exclusion criteria, a total of 2814 participants were randomly assigned to one of two distinct groups for the training and validation of the predictive models. We identified the most significant variables using Recursive Feature Elimination and constructed a HL prediction model through various ML models. The generalization ability of the models was evaluated via 10-fold cross-validation. Eight different models were utilized to develop the optimal prediction model for HL. Subsequently, three interpretable methods, Feature importance analysis, Generalized linear model (GLM) and Restricted cubic spline (RCS) were integrated into a pipeline and embedded in ML for model interpretation.</div><div>In this study, the Random Forest (RF) exhibited superior performance across all evaluation metrics after balancing the data using the Synthetic Minority Oversampling Technique (SMOTE), particularly excelling in AUC, PR-AUC and F1 score. Feature importance analysis uncovered significant correlations between HL and top 10 features, including age, blood lead (Pb) level, urine thallium (Tl) level, BMI, total energy, urine antimon (Sb) level, vitamin E intake, urine cobalt (Co) level, calcium intake and urine cesium (Cs) level. Moreover, both univariate and multivariate GLMs identified blood Pb [OR (95 % CI):1.169 (1.037,1.311)] and vitamin E intake [OR (95 % CI):0.776 (0.641,0.928)] as the main features associated with HL. The RCS analysis further revealed that increased blood Pb level and decreased vitamin E intake correspond to a proportional rise in the anticipated risk of HL after adjusted by confounders.</div><div>Our ML models identify key factors that, if validated by future studies, will have important implications for hearing conservation. Furthermore, these ML-based point-of-care prediction models will help overcome barriers to hearing healthcare and enable the efficient allocation of resources by accurately identifying individuals who are in dire need of hearing assessment.</div></div>\",\"PeriodicalId\":12881,\"journal\":{\"name\":\"Hearing Research\",\"volume\":\"461 \",\"pages\":\"Article 109252\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-03-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Hearing Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0378595525000711\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Hearing Research","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378595525000711","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
Machine learning-based prediction of hearing loss: Findings of the US NHANES from 2003 to 2018
The prevalence of hearing loss (HL) has emerged as an escalating public health concern globally. The objective of this study was to leverage data from the National Health and Nutritional Examination Survey (NHANES) to develop an interpretable predictive machine learning (ML) model for HL.
In accordance with the established inclusion and exclusion criteria, a total of 2814 participants were randomly assigned to one of two distinct groups for the training and validation of the predictive models. We identified the most significant variables using Recursive Feature Elimination and constructed a HL prediction model through various ML models. The generalization ability of the models was evaluated via 10-fold cross-validation. Eight different models were utilized to develop the optimal prediction model for HL. Subsequently, three interpretable methods, Feature importance analysis, Generalized linear model (GLM) and Restricted cubic spline (RCS) were integrated into a pipeline and embedded in ML for model interpretation.
In this study, the Random Forest (RF) exhibited superior performance across all evaluation metrics after balancing the data using the Synthetic Minority Oversampling Technique (SMOTE), particularly excelling in AUC, PR-AUC and F1 score. Feature importance analysis uncovered significant correlations between HL and top 10 features, including age, blood lead (Pb) level, urine thallium (Tl) level, BMI, total energy, urine antimon (Sb) level, vitamin E intake, urine cobalt (Co) level, calcium intake and urine cesium (Cs) level. Moreover, both univariate and multivariate GLMs identified blood Pb [OR (95 % CI):1.169 (1.037,1.311)] and vitamin E intake [OR (95 % CI):0.776 (0.641,0.928)] as the main features associated with HL. The RCS analysis further revealed that increased blood Pb level and decreased vitamin E intake correspond to a proportional rise in the anticipated risk of HL after adjusted by confounders.
Our ML models identify key factors that, if validated by future studies, will have important implications for hearing conservation. Furthermore, these ML-based point-of-care prediction models will help overcome barriers to hearing healthcare and enable the efficient allocation of resources by accurately identifying individuals who are in dire need of hearing assessment.
期刊介绍:
The aim of the journal is to provide a forum for papers concerned with basic peripheral and central auditory mechanisms. Emphasis is on experimental and clinical studies, but theoretical and methodological papers will also be considered. The journal publishes original research papers, review and mini- review articles, rapid communications, method/protocol and perspective articles.
Papers submitted should deal with auditory anatomy, physiology, psychophysics, imaging, modeling and behavioural studies in animals and humans, as well as hearing aids and cochlear implants. Papers dealing with the vestibular system are also considered for publication. Papers on comparative aspects of hearing and on effects of drugs and environmental contaminants on hearing function will also be considered. Clinical papers will be accepted when they contribute to the understanding of normal and pathological hearing functions.