Duc Phuc Nguyen, Peter Catcheside, Bastien Lechat, Gary Wittert, Andrew Vakulin, Robert Adams, Sarah L Appleton
{"title":"Explainable Machine Learning Assists in Revealing Associations Between Polysomnographic Biomarkers and Incident Type 2 Diabetes in Men.","authors":"Duc Phuc Nguyen, Peter Catcheside, Bastien Lechat, Gary Wittert, Andrew Vakulin, Robert Adams, Sarah L Appleton","doi":"10.2147/NSS.S512262","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Type 2 diabetes (T2D) shows bidirectional relationships with polysomnographic measures. However, no studies have searched systematically for novel polysomnographic biomarkers of T2D. We therefore investigated if state-of-the-art explainable machine learning (ML) models could identify new polysomnographic biomarkers predictive of incident T2D.</p><p><strong>Methods: </strong>We applied explainable ML models to longitudinal cohort study data from 536 males who were free of T2D at baseline and identified 52 cases of T2D at follow-up (mean 8.3, range 3.5-10.5 years). Beyond ranking biomarker importance, we explored how the explainable ML model approach can identify novel relationships, assist in hypothesis testing, and provide insights into risk factors.</p><p><strong>Results: </strong>The top five most predictive biomarkers included waist circumference, glucose, and three novel sleep biomarkers: the number of 3% desaturations in non-supine sleep, mean heart rate in supine sleep, and mean hypopnea duration. Explainable machine learning identified a significant association between the number of non-supine desaturation events (threshold of 19 events) and incident T2D (Odds ratio = 2.4 [95% CI 1.2-4.8], P = 0.013). No significant associations were found using continuous or quartiled versions of non-supine desaturation. Additionally, the model provided an individualized risk factor breakdown, supporting a more personalized approach to precision sleep medicine.</p><p><strong>Conclusion: </strong>Explainable ML supports the role of established biomarkers and reveals novel biomarkers of T2D likely to help guide further hypothesis testing and validation of more robust and clinically useful biomarkers. Although further validation is needed, these proof-of-concept data support the benefits of explainable ML in prospective data analysis.</p>","PeriodicalId":18896,"journal":{"name":"Nature and Science of Sleep","volume":"17 ","pages":"2013-2025"},"PeriodicalIF":3.4000,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12409479/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature and Science of Sleep","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2147/NSS.S512262","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Type 2 diabetes (T2D) shows bidirectional relationships with polysomnographic measures. However, no studies have searched systematically for novel polysomnographic biomarkers of T2D. We therefore investigated if state-of-the-art explainable machine learning (ML) models could identify new polysomnographic biomarkers predictive of incident T2D.
Methods: We applied explainable ML models to longitudinal cohort study data from 536 males who were free of T2D at baseline and identified 52 cases of T2D at follow-up (mean 8.3, range 3.5-10.5 years). Beyond ranking biomarker importance, we explored how the explainable ML model approach can identify novel relationships, assist in hypothesis testing, and provide insights into risk factors.
Results: The top five most predictive biomarkers included waist circumference, glucose, and three novel sleep biomarkers: the number of 3% desaturations in non-supine sleep, mean heart rate in supine sleep, and mean hypopnea duration. Explainable machine learning identified a significant association between the number of non-supine desaturation events (threshold of 19 events) and incident T2D (Odds ratio = 2.4 [95% CI 1.2-4.8], P = 0.013). No significant associations were found using continuous or quartiled versions of non-supine desaturation. Additionally, the model provided an individualized risk factor breakdown, supporting a more personalized approach to precision sleep medicine.
Conclusion: Explainable ML supports the role of established biomarkers and reveals novel biomarkers of T2D likely to help guide further hypothesis testing and validation of more robust and clinically useful biomarkers. Although further validation is needed, these proof-of-concept data support the benefits of explainable ML in prospective data analysis.
2型糖尿病(T2D)与多导睡眠图测量显示双向关系。然而,目前还没有研究系统地寻找新的T2D多导睡眠图生物标志物。因此,我们研究了最先进的可解释机器学习(ML)模型是否可以识别预测T2D事件的新的多导睡眠图生物标志物。方法:我们将可解释的ML模型应用于536名基线时无T2D的男性的纵向队列研究数据,并在随访时确定了52例T2D(平均8.3年,范围3.5-10.5年)。除了对生物标志物的重要性进行排名之外,我们还探索了可解释的ML模型方法如何识别新的关系,协助假设检验,并提供对风险因素的见解。结果:前五名最具预测性的生物标志物包括腰围、血糖和三个新的睡眠生物标志物:非仰卧睡眠时3%的去饱和数、仰卧睡眠时的平均心率和平均低通气持续时间。可解释的机器学习确定了非仰卧位去饱和事件(19个事件的阈值)与T2D事件之间的显著关联(优势比= 2.4 [95% CI 1.2-4.8], P = 0.013)。使用连续或四分位版本的非仰卧位去饱和没有发现显著的关联。此外,该模型提供了个性化的风险因素分解,支持更个性化的精准睡眠医学方法。结论:可解释的ML支持已建立的生物标志物的作用,并揭示了新的T2D生物标志物,可能有助于指导进一步的假设检验和验证更强大和临床有用的生物标志物。虽然需要进一步验证,但这些概念验证数据支持可解释ML在前瞻性数据分析中的好处。
期刊介绍:
Nature and Science of Sleep is an international, peer-reviewed, open access journal covering all aspects of sleep science and sleep medicine, including the neurophysiology and functions of sleep, the genetics of sleep, sleep and society, biological rhythms, dreaming, sleep disorders and therapy, and strategies to optimize healthy sleep.
Specific topics covered in the journal include:
The functions of sleep in humans and other animals
Physiological and neurophysiological changes with sleep
The genetics of sleep and sleep differences
The neurotransmitters, receptors and pathways involved in controlling both sleep and wakefulness
Behavioral and pharmacological interventions aimed at improving sleep, and improving wakefulness
Sleep changes with development and with age
Sleep and reproduction (e.g., changes across the menstrual cycle, with pregnancy and menopause)
The science and nature of dreams
Sleep disorders
Impact of sleep and sleep disorders on health, daytime function and quality of life
Sleep problems secondary to clinical disorders
Interaction of society with sleep (e.g., consequences of shift work, occupational health, public health)
The microbiome and sleep
Chronotherapy
Impact of circadian rhythms on sleep, physiology, cognition and health
Mechanisms controlling circadian rhythms, centrally and peripherally
Impact of circadian rhythm disruptions (including night shift work, jet lag and social jet lag) on sleep, physiology, cognition and health
Behavioral and pharmacological interventions aimed at reducing adverse effects of circadian-related sleep disruption
Assessment of technologies and biomarkers for measuring sleep and/or circadian rhythms
Epigenetic markers of sleep or circadian disruption.