解译失眠：复杂睡眠障碍的基准自动睡眠分期算法。

IF 3.4 3区医学 Q2 CLINICAL NEUROLOGY

Journal of Sleep Research Pub Date : 2025-03-27 DOI:10.1111/jsr.70048

Umaer Hanif, Anis Aloulou, Flynn Crosbie, Paul Bouchequet, Mounir Chennaoui, Thomas Andrillon, Damien Leger

{"title":"解译失眠：复杂睡眠障碍的基准自动睡眠分期算法。","authors":"Umaer Hanif, Anis Aloulou, Flynn Crosbie, Paul Bouchequet, Mounir Chennaoui, Thomas Andrillon, Damien Leger","doi":"10.1111/jsr.70048","DOIUrl":null,"url":null,"abstract":"Polysomnography (PSG) is essential for diagnosing sleep disorders, but its manual interpretation is labor-intensive. Automated sleep staging algorithms are promising, yet their utility in complex sleep disorders such as insomnia remains uncertain. This study evaluates five of the most recognised sleep staging classifiers-U-Sleep, STAGES, GSSC, Luna and YASA-on PSG data from 904 patients with chronic insomnia. Performance was assessed using F1 scores, confusion matrices and predicted sleep metrics. The effect of demographics, sleepiness and PSG metrics on each classifier's performance was assessed using linear regression. Across all sleep stages, GSSC performed best (macro F1 score = 0.66), followed by U-Sleep (0.62), Luna (0.56), STAGES (0.54) and YASA (0.52). GSSC achieved the highest F1 scores in Wake (0.83), N1 (0.22), N2 (0.80), N3 (0.71) and REM (0.76), while U-Sleep matched its performance in N1 and REM and Luna in N3. STAGES performed poorest in N3 (0.39) and YASA in REM (0.35). Common misclassifications included N1 vs. Wake/N2 and N3 vs. N2, with REM misclassified as Wake/N1/N2 by STAGES, Luna and YASA. GSSC and U-Sleep exhibited minimal demographic bias, while STAGES and Luna had more. No performance difference was observed between chronic insomnia patients with and without abnormal PSG. Sleep metric accuracy was highest for U-Sleep (TST, R2 = 0.88), STAGES (SOL, R2 = 0.82) and GSSC (WASO, R2 = 0.82). These findings underscore the solid yet variable performance of the classifiers and highlight GSSC and U-Sleep as leading tools for sleep staging in patients with chronic insomnia.","PeriodicalId":17057,"journal":{"name":"Journal of Sleep Research","volume":" ","pages":"e70048"},"PeriodicalIF":3.4000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deciphering Insomnia: Benchmarking Automated Sleep Staging Algorithms for Complex Sleep Disorders.\",\"authors\":\"Umaer Hanif, Anis Aloulou, Flynn Crosbie, Paul Bouchequet, Mounir Chennaoui, Thomas Andrillon, Damien Leger\",\"doi\":\"10.1111/jsr.70048\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Polysomnography (PSG) is essential for diagnosing sleep disorders, but its manual interpretation is labor-intensive. Automated sleep staging algorithms are promising, yet their utility in complex sleep disorders such as insomnia remains uncertain. This study evaluates five of the most recognised sleep staging classifiers-U-Sleep, STAGES, GSSC, Luna and YASA-on PSG data from 904 patients with chronic insomnia. Performance was assessed using F1 scores, confusion matrices and predicted sleep metrics. The effect of demographics, sleepiness and PSG metrics on each classifier's performance was assessed using linear regression. Across all sleep stages, GSSC performed best (macro F1 score = 0.66), followed by U-Sleep (0.62), Luna (0.56), STAGES (0.54) and YASA (0.52). GSSC achieved the highest F1 scores in Wake (0.83), N1 (0.22), N2 (0.80), N3 (0.71) and REM (0.76), while U-Sleep matched its performance in N1 and REM and Luna in N3. STAGES performed poorest in N3 (0.39) and YASA in REM (0.35). Common misclassifications included N1 vs. Wake/N2 and N3 vs. N2, with REM misclassified as Wake/N1/N2 by STAGES, Luna and YASA. GSSC and U-Sleep exhibited minimal demographic bias, while STAGES and Luna had more. No performance difference was observed between chronic insomnia patients with and without abnormal PSG. Sleep metric accuracy was highest for U-Sleep (TST, R2 = 0.88), STAGES (SOL, R2 = 0.82) and GSSC (WASO, R2 = 0.82). These findings underscore the solid yet variable performance of the classifiers and highlight GSSC and U-Sleep as leading tools for sleep staging in patients with chronic insomnia.\",\"PeriodicalId\":17057,\"journal\":{\"name\":\"Journal of Sleep Research\",\"volume\":\" \",\"pages\":\"e70048\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-03-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Sleep Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1111/jsr.70048\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CLINICAL NEUROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Sleep Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/jsr.70048","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}

引用次数: 0

摘要

多导睡眠图（PSG）对于诊断睡眠障碍是必不可少的，但它的人工解释是劳动密集型的。自动睡眠分期算法很有前景，但它们在失眠等复杂睡眠障碍中的应用仍不确定。本研究对904名慢性失眠症患者的PSG数据进行了评估，评估了五种最被认可的睡眠分期分类——u - sleep、STAGES、GSSC、Luna和yasa。使用F1分数、混淆矩阵和预测睡眠指标来评估表现。使用线性回归评估人口统计学、嗜睡和PSG指标对每个分类器性能的影响。在所有睡眠阶段，GSSC表现最好（宏观F1得分= 0.66），其次是U-Sleep（0.62）、Luna（0.56）、stages（0.54）和YASA（0.52）。GSSC在Wake（0.83）、N1（0.22）、N2（0.80）、N3（0.71）和REM（0.76）阶段的F1得分最高，U-Sleep在N1、REM和Luna （N3）阶段的F1得分与GSSC相当。N3期表现最差（0.39），REM期表现最差（0.35）。常见的错误分类包括N1 vs. Wake/N2和N3 vs. N2，其中REM被STAGES、Luna和YASA错误分类为Wake/N1/N2。GSSC和U-Sleep表现出最小的人口统计学偏差，而STAGES和Luna则有更多。伴有和未伴有PSG异常的慢性失眠症患者的表现无差异。U-Sleep （TST, R2 = 0.88）、STAGES （SOL, R2 = 0.82）和GSSC （WASO, R2 = 0.82）的睡眠测量精度最高。这些发现强调了分类器的可靠但可变的性能，并强调了GSSC和U-Sleep作为慢性失眠患者睡眠分期的主要工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deciphering Insomnia: Benchmarking Automated Sleep Staging Algorithms for Complex Sleep Disorders.

Polysomnography (PSG) is essential for diagnosing sleep disorders, but its manual interpretation is labor-intensive. Automated sleep staging algorithms are promising, yet their utility in complex sleep disorders such as insomnia remains uncertain. This study evaluates five of the most recognised sleep staging classifiers-U-Sleep, STAGES, GSSC, Luna and YASA-on PSG data from 904 patients with chronic insomnia. Performance was assessed using F1 scores, confusion matrices and predicted sleep metrics. The effect of demographics, sleepiness and PSG metrics on each classifier's performance was assessed using linear regression. Across all sleep stages, GSSC performed best (macro F1 score = 0.66), followed by U-Sleep (0.62), Luna (0.56), STAGES (0.54) and YASA (0.52). GSSC achieved the highest F1 scores in Wake (0.83), N1 (0.22), N2 (0.80), N3 (0.71) and REM (0.76), while U-Sleep matched its performance in N1 and REM and Luna in N3. STAGES performed poorest in N3 (0.39) and YASA in REM (0.35). Common misclassifications included N1 vs. Wake/N2 and N3 vs. N2, with REM misclassified as Wake/N1/N2 by STAGES, Luna and YASA. GSSC and U-Sleep exhibited minimal demographic bias, while STAGES and Luna had more. No performance difference was observed between chronic insomnia patients with and without abnormal PSG. Sleep metric accuracy was highest for U-Sleep (TST, R² = 0.88), STAGES (SOL, R² = 0.82) and GSSC (WASO, R² = 0.82). These findings underscore the solid yet variable performance of the classifiers and highlight GSSC and U-Sleep as leading tools for sleep staging in patients with chronic insomnia.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Sleep Research 医学-临床神经学

CiteScore

9.00

自引率

6.80%

发文量

234

审稿时长

6-12 weeks

期刊介绍： The Journal of Sleep Research is dedicated to basic and clinical sleep research. The Journal publishes original research papers and invited reviews in all areas of sleep research (including biological rhythms). The Journal aims to promote the exchange of ideas between basic and clinical sleep researchers coming from a wide range of backgrounds and disciplines. The Journal will achieve this by publishing papers which use multidisciplinary and novel approaches to answer important questions about sleep, as well as its disorders and the treatment thereof.