Acted sleepy speech vs. real sleepy speech: Human perception and machine prediction for sleepiness estimation

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, CYBERNETICS

International Journal of Human-Computer Studies Pub Date : 2026-04-01 Epub Date: 2026-03-09 DOI:10.1016/j.ijhcs.2026.103795

Jihye Moon , Youngsun Kong , Jeffrey Bolkhovsky , Yashvi Gupta , Ki H. Chon

{"title":"Acted sleepy speech vs. real sleepy speech: Human perception and machine prediction for sleepiness estimation","authors":"Jihye Moon , Youngsun Kong , Jeffrey Bolkhovsky , Yashvi Gupta , Ki H. Chon","doi":"10.1016/j.ijhcs.2026.103795","DOIUrl":null,"url":null,"abstract":"<div><div>As agentic AI systems increasingly operate in high-stakes human-centered environments, their ability to detect physiological states such as sleepiness is critical for mitigating health and safety risks. Accurately estimating sleepiness from speech is essential for managing these risks, including adverse health outcomes, reduced productivity, and occupational safety hazards. For example, chronic sleep deprivation increases the risk of cardiovascular disease and impairs cognitive performance. Therefore, accurately determining sleepiness holds significant value for both health monitoring and workplace risk management. Since speech serves as a primary interface for human-AI interaction, estimating sleepiness from speech has emerged as an efficient way to mitigate these risks through AI. However, current machine learning (ML) studies have struggled to estimate sleepiness from speech, showing weak Spearman correlations (below 0.40) with perceived sleepiness levels (ground-truth sleepiness). As ML performance depends on high-quality training data, obtaining speech from noticeably sleepy individuals is essential for its improvement. However, collecting such data is risky and costly, as it requires physiologically sleep-deprived speakers with at least 24 h of prolonged wakefulness. Given that humans can mimic sleepy speech, this paper proposes that acted sleepy speech can be an effective surrogate for modeling sleepiness patterns. Our study demonstrates two key findings: First, human listeners rated acted sleepy speech as significantly sleepier than real sleepy speech obtained from a 25-hour sleep deprivation protocol, as confirmed by Welch’s <em>t</em>-test. Second, ML models trained on acted speech significantly outperformed those trained on real sleepy speech, achieving a 0.57 correlation with perceived sleepiness levels and 0.83 accuracy in detecting cognitive impairments in sleep-deprived individuals awake for 25 h, even with fewer samples and different lexical transcripts, indicating robustness to lexical variation. This human-centered, ethical, and efficient approach demonstrates that acted sleepy speech can advance real-world speech-based sleepiness estimation and cognitive impairment detection systems. Ultimately, it offers a promising pathway to real-world solutions for reducing sleepiness-related risks in the workplace, improving health outcomes, and enhancing the ability of human-centered agentic AI to support daily human tasks.</div></div>","PeriodicalId":54955,"journal":{"name":"International Journal of Human-Computer Studies","volume":"211 ","pages":"Article 103795"},"PeriodicalIF":5.1000,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Human-Computer Studies","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1071581926000704","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/3/9 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, CYBERNETICS","Score":null,"Total":0}

引用次数: 0

Abstract

As agentic AI systems increasingly operate in high-stakes human-centered environments, their ability to detect physiological states such as sleepiness is critical for mitigating health and safety risks. Accurately estimating sleepiness from speech is essential for managing these risks, including adverse health outcomes, reduced productivity, and occupational safety hazards. For example, chronic sleep deprivation increases the risk of cardiovascular disease and impairs cognitive performance. Therefore, accurately determining sleepiness holds significant value for both health monitoring and workplace risk management. Since speech serves as a primary interface for human-AI interaction, estimating sleepiness from speech has emerged as an efficient way to mitigate these risks through AI. However, current machine learning (ML) studies have struggled to estimate sleepiness from speech, showing weak Spearman correlations (below 0.40) with perceived sleepiness levels (ground-truth sleepiness). As ML performance depends on high-quality training data, obtaining speech from noticeably sleepy individuals is essential for its improvement. However, collecting such data is risky and costly, as it requires physiologically sleep-deprived speakers with at least 24 h of prolonged wakefulness. Given that humans can mimic sleepy speech, this paper proposes that acted sleepy speech can be an effective surrogate for modeling sleepiness patterns. Our study demonstrates two key findings: First, human listeners rated acted sleepy speech as significantly sleepier than real sleepy speech obtained from a 25-hour sleep deprivation protocol, as confirmed by Welch’s t-test. Second, ML models trained on acted speech significantly outperformed those trained on real sleepy speech, achieving a 0.57 correlation with perceived sleepiness levels and 0.83 accuracy in detecting cognitive impairments in sleep-deprived individuals awake for 25 h, even with fewer samples and different lexical transcripts, indicating robustness to lexical variation. This human-centered, ethical, and efficient approach demonstrates that acted sleepy speech can advance real-world speech-based sleepiness estimation and cognitive impairment detection systems. Ultimately, it offers a promising pathway to real-world solutions for reducing sleepiness-related risks in the workplace, improving health outcomes, and enhancing the ability of human-centered agentic AI to support daily human tasks.

查看原文本刊更多论文

拟态困倦言语vs真实困倦言语：人类感知和机器预测的困倦估计

随着人工智能系统越来越多地在高风险的以人为中心的环境中运行，它们检测困睡等生理状态的能力对于减轻健康和安全风险至关重要。从言语中准确估计困倦程度对于管理这些风险至关重要，包括不良健康结果、生产力下降和职业安全危害。例如，长期睡眠不足会增加患心血管疾病的风险，并损害认知能力。因此，准确确定困倦程度对健康监测和工作场所风险管理都具有重要价值。由于语音是人类与人工智能交互的主要界面，因此从语音中估计困倦已成为通过人工智能减轻这些风险的有效方法。然而，目前的机器学习（ML）研究一直在努力从语音中估计困倦，显示出与感知困倦水平（真实困倦）的微弱斯皮尔曼相关性（低于0.40）。由于机器学习的性能依赖于高质量的训练数据，因此从明显困倦的个体中获取语音对于机器学习的改进至关重要。然而，收集这样的数据是有风险和昂贵的，因为它需要生理性睡眠剥夺的说话者至少24小时长时间清醒。鉴于人类可以模仿困倦的语言，本文提出，行为困倦的语言可以作为建模困倦模式的有效替代。我们的研究显示了两个关键发现：首先，人类听众对通过25小时睡眠剥夺方案获得的昏昏欲睡的言语的评价明显比真正的昏昏欲睡的言语更困，这一点得到了韦尔奇t检验的证实。其次，对动作语言进行训练的机器学习模型的表现明显优于对真实困倦语言进行训练的模型，即使在较少的样本和不同的词汇转录本的情况下，在清醒25小时的睡眠剥夺个体中，机器学习模型与感知困倦水平的相关性为0.57，检测认知障碍的准确率为0.83，表明对词汇变化的稳健性。这种以人为本的、合乎道德的、高效的方法表明，动作困倦言语可以推进现实世界中基于言语的困倦估计和认知障碍检测系统。最终，它为现实世界的解决方案提供了一条有希望的途径，可以减少工作场所与睡眠相关的风险，改善健康状况，并增强以人为中心的人工智能支持日常人类任务的能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Human-Computer Studies 工程技术-计算机：控制论

CiteScore

11.50

自引率

5.60%

发文量

108

审稿时长

3 months

期刊介绍： The International Journal of Human-Computer Studies publishes original research over the whole spectrum of work relevant to the theory and practice of innovative interactive systems. The journal is inherently interdisciplinary, covering research in computing, artificial intelligence, psychology, linguistics, communication, design, engineering, and social organization, which is relevant to the design, analysis, evaluation and application of innovative interactive systems. Papers at the boundaries of these disciplines are especially welcome, as it is our view that interdisciplinary approaches are needed for producing theoretical insights in this complex area and for effective deployment of innovative technologies in concrete user communities. Research areas relevant to the journal include, but are not limited to: • Innovative interaction techniques • Multimodal interaction • Speech interaction • Graphic interaction • Natural language interaction • Interaction in mobile and embedded systems • Interface design and evaluation methodologies • Design and evaluation of innovative interactive systems • User interface prototyping and management systems • Ubiquitous computing • Wearable computers • Pervasive computing • Affective computing • Empirical studies of user behaviour • Empirical studies of programming and software engineering • Computer supported cooperative work • Computer mediated communication • Virtual reality • Mixed and augmented Reality • Intelligent user interfaces • Presence ...