Jihye Moon , Youngsun Kong , Jeffrey Bolkhovsky , Yashvi Gupta , Ki H. Chon
{"title":"Acted sleepy speech vs. real sleepy speech: Human perception and machine prediction for sleepiness estimation","authors":"Jihye Moon , Youngsun Kong , Jeffrey Bolkhovsky , Yashvi Gupta , Ki H. Chon","doi":"10.1016/j.ijhcs.2026.103795","DOIUrl":null,"url":null,"abstract":"<div><div>As agentic AI systems increasingly operate in high-stakes human-centered environments, their ability to detect physiological states such as sleepiness is critical for mitigating health and safety risks. Accurately estimating sleepiness from speech is essential for managing these risks, including adverse health outcomes, reduced productivity, and occupational safety hazards. For example, chronic sleep deprivation increases the risk of cardiovascular disease and impairs cognitive performance. Therefore, accurately determining sleepiness holds significant value for both health monitoring and workplace risk management. Since speech serves as a primary interface for human-AI interaction, estimating sleepiness from speech has emerged as an efficient way to mitigate these risks through AI. However, current machine learning (ML) studies have struggled to estimate sleepiness from speech, showing weak Spearman correlations (below 0.40) with perceived sleepiness levels (ground-truth sleepiness). As ML performance depends on high-quality training data, obtaining speech from noticeably sleepy individuals is essential for its improvement. However, collecting such data is risky and costly, as it requires physiologically sleep-deprived speakers with at least 24 h of prolonged wakefulness. Given that humans can mimic sleepy speech, this paper proposes that acted sleepy speech can be an effective surrogate for modeling sleepiness patterns. Our study demonstrates two key findings: First, human listeners rated acted sleepy speech as significantly sleepier than real sleepy speech obtained from a 25-hour sleep deprivation protocol, as confirmed by Welch’s <em>t</em>-test. Second, ML models trained on acted speech significantly outperformed those trained on real sleepy speech, achieving a 0.57 correlation with perceived sleepiness levels and 0.83 accuracy in detecting cognitive impairments in sleep-deprived individuals awake for 25 h, even with fewer samples and different lexical transcripts, indicating robustness to lexical variation. This human-centered, ethical, and efficient approach demonstrates that acted sleepy speech can advance real-world speech-based sleepiness estimation and cognitive impairment detection systems. Ultimately, it offers a promising pathway to real-world solutions for reducing sleepiness-related risks in the workplace, improving health outcomes, and enhancing the ability of human-centered agentic AI to support daily human tasks.</div></div>","PeriodicalId":54955,"journal":{"name":"International Journal of Human-Computer Studies","volume":"211 ","pages":"Article 103795"},"PeriodicalIF":5.1000,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Human-Computer Studies","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1071581926000704","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/3/9 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, CYBERNETICS","Score":null,"Total":0}
引用次数: 0
Abstract
As agentic AI systems increasingly operate in high-stakes human-centered environments, their ability to detect physiological states such as sleepiness is critical for mitigating health and safety risks. Accurately estimating sleepiness from speech is essential for managing these risks, including adverse health outcomes, reduced productivity, and occupational safety hazards. For example, chronic sleep deprivation increases the risk of cardiovascular disease and impairs cognitive performance. Therefore, accurately determining sleepiness holds significant value for both health monitoring and workplace risk management. Since speech serves as a primary interface for human-AI interaction, estimating sleepiness from speech has emerged as an efficient way to mitigate these risks through AI. However, current machine learning (ML) studies have struggled to estimate sleepiness from speech, showing weak Spearman correlations (below 0.40) with perceived sleepiness levels (ground-truth sleepiness). As ML performance depends on high-quality training data, obtaining speech from noticeably sleepy individuals is essential for its improvement. However, collecting such data is risky and costly, as it requires physiologically sleep-deprived speakers with at least 24 h of prolonged wakefulness. Given that humans can mimic sleepy speech, this paper proposes that acted sleepy speech can be an effective surrogate for modeling sleepiness patterns. Our study demonstrates two key findings: First, human listeners rated acted sleepy speech as significantly sleepier than real sleepy speech obtained from a 25-hour sleep deprivation protocol, as confirmed by Welch’s t-test. Second, ML models trained on acted speech significantly outperformed those trained on real sleepy speech, achieving a 0.57 correlation with perceived sleepiness levels and 0.83 accuracy in detecting cognitive impairments in sleep-deprived individuals awake for 25 h, even with fewer samples and different lexical transcripts, indicating robustness to lexical variation. This human-centered, ethical, and efficient approach demonstrates that acted sleepy speech can advance real-world speech-based sleepiness estimation and cognitive impairment detection systems. Ultimately, it offers a promising pathway to real-world solutions for reducing sleepiness-related risks in the workplace, improving health outcomes, and enhancing the ability of human-centered agentic AI to support daily human tasks.
期刊介绍:
The International Journal of Human-Computer Studies publishes original research over the whole spectrum of work relevant to the theory and practice of innovative interactive systems. The journal is inherently interdisciplinary, covering research in computing, artificial intelligence, psychology, linguistics, communication, design, engineering, and social organization, which is relevant to the design, analysis, evaluation and application of innovative interactive systems. Papers at the boundaries of these disciplines are especially welcome, as it is our view that interdisciplinary approaches are needed for producing theoretical insights in this complex area and for effective deployment of innovative technologies in concrete user communities.
Research areas relevant to the journal include, but are not limited to:
• Innovative interaction techniques
• Multimodal interaction
• Speech interaction
• Graphic interaction
• Natural language interaction
• Interaction in mobile and embedded systems
• Interface design and evaluation methodologies
• Design and evaluation of innovative interactive systems
• User interface prototyping and management systems
• Ubiquitous computing
• Wearable computers
• Pervasive computing
• Affective computing
• Empirical studies of user behaviour
• Empirical studies of programming and software engineering
• Computer supported cooperative work
• Computer mediated communication
• Virtual reality
• Mixed and augmented Reality
• Intelligent user interfaces
• Presence
...