Yang Ouyang ME, Wenwei Luo MD, Yinwei Zhan PhD, Caizhen Wei ME, Xian Liang ME, Hongming Huang MM, Yong Cui MD
{"title":"Toward Intelligent Head Impulse Test: A Goggle-Free Approach Using a Monocular Infrared Camera","authors":"Yang Ouyang ME, Wenwei Luo MD, Yinwei Zhan PhD, Caizhen Wei ME, Xian Liang ME, Hongming Huang MM, Yong Cui MD","doi":"10.1002/lary.31848","DOIUrl":null,"url":null,"abstract":"<div>\n \n <section>\n \n <h3> Objectives</h3>\n \n <p>To assess vestibular function, video head impulse test (vHIT) is taken as the gold standard by evaluating the vestibulo-ocular reflex (VOR). However, vHIT requires the patient to wear a specialized head-mounted goggle equipment that needs to be calibrated before each use. For this, we proposed an intelligent head impulse test (iHIT) setting with a monocular infrared camera instead of the head-mounted goggle and contributed correspondingly a video classification approach with deep learning to vestibular function determination.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>Within the iHIT framework, a monocular infrared camera was set in front of the patient to capture test videos, based on which a dataset DiHIT of HIT video clips was set up. We then proposed a two-stage multi-modal video classification network, trained on the dataset DiHIT, that took as input the eye motion and head motion data extracted from the facial keypoints via HIT clips and outputted the identification of the semicircular canal (SCC) being tested (SCC identification) and determination of VOR abnormality (SCC qualitation).</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>Experiments on this dataset DiHIT showed that it achieved the accuracy of 100% in prediction of SCC identification. Furthermore, it attained predictive accuracies of 84.1% in horizontal and 79.0% in vertical SCC qualitation.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>Compared with existing video-based HIT, iHIT eliminates goggles, does not require equipment calibration, and achieves complete automation. Furthermore, iHIT will bring more benefits to users due to its low cost and ease of operation. Codes and use case pipeline are available at: https://github.com/dec1st2023/iHIT.</p>\n </section>\n \n <section>\n \n <h3> Level of Evidence</h3>\n \n <p>3 <i>Laryngoscope</i>, 135:1161–1168, 2025</p>\n </section>\n </div>","PeriodicalId":49921,"journal":{"name":"Laryngoscope","volume":"135 3","pages":"1161-1168"},"PeriodicalIF":2.2000,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Laryngoscope","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/lary.31848","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives
To assess vestibular function, video head impulse test (vHIT) is taken as the gold standard by evaluating the vestibulo-ocular reflex (VOR). However, vHIT requires the patient to wear a specialized head-mounted goggle equipment that needs to be calibrated before each use. For this, we proposed an intelligent head impulse test (iHIT) setting with a monocular infrared camera instead of the head-mounted goggle and contributed correspondingly a video classification approach with deep learning to vestibular function determination.
Methods
Within the iHIT framework, a monocular infrared camera was set in front of the patient to capture test videos, based on which a dataset DiHIT of HIT video clips was set up. We then proposed a two-stage multi-modal video classification network, trained on the dataset DiHIT, that took as input the eye motion and head motion data extracted from the facial keypoints via HIT clips and outputted the identification of the semicircular canal (SCC) being tested (SCC identification) and determination of VOR abnormality (SCC qualitation).
Results
Experiments on this dataset DiHIT showed that it achieved the accuracy of 100% in prediction of SCC identification. Furthermore, it attained predictive accuracies of 84.1% in horizontal and 79.0% in vertical SCC qualitation.
Conclusions
Compared with existing video-based HIT, iHIT eliminates goggles, does not require equipment calibration, and achieves complete automation. Furthermore, iHIT will bring more benefits to users due to its low cost and ease of operation. Codes and use case pipeline are available at: https://github.com/dec1st2023/iHIT.
期刊介绍:
The Laryngoscope has been the leading source of information on advances in the diagnosis and treatment of head and neck disorders since 1890. The Laryngoscope is the first choice among otolaryngologists for publication of their important findings and techniques. Each monthly issue of The Laryngoscope features peer-reviewed medical, clinical, and research contributions in general otolaryngology, allergy/rhinology, otology/neurotology, laryngology/bronchoesophagology, head and neck surgery, sleep medicine, pediatric otolaryngology, facial plastics and reconstructive surgery, oncology, and communicative disorders. Contributions include papers and posters presented at the Annual and Section Meetings of the Triological Society, as well as independent papers, "How I Do It", "Triological Best Practice" articles, and contemporary reviews. Theses authored by the Triological Society’s new Fellows as well as papers presented at meetings of the American Laryngological Association are published in The Laryngoscope.
• Broncho-esophagology
• Communicative disorders
• Head and neck surgery
• Plastic and reconstructive facial surgery
• Oncology
• Speech and hearing defects