M. Naghibolhosseini, Ahmed M. Yousef, Mohsen Zayernouri, Stephanie R. C. Zacharias, D. Deliyski
{"title":"高速喉部影像分析的深度学习","authors":"M. Naghibolhosseini, Ahmed M. Yousef, Mohsen Zayernouri, Stephanie R. C. Zacharias, D. Deliyski","doi":"10.1109/ICCIKE58312.2023.10131757","DOIUrl":null,"url":null,"abstract":"High-speed imaging of the larynx provides a valuable means for studying vocal folds function and vibratory behaviors. Using laryngeal high-speed videoendoscopy (HSV) with a flexible nasolaryngoscope, one can record the detailed vibratory movements of vocal folds during connected speech. This high-speed imaging tool enables us to study the normal function of the vocal folds and how this function can be disrupted due to the presence of voice disorders. In this work, HSV data were utilized during connected speech from subjects with normophonic voices (no voice disorders) and a neurological voice disorder. The data were obtained using a high-speed camera, coupled with a flexible endoscope, at 4,000 frames per second. Deep learning was used for the analysis of the big HSV dataset to extract the vibratory behaviors of the vocal folds. This deep-learning-based tool achieved high levels of accuracy for analysis of challenging HSV data in connected speech. This tool provides a computationally cost-effective and an accurate measurement approach that could help design more advanced voice assessment protocols in future.","PeriodicalId":164690,"journal":{"name":"2023 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Learning for High-Speed Laryngeal Imaging Analysis\",\"authors\":\"M. Naghibolhosseini, Ahmed M. Yousef, Mohsen Zayernouri, Stephanie R. C. Zacharias, D. Deliyski\",\"doi\":\"10.1109/ICCIKE58312.2023.10131757\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High-speed imaging of the larynx provides a valuable means for studying vocal folds function and vibratory behaviors. Using laryngeal high-speed videoendoscopy (HSV) with a flexible nasolaryngoscope, one can record the detailed vibratory movements of vocal folds during connected speech. This high-speed imaging tool enables us to study the normal function of the vocal folds and how this function can be disrupted due to the presence of voice disorders. In this work, HSV data were utilized during connected speech from subjects with normophonic voices (no voice disorders) and a neurological voice disorder. The data were obtained using a high-speed camera, coupled with a flexible endoscope, at 4,000 frames per second. Deep learning was used for the analysis of the big HSV dataset to extract the vibratory behaviors of the vocal folds. This deep-learning-based tool achieved high levels of accuracy for analysis of challenging HSV data in connected speech. This tool provides a computationally cost-effective and an accurate measurement approach that could help design more advanced voice assessment protocols in future.\",\"PeriodicalId\":164690,\"journal\":{\"name\":\"2023 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCIKE58312.2023.10131757\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIKE58312.2023.10131757","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Deep Learning for High-Speed Laryngeal Imaging Analysis
High-speed imaging of the larynx provides a valuable means for studying vocal folds function and vibratory behaviors. Using laryngeal high-speed videoendoscopy (HSV) with a flexible nasolaryngoscope, one can record the detailed vibratory movements of vocal folds during connected speech. This high-speed imaging tool enables us to study the normal function of the vocal folds and how this function can be disrupted due to the presence of voice disorders. In this work, HSV data were utilized during connected speech from subjects with normophonic voices (no voice disorders) and a neurological voice disorder. The data were obtained using a high-speed camera, coupled with a flexible endoscope, at 4,000 frames per second. Deep learning was used for the analysis of the big HSV dataset to extract the vibratory behaviors of the vocal folds. This deep-learning-based tool achieved high levels of accuracy for analysis of challenging HSV data in connected speech. This tool provides a computationally cost-effective and an accurate measurement approach that could help design more advanced voice assessment protocols in future.