Zhenzhen You, Delong Sun, Zhenghao Shi, Shuangli Du, Xinhong Hei, Demin Kong, Xiaoying Du, Jing Yan, Xiaoyong Ren, Jin Hou
{"title":"The Initial Screening of Laryngeal Tumors via Voice Acoustic Analysis Based on Siamese Network Under Small Samples.","authors":"Zhenzhen You, Delong Sun, Zhenghao Shi, Shuangli Du, Xinhong Hei, Demin Kong, Xiaoying Du, Jing Yan, Xiaoyong Ren, Jin Hou","doi":"10.1016/j.jvoice.2025.03.043","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>The initial screening of laryngeal tumors via voice acoustic analysis is based on the clinician's experience that is subjective. This article introduces a Siamese network with an auxiliary gender classifier for automated, accurate, and objective initial screening of laryngeal tumors based on voice signals.</p><p><strong>Methods: </strong>The study involved 71 tumor patients and 293 non-tumor subjects of Chinese Mandarin. This dataset was divided into a training set and a test set in a ratio of 4:1. We applied nine data augmentation techniques to enlarge the voice training set and extracted the corresponding mel-frequency cepstral coefficients (MFCC) maps. The MFCC maps were randomly paired and fed into the proposed Siamese network to achieve multitask classification for tumor and non-tumor, woman and man. The performance of the proposed model was compared with one machine learning method and six classical deep learning models with and without the auxiliary gender classifier.</p><p><strong>Results: </strong>Experiments demonstrate the superiority of the proposed network compared with the reference models. The proposed model achieved an overall accuracy of 0.9437, an F score of 0.8462, a precision of 0.9167, a sensitivity of 0.7857, and a specificity of 0.9825.</p><p><strong>Conclusion: </strong>The proposed network can assist in the initial screening of laryngeal tumors through voice acoustic analysis. The initial screening solely through voice acoustic analysis can help individuals seek medical assistance outside the hospitals and reduce the burden on doctors as well.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Voice","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.jvoice.2025.03.043","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: The initial screening of laryngeal tumors via voice acoustic analysis is based on the clinician's experience that is subjective. This article introduces a Siamese network with an auxiliary gender classifier for automated, accurate, and objective initial screening of laryngeal tumors based on voice signals.
Methods: The study involved 71 tumor patients and 293 non-tumor subjects of Chinese Mandarin. This dataset was divided into a training set and a test set in a ratio of 4:1. We applied nine data augmentation techniques to enlarge the voice training set and extracted the corresponding mel-frequency cepstral coefficients (MFCC) maps. The MFCC maps were randomly paired and fed into the proposed Siamese network to achieve multitask classification for tumor and non-tumor, woman and man. The performance of the proposed model was compared with one machine learning method and six classical deep learning models with and without the auxiliary gender classifier.
Results: Experiments demonstrate the superiority of the proposed network compared with the reference models. The proposed model achieved an overall accuracy of 0.9437, an F score of 0.8462, a precision of 0.9167, a sensitivity of 0.7857, and a specificity of 0.9825.
Conclusion: The proposed network can assist in the initial screening of laryngeal tumors through voice acoustic analysis. The initial screening solely through voice acoustic analysis can help individuals seek medical assistance outside the hospitals and reduce the burden on doctors as well.
期刊介绍:
The Journal of Voice is widely regarded as the world''s premiere journal for voice medicine and research. This peer-reviewed publication is listed in Index Medicus and is indexed by the Institute for Scientific Information. The journal contains articles written by experts throughout the world on all topics in voice sciences, voice medicine and surgery, and speech-language pathologists'' management of voice-related problems. The journal includes clinical articles, clinical research, and laboratory research. Members of the Foundation receive the journal as a benefit of membership.