{"title":"An Automatic Speech Segmentation Algorithm of Portuguese based on Spectrogram Windowing","authors":"Lap-Man Hoi, Yuqi Sun, S. Im","doi":"10.1109/aiiot54504.2022.9817299","DOIUrl":null,"url":null,"abstract":"Sentence segmentation is important for improving the human readability of Automatic Speech Recognition (ASR) systems. Although it has been explored through numerous interdisciplinary studies, segmentation of Portuguese is still time-consuming due to the lack of efficient automatic segmentation methods and the reliance on qualified phonetic experts. This paper presents a novel algorithm that efficiently segments speech into sentences by learning the spectrogram of sentences through windows using a classification model developed with an Artificial Neural Network (ANN). Based on our experiments, the beginning part of a European Portuguese (EP) sentence enables better identification of the sentence's boundaries. In addition, a window frame of spectrogram constructed by the previous ending of 100 milliseconds (ms) and the subsequent beginning of 300 ms presents the best performance in the automatic sentence segmentation. As a result, the proposed algorithm can automatically segment Portuguese speech into sentences by analyzing its spectrogram without knowing the speech semantics.","PeriodicalId":409264,"journal":{"name":"2022 IEEE World AI IoT Congress (AIIoT)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE World AI IoT Congress (AIIoT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/aiiot54504.2022.9817299","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Sentence segmentation is important for improving the human readability of Automatic Speech Recognition (ASR) systems. Although it has been explored through numerous interdisciplinary studies, segmentation of Portuguese is still time-consuming due to the lack of efficient automatic segmentation methods and the reliance on qualified phonetic experts. This paper presents a novel algorithm that efficiently segments speech into sentences by learning the spectrogram of sentences through windows using a classification model developed with an Artificial Neural Network (ANN). Based on our experiments, the beginning part of a European Portuguese (EP) sentence enables better identification of the sentence's boundaries. In addition, a window frame of spectrogram constructed by the previous ending of 100 milliseconds (ms) and the subsequent beginning of 300 ms presents the best performance in the automatic sentence segmentation. As a result, the proposed algorithm can automatically segment Portuguese speech into sentences by analyzing its spectrogram without knowing the speech semantics.