{"title":"Multi-Scale Model for Mandarin Tone Recognition","authors":"Linkai Peng, Wang Dai, Dengfeng Ke, Jinsong Zhang","doi":"10.1109/ISCSLP49672.2021.9362063","DOIUrl":null,"url":null,"abstract":"Tone plays an important role in tonal languages such as Mandarin and tone classification is an essential component of speech evaluation of Mandarin Chinese. Previous methods for tone classification rarely take into account that different tones possess different scales along both time and frequency axis. Meanwhile, tone contours are subject to many sorts of variation and therefore information from multiple scales can help models to determine the unclear boundary of tones in continuous speech. In this work, we propose a Multi-Scale model which can gather information at multiple resolutions to better capture the characteristics of tone variations effected by complex phonetic and linguistic rules. The experimental results showed that our method achieves competitive results on the Chinese National Hi-Tech Project 863 corpus with TER of 10.5%.","PeriodicalId":279828,"journal":{"name":"2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCSLP49672.2021.9362063","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Tone plays an important role in tonal languages such as Mandarin and tone classification is an essential component of speech evaluation of Mandarin Chinese. Previous methods for tone classification rarely take into account that different tones possess different scales along both time and frequency axis. Meanwhile, tone contours are subject to many sorts of variation and therefore information from multiple scales can help models to determine the unclear boundary of tones in continuous speech. In this work, we propose a Multi-Scale model which can gather information at multiple resolutions to better capture the characteristics of tone variations effected by complex phonetic and linguistic rules. The experimental results showed that our method achieves competitive results on the Chinese National Hi-Tech Project 863 corpus with TER of 10.5%.