{"title":"ToBI accent type recognition","authors":"Arman Maghbouleh","doi":"10.21437/ICSLP.1998-126","DOIUrl":null,"url":null,"abstract":"This paper describes work in progress for recognizing a subset of ToBI intonation labels (H*, L+H*, L*, !H*, L+!H*, no accent). Initially, duration characteristics are used to classify syllables as accented or not. The accented syllables are then subclassified based on fundamental frequency, F0, values. Potential F0 intonation gestures are schematized by connected line segments within a window around a given syllable. The schematizations are found using spline-basis linear regression. The regression weights on F0 points are varied in order to discount segmental effects and F0 detection errors. Parameters based on the line segments are then used to perform the subclassification. This paper presents new results in recognizing L*, L+H*, and L+!H* accents. In addition, the models presented here perform comparably (80% overall, and 74% accent type recognition) to models which do not distinguish bitonal accents.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"5th International Conference on Spoken Language Processing (ICSLP 1998)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/ICSLP.1998-126","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
This paper describes work in progress for recognizing a subset of ToBI intonation labels (H*, L+H*, L*, !H*, L+!H*, no accent). Initially, duration characteristics are used to classify syllables as accented or not. The accented syllables are then subclassified based on fundamental frequency, F0, values. Potential F0 intonation gestures are schematized by connected line segments within a window around a given syllable. The schematizations are found using spline-basis linear regression. The regression weights on F0 points are varied in order to discount segmental effects and F0 detection errors. Parameters based on the line segments are then used to perform the subclassification. This paper presents new results in recognizing L*, L+H*, and L+!H* accents. In addition, the models presented here perform comparably (80% overall, and 74% accent type recognition) to models which do not distinguish bitonal accents.