{"title":"从词汇词中预测韵律词——从文本中预测韵律的第一步","authors":"Hua-Jui Peng, Chi-ching Chen, Chiu-yu Tseng, Keh-Jiann Chen","doi":"10.1109/CHINSL.2004.1409614","DOIUrl":null,"url":null,"abstract":"Much remains unsolved in how to predict prosody from text for unlimited Mandarin Chinese TTS. The interactions and the rules between syntactic structure and prosodic structure are still unresolved challenges. By using part-of-speech (POS) tagging, for which text lexical information is required, we aim to find significant patterns of word grouping from analyzing real speech data and such lexical information. The paper reports discrepancies found between lexical words (LW) parsed from text and prosodic words (PW) annotated from speech data, and proposes a statistical model to predict PWs from LWs. In the statistical model, the length of the word and the tagging from POS are two essential features to predict PWs, and the results show approximately 90% of prediction for PWs; however, it does leave more room for extension. We believe that evidence from PW predictions is a first step towards building prosody models from text.","PeriodicalId":212562,"journal":{"name":"2004 International Symposium on Chinese Spoken Language Processing","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"Predicting prosodic words from lexical words - a first step towards predicting prosody from text\",\"authors\":\"Hua-Jui Peng, Chi-ching Chen, Chiu-yu Tseng, Keh-Jiann Chen\",\"doi\":\"10.1109/CHINSL.2004.1409614\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Much remains unsolved in how to predict prosody from text for unlimited Mandarin Chinese TTS. The interactions and the rules between syntactic structure and prosodic structure are still unresolved challenges. By using part-of-speech (POS) tagging, for which text lexical information is required, we aim to find significant patterns of word grouping from analyzing real speech data and such lexical information. The paper reports discrepancies found between lexical words (LW) parsed from text and prosodic words (PW) annotated from speech data, and proposes a statistical model to predict PWs from LWs. In the statistical model, the length of the word and the tagging from POS are two essential features to predict PWs, and the results show approximately 90% of prediction for PWs; however, it does leave more room for extension. We believe that evidence from PW predictions is a first step towards building prosody models from text.\",\"PeriodicalId\":212562,\"journal\":{\"name\":\"2004 International Symposium on Chinese Spoken Language Processing\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2004-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2004 International Symposium on Chinese Spoken Language Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CHINSL.2004.1409614\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2004 International Symposium on Chinese Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CHINSL.2004.1409614","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Predicting prosodic words from lexical words - a first step towards predicting prosody from text
Much remains unsolved in how to predict prosody from text for unlimited Mandarin Chinese TTS. The interactions and the rules between syntactic structure and prosodic structure are still unresolved challenges. By using part-of-speech (POS) tagging, for which text lexical information is required, we aim to find significant patterns of word grouping from analyzing real speech data and such lexical information. The paper reports discrepancies found between lexical words (LW) parsed from text and prosodic words (PW) annotated from speech data, and proposes a statistical model to predict PWs from LWs. In the statistical model, the length of the word and the tagging from POS are two essential features to predict PWs, and the results show approximately 90% of prediction for PWs; however, it does leave more room for extension. We believe that evidence from PW predictions is a first step towards building prosody models from text.