{"title":"机器学习在语音科学中的应用","authors":"R. Trencsényi, L. Czap","doi":"10.1109/iccc54292.2022.9805885","DOIUrl":null,"url":null,"abstract":"We study the qualitative and quantitative information acquired from two-dimensional dynamic audiovisual sources storing synchronized image and sound signals recorded during human speech. Our main tool is machine learning, which connects data arising from records made by ultrasound (US) and magnetic resonance imaging (MRI) techniques. As a starting point, we track the tongue contours of the US and MRI frames utilizing our automatic algorithms. The constructed neural network is stimulated by data derived from the US tongue contours, and parameters obtained from the MRI tongue contours are assigned to the output of the system. By varying the number of input parameters, we create several system settings and test the operation and efficiency of the network for all configurations.","PeriodicalId":167963,"journal":{"name":"2022 23rd International Carpathian Control Conference (ICCC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine Learning Applied in Speech Science\",\"authors\":\"R. Trencsényi, L. Czap\",\"doi\":\"10.1109/iccc54292.2022.9805885\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We study the qualitative and quantitative information acquired from two-dimensional dynamic audiovisual sources storing synchronized image and sound signals recorded during human speech. Our main tool is machine learning, which connects data arising from records made by ultrasound (US) and magnetic resonance imaging (MRI) techniques. As a starting point, we track the tongue contours of the US and MRI frames utilizing our automatic algorithms. The constructed neural network is stimulated by data derived from the US tongue contours, and parameters obtained from the MRI tongue contours are assigned to the output of the system. By varying the number of input parameters, we create several system settings and test the operation and efficiency of the network for all configurations.\",\"PeriodicalId\":167963,\"journal\":{\"name\":\"2022 23rd International Carpathian Control Conference (ICCC)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 23rd International Carpathian Control Conference (ICCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/iccc54292.2022.9805885\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 23rd International Carpathian Control Conference (ICCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iccc54292.2022.9805885","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
We study the qualitative and quantitative information acquired from two-dimensional dynamic audiovisual sources storing synchronized image and sound signals recorded during human speech. Our main tool is machine learning, which connects data arising from records made by ultrasound (US) and magnetic resonance imaging (MRI) techniques. As a starting point, we track the tongue contours of the US and MRI frames utilizing our automatic algorithms. The constructed neural network is stimulated by data derived from the US tongue contours, and parameters obtained from the MRI tongue contours are assigned to the output of the system. By varying the number of input parameters, we create several system settings and test the operation and efficiency of the network for all configurations.