{"title":"基于卷积和循环神经网络的超声波静音语音接口","authors":"E. Juanpere, T. Csapó","doi":"10.3813/AAA.919339","DOIUrl":null,"url":null,"abstract":"Silent Speech Interface (SSI) is a technology with the goal of synthesizing speech from articulatory motion. A Deep Neural Network based SSI using ultrasound images of the tongue as input signals and spectral coefficients of a vocoder as target parameters are proposed. Several deep\n learning models, such as a baseline Feed-forward, and a combination of Convolutional and Recurrent Neural Networks are presented and discussed. A pre-processing step using a Deep Convolutional AutoEncoder was also studied. According to the experimental results, an architecture based on a CNN\n and bidirectional LSTM layers has shown the best objective and subjective results.","PeriodicalId":35085,"journal":{"name":"Acta Acustica united with Acustica","volume":"69 7","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Ultrasound-Based Silent Speech Interface Using Convolutional and Recurrent Neural Networks\",\"authors\":\"E. Juanpere, T. Csapó\",\"doi\":\"10.3813/AAA.919339\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Silent Speech Interface (SSI) is a technology with the goal of synthesizing speech from articulatory motion. A Deep Neural Network based SSI using ultrasound images of the tongue as input signals and spectral coefficients of a vocoder as target parameters are proposed. Several deep\\n learning models, such as a baseline Feed-forward, and a combination of Convolutional and Recurrent Neural Networks are presented and discussed. A pre-processing step using a Deep Convolutional AutoEncoder was also studied. According to the experimental results, an architecture based on a CNN\\n and bidirectional LSTM layers has shown the best objective and subjective results.\",\"PeriodicalId\":35085,\"journal\":{\"name\":\"Acta Acustica united with Acustica\",\"volume\":\"69 7\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta Acustica united with Acustica\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3813/AAA.919339\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Arts and Humanities\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Acustica united with Acustica","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3813/AAA.919339","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Arts and Humanities","Score":null,"Total":0}
Ultrasound-Based Silent Speech Interface Using Convolutional and Recurrent Neural Networks
Silent Speech Interface (SSI) is a technology with the goal of synthesizing speech from articulatory motion. A Deep Neural Network based SSI using ultrasound images of the tongue as input signals and spectral coefficients of a vocoder as target parameters are proposed. Several deep
learning models, such as a baseline Feed-forward, and a combination of Convolutional and Recurrent Neural Networks are presented and discussed. A pre-processing step using a Deep Convolutional AutoEncoder was also studied. According to the experimental results, an architecture based on a CNN
and bidirectional LSTM layers has shown the best objective and subjective results.
期刊介绍:
Cessation. Acta Acustica united with Acustica (Acta Acust united Ac), was published together with the European Acoustics Association (EAA). It was an international, peer-reviewed journal on acoustics. It published original articles on all subjects in the field of acoustics, such as
• General Linear Acoustics, • Nonlinear Acoustics, Macrosonics, • Aeroacoustics, • Atmospheric Sound, • Underwater Sound, • Ultrasonics, • Physical Acoustics, • Structural Acoustics, • Noise Control, • Active Control, • Environmental Noise, • Building Acoustics, • Room Acoustics, • Acoustic Materials and Metamaterials, • Audio Signal Processing and Transducers, • Computational and Numerical Acoustics, • Hearing, Audiology and Psychoacoustics, • Speech,
• Musical Acoustics, • Virtual Acoustics, • Auditory Quality of Systems, • Animal Bioacoustics, • History of Acoustics.