{"title":"多语环境下印尼语自动转录与字幕系统","authors":"Muhammad Bagus Andra, T. Usagawa","doi":"10.1109/ICIIBMS50712.2020.9336388","DOIUrl":null,"url":null,"abstract":"Compared to the more established languages, such as English, Bahasa Indonesia, which is still considered a low-resource language, remains deficient in terms of communication-assisting technology development. This research paper proposes a new method for automatically transcribing simultaneous speech in Bahasa Indonesia. The proposed method could be used as an assistive tool in situations that involve simultaneous speech, such as online discussions and remote conferences. The proposed method uses pitch-aware gain-based speech separation to distinguish the speech between speakers, and a recurrent neural network (RNN) is used to generate a transcription of the speech. This method can detect and transcribe a mixed speech signal of up to three speakers and demonstrates enhanced performance in single-speaker situations compared to the baseline method.","PeriodicalId":243033,"journal":{"name":"2020 5th International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automatic Transcription and Captioning System for Bahasa Indonesia in Multi-Speaker Environment\",\"authors\":\"Muhammad Bagus Andra, T. Usagawa\",\"doi\":\"10.1109/ICIIBMS50712.2020.9336388\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Compared to the more established languages, such as English, Bahasa Indonesia, which is still considered a low-resource language, remains deficient in terms of communication-assisting technology development. This research paper proposes a new method for automatically transcribing simultaneous speech in Bahasa Indonesia. The proposed method could be used as an assistive tool in situations that involve simultaneous speech, such as online discussions and remote conferences. The proposed method uses pitch-aware gain-based speech separation to distinguish the speech between speakers, and a recurrent neural network (RNN) is used to generate a transcription of the speech. This method can detect and transcribe a mixed speech signal of up to three speakers and demonstrates enhanced performance in single-speaker situations compared to the baseline method.\",\"PeriodicalId\":243033,\"journal\":{\"name\":\"2020 5th International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS)\",\"volume\":\"74 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 5th International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIIBMS50712.2020.9336388\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 5th International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIIBMS50712.2020.9336388","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automatic Transcription and Captioning System for Bahasa Indonesia in Multi-Speaker Environment
Compared to the more established languages, such as English, Bahasa Indonesia, which is still considered a low-resource language, remains deficient in terms of communication-assisting technology development. This research paper proposes a new method for automatically transcribing simultaneous speech in Bahasa Indonesia. The proposed method could be used as an assistive tool in situations that involve simultaneous speech, such as online discussions and remote conferences. The proposed method uses pitch-aware gain-based speech separation to distinguish the speech between speakers, and a recurrent neural network (RNN) is used to generate a transcription of the speech. This method can detect and transcribe a mixed speech signal of up to three speakers and demonstrates enhanced performance in single-speaker situations compared to the baseline method.