2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)最新文献

筛选
英文 中文
The Architecture of Speech-to-Speech Translator for Mobile Conversation 面向移动会话的语音到语音翻译器体系结构
Agung Santosa, Andi Djalal Latief, Hammam Riza, Asril Jarin, Lyla Ruslana Aini, Gunarso, Gita Citra Puspita, M. T. Uliniansyah, Elvira Nurfadhilah, Harnum A. Prafitia, Made Gunawan
{"title":"The Architecture of Speech-to-Speech Translator for Mobile Conversation","authors":"Agung Santosa, Andi Djalal Latief, Hammam Riza, Asril Jarin, Lyla Ruslana Aini, Gunarso, Gita Citra Puspita, M. T. Uliniansyah, Elvira Nurfadhilah, Harnum A. Prafitia, Made Gunawan","doi":"10.1109/O-COCOSDA46868.2019.9041196","DOIUrl":"https://doi.org/10.1109/O-COCOSDA46868.2019.9041196","url":null,"abstract":"With competencies and the results of the engineering of natural language processing technology owned by BPPT since 1987, BPPT develops an English-Bahasa Indonesia speech-to-speech translation system (S2ST). In this paper, we propose an architecture of speech-to-speech translation system for Android-based mobile conversation using separate mobile devices for each language. This architecture applies three leading technologies, namely: WebSocket, REST, and JSON. The system utilizes a two-way communication protocol between two users and a simple voice activation detector that can detect a boundary of user's utterance.","PeriodicalId":263209,"journal":{"name":"2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116075403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The Study of Prosody-Pragmatics Interface with Focus Functioning as Pragmatic Markers: The Case of Question and Statement 焦点作为语用标记的韵律语用界面研究——以疑问句和陈述句为例
Siyi Cao, Yizhong Xu, Xiaoli Ji
{"title":"The Study of Prosody-Pragmatics Interface with Focus Functioning as Pragmatic Markers: The Case of Question and Statement","authors":"Siyi Cao, Yizhong Xu, Xiaoli Ji","doi":"10.1109/O-COCOSDA46868.2019.9041157","DOIUrl":"https://doi.org/10.1109/O-COCOSDA46868.2019.9041157","url":null,"abstract":"This paper investigated that based on [22] ‘s perspective that Pragmatic Markers (PMs) are realized mainly through prosody between native speakers and non-native speakers, when focus functions as pragmatic markers, whether pragmatic factors from non-native speakers restrict the realization of Pragmatic Markers through prosody leading to misunderstanding in intercultural communication, in the case of declarative questions and statements. Pitch contours of 17 Chinese EFL (English as a foreign language) learners (non-native speakers)’ sentences were compared with that of six native speakers using four sentences from AESOP. The results demonstrated that native speakers and non-native speakers indeed realized pragmatic markers (focused words) through prosodic cues (pitch range), but differed in the way of realization for pragmatic markers, leading to pragmatic misunderstanding. This paper proves [22] ‘s opinion and demonstrates that pragmatic elements from transfer, L2 teaching, proficiency of non-native speakers constraint prosodic ways for realizing pragmatic markers, which indicates conventionality in cross-culture conversation.","PeriodicalId":263209,"journal":{"name":"2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132616409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparison between read and spontaneous speech assessment of L2 Korean 二语韩语阅读与自发言语评价的比较
S. Yang, Minhwa Chung
{"title":"Comparison between read and spontaneous speech assessment of L2 Korean","authors":"S. Yang, Minhwa Chung","doi":"10.1109/O-COCOSDA46868.2019.9060846","DOIUrl":"https://doi.org/10.1109/O-COCOSDA46868.2019.9060846","url":null,"abstract":"This paper describes two experiments aimed at exploring the relationship between linguistic aspects and perceived proficiency in read and spontaneous speech. 5,000 utterances of read speech by 50 non-native speakers of Korean in Experiment 1, and of 6,000 spontaneous speech utterances in Experiment 2 were scored for proficiency by native human raters and were analyzed by factors known to be related to perceived proficiency. The results show that the factors investigated in this study can be employed to predict proficiency ratings, and the predictive power of fluency and pitch and accent accuracy is strong for both read and spontaneous speech. We also observe that while proficiency ratings of read speech are mainly related to segmental accuracy, those of spontaneous speech appear to be more related to pitch and accent accuracy. Moreover, proficiency in read speech does not always equate to the proficiency in spontaneous speech, and vice versa, with Pearson’s per-speaker correlation score of 0.535.","PeriodicalId":263209,"journal":{"name":"2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122246679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Large Collection of Sentences Read Aloud by Vietnamese Learners of Japanese and Native Speaker's Reverse Shadowings 越南日语学习者大声朗读的大量句子和母语人士的反向阴影
Shintaro Ando, Z. Lin, Tasavat Trisitichoke, Y. Inoue, Fuki Yoshizawa, D. Saito, N. Minematsu
{"title":"A Large Collection of Sentences Read Aloud by Vietnamese Learners of Japanese and Native Speaker's Reverse Shadowings","authors":"Shintaro Ando, Z. Lin, Tasavat Trisitichoke, Y. Inoue, Fuki Yoshizawa, D. Saito, N. Minematsu","doi":"10.1109/O-COCOSDA46868.2019.9041215","DOIUrl":"https://doi.org/10.1109/O-COCOSDA46868.2019.9041215","url":null,"abstract":"The main objective of language learning is to acquire good communication skills in the target language. From this viewpoint, the primary goal of pronunciation training is to become able to speak in an intelligible-enough or comprehensible-enough pronunciation, not a native-sounding one. However, achieving such pronunciation is still not easy for many learners mainly because of their lack of opportunity to use the language they learn and to receive some feedbacks on intelligibility or comprehensibility from native listeners. In order to solve this problem, the authors previously proposed a novel method of native speakers' reverse shadowing and showed that the degree of inarticulation observed in native speakers' shadowings of learners' utterances can be used to estimate the comprehensibility of learners' speech. One major problem in our previous research however, was that the experiment was done on a relatively small scale; the number of learners was only six. For this reason, in this study, we carried out a larger collection of Japanese utterances read aloud by 60 Vietnamese learners and Japanese native speakers' shadowings of those utterances. An analysis of the subjective ratings done by the native speakers implies that some modifications we made from our previous experiment contribute to making the framework of native speakers' reverse shadowing more pedagogically effective. Further, a preliminary analysis of the recorded shadowings shows good correlations to listeners' perceived shadowability.","PeriodicalId":263209,"journal":{"name":"2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"33 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132596172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Challenges Posed by Voice Interface to Child- Agent Collaborative Storytelling 语音界面对儿童-代理协作讲故事的挑战
Ethel Ong, Junlyn Bryan Alburo, Christine Rachel De Jesus, Luisa Katherine Gilig, Dionne Tiffany Ong
{"title":"Challenges Posed by Voice Interface to Child- Agent Collaborative Storytelling","authors":"Ethel Ong, Junlyn Bryan Alburo, Christine Rachel De Jesus, Luisa Katherine Gilig, Dionne Tiffany Ong","doi":"10.1109/O-COCOSDA46868.2019.9041233","DOIUrl":"https://doi.org/10.1109/O-COCOSDA46868.2019.9041233","url":null,"abstract":"Child-agent collaborative storytelling can be facilitated through text and voice interfaces. Voice interfaces are more intuitive and closely resemble the way people usually relate to one another. This may be attributed to the colloquial characteristics of everyday conversations that do away with rigid linguistic structures typically present in text interfaces, such as observing the use of correct grammar and spelling. However, the capabilities of voice-based interfaces currently available in virtual assistants can lead to failure in communication due to user frustration and confusion when the agent is not providing the needed support, possibly caused by the latter's misinterpretation of the user's input. In such situations, text-based interfaces from messaging applications may be used as an alternative communication channel. In this paper, we provide a comparative analysis of the performance of our collaborative storytelling agent in processing user input by analyzing conversation logs from voice-based interface using Google Assistant, and text-based interface using Google Firebase. To do this, we give a brief overview of the different dialogue strategies employed by our agent, and how these are manifested through the interfaces. We also identify the obstacles posed by incorrect input processing to the collaborative tasks, and offer suggestions on how these challenges can be addressed.","PeriodicalId":263209,"journal":{"name":"2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127344906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Fast and Accurate Capitalization and Punctuation for Automatic Speech Recognition Using Transformer and Chunk Merging 基于变换和块合并的快速准确的语音自动识别的大写和标点
B. Nguyen, V. H. Nguyen, Hien Nguyen, Pham Ngoc Phuong, The-Loc Nguyen, Quoc Truong Do, Luong Chi Mai
{"title":"Fast and Accurate Capitalization and Punctuation for Automatic Speech Recognition Using Transformer and Chunk Merging","authors":"B. Nguyen, V. H. Nguyen, Hien Nguyen, Pham Ngoc Phuong, The-Loc Nguyen, Quoc Truong Do, Luong Chi Mai","doi":"10.1109/O-COCOSDA46868.2019.9041202","DOIUrl":"https://doi.org/10.1109/O-COCOSDA46868.2019.9041202","url":null,"abstract":"In recent years, studies on automatic speech recognition (ASR) have shown outstanding results that reach human parity on short speech segments. However, there are still difficulties in standardizing the output of ASR such as capitalization and punctuation restoration for long-speech transcription. The problems obstruct readers to understand the ASR output semantically and also cause difficulties for natural language processing models such as NER, POS and semantic parsing. In this paper, we propose a method to restore the punctuation and capitalization for long-speech ASR transcription. The method is based on Transformer models and chunk merging that allows us to (1), build a single model that performs punctuation and capitalization in one go, and (2), perform decoding in parallel while improving the prediction accuracy. Experiments on British National Corpus showed that the proposed approach outperforms existing methods in both accuracy and decoding speed.","PeriodicalId":263209,"journal":{"name":"2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"216 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128874968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信