2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)最新文献_第4页

The Architecture of Speech-to-Speech Translator for Mobile Conversation 面向移动会话的语音到语音翻译器体系结构

2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA) Pub Date : 2019-10-01 DOI: 10.1109/O-COCOSDA46868.2019.9041196

Agung Santosa, Andi Djalal Latief, Hammam Riza, Asril Jarin, Lyla Ruslana Aini, Gunarso, Gita Citra Puspita, M. T. Uliniansyah, Elvira Nurfadhilah, Harnum A. Prafitia, Made Gunawan

引用次数: 2

The Study of Prosody-Pragmatics Interface with Focus Functioning as Pragmatic Markers: The Case of Question and Statement 焦点作为语用标记的韵律语用界面研究——以疑问句和陈述句为例

2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA) Pub Date : 2019-10-01 DOI: 10.1109/O-COCOSDA46868.2019.9041157

Siyi Cao, Yizhong Xu, Xiaoli Ji

{"title":"The Study of Prosody-Pragmatics Interface with Focus Functioning as Pragmatic Markers: The Case of Question and Statement","authors":"Siyi Cao, Yizhong Xu, Xiaoli Ji","doi":"10.1109/O-COCOSDA46868.2019.9041157","DOIUrl":"https://doi.org/10.1109/O-COCOSDA46868.2019.9041157","url":null,"abstract":"This paper investigated that based on [22] ‘s perspective that Pragmatic Markers (PMs) are realized mainly through prosody between native speakers and non-native speakers, when focus functions as pragmatic markers, whether pragmatic factors from non-native speakers restrict the realization of Pragmatic Markers through prosody leading to misunderstanding in intercultural communication, in the case of declarative questions and statements. Pitch contours of 17 Chinese EFL (English as a foreign language) learners (non-native speakers)’ sentences were compared with that of six native speakers using four sentences from AESOP. The results demonstrated that native speakers and non-native speakers indeed realized pragmatic markers (focused words) through prosodic cues (pitch range), but differed in the way of realization for pragmatic markers, leading to pragmatic misunderstanding. This paper proves [22] ‘s opinion and demonstrates that pragmatic elements from transfer, L2 teaching, proficiency of non-native speakers constraint prosodic ways for realizing pragmatic markers, which indicates conventionality in cross-culture conversation.","PeriodicalId":263209,"journal":{"name":"2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132616409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Comparison between read and spontaneous speech assessment of L2 Korean 二语韩语阅读与自发言语评价的比较

2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA) Pub Date : 2019-10-01 DOI: 10.1109/O-COCOSDA46868.2019.9060846

S. Yang, Minhwa Chung

引用次数: 0

A Large Collection of Sentences Read Aloud by Vietnamese Learners of Japanese and Native Speaker's Reverse Shadowings 越南日语学习者大声朗读的大量句子和母语人士的反向阴影

2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA) Pub Date : 2019-10-01 DOI: 10.1109/O-COCOSDA46868.2019.9041215

Shintaro Ando, Z. Lin, Tasavat Trisitichoke, Y. Inoue, Fuki Yoshizawa, D. Saito, N. Minematsu

{"title":"A Large Collection of Sentences Read Aloud by Vietnamese Learners of Japanese and Native Speaker's Reverse Shadowings","authors":"Shintaro Ando, Z. Lin, Tasavat Trisitichoke, Y. Inoue, Fuki Yoshizawa, D. Saito, N. Minematsu","doi":"10.1109/O-COCOSDA46868.2019.9041215","DOIUrl":"https://doi.org/10.1109/O-COCOSDA46868.2019.9041215","url":null,"abstract":"The main objective of language learning is to acquire good communication skills in the target language. From this viewpoint, the primary goal of pronunciation training is to become able to speak in an intelligible-enough or comprehensible-enough pronunciation, not a native-sounding one. However, achieving such pronunciation is still not easy for many learners mainly because of their lack of opportunity to use the language they learn and to receive some feedbacks on intelligibility or comprehensibility from native listeners. In order to solve this problem, the authors previously proposed a novel method of native speakers' reverse shadowing and showed that the degree of inarticulation observed in native speakers' shadowings of learners' utterances can be used to estimate the comprehensibility of learners' speech. One major problem in our previous research however, was that the experiment was done on a relatively small scale; the number of learners was only six. For this reason, in this study, we carried out a larger collection of Japanese utterances read aloud by 60 Vietnamese learners and Japanese native speakers' shadowings of those utterances. An analysis of the subjective ratings done by the native speakers implies that some modifications we made from our previous experiment contribute to making the framework of native speakers' reverse shadowing more pedagogically effective. Further, a preliminary analysis of the recorded shadowings shows good correlations to listeners' perceived shadowability.","PeriodicalId":263209,"journal":{"name":"2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"33 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132596172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Challenges Posed by Voice Interface to Child- Agent Collaborative Storytelling 语音界面对儿童-代理协作讲故事的挑战

2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA) Pub Date : 2019-10-01 DOI: 10.1109/O-COCOSDA46868.2019.9041233

Ethel Ong, Junlyn Bryan Alburo, Christine Rachel De Jesus, Luisa Katherine Gilig, Dionne Tiffany Ong

{"title":"Challenges Posed by Voice Interface to Child- Agent Collaborative Storytelling","authors":"Ethel Ong, Junlyn Bryan Alburo, Christine Rachel De Jesus, Luisa Katherine Gilig, Dionne Tiffany Ong","doi":"10.1109/O-COCOSDA46868.2019.9041233","DOIUrl":"https://doi.org/10.1109/O-COCOSDA46868.2019.9041233","url":null,"abstract":"Child-agent collaborative storytelling can be facilitated through text and voice interfaces. Voice interfaces are more intuitive and closely resemble the way people usually relate to one another. This may be attributed to the colloquial characteristics of everyday conversations that do away with rigid linguistic structures typically present in text interfaces, such as observing the use of correct grammar and spelling. However, the capabilities of voice-based interfaces currently available in virtual assistants can lead to failure in communication due to user frustration and confusion when the agent is not providing the needed support, possibly caused by the latter's misinterpretation of the user's input. In such situations, text-based interfaces from messaging applications may be used as an alternative communication channel. In this paper, we provide a comparative analysis of the performance of our collaborative storytelling agent in processing user input by analyzing conversation logs from voice-based interface using Google Assistant, and text-based interface using Google Firebase. To do this, we give a brief overview of the different dialogue strategies employed by our agent, and how these are manifested through the interfaces. We also identify the obstacles posed by incorrect input processing to the collaborative tasks, and offer suggestions on how these challenges can be addressed.","PeriodicalId":263209,"journal":{"name":"2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127344906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Fast and Accurate Capitalization and Punctuation for Automatic Speech Recognition Using Transformer and Chunk Merging 基于变换和块合并的快速准确的语音自动识别的大写和标点

2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA) Pub Date : 2019-08-07 DOI: 10.1109/O-COCOSDA46868.2019.9041202

B. Nguyen, V. H. Nguyen, Hien Nguyen, Pham Ngoc Phuong, The-Loc Nguyen, Quoc Truong Do, Luong Chi Mai

{"title":"Fast and Accurate Capitalization and Punctuation for Automatic Speech Recognition Using Transformer and Chunk Merging","authors":"B. Nguyen, V. H. Nguyen, Hien Nguyen, Pham Ngoc Phuong, The-Loc Nguyen, Quoc Truong Do, Luong Chi Mai","doi":"10.1109/O-COCOSDA46868.2019.9041202","DOIUrl":"https://doi.org/10.1109/O-COCOSDA46868.2019.9041202","url":null,"abstract":"In recent years, studies on automatic speech recognition (ASR) have shown outstanding results that reach human parity on short speech segments. However, there are still difficulties in standardizing the output of ASR such as capitalization and punctuation restoration for long-speech transcription. The problems obstruct readers to understand the ASR output semantically and also cause difficulties for natural language processing models such as NER, POS and semantic parsing. In this paper, we propose a method to restore the punctuation and capitalization for long-speech ASR transcription. The method is based on Transformer models and chunk merging that allows us to (1), build a single model that performs punctuation and capitalization in one go, and (2), perform decoding in parallel while improving the prediction accuracy. Experiments on British National Corpus showed that the proposed approach outperforms existing methods in both accuracy and decoding speed.","PeriodicalId":263209,"journal":{"name":"2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"216 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128874968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 38