多语言COVID-QA:通过多语言网络问答学习实现全球信息共享

Proceedings of the Web Conference 2021 Pub Date : 2021-04-19 DOI:10.1145/3442381.3449991

Rui Yan, Weiheng Liao, Jianwei Cui, Hailei Zhang, Yichuan Hu, Dongyan Zhao

{"title":"多语言COVID-QA:通过多语言网络问答学习实现全球信息共享","authors":"Rui Yan, Weiheng Liao, Jianwei Cui, Hailei Zhang, Yichuan Hu, Dongyan Zhao","doi":"10.1145/3442381.3449991","DOIUrl":null,"url":null,"abstract":"Since late December 2019, it has been reported an outbreak of atypical pneumonia, now known as COVID-19 caused by the novel coronavirus. Cases have spread to more than 200 countries and regions internationally. World Health Organization (WHO) officially declares the coronavirus outbreak a pandemic and the public health emergency has caused world-wide impact to daily lives: people are advised to keep social distance, in-person events have been moved online, and some function facilitates have been locked-down. Alternatively, the Web becomes an active venue for people to share information. With respect to the on-going topic, people continuously post questions online and seek for answers. Yet, sharing global information conveyed in different languages is challenging because the language barrier is intrinsically unfriendly to monolingual speakers. In this paper, we propose a multilingual COVID-QA model to answer people’s questions in their own languages while the model is able to absorb knowledge from other languages. Another challenge is that in most cases, the information to share does not have parallel data in multiple languages. To this end, we propose a novel framework which incorporates (unsupervised) translation alignment to learn as pseudo-parallel data. Then we train multilingual question-answering mapping and generation. We demonstrate the effectiveness of our proposed approach compared against a series of competitive baselines. In this way, we make it easier to share global information across the language barriers, and hopefully we contribute to the battle against COVID-19.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Multilingual COVID-QA: Learning towards Global Information Sharing via Web Question Answering in Multiple Languages\",\"authors\":\"Rui Yan, Weiheng Liao, Jianwei Cui, Hailei Zhang, Yichuan Hu, Dongyan Zhao\",\"doi\":\"10.1145/3442381.3449991\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Since late December 2019, it has been reported an outbreak of atypical pneumonia, now known as COVID-19 caused by the novel coronavirus. Cases have spread to more than 200 countries and regions internationally. World Health Organization (WHO) officially declares the coronavirus outbreak a pandemic and the public health emergency has caused world-wide impact to daily lives: people are advised to keep social distance, in-person events have been moved online, and some function facilitates have been locked-down. Alternatively, the Web becomes an active venue for people to share information. With respect to the on-going topic, people continuously post questions online and seek for answers. Yet, sharing global information conveyed in different languages is challenging because the language barrier is intrinsically unfriendly to monolingual speakers. In this paper, we propose a multilingual COVID-QA model to answer people’s questions in their own languages while the model is able to absorb knowledge from other languages. Another challenge is that in most cases, the information to share does not have parallel data in multiple languages. To this end, we propose a novel framework which incorporates (unsupervised) translation alignment to learn as pseudo-parallel data. Then we train multilingual question-answering mapping and generation. We demonstrate the effectiveness of our proposed approach compared against a series of competitive baselines. In this way, we make it easier to share global information across the language barriers, and hopefully we contribute to the battle against COVID-19.\",\"PeriodicalId\":106672,\"journal\":{\"name\":\"Proceedings of the Web Conference 2021\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Web Conference 2021\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3442381.3449991\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Web Conference 2021","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3442381.3449991","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

自2019年12月下旬以来，据报道爆发了由新型冠状病毒引起的非典型肺炎，现在称为COVID-19。病例已蔓延到200多个国家和地区。世界卫生组织(世卫组织)正式宣布新冠肺炎疫情为大流行疫情，突发公共卫生事件对日常生活造成了全球性影响:建议人们保持社交距离，现场活动转移到网上，一些功能设施被封锁。另外，网络成为人们分享信息的活跃场所。对于正在进行的话题，人们不断地在网上提出问题并寻求答案。然而，共享以不同语言传达的全球信息是具有挑战性的，因为语言障碍对单语使用者来说本质上是不友好的。在本文中，我们提出了一个多语言的COVID-QA模型，该模型可以用自己的语言回答人们的问题，同时可以吸收其他语言的知识。另一个挑战是，在大多数情况下，要共享的信息没有多种语言的并行数据。为此，我们提出了一个新的框架，该框架将(无监督)翻译对齐作为伪并行数据进行学习。然后我们训练多语言问答的映射和生成。我们将所提出的方法与一系列竞争性基线进行比较，以证明其有效性。通过这种方式，我们可以更轻松地跨越语言障碍分享全球信息，希望我们能为抗击COVID-19的斗争做出贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multilingual COVID-QA: Learning towards Global Information Sharing via Web Question Answering in Multiple Languages

Since late December 2019, it has been reported an outbreak of atypical pneumonia, now known as COVID-19 caused by the novel coronavirus. Cases have spread to more than 200 countries and regions internationally. World Health Organization (WHO) officially declares the coronavirus outbreak a pandemic and the public health emergency has caused world-wide impact to daily lives: people are advised to keep social distance, in-person events have been moved online, and some function facilitates have been locked-down. Alternatively, the Web becomes an active venue for people to share information. With respect to the on-going topic, people continuously post questions online and seek for answers. Yet, sharing global information conveyed in different languages is challenging because the language barrier is intrinsically unfriendly to monolingual speakers. In this paper, we propose a multilingual COVID-QA model to answer people’s questions in their own languages while the model is able to absorb knowledge from other languages. Another challenge is that in most cases, the information to share does not have parallel data in multiple languages. To this end, we propose a novel framework which incorporates (unsupervised) translation alignment to learn as pseudo-parallel data. Then we train multilingual question-answering mapping and generation. We demonstrate the effectiveness of our proposed approach compared against a series of competitive baselines. In this way, we make it easier to share global information across the language barriers, and hopefully we contribute to the battle against COVID-19.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Web Conference 2021

自引率

0.00%

发文量