VLSP 2021 - VieCap4H Challenge: Automatic Image Caption Generation for Healthcare Domain in Vietnamese

VNU Journal of Science: Computer Science and Communication Engineering Pub Date : 2022-12-16 DOI:10.25073/2588-1086/vnucsce.364

P. Phan

{"title":"VLSP 2021 - VieCap4H Challenge: Automatic Image Caption Generation for Healthcare Domain in Vietnamese","authors":"P. Phan","doi":"10.25073/2588-1086/vnucsce.364","DOIUrl":null,"url":null,"abstract":"Machine reading comprehension (MRC) is a challenging Natural Language Processing (NLP) research fieldand wide real-world applications. The great progress of this field in recents is mainly due to the emergence offew datasets for machine reading comprehension tasks with large sizes and deep learning. For the Vietnameselanguage, some datasets, such as UIT-ViQuAD [1] and UIT-ViNewsQA [2], most recently, UIT-ViQuAD 2.0 [3] - adataset of the competitive VLSP 2021-MRC Shared Task 1 . MRC systems must not only answer questions whennecessary but also tactfully abstain from answering when no answer is available according to the given passage.In this paper, we proposed two types of joint models for answerability prediction and pure-MRC prediction with/without a dependency mechanism to learn the correlation between a start position and end position in pure-MRCoutput prediction. Besides, we use ensemble models and a verification strategy by voting the best answer from thetop K answers of different models. Our proposed approach is evaluated on the benchmark VLSP 2021-MRC SharedTask challenge dataset UIT-ViQuAD 2.0 [3] shows that our approach is significantly better than the baseline.","PeriodicalId":416488,"journal":{"name":"VNU Journal of Science: Computer Science and Communication Engineering","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"VNU Journal of Science: Computer Science and Communication Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.25073/2588-1086/vnucsce.364","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Machine reading comprehension (MRC) is a challenging Natural Language Processing (NLP) research fieldand wide real-world applications. The great progress of this field in recents is mainly due to the emergence offew datasets for machine reading comprehension tasks with large sizes and deep learning. For the Vietnameselanguage, some datasets, such as UIT-ViQuAD [1] and UIT-ViNewsQA [2], most recently, UIT-ViQuAD 2.0 [3] - adataset of the competitive VLSP 2021-MRC Shared Task 1 . MRC systems must not only answer questions whennecessary but also tactfully abstain from answering when no answer is available according to the given passage.In this paper, we proposed two types of joint models for answerability prediction and pure-MRC prediction with/without a dependency mechanism to learn the correlation between a start position and end position in pure-MRCoutput prediction. Besides, we use ensemble models and a verification strategy by voting the best answer from thetop K answers of different models. Our proposed approach is evaluated on the benchmark VLSP 2021-MRC SharedTask challenge dataset UIT-ViQuAD 2.0 [3] shows that our approach is significantly better than the baseline.

查看原文本刊更多论文

VLSP 2021 - VieCap4H挑战:越南医疗保健领域的自动图像标题生成

机器阅读理解(MRC)是一个具有挑战性的自然语言处理(NLP)研究领域和广泛的现实应用。近年来该领域的巨大进步主要是由于出现了一些用于大规模和深度学习的机器阅读理解任务的数据集。对于越南语，一些数据集，如unit - viquad[1]和unit - viquad[2]，最近，unit - viquad 2.0[3] -竞争的VLSP 2021-MRC共享任务1的数据集。MRC系统不仅要在必要时回答问题，而且要机智地避免在没有答案时根据给定的文章回答问题。为了学习纯mrc输出预测中起始位置和结束位置之间的相关性，本文提出了两种联合模型，分别用于可答性预测和纯mrc预测，其中有/没有依赖机制。此外，我们使用集成模型和验证策略，从不同模型的前K个答案中投票选出最佳答案。我们提出的方法在基准VLSP 2021-MRC SharedTask挑战数据集unit - viquad 2.0[3]上进行了评估，结果表明我们的方法明显优于基线。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

VNU Journal of Science: Computer Science and Communication Engineering

自引率

0.00%

发文量