基于变压器双向编码器表示和最佳匹配方法的双语问答系统

D. A. Navastara, Ihdiannaja, A. Arifin
{"title":"基于变压器双向编码器表示和最佳匹配方法的双语问答系统","authors":"D. A. Navastara, Ihdiannaja, A. Arifin","doi":"10.1109/ICTS52701.2021.9608905","DOIUrl":null,"url":null,"abstract":"Question answering (QA) system is built to answer asked queries based on an unstructured collection of documents in natural language. The implementation of the QA system makes QA more efficient because the system can answer similar questions automatically. However, similarity queries based on questions or answers alone fail to retrieve documents relevant to the query in some cases because the word choice used in the query is different from the word choice in the QA database even though the context is the same. The same context can be seen from the list of references used by a QA. Therefore, it is necessary to measure the similarity of the query that does not only take into account the question and answer but also the reference. In this paper, we propose to build a bilingual QA system that answers Indonesian questions based on the combination of query similarities among question, answer, and external reference in Arabic using Bidirectional Encoder Representation from Transformers (BERT) and Best Matching (BM25) method. The similarity between query and reference are able to help to recognize a QA that uses reference with similar context. Based on the experimental result, the combination parameter of query-Question followed by query-Answer achieves the highest evaluation score with the Mean Average Precision (MAP) score of 0.988 and the Mean Reciprocal Rank (MRR) score of 1.000.","PeriodicalId":6738,"journal":{"name":"2021 13th International Conference on Information & Communication Technology and System (ICTS)","volume":"114 1","pages":"360-364"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Bilingual Question Answering System Using Bidirectional Encoder Representations from Transformers and Best Matching Method\",\"authors\":\"D. A. Navastara, Ihdiannaja, A. Arifin\",\"doi\":\"10.1109/ICTS52701.2021.9608905\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Question answering (QA) system is built to answer asked queries based on an unstructured collection of documents in natural language. The implementation of the QA system makes QA more efficient because the system can answer similar questions automatically. However, similarity queries based on questions or answers alone fail to retrieve documents relevant to the query in some cases because the word choice used in the query is different from the word choice in the QA database even though the context is the same. The same context can be seen from the list of references used by a QA. Therefore, it is necessary to measure the similarity of the query that does not only take into account the question and answer but also the reference. In this paper, we propose to build a bilingual QA system that answers Indonesian questions based on the combination of query similarities among question, answer, and external reference in Arabic using Bidirectional Encoder Representation from Transformers (BERT) and Best Matching (BM25) method. The similarity between query and reference are able to help to recognize a QA that uses reference with similar context. Based on the experimental result, the combination parameter of query-Question followed by query-Answer achieves the highest evaluation score with the Mean Average Precision (MAP) score of 0.988 and the Mean Reciprocal Rank (MRR) score of 1.000.\",\"PeriodicalId\":6738,\"journal\":{\"name\":\"2021 13th International Conference on Information & Communication Technology and System (ICTS)\",\"volume\":\"114 1\",\"pages\":\"360-364\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 13th International Conference on Information & Communication Technology and System (ICTS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTS52701.2021.9608905\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 13th International Conference on Information & Communication Technology and System (ICTS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTS52701.2021.9608905","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

问答(QA)系统是基于自然语言的非结构化文档集合来回答用户提出的问题。QA系统的实现使得QA更加高效,因为系统可以自动回答类似的问题。然而,在某些情况下,仅基于问题或答案的相似性查询无法检索与查询相关的文档,因为查询中使用的单词选择与QA数据库中的单词选择不同,即使上下文相同。同样的上下文可以从QA使用的引用列表中看到。因此,有必要测量查询的相似度,不仅要考虑问题和答案,还要考虑参考。在本文中,我们提出了一个基于阿拉伯语问题、答案和外部参考之间的查询相似度组合的双语问答系统,该系统采用双向编码器表示(BERT)和最佳匹配(BM25)方法来回答印尼语问题。查询和引用之间的相似性有助于识别使用具有相似上下文的引用的QA。从实验结果来看,query-Question后query-Answer的组合参数评价得分最高,MAP (Mean Average Precision)得分为0.988,MRR (Mean Reciprocal Rank)得分为1.000。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Bilingual Question Answering System Using Bidirectional Encoder Representations from Transformers and Best Matching Method
Question answering (QA) system is built to answer asked queries based on an unstructured collection of documents in natural language. The implementation of the QA system makes QA more efficient because the system can answer similar questions automatically. However, similarity queries based on questions or answers alone fail to retrieve documents relevant to the query in some cases because the word choice used in the query is different from the word choice in the QA database even though the context is the same. The same context can be seen from the list of references used by a QA. Therefore, it is necessary to measure the similarity of the query that does not only take into account the question and answer but also the reference. In this paper, we propose to build a bilingual QA system that answers Indonesian questions based on the combination of query similarities among question, answer, and external reference in Arabic using Bidirectional Encoder Representation from Transformers (BERT) and Best Matching (BM25) method. The similarity between query and reference are able to help to recognize a QA that uses reference with similar context. Based on the experimental result, the combination parameter of query-Question followed by query-Answer achieves the highest evaluation score with the Mean Average Precision (MAP) score of 0.988 and the Mean Reciprocal Rank (MRR) score of 1.000.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信