How You Ask Matters: The Effect of Paraphrastic Questions to BERT Performance on a Clinical SQuAD Dataset

Sungrim Moon, Jungwei Fan
{"title":"How You Ask Matters: The Effect of Paraphrastic Questions to BERT Performance on a Clinical SQuAD Dataset","authors":"Sungrim Moon, Jungwei Fan","doi":"10.18653/v1/2020.clinicalnlp-1.13","DOIUrl":null,"url":null,"abstract":"Reading comprehension style question-answering (QA) based on patient-specific documents represents a growing area in clinical NLP with plentiful applications. Bidirectional Encoder Representations from Transformers (BERT) and its derivatives lead the state-of-the-art accuracy on the task, but most evaluation has treated the data as a pre-mixture without systematically looking into the potential effect of imperfect train/test questions. The current study seeks to address this gap by experimenting with full versus partial train/test data consisting of paraphrastic questions. Our key findings include 1) training with all pooled question variants yielded best accuracy, 2) the accuracy varied widely, from 0.74 to 0.80, when trained with each single question variant, and 3) questions of similar lexical/syntactic structure tended to induce identical answers. The results suggest that how you ask questions matters in BERT-based QA, especially at the training stage.","PeriodicalId":216954,"journal":{"name":"Clinical Natural Language Processing Workshop","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Natural Language Processing Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2020.clinicalnlp-1.13","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Reading comprehension style question-answering (QA) based on patient-specific documents represents a growing area in clinical NLP with plentiful applications. Bidirectional Encoder Representations from Transformers (BERT) and its derivatives lead the state-of-the-art accuracy on the task, but most evaluation has treated the data as a pre-mixture without systematically looking into the potential effect of imperfect train/test questions. The current study seeks to address this gap by experimenting with full versus partial train/test data consisting of paraphrastic questions. Our key findings include 1) training with all pooled question variants yielded best accuracy, 2) the accuracy varied widely, from 0.74 to 0.80, when trained with each single question variant, and 3) questions of similar lexical/syntactic structure tended to induce identical answers. The results suggest that how you ask questions matters in BERT-based QA, especially at the training stage.
你如何提问:意译问题对BERT在临床小组数据集上表现的影响
基于患者特定文档的阅读理解式问答(QA)是临床自然语言处理中一个不断发展的领域,有着广泛的应用。来自变形金刚的双向编码器表示(BERT)及其衍生物在任务上领先于最先进的精度,但大多数评估都将数据视为预混合,而没有系统地研究不完美训练/测试问题的潜在影响。目前的研究试图通过对由释义问题组成的完整和部分训练/测试数据进行实验来解决这一差距。我们的主要发现包括:1)使用所有混合问题变体进行训练的准确率最高;2)使用单个问题变体进行训练的准确率差异很大,从0.74到0.80不等;3)词汇/句法结构相似的问题倾向于得出相同的答案。结果表明,在基于bert的QA中,提问的方式很重要,尤其是在培训阶段。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信