ntcirr -10 RITE-2中文数据集的秩相关分析及评价指标

Chuan-Jie Lin, Cheng-Wei Lee, Cheng-Wei Shih, W. Hsu
{"title":"ntcirr -10 RITE-2中文数据集的秩相关分析及评价指标","authors":"Chuan-Jie Lin, Cheng-Wei Lee, Cheng-Wei Shih, W. Hsu","doi":"10.1109/IRI.2013.6642454","DOIUrl":null,"url":null,"abstract":"Textual Entailment (TE) is the task of recognizing entailment, paraphrase, and contradiction relations between a given text pair. The goal of textual entailment research is to develop a core inference component that can be applied to various domains such as QA. We observed several rank correlations on the data and system results in the NTCIR-10 RITE-2 task, trying to find out correlations between datasets and evaluation metrics. We also constructed RITE4QA datasets in the RITE-2 task under the scenario of QA in order to see the applicability of RITE systems in QA. We find that datasets created from different sources and different ways can hardly predict each other. However, the system ranking on the dataset consisting of expert-made artificial pairs has moderate correlation with the ranking on QA metrics. Both RITE metrics and QA metrics are stable in terms of their own subtasks.","PeriodicalId":418492,"journal":{"name":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Rank correlation analysis of NTCIR-10 RITE-2 Chinese datasets and evaluation metrics\",\"authors\":\"Chuan-Jie Lin, Cheng-Wei Lee, Cheng-Wei Shih, W. Hsu\",\"doi\":\"10.1109/IRI.2013.6642454\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Textual Entailment (TE) is the task of recognizing entailment, paraphrase, and contradiction relations between a given text pair. The goal of textual entailment research is to develop a core inference component that can be applied to various domains such as QA. We observed several rank correlations on the data and system results in the NTCIR-10 RITE-2 task, trying to find out correlations between datasets and evaluation metrics. We also constructed RITE4QA datasets in the RITE-2 task under the scenario of QA in order to see the applicability of RITE systems in QA. We find that datasets created from different sources and different ways can hardly predict each other. However, the system ranking on the dataset consisting of expert-made artificial pairs has moderate correlation with the ranking on QA metrics. Both RITE metrics and QA metrics are stable in terms of their own subtasks.\",\"PeriodicalId\":418492,\"journal\":{\"name\":\"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-10-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IRI.2013.6642454\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRI.2013.6642454","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

文本蕴涵(TE)是识别给定文本对之间的蕴涵、释义和矛盾关系的任务。文本蕴涵研究的目标是开发一个可以应用于各种领域的核心推理组件,如QA。我们在ntir -10 RITE-2任务中观察到数据和系统结果之间的几个等级相关性,试图找出数据集和评估指标之间的相关性。为了验证RITE系统在QA中的适用性,我们还在QA场景下的RITE-2任务中构建了RITE4QA数据集。我们发现从不同来源和不同方式创建的数据集很难相互预测。然而,系统在由专家人工配对组成的数据集上的排名与在QA指标上的排名有适度的相关性。RITE指标和QA指标在它们自己的子任务方面都是稳定的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Rank correlation analysis of NTCIR-10 RITE-2 Chinese datasets and evaluation metrics
Textual Entailment (TE) is the task of recognizing entailment, paraphrase, and contradiction relations between a given text pair. The goal of textual entailment research is to develop a core inference component that can be applied to various domains such as QA. We observed several rank correlations on the data and system results in the NTCIR-10 RITE-2 task, trying to find out correlations between datasets and evaluation metrics. We also constructed RITE4QA datasets in the RITE-2 task under the scenario of QA in order to see the applicability of RITE systems in QA. We find that datasets created from different sources and different ways can hardly predict each other. However, the system ranking on the dataset consisting of expert-made artificial pairs has moderate correlation with the ranking on QA metrics. Both RITE metrics and QA metrics are stable in terms of their own subtasks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信