TREC新颖性轨迹判断中的句子长度偏差

Australasian Document Computing Symposium Pub Date : 2012-12-05 DOI:10.1145/2407085.2407093

L. L. Bando, Falk Scholer, A. Turpin

{"title":"TREC新颖性轨迹判断中的句子长度偏差","authors":"L. L. Bando, Falk Scholer, A. Turpin","doi":"10.1145/2407085.2407093","DOIUrl":null,"url":null,"abstract":"The Cranfield methodology for comparing document ranking systems has also been applied recently to comparing sentence ranking methods, which are used as pre-processors for summary generation methods. In particular, the TREC Novelty track data has been used to assess whether one sentence ranking system is better than another. This paper demonstrates that there is a strong bias in the Novelty track data for relevant sentences to also be longer sentences. Thus, systems that simply choose the longest sentences will often appear to perform better in terms of identifying \"relevant\" sentences than systems that use other methods. We demonstrate, by example, how this can lead to misleading conclusions about the comparative effectiveness of sentence ranking systems. We then demonstrate that if the Novelty track data is split into subcollections based on sentence length, comparing systems on each of the subcollections leads to conclusions that avoid the bias.","PeriodicalId":402985,"journal":{"name":"Australasian Document Computing Symposium","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Sentence length bias in TREC novelty track judgements\",\"authors\":\"L. L. Bando, Falk Scholer, A. Turpin\",\"doi\":\"10.1145/2407085.2407093\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Cranfield methodology for comparing document ranking systems has also been applied recently to comparing sentence ranking methods, which are used as pre-processors for summary generation methods. In particular, the TREC Novelty track data has been used to assess whether one sentence ranking system is better than another. This paper demonstrates that there is a strong bias in the Novelty track data for relevant sentences to also be longer sentences. Thus, systems that simply choose the longest sentences will often appear to perform better in terms of identifying \\\"relevant\\\" sentences than systems that use other methods. We demonstrate, by example, how this can lead to misleading conclusions about the comparative effectiveness of sentence ranking systems. We then demonstrate that if the Novelty track data is split into subcollections based on sentence length, comparing systems on each of the subcollections leads to conclusions that avoid the bias.\",\"PeriodicalId\":402985,\"journal\":{\"name\":\"Australasian Document Computing Symposium\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Australasian Document Computing Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2407085.2407093\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Australasian Document Computing Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2407085.2407093","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

用于比较文档排序系统的克兰菲尔德方法最近也被应用于比较句子排序方法，这些方法被用作摘要生成方法的预处理程序。特别是，TREC新颖性跟踪数据被用来评估一个句子排名系统是否比另一个更好。研究表明，新颖性轨迹数据对相关句子也存在较长句子的强烈偏向。因此，在识别“相关”句子方面，简单选择最长句子的系统通常比使用其他方法的系统表现得更好。我们通过例子证明，这是如何导致关于句子排序系统的比较有效性的误导性结论的。然后我们证明，如果新颖性轨道数据根据句子长度分成子集合，那么在每个子集合上比较系统会导致避免偏差的结论。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Sentence length bias in TREC novelty track judgements

The Cranfield methodology for comparing document ranking systems has also been applied recently to comparing sentence ranking methods, which are used as pre-processors for summary generation methods. In particular, the TREC Novelty track data has been used to assess whether one sentence ranking system is better than another. This paper demonstrates that there is a strong bias in the Novelty track data for relevant sentences to also be longer sentences. Thus, systems that simply choose the longest sentences will often appear to perform better in terms of identifying "relevant" sentences than systems that use other methods. We demonstrate, by example, how this can lead to misleading conclusions about the comparative effectiveness of sentence ranking systems. We then demonstrate that if the Novelty track data is split into subcollections based on sentence length, comparing systems on each of the subcollections leads to conclusions that avoid the bias.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Australasian Document Computing Symposium

自引率

0.00%

发文量