一个标准语料库,用于评估波斯语文本摘要器

Behdad Behmadi Moghaddas, M. Kahani, Seyyed Ahmad Toosi, Asef Pourmasoumi, Ahmad Estiri
{"title":"一个标准语料库,用于评估波斯语文本摘要器","authors":"Behdad Behmadi Moghaddas, M. Kahani, Seyyed Ahmad Toosi, Asef Pourmasoumi, Ahmad Estiri","doi":"10.1109/ICCKE.2013.6682873","DOIUrl":null,"url":null,"abstract":"The increasingly vast amount of information, particularly on the Web, has resulted in a profound need for automatic summarization systems. The systems, in turn, need to be evaluated in terms of how desirably they can retrieve information. The evaluation is done by comparing the machine summaries against a standard reference corpus containing a reasonably large number of text sources and the summaries that human beings have made out of them. Due to the lack of such a standard corpus for Persian, the summarizers that were developed used to be evaluated against the small corpora constructed by the developers of the proposed systems. This made the systems non-comparable. Thus, Pasokh was constructed as a standard large enough reference corpus. It took over 2000 man-hours of work.","PeriodicalId":321117,"journal":{"name":"ICCKE 2013","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Pasokh: A standard corpus for the evaluation of Persian text summarizers\",\"authors\":\"Behdad Behmadi Moghaddas, M. Kahani, Seyyed Ahmad Toosi, Asef Pourmasoumi, Ahmad Estiri\",\"doi\":\"10.1109/ICCKE.2013.6682873\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The increasingly vast amount of information, particularly on the Web, has resulted in a profound need for automatic summarization systems. The systems, in turn, need to be evaluated in terms of how desirably they can retrieve information. The evaluation is done by comparing the machine summaries against a standard reference corpus containing a reasonably large number of text sources and the summaries that human beings have made out of them. Due to the lack of such a standard corpus for Persian, the summarizers that were developed used to be evaluated against the small corpora constructed by the developers of the proposed systems. This made the systems non-comparable. Thus, Pasokh was constructed as a standard large enough reference corpus. It took over 2000 man-hours of work.\",\"PeriodicalId\":321117,\"journal\":{\"name\":\"ICCKE 2013\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ICCKE 2013\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCKE.2013.6682873\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICCKE 2013","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE.2013.6682873","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

摘要

越来越多的信息,特别是在Web上的信息,导致了对自动摘要系统的深刻需求。反过来,需要根据系统检索信息的理想程度来评估系统。评估是通过将机器摘要与标准参考语料库进行比较来完成的,标准参考语料库包含相当多的文本来源和人类从中做出的摘要。由于缺乏这样一个标准的波斯语语料库,所开发的摘要器过去常常与所提议系统的开发人员构建的小型语料库进行评估。这使得系统无法比较。因此,Pasokh被构建为一个标准的足够大的参考语料库。它花费了2000多个工时。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Pasokh: A standard corpus for the evaluation of Persian text summarizers
The increasingly vast amount of information, particularly on the Web, has resulted in a profound need for automatic summarization systems. The systems, in turn, need to be evaluated in terms of how desirably they can retrieve information. The evaluation is done by comparing the machine summaries against a standard reference corpus containing a reasonably large number of text sources and the summaries that human beings have made out of them. Due to the lack of such a standard corpus for Persian, the summarizers that were developed used to be evaluated against the small corpora constructed by the developers of the proposed systems. This made the systems non-comparable. Thus, Pasokh was constructed as a standard large enough reference corpus. It took over 2000 man-hours of work.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信