一个标准语料库，用于评估波斯语文本摘要器

ICCKE 2013 Pub Date : 2013-12-16 DOI:10.1109/ICCKE.2013.6682873

Behdad Behmadi Moghaddas, M. Kahani, Seyyed Ahmad Toosi, Asef Pourmasoumi, Ahmad Estiri

{"title":"一个标准语料库，用于评估波斯语文本摘要器","authors":"Behdad Behmadi Moghaddas, M. Kahani, Seyyed Ahmad Toosi, Asef Pourmasoumi, Ahmad Estiri","doi":"10.1109/ICCKE.2013.6682873","DOIUrl":null,"url":null,"abstract":"The increasingly vast amount of information, particularly on the Web, has resulted in a profound need for automatic summarization systems. The systems, in turn, need to be evaluated in terms of how desirably they can retrieve information. The evaluation is done by comparing the machine summaries against a standard reference corpus containing a reasonably large number of text sources and the summaries that human beings have made out of them. Due to the lack of such a standard corpus for Persian, the summarizers that were developed used to be evaluated against the small corpora constructed by the developers of the proposed systems. This made the systems non-comparable. Thus, Pasokh was constructed as a standard large enough reference corpus. It took over 2000 man-hours of work.","PeriodicalId":321117,"journal":{"name":"ICCKE 2013","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Pasokh: A standard corpus for the evaluation of Persian text summarizers\",\"authors\":\"Behdad Behmadi Moghaddas, M. Kahani, Seyyed Ahmad Toosi, Asef Pourmasoumi, Ahmad Estiri\",\"doi\":\"10.1109/ICCKE.2013.6682873\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The increasingly vast amount of information, particularly on the Web, has resulted in a profound need for automatic summarization systems. The systems, in turn, need to be evaluated in terms of how desirably they can retrieve information. The evaluation is done by comparing the machine summaries against a standard reference corpus containing a reasonably large number of text sources and the summaries that human beings have made out of them. Due to the lack of such a standard corpus for Persian, the summarizers that were developed used to be evaluated against the small corpora constructed by the developers of the proposed systems. This made the systems non-comparable. Thus, Pasokh was constructed as a standard large enough reference corpus. It took over 2000 man-hours of work.\",\"PeriodicalId\":321117,\"journal\":{\"name\":\"ICCKE 2013\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ICCKE 2013\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCKE.2013.6682873\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICCKE 2013","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE.2013.6682873","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

摘要

越来越多的信息，特别是在Web上的信息，导致了对自动摘要系统的深刻需求。反过来，需要根据系统检索信息的理想程度来评估系统。评估是通过将机器摘要与标准参考语料库进行比较来完成的，标准参考语料库包含相当多的文本来源和人类从中做出的摘要。由于缺乏这样一个标准的波斯语语料库，所开发的摘要器过去常常与所提议系统的开发人员构建的小型语料库进行评估。这使得系统无法比较。因此，Pasokh被构建为一个标准的足够大的参考语料库。它花费了2000多个工时。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Pasokh: A standard corpus for the evaluation of Persian text summarizers

The increasingly vast amount of information, particularly on the Web, has resulted in a profound need for automatic summarization systems. The systems, in turn, need to be evaluated in terms of how desirably they can retrieve information. The evaluation is done by comparing the machine summaries against a standard reference corpus containing a reasonably large number of text sources and the summaries that human beings have made out of them. Due to the lack of such a standard corpus for Persian, the summarizers that were developed used to be evaluated against the small corpora constructed by the developers of the proposed systems. This made the systems non-comparable. Thus, Pasokh was constructed as a standard large enough reference corpus. It took over 2000 man-hours of work.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ICCKE 2013

自引率

0.00%

发文量