基于排序的自动抽取文本摘要语言模型

2022 First International Conference on Artificial Intelligence Trends and Pattern Recognition (ICAITPR) Pub Date : 2022-03-10 DOI:10.1109/ICAITPR51569.2022.9844187

Pooja Gupta, S. Nigam, R. Singh

{"title":"基于排序的自动抽取文本摘要语言模型","authors":"Pooja Gupta, S. Nigam, R. Singh","doi":"10.1109/ICAITPR51569.2022.9844187","DOIUrl":null,"url":null,"abstract":"Increased availability of the Internet and social media has created another ‘world of data’ comprised of text, audio and video files. It is very difficult for a user to get the accurate summary or to comprehend the relevant and important items from the available media. Additionally, readers or evaluators of these data files are interested only in the relevant content or summary to be retrieved in the less duration from the source files. Automatic text summarization (ATS) is the only way to summarize single or multiple documents to obtain relevant content from the source files. Available ATS systems generate bad summaries and take a lot of time and space for long documents due to inaccurate encoding. Therefore, in this work, we have introduced an approach for extractive text summarization using sentence ranking. Experiments have been performed over BBC and CNN news datasets and evaluated in terms of ROUGE using N-gram Language Model. The quantitative values of the metrics show the effectiveness of the proposed approach for news datasets.","PeriodicalId":262409,"journal":{"name":"2022 First International Conference on Artificial Intelligence Trends and Pattern Recognition (ICAITPR)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A Ranking based Language Model for Automatic Extractive Text Summarization\",\"authors\":\"Pooja Gupta, S. Nigam, R. Singh\",\"doi\":\"10.1109/ICAITPR51569.2022.9844187\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Increased availability of the Internet and social media has created another ‘world of data’ comprised of text, audio and video files. It is very difficult for a user to get the accurate summary or to comprehend the relevant and important items from the available media. Additionally, readers or evaluators of these data files are interested only in the relevant content or summary to be retrieved in the less duration from the source files. Automatic text summarization (ATS) is the only way to summarize single or multiple documents to obtain relevant content from the source files. Available ATS systems generate bad summaries and take a lot of time and space for long documents due to inaccurate encoding. Therefore, in this work, we have introduced an approach for extractive text summarization using sentence ranking. Experiments have been performed over BBC and CNN news datasets and evaluated in terms of ROUGE using N-gram Language Model. The quantitative values of the metrics show the effectiveness of the proposed approach for news datasets.\",\"PeriodicalId\":262409,\"journal\":{\"name\":\"2022 First International Conference on Artificial Intelligence Trends and Pattern Recognition (ICAITPR)\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 First International Conference on Artificial Intelligence Trends and Pattern Recognition (ICAITPR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAITPR51569.2022.9844187\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 First International Conference on Artificial Intelligence Trends and Pattern Recognition (ICAITPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAITPR51569.2022.9844187","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

互联网和社交媒体的日益普及创造了另一个由文本、音频和视频文件组成的“数据世界”。用户很难从现有的媒体中获得准确的摘要或理解相关的重要项目。此外，这些数据文件的读者或评估者只对在较短时间内从源文件中检索到的相关内容或摘要感兴趣。自动文本摘要(ATS)是对单个或多个文档进行摘要以从源文件中获取相关内容的唯一方法。由于编码不准确，现有的ATS系统会生成糟糕的摘要，并且对于长文档占用大量时间和空间。因此，在这项工作中，我们引入了一种使用句子排序的提取文本摘要方法。在BBC和CNN新闻数据集上进行了实验，并使用N-gram语言模型对ROUGE进行了评估。度量的定量值显示了所提出的方法对新闻数据集的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Ranking based Language Model for Automatic Extractive Text Summarization

Increased availability of the Internet and social media has created another ‘world of data’ comprised of text, audio and video files. It is very difficult for a user to get the accurate summary or to comprehend the relevant and important items from the available media. Additionally, readers or evaluators of these data files are interested only in the relevant content or summary to be retrieved in the less duration from the source files. Automatic text summarization (ATS) is the only way to summarize single or multiple documents to obtain relevant content from the source files. Available ATS systems generate bad summaries and take a lot of time and space for long documents due to inaccurate encoding. Therefore, in this work, we have introduced an approach for extractive text summarization using sentence ranking. Experiments have been performed over BBC and CNN news datasets and evaluated in terms of ROUGE using N-gram Language Model. The quantitative values of the metrics show the effectiveness of the proposed approach for news datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 First International Conference on Artificial Intelligence Trends and Pattern Recognition (ICAITPR)

自引率

0.00%

发文量