CONDENZA: A System for Extracting Abstract from a Given Source Document

Mgbeafulike Ij, C. Ejiofor
{"title":"CONDENZA: A System for Extracting Abstract from a Given Source Document","authors":"Mgbeafulike Ij, C. Ejiofor","doi":"10.4172/2165-7866.1000222","DOIUrl":null,"url":null,"abstract":"Despite the increasingly availability of documents in electronic form and the availability of desktop publishing software, abstracts continue to be produced manually. The purpose of CONDENZA is to develop a system for abstract extraction from a given source document. CONDENZA describes a system on automatic methods of obtaining abstracts. The rationale of abstracts is to facilitate quick and accurate identification of the topic of published papers. The idea is to save a prospective reader time and effort in finding useful information in a given article or report. The system generates a shorter version of a given sentence while attempting to preserve its meaning. This task is carried out using summarization techniques. CONDENZA implements a method that combines apriori algorithm for keyword frequency detection with clustering based approach for grouping similar sentences together. The result from the system shows that our approach helps in summarizing the text documents efficiently by avoiding redundancy among the words in the document and ensures highest relevance to the input text. The guiding factors of our results are the ratio of input to output sentences after summarization.","PeriodicalId":91908,"journal":{"name":"Journal of information technology & software engineering","volume":"8 1","pages":"1-3"},"PeriodicalIF":0.0000,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of information technology & software engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4172/2165-7866.1000222","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Despite the increasingly availability of documents in electronic form and the availability of desktop publishing software, abstracts continue to be produced manually. The purpose of CONDENZA is to develop a system for abstract extraction from a given source document. CONDENZA describes a system on automatic methods of obtaining abstracts. The rationale of abstracts is to facilitate quick and accurate identification of the topic of published papers. The idea is to save a prospective reader time and effort in finding useful information in a given article or report. The system generates a shorter version of a given sentence while attempting to preserve its meaning. This task is carried out using summarization techniques. CONDENZA implements a method that combines apriori algorithm for keyword frequency detection with clustering based approach for grouping similar sentences together. The result from the system shows that our approach helps in summarizing the text documents efficiently by avoiding redundancy among the words in the document and ensures highest relevance to the input text. The guiding factors of our results are the ratio of input to output sentences after summarization.
从给定的源文档中提取摘要的系统
尽管电子形式的文件越来越多,桌面出版软件也越来越多,但摘要仍然是手工制作的。CONDENZA的目的是开发一个从给定源文档中提取抽象内容的系统。CONDENZA描述了一个自动获取摘要方法的系统。摘要的基本原理是为了方便快速准确地识别已发表论文的主题。这样做的目的是为了节省读者在文章或报告中寻找有用信息的时间和精力。该系统生成一个给定句子的较短版本,同时试图保留其含义。这项任务是使用摘要技术来完成的。CONDENZA实现了一种将先验算法用于关键词频率检测与基于聚类的相似句子分组方法相结合的方法。系统的结果表明,我们的方法有助于有效地总结文本文档,避免了文档中单词之间的冗余,并确保了与输入文本的最高相关性。我们的结果的指导因素是输入句子与输出句子的比率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信