An integer linear programming model for multi document summarization of learning materials using phrase embedding technique

IF 1.6 Q2 ENGINEERING, MULTIDISCIPLINARY
K. Sakkaravarthy Iyyappan, S. R. Balasundaram
{"title":"An integer linear programming model for multi document summarization of learning materials using phrase embedding technique","authors":"K. Sakkaravarthy Iyyappan, S. R. Balasundaram","doi":"10.1007/s13198-024-02299-7","DOIUrl":null,"url":null,"abstract":"<h3>Abstract</h3> <p>Automatic text summarization (ATS) plays a vital role in condensing original text documents while preserving the most crucial information. Its benefits extend to various domains, including e-Learning systems, where educational content can be summarized to facilitate easier access and comprehension. Multi-document summarization (MDS) techniques enable the creation of concise summaries from groups of related text documents. Leveraging MDS for summarizing learning materials opens new avenues, offering students and teachers reference summaries for enhanced learning experiences. This paper introduces a concept-based Integer Linear Programming model for summarizing learning materials, leveraging a phrase embedding technique. Phrases are treated as fundamental and significant semantic building blocks of sentences, facilitating the comprehension and summarization of documents. Embedding techniques are employed to semantically identify related phrases, eliminate redundancy, and enhance coherence through vector representations. Summaries are generated using the ILP technique, selecting key sentences and reducing redundancy with phrase vectors. The paper proposes sentence reordering techniques based on phrases and sentences to further enhance coherence. The resulting summaries are automatically evaluated using ROUGE metrics, demonstrating the superior performance of the proposed approach compared to various benchmark and baseline methods on both the DUC 2004 benchmark dataset and the newly created educational dataset, EduSumm.</p>","PeriodicalId":14463,"journal":{"name":"International Journal of System Assurance Engineering and Management","volume":"80 1","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2024-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of System Assurance Engineering and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s13198-024-02299-7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Automatic text summarization (ATS) plays a vital role in condensing original text documents while preserving the most crucial information. Its benefits extend to various domains, including e-Learning systems, where educational content can be summarized to facilitate easier access and comprehension. Multi-document summarization (MDS) techniques enable the creation of concise summaries from groups of related text documents. Leveraging MDS for summarizing learning materials opens new avenues, offering students and teachers reference summaries for enhanced learning experiences. This paper introduces a concept-based Integer Linear Programming model for summarizing learning materials, leveraging a phrase embedding technique. Phrases are treated as fundamental and significant semantic building blocks of sentences, facilitating the comprehension and summarization of documents. Embedding techniques are employed to semantically identify related phrases, eliminate redundancy, and enhance coherence through vector representations. Summaries are generated using the ILP technique, selecting key sentences and reducing redundancy with phrase vectors. The paper proposes sentence reordering techniques based on phrases and sentences to further enhance coherence. The resulting summaries are automatically evaluated using ROUGE metrics, demonstrating the superior performance of the proposed approach compared to various benchmark and baseline methods on both the DUC 2004 benchmark dataset and the newly created educational dataset, EduSumm.

Abstract Image

使用短语嵌入技术的多文档学习材料摘要整数线性规划模型
摘要 自动文本摘要(ATS)在浓缩原始文本文件、保留最关键信息方面发挥着重要作用。自动文本摘要技术的优势可扩展到包括电子学习系统在内的各个领域,通过对教育内容进行摘要,可以方便用户访问和理解。多文档摘要(MDS)技术可以从一组相关的文本文档中创建简明摘要。利用 MDS 总结学习材料开辟了新的途径,为学生和教师提供参考总结,以增强学习体验。本文介绍了一种基于概念的整数线性规划模型,用于利用短语嵌入技术总结学习材料。短语被视为句子的基本和重要语义构件,有助于理解和总结文档。嵌入技术用于从语义上识别相关短语,消除冗余,并通过向量表示增强连贯性。使用 ILP 技术生成摘要,通过短语向量选择关键句子并减少冗余。本文提出了基于短语和句子的句子重排序技术,以进一步增强连贯性。本文使用 ROUGE 指标对生成的摘要进行了自动评估,结果表明,在 DUC 2004 基准数据集和新创建的教育数据集 EduSumm 上,与各种基准方法和基线方法相比,本文提出的方法具有卓越的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.30
自引率
10.00%
发文量
252
期刊介绍: This Journal is established with a view to cater to increased awareness for high quality research in the seamless integration of heterogeneous technologies to formulate bankable solutions to the emergent complex engineering problems. Assurance engineering could be thought of as relating to the provision of higher confidence in the reliable and secure implementation of a system’s critical characteristic features through the espousal of a holistic approach by using a wide variety of cross disciplinary tools and techniques. Successful realization of sustainable and dependable products, systems and services involves an extensive adoption of Reliability, Quality, Safety and Risk related procedures for achieving high assurancelevels of performance; also pivotal are the management issues related to risk and uncertainty that govern the practical constraints encountered in their deployment. It is our intention to provide a platform for the modeling and analysis of large engineering systems, among the other aforementioned allied goals of systems assurance engineering, leading to the enforcement of performance enhancement measures. Achieving a fine balance between theory and practice is the primary focus. The Journal only publishes high quality papers that have passed the rigorous peer review procedure of an archival scientific Journal. The aim is an increasing number of submissions, wide circulation and a high impact factor.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信