{"title":"Enhancing extractive multi-documents summarization with a novel dominating set model for semantic relationship detection","authors":"Said Yunus , Cengiz Hark , Fatih Okumuş","doi":"10.1016/j.jestch.2025.102127","DOIUrl":null,"url":null,"abstract":"<div><div>In this paper, the Dominant Set-Based Extractive Text summarizing (DSETS) framework is proposed, which gives a new approach to automatic text summarizing. Utilizing the Minimum Dominant Set technique, the proposed framework creates summaries based on a word-level graphical representation that minimizes information loss while maintaining significant semantics. DSETS aims to inspire an alternative perspective on the computational text summarization method. The proposed framework distributes the processing load and reduces time complexity with the segmentation it applies, thus providing more scalable performance on large datasets. Additionally, empirical runtime and memory evaluations revealed that the proposed segmentation strategy reduced processing time by up to 24 % and offered comparable memory usage to lighter baseline methods, demonstrating its practicality in resource-constrained environments. After comparing the effectiveness of the DSETS framework with a series of text summarization techniques, it was determined that it offers significantly improved text summarization performance. Experiments were conducted using four different datasets (BBC News, XSum, CNN/Daily Mail and MultiNews) and summaries of varying word lengths were generated. The proposed framework achieved the highest ROUGE (1, 2, L, W) scores on most of the summary configurations generated on different datasets and various word counts. In particular, ROUGE-W F-scores improved by up to 15.8 %, while ROUGE-1 and ROUGE-L showed significant increases of 3 % to 8 % across various summary lengths. The evaluation results suggest that the DSETS framework was able to outperform many state-of-the-art summarization methods, with improvements observed between 1.3 % and 15.8 % depending on the metric and dataset. To better understand which parts of the system contributed most to this success, an ablation study was carried out. The findings from this analysis indicated that the segmentation mechanism and the semantic filtering process played a key role—particularly in enhancing recall-based performance. Taken together, these results indicate that DSETS is not only a strong and reliable framework for extractive summarization, especially in single-topic documents, but also a promising option for building lightweight and interpretable summarization systems in future applications.</div></div>","PeriodicalId":48609,"journal":{"name":"Engineering Science and Technology-An International Journal-Jestech","volume":"69 ","pages":"Article 102127"},"PeriodicalIF":5.1000,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Science and Technology-An International Journal-Jestech","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S221509862500182X","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, the Dominant Set-Based Extractive Text summarizing (DSETS) framework is proposed, which gives a new approach to automatic text summarizing. Utilizing the Minimum Dominant Set technique, the proposed framework creates summaries based on a word-level graphical representation that minimizes information loss while maintaining significant semantics. DSETS aims to inspire an alternative perspective on the computational text summarization method. The proposed framework distributes the processing load and reduces time complexity with the segmentation it applies, thus providing more scalable performance on large datasets. Additionally, empirical runtime and memory evaluations revealed that the proposed segmentation strategy reduced processing time by up to 24 % and offered comparable memory usage to lighter baseline methods, demonstrating its practicality in resource-constrained environments. After comparing the effectiveness of the DSETS framework with a series of text summarization techniques, it was determined that it offers significantly improved text summarization performance. Experiments were conducted using four different datasets (BBC News, XSum, CNN/Daily Mail and MultiNews) and summaries of varying word lengths were generated. The proposed framework achieved the highest ROUGE (1, 2, L, W) scores on most of the summary configurations generated on different datasets and various word counts. In particular, ROUGE-W F-scores improved by up to 15.8 %, while ROUGE-1 and ROUGE-L showed significant increases of 3 % to 8 % across various summary lengths. The evaluation results suggest that the DSETS framework was able to outperform many state-of-the-art summarization methods, with improvements observed between 1.3 % and 15.8 % depending on the metric and dataset. To better understand which parts of the system contributed most to this success, an ablation study was carried out. The findings from this analysis indicated that the segmentation mechanism and the semantic filtering process played a key role—particularly in enhancing recall-based performance. Taken together, these results indicate that DSETS is not only a strong and reliable framework for extractive summarization, especially in single-topic documents, but also a promising option for building lightweight and interpretable summarization systems in future applications.
期刊介绍:
Engineering Science and Technology, an International Journal (JESTECH) (formerly Technology), a peer-reviewed quarterly engineering journal, publishes both theoretical and experimental high quality papers of permanent interest, not previously published in journals, in the field of engineering and applied science which aims to promote the theory and practice of technology and engineering. In addition to peer-reviewed original research papers, the Editorial Board welcomes original research reports, state-of-the-art reviews and communications in the broadly defined field of engineering science and technology.
The scope of JESTECH includes a wide spectrum of subjects including:
-Electrical/Electronics and Computer Engineering (Biomedical Engineering and Instrumentation; Coding, Cryptography, and Information Protection; Communications, Networks, Mobile Computing and Distributed Systems; Compilers and Operating Systems; Computer Architecture, Parallel Processing, and Dependability; Computer Vision and Robotics; Control Theory; Electromagnetic Waves, Microwave Techniques and Antennas; Embedded Systems; Integrated Circuits, VLSI Design, Testing, and CAD; Microelectromechanical Systems; Microelectronics, and Electronic Devices and Circuits; Power, Energy and Energy Conversion Systems; Signal, Image, and Speech Processing)
-Mechanical and Civil Engineering (Automotive Technologies; Biomechanics; Construction Materials; Design and Manufacturing; Dynamics and Control; Energy Generation, Utilization, Conversion, and Storage; Fluid Mechanics and Hydraulics; Heat and Mass Transfer; Micro-Nano Sciences; Renewable and Sustainable Energy Technologies; Robotics and Mechatronics; Solid Mechanics and Structure; Thermal Sciences)
-Metallurgical and Materials Engineering (Advanced Materials Science; Biomaterials; Ceramic and Inorgnanic Materials; Electronic-Magnetic Materials; Energy and Environment; Materials Characterizastion; Metallurgy; Polymers and Nanocomposites)