Short Updates- Machine Learning Based News Summarizer

Journal of Advanced College of Engineering and Management Pub Date : 2023-06-23 DOI:10.3126/jacem.v8i2.55939

Raksha Dangol, Prashna Adhikari, Pranjal Dahal, Hrizu Sharma

引用次数: 0

Abstract

Automated Text Summarization is becoming important due to the vast amount of data being generated. Manual processing of documents is tedious, mostly due to the absence of standards. Therefore, there is a need for a mechanism to reduce text size, structure it, and make it readable for users. Natural Language Processing (NLP) is critical for analyzing large amounts of unstructured, text-heavy data. This project aims to address concerns with extractive and abstractive text summarization by introducing a new neural network model that deals with repetitive and incoherent phrases in longer documents. The model incorporates a novel Seq2Seq architecture that enhances the standard attentional model with an intra-attention mechanism. Additionally, a new training method that combines supervised word prediction and reinforcement learning is employed. The model utilizes a hybrid pointer-generator network, which distinguishes it from the standard encoder-decoder model. This approach produces higher quality summaries than existing models.

查看原文本刊更多论文

短更新-基于机器学习的新闻摘要

由于生成了大量的数据，自动文本摘要变得越来越重要。手工处理文档是乏味的，主要是由于缺乏标准。因此，需要一种机制来减小文本大小、构建文本并使其易于用户阅读。自然语言处理(NLP)对于分析大量非结构化、文本繁重的数据至关重要。该项目旨在通过引入一种新的神经网络模型来解决抽取和抽象文本摘要的问题，该模型可以处理较长文档中重复和不连贯的短语。该模型采用了一种新颖的Seq2Seq架构，通过注意内机制增强了标准注意模型。此外，本文还采用了一种结合监督词预测和强化学习的训练方法。该模型采用了一种与标准的编码器-解码器模型不同的混合指针-生成器网络。这种方法产生比现有模型更高质量的摘要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Advanced College of Engineering and Management

自引率

0.00%

发文量