短更新-基于机器学习的新闻摘要

Journal of Advanced College of Engineering and Management Pub Date : 2023-06-23 DOI:10.3126/jacem.v8i2.55939

Raksha Dangol, Prashna Adhikari, Pranjal Dahal, Hrizu Sharma

{"title":"短更新-基于机器学习的新闻摘要","authors":"Raksha Dangol, Prashna Adhikari, Pranjal Dahal, Hrizu Sharma","doi":"10.3126/jacem.v8i2.55939","DOIUrl":null,"url":null,"abstract":"Automated Text Summarization is becoming important due to the vast amount of data being generated. Manual processing of documents is tedious, mostly due to the absence of standards. Therefore, there is a need for a mechanism to reduce text size, structure it, and make it readable for users. Natural Language Processing (NLP) is critical for analyzing large amounts of unstructured, text-heavy data. This project aims to address concerns with extractive and abstractive text summarization by introducing a new neural network model that deals with repetitive and incoherent phrases in longer documents. The model incorporates a novel Seq2Seq architecture that enhances the standard attentional model with an intra-attention mechanism. Additionally, a new training method that combines supervised word prediction and reinforcement learning is employed. The model utilizes a hybrid pointer-generator network, which distinguishes it from the standard encoder-decoder model. This approach produces higher quality summaries than existing models.","PeriodicalId":306432,"journal":{"name":"Journal of Advanced College of Engineering and Management","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Short Updates- Machine Learning Based News Summarizer\",\"authors\":\"Raksha Dangol, Prashna Adhikari, Pranjal Dahal, Hrizu Sharma\",\"doi\":\"10.3126/jacem.v8i2.55939\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automated Text Summarization is becoming important due to the vast amount of data being generated. Manual processing of documents is tedious, mostly due to the absence of standards. Therefore, there is a need for a mechanism to reduce text size, structure it, and make it readable for users. Natural Language Processing (NLP) is critical for analyzing large amounts of unstructured, text-heavy data. This project aims to address concerns with extractive and abstractive text summarization by introducing a new neural network model that deals with repetitive and incoherent phrases in longer documents. The model incorporates a novel Seq2Seq architecture that enhances the standard attentional model with an intra-attention mechanism. Additionally, a new training method that combines supervised word prediction and reinforcement learning is employed. The model utilizes a hybrid pointer-generator network, which distinguishes it from the standard encoder-decoder model. This approach produces higher quality summaries than existing models.\",\"PeriodicalId\":306432,\"journal\":{\"name\":\"Journal of Advanced College of Engineering and Management\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Advanced College of Engineering and Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3126/jacem.v8i2.55939\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Advanced College of Engineering and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3126/jacem.v8i2.55939","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

由于生成了大量的数据，自动文本摘要变得越来越重要。手工处理文档是乏味的，主要是由于缺乏标准。因此，需要一种机制来减小文本大小、构建文本并使其易于用户阅读。自然语言处理(NLP)对于分析大量非结构化、文本繁重的数据至关重要。该项目旨在通过引入一种新的神经网络模型来解决抽取和抽象文本摘要的问题，该模型可以处理较长文档中重复和不连贯的短语。该模型采用了一种新颖的Seq2Seq架构，通过注意内机制增强了标准注意模型。此外，本文还采用了一种结合监督词预测和强化学习的训练方法。该模型采用了一种与标准的编码器-解码器模型不同的混合指针-生成器网络。这种方法产生比现有模型更高质量的摘要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Short Updates- Machine Learning Based News Summarizer

Automated Text Summarization is becoming important due to the vast amount of data being generated. Manual processing of documents is tedious, mostly due to the absence of standards. Therefore, there is a need for a mechanism to reduce text size, structure it, and make it readable for users. Natural Language Processing (NLP) is critical for analyzing large amounts of unstructured, text-heavy data. This project aims to address concerns with extractive and abstractive text summarization by introducing a new neural network model that deals with repetitive and incoherent phrases in longer documents. The model incorporates a novel Seq2Seq architecture that enhances the standard attentional model with an intra-attention mechanism. Additionally, a new training method that combines supervised word prediction and reinforcement learning is employed. The model utilizes a hybrid pointer-generator network, which distinguishes it from the standard encoder-decoder model. This approach produces higher quality summaries than existing models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Advanced College of Engineering and Management

自引率

0.00%

发文量