{"title":"短更新-基于机器学习的新闻摘要","authors":"Raksha Dangol, Prashna Adhikari, Pranjal Dahal, Hrizu Sharma","doi":"10.3126/jacem.v8i2.55939","DOIUrl":null,"url":null,"abstract":"Automated Text Summarization is becoming important due to the vast amount of data being generated. Manual processing of documents is tedious, mostly due to the absence of standards. Therefore, there is a need for a mechanism to reduce text size, structure it, and make it readable for users. Natural Language Processing (NLP) is critical for analyzing large amounts of unstructured, text-heavy data. This project aims to address concerns with extractive and abstractive text summarization by introducing a new neural network model that deals with repetitive and incoherent phrases in longer documents. The model incorporates a novel Seq2Seq architecture that enhances the standard attentional model with an intra-attention mechanism. Additionally, a new training method that combines supervised word prediction and reinforcement learning is employed. The model utilizes a hybrid pointer-generator network, which distinguishes it from the standard encoder-decoder model. This approach produces higher quality summaries than existing models.","PeriodicalId":306432,"journal":{"name":"Journal of Advanced College of Engineering and Management","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Short Updates- Machine Learning Based News Summarizer\",\"authors\":\"Raksha Dangol, Prashna Adhikari, Pranjal Dahal, Hrizu Sharma\",\"doi\":\"10.3126/jacem.v8i2.55939\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automated Text Summarization is becoming important due to the vast amount of data being generated. Manual processing of documents is tedious, mostly due to the absence of standards. Therefore, there is a need for a mechanism to reduce text size, structure it, and make it readable for users. Natural Language Processing (NLP) is critical for analyzing large amounts of unstructured, text-heavy data. This project aims to address concerns with extractive and abstractive text summarization by introducing a new neural network model that deals with repetitive and incoherent phrases in longer documents. The model incorporates a novel Seq2Seq architecture that enhances the standard attentional model with an intra-attention mechanism. Additionally, a new training method that combines supervised word prediction and reinforcement learning is employed. The model utilizes a hybrid pointer-generator network, which distinguishes it from the standard encoder-decoder model. This approach produces higher quality summaries than existing models.\",\"PeriodicalId\":306432,\"journal\":{\"name\":\"Journal of Advanced College of Engineering and Management\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Advanced College of Engineering and Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3126/jacem.v8i2.55939\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Advanced College of Engineering and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3126/jacem.v8i2.55939","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Short Updates- Machine Learning Based News Summarizer
Automated Text Summarization is becoming important due to the vast amount of data being generated. Manual processing of documents is tedious, mostly due to the absence of standards. Therefore, there is a need for a mechanism to reduce text size, structure it, and make it readable for users. Natural Language Processing (NLP) is critical for analyzing large amounts of unstructured, text-heavy data. This project aims to address concerns with extractive and abstractive text summarization by introducing a new neural network model that deals with repetitive and incoherent phrases in longer documents. The model incorporates a novel Seq2Seq architecture that enhances the standard attentional model with an intra-attention mechanism. Additionally, a new training method that combines supervised word prediction and reinforcement learning is employed. The model utilizes a hybrid pointer-generator network, which distinguishes it from the standard encoder-decoder model. This approach produces higher quality summaries than existing models.