{"title":"Automatic labeling for news article classification based on paragraph vector","authors":"Taishi Saito, O. Uchida","doi":"10.1109/ICITEED.2017.8250448","DOIUrl":null,"url":null,"abstract":"Getting useful information from the Internet plays an important role. A news site is one of Internet services often used for obtaining information on the Internet. The news site has advantages such that information update is fast and there are abundant kinds of information, and in recent years there are sites that collaborate with multiple newspaper companies and post bulk content. However, as there are a lot of articles, there are problems that it is difficult to find the articles we would like to read. Therefore, how to classify and present articles is an important issue. In this study, we consider the category classification of documents using distributed representation of sentences. Specifically, we propose a method to classify articles by extracting words with similar meanings from sentence vectors of each category and assigning them as labels.","PeriodicalId":267403,"journal":{"name":"2017 9th International Conference on Information Technology and Electrical Engineering (ICITEE)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 9th International Conference on Information Technology and Electrical Engineering (ICITEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICITEED.2017.8250448","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Getting useful information from the Internet plays an important role. A news site is one of Internet services often used for obtaining information on the Internet. The news site has advantages such that information update is fast and there are abundant kinds of information, and in recent years there are sites that collaborate with multiple newspaper companies and post bulk content. However, as there are a lot of articles, there are problems that it is difficult to find the articles we would like to read. Therefore, how to classify and present articles is an important issue. In this study, we consider the category classification of documents using distributed representation of sentences. Specifically, we propose a method to classify articles by extracting words with similar meanings from sentence vectors of each category and assigning them as labels.