{"title":"List Based Matching Algorithm for Classifying News Articles in NewsPage.com","authors":"T. Jo, Gwyduk Yeom","doi":"10.1109/SYSOSE.2008.4724142","DOIUrl":null,"url":null,"abstract":"This research proposes an alternative approach to machine learning based ones for categorizing news articles given as in plain texts. In order to use one of machine learning based approaches for the task, documents should be encoded into numerical vectors; it causes two problems: huge dimensionality and sparse distribution. The proposed approach is intended to address the two problems. In other words, the two problems are avoided by encoding a document or documents into a table, instead of numerical vectors. Therefore, the goal of the research is to improve the performance of text categorization by solving the two problems.","PeriodicalId":425055,"journal":{"name":"2008 IEEE International Workshop on Semantic Computing and Applications","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Workshop on Semantic Computing and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SYSOSE.2008.4724142","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
This research proposes an alternative approach to machine learning based ones for categorizing news articles given as in plain texts. In order to use one of machine learning based approaches for the task, documents should be encoded into numerical vectors; it causes two problems: huge dimensionality and sparse distribution. The proposed approach is intended to address the two problems. In other words, the two problems are avoided by encoding a document or documents into a table, instead of numerical vectors. Therefore, the goal of the research is to improve the performance of text categorization by solving the two problems.