{"title":"Research and application of news-text similarity algorithm based on Chinese word segmentation","authors":"Wei Guan, Pengzhou Zhang","doi":"10.1109/CECNET.2013.6703375","DOIUrl":null,"url":null,"abstract":"With the rapid development of the Internet, text messages on the network is also an exponential growth. Facing the vast network of information, how to quickly and efficiently identify the different sites of similar news-text plays a major role in strengthening the integrated management of network information. Existing text similarity algorithm has many disadvantages when used in Chinese news-texts, we propose a more suitable and effective news-text similarity algorithm. This paper uses the Chinese word segmentation technology, and based on this kind of news-text similarity comparison and improved vector space model is applied to the algorithm. Experimental results show that the proposed method is superior to traditional methods the results obtained, thus proving the proposed Chinese news-text similarity calculation method.","PeriodicalId":427418,"journal":{"name":"2013 3rd International Conference on Consumer Electronics, Communications and Networks","volume":"16 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 3rd International Conference on Consumer Electronics, Communications and Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CECNET.2013.6703375","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
With the rapid development of the Internet, text messages on the network is also an exponential growth. Facing the vast network of information, how to quickly and efficiently identify the different sites of similar news-text plays a major role in strengthening the integrated management of network information. Existing text similarity algorithm has many disadvantages when used in Chinese news-texts, we propose a more suitable and effective news-text similarity algorithm. This paper uses the Chinese word segmentation technology, and based on this kind of news-text similarity comparison and improved vector space model is applied to the algorithm. Experimental results show that the proposed method is superior to traditional methods the results obtained, thus proving the proposed Chinese news-text similarity calculation method.