{"title":"基于MapReduce架构的互联网热点话题检测研究","authors":"Zheng Fen, Xu Yabin, Li Yanping","doi":"10.1109/IHMSC.2012.26","DOIUrl":null,"url":null,"abstract":"Internet public opinion increasingly influences daily lives of peoples and social stability. With the development of the Internet, the amount of information on the Internet is huge and updated quickly, which makes public opinion mining face enormous challenges in dealing with huge amounts of information and complex data. This paper proposes an internet Hot Topic Detection scheme based on cloud computing platform. The MapReduce programming model is introduced into the network public opinion analysis for processing massive, complex data. This scheme uses named entity words as text features, and the title and body are combined as two-dimensional VSM (vector space model) to represent text, to improve the accuracy of the Internet public opinion and the system response speed.","PeriodicalId":431532,"journal":{"name":"2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Research on Internet Hot Topic Detection Based on MapReduce Architecture\",\"authors\":\"Zheng Fen, Xu Yabin, Li Yanping\",\"doi\":\"10.1109/IHMSC.2012.26\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Internet public opinion increasingly influences daily lives of peoples and social stability. With the development of the Internet, the amount of information on the Internet is huge and updated quickly, which makes public opinion mining face enormous challenges in dealing with huge amounts of information and complex data. This paper proposes an internet Hot Topic Detection scheme based on cloud computing platform. The MapReduce programming model is introduced into the network public opinion analysis for processing massive, complex data. This scheme uses named entity words as text features, and the title and body are combined as two-dimensional VSM (vector space model) to represent text, to improve the accuracy of the Internet public opinion and the system response speed.\",\"PeriodicalId\":431532,\"journal\":{\"name\":\"2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IHMSC.2012.26\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IHMSC.2012.26","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on Internet Hot Topic Detection Based on MapReduce Architecture
Internet public opinion increasingly influences daily lives of peoples and social stability. With the development of the Internet, the amount of information on the Internet is huge and updated quickly, which makes public opinion mining face enormous challenges in dealing with huge amounts of information and complex data. This paper proposes an internet Hot Topic Detection scheme based on cloud computing platform. The MapReduce programming model is introduced into the network public opinion analysis for processing massive, complex data. This scheme uses named entity words as text features, and the title and body are combined as two-dimensional VSM (vector space model) to represent text, to improve the accuracy of the Internet public opinion and the system response speed.