{"title":"基于内容和时间相似度相结合的热点话题检测","authors":"Yi Zhao, Kun Zhang, Hong Zhang, Xia Yan, Ying Cai","doi":"10.1109/PIC.2017.8359580","DOIUrl":null,"url":null,"abstract":"Hot topic detection has always been a hot research field, and there are a large number of the applications of this technology in real life. Most of the previous work, however, focused only on the textual information of the news itself, while ignoring the other attributes of the news, such as the time the news was published, which can also tell the topic described in its perspective. And others use only one certain method to calculate the text similarity, which all have their disadvantages. To solve these problems, we proposed our own topic detection algorithm, which takes into account the information difference between the title and the text, combines several methods to calculate text similarity, and combines text and time similarity together. We tested the combined similarity calculation methods, and tested the effect of several time similarity equations. Then we took three different models to calculate the combined similarity which are linear model, quadratic polynomial model and neural network model. Finally, we give out the results and analysis of our experiments.","PeriodicalId":370588,"journal":{"name":"2017 International Conference on Progress in Informatics and Computing (PIC)","volume":"491 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Hot topic detection based on combined content and time similarity\",\"authors\":\"Yi Zhao, Kun Zhang, Hong Zhang, Xia Yan, Ying Cai\",\"doi\":\"10.1109/PIC.2017.8359580\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hot topic detection has always been a hot research field, and there are a large number of the applications of this technology in real life. Most of the previous work, however, focused only on the textual information of the news itself, while ignoring the other attributes of the news, such as the time the news was published, which can also tell the topic described in its perspective. And others use only one certain method to calculate the text similarity, which all have their disadvantages. To solve these problems, we proposed our own topic detection algorithm, which takes into account the information difference between the title and the text, combines several methods to calculate text similarity, and combines text and time similarity together. We tested the combined similarity calculation methods, and tested the effect of several time similarity equations. Then we took three different models to calculate the combined similarity which are linear model, quadratic polynomial model and neural network model. Finally, we give out the results and analysis of our experiments.\",\"PeriodicalId\":370588,\"journal\":{\"name\":\"2017 International Conference on Progress in Informatics and Computing (PIC)\",\"volume\":\"491 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on Progress in Informatics and Computing (PIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PIC.2017.8359580\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Progress in Informatics and Computing (PIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PIC.2017.8359580","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Hot topic detection based on combined content and time similarity
Hot topic detection has always been a hot research field, and there are a large number of the applications of this technology in real life. Most of the previous work, however, focused only on the textual information of the news itself, while ignoring the other attributes of the news, such as the time the news was published, which can also tell the topic described in its perspective. And others use only one certain method to calculate the text similarity, which all have their disadvantages. To solve these problems, we proposed our own topic detection algorithm, which takes into account the information difference between the title and the text, combines several methods to calculate text similarity, and combines text and time similarity together. We tested the combined similarity calculation methods, and tested the effect of several time similarity equations. Then we took three different models to calculate the combined similarity which are linear model, quadratic polynomial model and neural network model. Finally, we give out the results and analysis of our experiments.