趋势预测的时间主题推断

S. Aghababaei, M. Makrehchi
{"title":"趋势预测的时间主题推断","authors":"S. Aghababaei, M. Makrehchi","doi":"10.1109/ICDMW.2015.214","DOIUrl":null,"url":null,"abstract":"Publicly available social data has been adoptedwidely to explore language of crowds and leverage themin real world problem predictions. In microblogs, usersextensively share information about their moods, topics ofinterests, and social events which provide ideal data resourcefor many applications. We also study footprints of socialproblems in Twitter data. Hidden topics identified fromTwitter content are utilized to predict crime trend. Since ourproblem has a sequential order, extracting meaningful patternsinvolves temporal analysis. Prediction model requiresto address information evolution, in which data are morerelated when they are close in time rather than further apart. The study has been presented into two steps: firstly, a temporaltopic detection model is introduced to infer predictivehidden topics. The model builds a dynamic vocabulary todetect emerged topics. Topics are compared over time to havediversity and novelty in each time consideration. Secondly, apredictive model is proposed which utilizes identified temporaltopics to predict crime trend in prospective timeframe. The model does not suffer from lack of available learningexamples. Learning examples are annotated with knowledgeinferred from the trend. The experiments have revealed, temporal topic detection outperforms static topic modelingwhen dealing with sequential data. Topics are more diversewhen are inferred in different time slices. In general, theresults indicate temporal topics have a strong correlationwith crime index changes. Predictability is high in somespecific crime types and could be variant depending on theincidents. The study provides insight into the correlation oflanguage and real world problems and impacts of social datain providing predictive indicators.","PeriodicalId":192888,"journal":{"name":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Temporal Topic Inference for Trend Prediction\",\"authors\":\"S. Aghababaei, M. Makrehchi\",\"doi\":\"10.1109/ICDMW.2015.214\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Publicly available social data has been adoptedwidely to explore language of crowds and leverage themin real world problem predictions. In microblogs, usersextensively share information about their moods, topics ofinterests, and social events which provide ideal data resourcefor many applications. We also study footprints of socialproblems in Twitter data. Hidden topics identified fromTwitter content are utilized to predict crime trend. Since ourproblem has a sequential order, extracting meaningful patternsinvolves temporal analysis. Prediction model requiresto address information evolution, in which data are morerelated when they are close in time rather than further apart. The study has been presented into two steps: firstly, a temporaltopic detection model is introduced to infer predictivehidden topics. The model builds a dynamic vocabulary todetect emerged topics. Topics are compared over time to havediversity and novelty in each time consideration. Secondly, apredictive model is proposed which utilizes identified temporaltopics to predict crime trend in prospective timeframe. The model does not suffer from lack of available learningexamples. Learning examples are annotated with knowledgeinferred from the trend. The experiments have revealed, temporal topic detection outperforms static topic modelingwhen dealing with sequential data. Topics are more diversewhen are inferred in different time slices. In general, theresults indicate temporal topics have a strong correlationwith crime index changes. Predictability is high in somespecific crime types and could be variant depending on theincidents. The study provides insight into the correlation oflanguage and real world problems and impacts of social datain providing predictive indicators.\",\"PeriodicalId\":192888,\"journal\":{\"name\":\"2015 IEEE International Conference on Data Mining Workshop (ICDMW)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE International Conference on Data Mining Workshop (ICDMW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDMW.2015.214\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2015.214","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

公开可用的社会数据已被广泛用于探索群体语言,并利用它们来预测现实世界的问题。在微博中,用户广泛地分享关于他们的情绪、兴趣话题和社会事件的信息,这为许多应用程序提供了理想的数据资源。我们还研究了Twitter数据中社会问题的足迹。从twitter内容中识别的隐藏话题被用来预测犯罪趋势。由于我们的问题有一个连续的顺序,提取有意义的模式涉及到时间分析。预测模型需要解决信息演化的问题,即数据在时间上越接近,相关性越强。该研究分为两个步骤:首先,引入时间主题检测模型来推断预测隐藏主题;该模型建立了一个动态词汇表来检测出现的主题。随着时间的推移,主题进行比较,在每次考虑中都具有多样性和新颖性。其次,提出了预测模型,利用确定的时间主题来预测未来时间框架内的犯罪趋势。该模型不受缺乏可用学习实例的影响。学习实例用从趋势中推断出的知识进行注释。实验表明,在处理顺序数据时,时间主题检测优于静态主题建模。当在不同的时间片中推断时,主题更加多样化。总体而言,研究结果表明,时间话题与犯罪指数的变化有很强的相关性。在某些特定的犯罪类型中,可预测性很高,并且可能因事件而异。该研究为语言和现实世界问题的相关性以及社会数据的影响提供了预测指标。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Temporal Topic Inference for Trend Prediction
Publicly available social data has been adoptedwidely to explore language of crowds and leverage themin real world problem predictions. In microblogs, usersextensively share information about their moods, topics ofinterests, and social events which provide ideal data resourcefor many applications. We also study footprints of socialproblems in Twitter data. Hidden topics identified fromTwitter content are utilized to predict crime trend. Since ourproblem has a sequential order, extracting meaningful patternsinvolves temporal analysis. Prediction model requiresto address information evolution, in which data are morerelated when they are close in time rather than further apart. The study has been presented into two steps: firstly, a temporaltopic detection model is introduced to infer predictivehidden topics. The model builds a dynamic vocabulary todetect emerged topics. Topics are compared over time to havediversity and novelty in each time consideration. Secondly, apredictive model is proposed which utilizes identified temporaltopics to predict crime trend in prospective timeframe. The model does not suffer from lack of available learningexamples. Learning examples are annotated with knowledgeinferred from the trend. The experiments have revealed, temporal topic detection outperforms static topic modelingwhen dealing with sequential data. Topics are more diversewhen are inferred in different time slices. In general, theresults indicate temporal topics have a strong correlationwith crime index changes. Predictability is high in somespecific crime types and could be variant depending on theincidents. The study provides insight into the correlation oflanguage and real world problems and impacts of social datain providing predictive indicators.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信