基于人气的微博检索时间相关性估计

Manar Alohaly, J. D. Teresco
{"title":"基于人气的微博检索时间相关性估计","authors":"Manar Alohaly, J. D. Teresco","doi":"10.1145/2638404.2675721","DOIUrl":null,"url":null,"abstract":"Finding relevant information among the vast amounts of data generated continuously by modern micro-blogging platforms has opened new challenges in information retrieval. Recent studies on time-based retrieval have shown that identifying the relevant time periods to be incorporated into the retrieval process is promising; by relevant time period we mean the peak time of a query that satisfies the temporal needs of the user's query. Or in other words, a time period at which the potential to find accurate matches for the query in a set of retrieved documents is relatively high when compared with other time periods. We refer to this as temporal relevance. In this paper, using data collected from Twitter, we propose a new temporal relevance estimation technique based on tracking documents published by the popular users, who have high indegree (i.e., number of followers). In this study we concentrate on queries that are short (one word) and popular, i.e., constantly consumed by micro-bloggers. We choose the simple frequency-based technique to estimate the relevant time period as a baseline against which we evaluate our technique. The results of our technique either match or suggest a better time period as the most relevant one, when compared with the baseline. In fact, for the type of queries in our study, narrowing our focus to the documents published by popular users produces a query-to-documents matching pattern that uncovers some information about temporal relevance that might otherwise be hidden. Also, our matching pattern reflects the nature of the real-world news events that are related to the user's query more so than the baseline, thus revealing the important time frames more clearly.","PeriodicalId":91384,"journal":{"name":"Proceedings of the 2014 ACM Southeast Regional Conference","volume":"20 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2014-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Popularity-based temporal relevance estimation for micro-blogging retrieval\",\"authors\":\"Manar Alohaly, J. D. Teresco\",\"doi\":\"10.1145/2638404.2675721\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Finding relevant information among the vast amounts of data generated continuously by modern micro-blogging platforms has opened new challenges in information retrieval. Recent studies on time-based retrieval have shown that identifying the relevant time periods to be incorporated into the retrieval process is promising; by relevant time period we mean the peak time of a query that satisfies the temporal needs of the user's query. Or in other words, a time period at which the potential to find accurate matches for the query in a set of retrieved documents is relatively high when compared with other time periods. We refer to this as temporal relevance. In this paper, using data collected from Twitter, we propose a new temporal relevance estimation technique based on tracking documents published by the popular users, who have high indegree (i.e., number of followers). In this study we concentrate on queries that are short (one word) and popular, i.e., constantly consumed by micro-bloggers. We choose the simple frequency-based technique to estimate the relevant time period as a baseline against which we evaluate our technique. The results of our technique either match or suggest a better time period as the most relevant one, when compared with the baseline. In fact, for the type of queries in our study, narrowing our focus to the documents published by popular users produces a query-to-documents matching pattern that uncovers some information about temporal relevance that might otherwise be hidden. Also, our matching pattern reflects the nature of the real-world news events that are related to the user's query more so than the baseline, thus revealing the important time frames more clearly.\",\"PeriodicalId\":91384,\"journal\":{\"name\":\"Proceedings of the 2014 ACM Southeast Regional Conference\",\"volume\":\"20 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-03-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2014 ACM Southeast Regional Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2638404.2675721\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2014 ACM Southeast Regional Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2638404.2675721","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在现代微博平台不断产生的海量数据中寻找相关信息,对信息检索提出了新的挑战。最近关于基于时间的检索的研究表明,确定将相关时间段纳入检索过程是有希望的;我们所说的相关时间段是指满足用户查询临时需求的查询高峰时间。或者换句话说,与其他时间段相比,在一组检索文档中为查询找到准确匹配的可能性相对较高的时间段。我们称之为时间相关性。本文利用Twitter上的数据,提出了一种新的时间相关性估计技术,该技术基于跟踪高相关度(即关注者数量)的热门用户发布的文档。在这项研究中,我们专注于短(一个词)和流行的查询,即微博用户经常使用的查询。我们选择简单的基于频率的技术来估计相关的时间段,作为我们评估技术的基线。当与基线比较时,我们的技术结果匹配或建议一个更好的时间段作为最相关的时间段。事实上,对于我们研究中的查询类型,将我们的关注点缩小到流行用户发布的文档会产生一个查询到文档的匹配模式,该模式揭示了一些关于时间相关性的信息,否则这些信息可能会被隐藏。此外,我们的匹配模式反映了与用户查询相关的真实世界新闻事件的性质,从而更清楚地揭示了重要的时间框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Popularity-based temporal relevance estimation for micro-blogging retrieval
Finding relevant information among the vast amounts of data generated continuously by modern micro-blogging platforms has opened new challenges in information retrieval. Recent studies on time-based retrieval have shown that identifying the relevant time periods to be incorporated into the retrieval process is promising; by relevant time period we mean the peak time of a query that satisfies the temporal needs of the user's query. Or in other words, a time period at which the potential to find accurate matches for the query in a set of retrieved documents is relatively high when compared with other time periods. We refer to this as temporal relevance. In this paper, using data collected from Twitter, we propose a new temporal relevance estimation technique based on tracking documents published by the popular users, who have high indegree (i.e., number of followers). In this study we concentrate on queries that are short (one word) and popular, i.e., constantly consumed by micro-bloggers. We choose the simple frequency-based technique to estimate the relevant time period as a baseline against which we evaluate our technique. The results of our technique either match or suggest a better time period as the most relevant one, when compared with the baseline. In fact, for the type of queries in our study, narrowing our focus to the documents published by popular users produces a query-to-documents matching pattern that uncovers some information about temporal relevance that might otherwise be hidden. Also, our matching pattern reflects the nature of the real-world news events that are related to the user's query more so than the baseline, thus revealing the important time frames more clearly.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信