Evaluating Streams of Evolving News Events

G. Baruah, Mark D. Smucker, C. Clarke
{"title":"Evaluating Streams of Evolving News Events","authors":"G. Baruah, Mark D. Smucker, C. Clarke","doi":"10.1145/2766462.2767751","DOIUrl":null,"url":null,"abstract":"People track news events according to their interests and available time. For a major event of great personal interest, they might check for updates several times an hour, taking time to keep abreast of all aspects of the evolving event. For minor events of more marginal interest, they might check back once or twice a day for a few minutes to learn about the most significant developments. Systems generating streams of updates about evolving events can improve user performance by appropriately filtering these updates, making it easy for users to track events in a timely manner without undue information overload. Unfortunately, predicting user performance on these systems poses a significant challenge. Standard evaluation methodology, designed for Web search and other adhoc retrieval tasks, adapts poorly to this context. In this paper, we develop a simple model that simulates users checking the system from time to time to read updates. For each simulated user, we generate a trace of their activities alternating between away times and reading times. These traces are then applied to measure system effectiveness. We test our model using data from the TREC 2013 Temporal Summarization Track (TST) comparing it to the effectiveness measures used in that track. The primary TST measure corresponds most closely with a modeled user that checks back once a day on average for an average of one minute. Users checking more frequently for longer times may view the relative performance of participating systems quite differently. In light of this sensitivity to user behavior, we recommend that future experiments be built around clearly stated assumptions regarding user interfaces and access patterns, with effectiveness measures reflecting these assumptions.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2766462.2767751","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

People track news events according to their interests and available time. For a major event of great personal interest, they might check for updates several times an hour, taking time to keep abreast of all aspects of the evolving event. For minor events of more marginal interest, they might check back once or twice a day for a few minutes to learn about the most significant developments. Systems generating streams of updates about evolving events can improve user performance by appropriately filtering these updates, making it easy for users to track events in a timely manner without undue information overload. Unfortunately, predicting user performance on these systems poses a significant challenge. Standard evaluation methodology, designed for Web search and other adhoc retrieval tasks, adapts poorly to this context. In this paper, we develop a simple model that simulates users checking the system from time to time to read updates. For each simulated user, we generate a trace of their activities alternating between away times and reading times. These traces are then applied to measure system effectiveness. We test our model using data from the TREC 2013 Temporal Summarization Track (TST) comparing it to the effectiveness measures used in that track. The primary TST measure corresponds most closely with a modeled user that checks back once a day on average for an average of one minute. Users checking more frequently for longer times may view the relative performance of participating systems quite differently. In light of this sensitivity to user behavior, we recommend that future experiments be built around clearly stated assumptions regarding user interfaces and access patterns, with effectiveness measures reflecting these assumptions.
评估不断发展的新闻事件流
人们根据自己的兴趣和时间跟踪新闻事件。对于个人非常感兴趣的重大事件,他们可能一小时检查几次更新,花时间了解事件发展的各个方面。对于无关紧要的小事件,他们可能会每天花几分钟查看一两次,了解最重要的进展。系统生成关于不断发展的事件的更新流,可以通过适当地过滤这些更新来提高用户性能,使用户能够轻松地及时跟踪事件,而不会出现不必要的信息过载。不幸的是,预测这些系统上的用户性能是一项重大挑战。为Web搜索和其他特殊检索任务设计的标准评估方法很难适应这种情况。在本文中,我们开发了一个简单的模型来模拟用户不时检查系统以读取更新。对于每个模拟用户,我们生成他们在离开时间和阅读时间之间交替进行的活动的跟踪。然后应用这些跟踪来测量系统的有效性。我们使用来自TREC 2013时间汇总轨道(TST)的数据来测试我们的模型,并将其与该轨道中使用的有效性度量进行比较。主要的TST测量与建模用户最接近,该用户平均每天检查一次,平均每次检查一分钟。更频繁、更长时间检查的用户可能会以完全不同的方式查看参与系统的相对性能。鉴于这种对用户行为的敏感性,我们建议未来的实验应围绕关于用户界面和访问模式的明确假设建立,并采用反映这些假设的有效性度量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信