呼叫细节记录对人员流动抽样有偏差吗?

Gyan Ranjan, H. Zang, Zhi-Li Zhang, J. Bolot
{"title":"呼叫细节记录对人员流动抽样有偏差吗?","authors":"Gyan Ranjan, H. Zang, Zhi-Li Zhang, J. Bolot","doi":"10.1145/2412096.2412101","DOIUrl":null,"url":null,"abstract":"Call detail records (CDRs) have recently been used in studying different aspects of human mobility. While CDRs provide a means of sampling user locations at large population scales, they may not sample all locations proportionate to the visitation frequency of a user, owing to sparsity in time and space of voice-calls, thereby introducing a bias. Also, as the rate of sampling is inherently dependent on the calling frequencies of an individual, high voice-call activity users are often chosen for conducting a meaningful study. Such a selection process can, inadvertently, lead to a biased view as high frequency callers may not always be representative of an entire population. With the advent of 3G technology and wide adoption of smart-phones, cellular devices have become versatile end-hosts. As the data accessed on these devices does not always require human initiation, it affords us with an unprecedented opportunity to validate the utility of CDRs for studying human mobility. In this work, we investigate various metrics for human mobility studied in literature for over a million cellular users in the San Francisco bay-area, for over a month. Our findings reveal that although the voice-call process does well to sample significant locations, such as home and work, it may in some cases incur biases in capturing the overall spatio-temporal characteristics of individual human mobility. Additionally, we motivate an \"artificially\" imposed sampling process, vis-a-vis the voice-call process with the same average intensity. We observe that in many cases such an imposed sampling process yields better performance results based on the usual metrics like entropies and marginal distributions used often in literature.","PeriodicalId":43578,"journal":{"name":"Mobile Computing and Communications Review","volume":"1 1","pages":"33-44"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"116","resultStr":"{\"title\":\"Are call detail records biased for sampling human mobility?\",\"authors\":\"Gyan Ranjan, H. Zang, Zhi-Li Zhang, J. Bolot\",\"doi\":\"10.1145/2412096.2412101\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Call detail records (CDRs) have recently been used in studying different aspects of human mobility. While CDRs provide a means of sampling user locations at large population scales, they may not sample all locations proportionate to the visitation frequency of a user, owing to sparsity in time and space of voice-calls, thereby introducing a bias. Also, as the rate of sampling is inherently dependent on the calling frequencies of an individual, high voice-call activity users are often chosen for conducting a meaningful study. Such a selection process can, inadvertently, lead to a biased view as high frequency callers may not always be representative of an entire population. With the advent of 3G technology and wide adoption of smart-phones, cellular devices have become versatile end-hosts. As the data accessed on these devices does not always require human initiation, it affords us with an unprecedented opportunity to validate the utility of CDRs for studying human mobility. In this work, we investigate various metrics for human mobility studied in literature for over a million cellular users in the San Francisco bay-area, for over a month. Our findings reveal that although the voice-call process does well to sample significant locations, such as home and work, it may in some cases incur biases in capturing the overall spatio-temporal characteristics of individual human mobility. Additionally, we motivate an \\\"artificially\\\" imposed sampling process, vis-a-vis the voice-call process with the same average intensity. We observe that in many cases such an imposed sampling process yields better performance results based on the usual metrics like entropies and marginal distributions used often in literature.\",\"PeriodicalId\":43578,\"journal\":{\"name\":\"Mobile Computing and Communications Review\",\"volume\":\"1 1\",\"pages\":\"33-44\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"116\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mobile Computing and Communications Review\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2412096.2412101\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mobile Computing and Communications Review","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2412096.2412101","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 116

摘要

呼叫细节记录(CDRs)最近被用于研究人类移动的不同方面。虽然话单提供了一种在大人口规模上对用户位置进行抽样的方法,但由于语音呼叫在时间和空间上的稀疏性,它们可能无法对与用户访问频率成比例的所有位置进行抽样,从而产生偏差。此外,由于采样率本质上取决于个人的呼叫频率,因此经常选择高语音呼叫活动的用户进行有意义的研究。这样的选择过程可能在不经意间导致一种有偏见的观点,因为高频呼叫者可能并不总是代表整个人群。随着3G技术的出现和智能手机的广泛采用,蜂窝设备已经成为多功能的终端主机。由于在这些设备上访问的数据并不总是需要人工启动,它为我们提供了一个前所未有的机会来验证cdr在研究人类流动性方面的效用。在这项工作中,我们在一个多月的时间里调查了旧金山湾区100多万手机用户在文献中研究的各种人类流动性指标。我们的研究结果表明,尽管语音通话过程对重要地点(如家庭和工作地点)的采样效果很好,但在某些情况下,它可能会在捕捉个体人类流动性的整体时空特征时产生偏差。此外,我们激发了一个“人为”强加的采样过程,相对于语音呼叫过程具有相同的平均强度。我们观察到,在许多情况下,基于文献中常用的熵和边际分布等常规指标,这种强加的抽样过程产生了更好的性能结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Are call detail records biased for sampling human mobility?
Call detail records (CDRs) have recently been used in studying different aspects of human mobility. While CDRs provide a means of sampling user locations at large population scales, they may not sample all locations proportionate to the visitation frequency of a user, owing to sparsity in time and space of voice-calls, thereby introducing a bias. Also, as the rate of sampling is inherently dependent on the calling frequencies of an individual, high voice-call activity users are often chosen for conducting a meaningful study. Such a selection process can, inadvertently, lead to a biased view as high frequency callers may not always be representative of an entire population. With the advent of 3G technology and wide adoption of smart-phones, cellular devices have become versatile end-hosts. As the data accessed on these devices does not always require human initiation, it affords us with an unprecedented opportunity to validate the utility of CDRs for studying human mobility. In this work, we investigate various metrics for human mobility studied in literature for over a million cellular users in the San Francisco bay-area, for over a month. Our findings reveal that although the voice-call process does well to sample significant locations, such as home and work, it may in some cases incur biases in capturing the overall spatio-temporal characteristics of individual human mobility. Additionally, we motivate an "artificially" imposed sampling process, vis-a-vis the voice-call process with the same average intensity. We observe that in many cases such an imposed sampling process yields better performance results based on the usual metrics like entropies and marginal distributions used often in literature.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信