从移动数据集中提取事实的自动解释框架

Anique Tahir, Yuhan Sun, Mohamed Sarwat
{"title":"从移动数据集中提取事实的自动解释框架","authors":"Anique Tahir, Yuhan Sun, Mohamed Sarwat","doi":"10.1109/MDM.2019.00-48","DOIUrl":null,"url":null,"abstract":"When a data scientist analyzes mobility data (e.g., using a data visualization tool), she may find out some interesting facts in the dataset. An example of a fact can be: \"The number of Taxi trips in NYC on January 23, 2016, dropped drastically as compared to other days of the same month\". However, the data scientist may be left clueless if they cannot find a crisp explanation to such a fact. Furthermore, the tedious task of finding an explanation by manually scraping the data becomes even impossible with big data. Existing techniques are designed for non-spatial data which cannot be applied to spatial data because it does not consider the spatial proximity. In this paper, we propose an automatic framework which guides the data scientist to explain the fact discovered from mobility data. Our approach expands on the aggravation and intervention techniques while using spatial partitioning/clustering to improve explanations for spatial data. Experiments show that the proposed approach outperforms the state-of-the-art approaches in finding the explanation for facts extracted from NYC taxi real mobility dataset.","PeriodicalId":241426,"journal":{"name":"2019 20th IEEE International Conference on Mobile Data Management (MDM)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Automated Framework for Explaining Facts Extracted From Mobility Datasets\",\"authors\":\"Anique Tahir, Yuhan Sun, Mohamed Sarwat\",\"doi\":\"10.1109/MDM.2019.00-48\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When a data scientist analyzes mobility data (e.g., using a data visualization tool), she may find out some interesting facts in the dataset. An example of a fact can be: \\\"The number of Taxi trips in NYC on January 23, 2016, dropped drastically as compared to other days of the same month\\\". However, the data scientist may be left clueless if they cannot find a crisp explanation to such a fact. Furthermore, the tedious task of finding an explanation by manually scraping the data becomes even impossible with big data. Existing techniques are designed for non-spatial data which cannot be applied to spatial data because it does not consider the spatial proximity. In this paper, we propose an automatic framework which guides the data scientist to explain the fact discovered from mobility data. Our approach expands on the aggravation and intervention techniques while using spatial partitioning/clustering to improve explanations for spatial data. Experiments show that the proposed approach outperforms the state-of-the-art approaches in finding the explanation for facts extracted from NYC taxi real mobility dataset.\",\"PeriodicalId\":241426,\"journal\":{\"name\":\"2019 20th IEEE International Conference on Mobile Data Management (MDM)\",\"volume\":\"114 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 20th IEEE International Conference on Mobile Data Management (MDM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MDM.2019.00-48\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 20th IEEE International Conference on Mobile Data Management (MDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MDM.2019.00-48","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

当数据科学家分析流动性数据时(例如,使用数据可视化工具),她可能会在数据集中发现一些有趣的事实。一个事实的例子可以是:“2016年1月23日纽约市的出租车出行次数与同月的其他日子相比急剧下降”。然而,如果数据科学家找不到对这一事实的清晰解释,他们可能会束手无策。此外,通过手动抓取数据来寻找解释的繁琐任务在大数据中甚至变得不可能。现有的技术是针对非空间数据设计的,由于没有考虑空间接近性而不能应用于空间数据。在本文中,我们提出了一个自动框架来指导数据科学家解释从移动数据中发现的事实。我们的方法扩展了加重和干预技术,同时使用空间分区/聚类来改进对空间数据的解释。实验表明,所提出的方法在寻找从纽约市出租车真实移动数据集中提取的事实的解释方面优于最先进的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An Automated Framework for Explaining Facts Extracted From Mobility Datasets
When a data scientist analyzes mobility data (e.g., using a data visualization tool), she may find out some interesting facts in the dataset. An example of a fact can be: "The number of Taxi trips in NYC on January 23, 2016, dropped drastically as compared to other days of the same month". However, the data scientist may be left clueless if they cannot find a crisp explanation to such a fact. Furthermore, the tedious task of finding an explanation by manually scraping the data becomes even impossible with big data. Existing techniques are designed for non-spatial data which cannot be applied to spatial data because it does not consider the spatial proximity. In this paper, we propose an automatic framework which guides the data scientist to explain the fact discovered from mobility data. Our approach expands on the aggravation and intervention techniques while using spatial partitioning/clustering to improve explanations for spatial data. Experiments show that the proposed approach outperforms the state-of-the-art approaches in finding the explanation for facts extracted from NYC taxi real mobility dataset.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信