{"title":"Urban Human Mobility: Data-Driven Modeling and Prediction","authors":"Jinzhong Wang, Xiangjie Kong, Feng Xia, Lijun Sun","doi":"10.1145/3331651.3331653","DOIUrl":"https://doi.org/10.1145/3331651.3331653","url":null,"abstract":"Human mobility is a multidisciplinary field of physics and computer science and has drawn a lot of attentions in recent years. Some representative models and prediction approaches have been proposed for modeling and predicting human mobility. However, multi-source heterogeneous data from handheld terminals, GPS, and social media, provides a new driving force for exploring urban human mobility patterns from a quantitative and microscopic perspective. The studies of human mobility modeling and prediction play a vital role in a series of applications such as urban planning, epidemic control, location-based services, and intelligent transportation management. In this survey, we review human mobility models based on a human-centric angle in a datadriven context. Specifically, we characterize human mobility patterns from individual, collective, and hybrid levels. Meanwhile, we survey human mobility prediction methods from four aspects and then describe recent development respectively. Finally, we discuss some open issues that provide a helpful reference for researchers' future direction. This review not only lays a solid foundation for beginners who want to acquire a quick understanding of human mobility but also provides helpful information for researchers on how to develop a unified human mobility model.","PeriodicalId":90050,"journal":{"name":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","volume":"59 1","pages":"1-19"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84613621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Schilling, Griselda Pena-Jackson, S. Russell, J. Corral, Bethany M. Kwan, Julie Ressalam
{"title":"Co-Designing Learning Materials to Empower Laypersons to Better Understand Big Data and Big Data Methods","authors":"L. Schilling, Griselda Pena-Jackson, S. Russell, J. Corral, Bethany M. Kwan, Julie Ressalam","doi":"10.1145/3331651.3331659","DOIUrl":"https://doi.org/10.1145/3331651.3331659","url":null,"abstract":"University of Colorado Anschutz Medical Campus' Data Science to Patient Value Program and 2040 Partners for Health sought to create open learning materials for engaged citizens and community leaders regarding big data and big data methods to support their collaboration in patient-centered and participatorybased community research and evaluation. 2040 is a local nonprofit organization that cultivates partnerships in Aurora, Colorado neighborhoods to tackle critical health needs. Our goal was to co-design and co-create a series of big data learning modules accessible to community laypeople, so they might better understand big data topics and be empowered more actively engage in health research and evaluation that uses big data methods.","PeriodicalId":90050,"journal":{"name":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","volume":"19 1","pages":"41-44"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88895254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Budding Data Scientists Hackathon","authors":"ChuaHui Xiang, ChuaEe-Ling, SooKenneth","doi":"10.1145/3331651.3331658","DOIUrl":"https://doi.org/10.1145/3331651.3331658","url":null,"abstract":"The \"Budding Data Scientists Hackathon\" was a pilot program to bring data science into a high school's curriculum in Singapore. Unlike typical hackathons, this hackathon lasted for a few months. A ...","PeriodicalId":90050,"journal":{"name":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","volume":"146 1","pages":"38-40"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3331651.3331658","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64018618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Suvorova, V. Ivaniushina, Alina Bakhitova, Anastasiya Kuznetsova
{"title":"Women Data Science Leaders in Russia","authors":"A. Suvorova, V. Ivaniushina, Alina Bakhitova, Anastasiya Kuznetsova","doi":"10.1145/3331651.3331660","DOIUrl":"https://doi.org/10.1145/3331651.3331660","url":null,"abstract":"The project \"Women Data Science Leaders in Russia\" aims to increase gender diversity and women's participation in the Russian data science community by means of developing online courses and video materials that present female role models for female students, in order to change the stereotypes that affect the perception of the data science field.","PeriodicalId":90050,"journal":{"name":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","volume":"42 1","pages":"45-48"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76004309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Mobasher, L. Dettori, D. Raicu, R. Settimi, Nasim Sonboli, M. Stettler
{"title":"Data Science Summer Academy for Chicago Public School Students","authors":"B. Mobasher, L. Dettori, D. Raicu, R. Settimi, Nasim Sonboli, M. Stettler","doi":"10.1145/3331651.3331661","DOIUrl":"https://doi.org/10.1145/3331651.3331661","url":null,"abstract":"In this article, we describe DePaul University's summer data science academy for Chicago Public School students which was, in part, funded through the SIGKDD Impact Award program in 2018. The goal of the academy was to increase awareness about data science among high school students. The program specifically aimed to broaden participation of underrepresented groups in computing by targeting economically disadvantaged, African American, Hispanic, and female students. A cohort of 15 high school students from the Chicago Public School system participated in this week-long lab-based data science program learning about a variety of data science methods and their applications, including data visualization, distance-based methods, classification, clustering and others. The group comprised of 75% African American and Hispanic students, 58.3% of whom were female.","PeriodicalId":90050,"journal":{"name":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","volume":"1 1","pages":"49-52"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91263051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SIGKDD Impact Program 2018","authors":"R. Sosič, R. Ghani, J. Leskovec","doi":"10.1145/3331651.3331657","DOIUrl":"https://doi.org/10.1145/3331651.3331657","url":null,"abstract":"The SIGKDD Impact Program was established in 2017. As the SIGKDD community has expanded its reach dramati- cally over the past few years and the KDD conference has grown into a major global event, the aim for the social im- pact program is to focus the power of the community to- wards a broader positive societal impact.\u0000 The goal of this program was to support projects that pro- mote data science, increase its impact on society, and help the data science community. The project duration was lim- ited to one year. The total amount of funding provided by the SIGKDD Impact Program in 2018 was $250k, given as unrestricted gifts.\u0000 Projects funded in 2018 were required to present a mid- project update of their work and outcomes at the KDD 2018 conference in London in August 2018.","PeriodicalId":90050,"journal":{"name":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","volume":"54 1","pages":"36-37"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90314801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Research Issues of Outlier Detection in Trajectory Streams Using GPUs","authors":"Eleazar Leal, L. Gruenwald","doi":"10.1145/3299986.3299989","DOIUrl":"https://doi.org/10.1145/3299986.3299989","url":null,"abstract":"The widespread availability of sensors like GPS and traffic cameras has made it possible to collect large amounts of spatio-temporal data. One such type of data are trajectories, each of which consists of a time-ordered sequence of positions that a moving object occupies in space as time goes by. Trajectories can be streamed in real time from sensors, and because of this, they capture the current state of moving objects. For this reason, trajectories can be used in applications such as the real-time detection of senior citizens who have just fallen or who have just gotten lost outdoors, the real-time detection of drunk drivers, and the real-time detection of enemy forces in the battlefield. These applications involve the identification of trajectories with anomalous behaviors, and require fast processing in order to take immediate preventive action. However, outlier detection poses challenges stemming from both the complexity of the data and of the task. One way to address this is through parallel architectures like GPUs. In this paper, we present the problem of outlier detection in trajectory streams, and discuss the research issues that should be addressed by new outlier detection techniques for trajectory streams on GPUs.","PeriodicalId":90050,"journal":{"name":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","volume":"33 1","pages":"13-20"},"PeriodicalIF":0.0,"publicationDate":"2018-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85289669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Progress Indication for Machine Learning Model Building: A Feasibility Demonstration.","authors":"Gang Luo","doi":"10.1145/3299986.3299988","DOIUrl":"https://doi.org/10.1145/3299986.3299988","url":null,"abstract":"<p><p>Progress indicators are desirable for machine learning model building that often takes a long time, by continuously estimating the remaining model building time and the portion of model building work that has been finished. Recently, we proposed a high-level framework using system approaches to support non-trivial progress indicators for machine learning model building, but offered no detailed implementation technique. It remains to be seen whether it is feasible to provide such progress indicators. In this paper, we fill this gap and give the first demonstration that offering such progress indicators is viable. We describe detailed progress indicator implementation techniques for three major, supervised machine learning algorithms. We report an implementation of these techniques in Weka.</p>","PeriodicalId":90050,"journal":{"name":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","volume":"20 2","pages":"1-12"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3299986.3299988","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37041531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Broad Learning:: An Emerging Area in Social Network Analysis","authors":"Jiawei Zhang, Philip S. Yu","doi":"10.1145/3229329.3229333","DOIUrl":"https://doi.org/10.1145/3229329.3229333","url":null,"abstract":"Looking from a global perspective, the landscape of online social networks is highly fragmented. A large number of online social networks have appeared, which can provide users with various types of services. Generally, information available in these online social networks is of diverse categories, which can be represented as heterogeneous social networks (HSNs) formally. Meanwhile, in such an age of online social media, users usually participate in multiple online social networks simultaneously, who can act as the anchors aligning different social networks together. So multiple HSNs not only represent information in each social network, but also fuse information from multiple networks.\u0000 Formally, the online social networks sharing common users are named as the aligned social networks, and these shared users are called the anchor users. The heterogeneous information generated by users' social activities in the multiple aligned social networks provides social network practitioners and researchers with the opportunities to study individual user's social behaviors across multiple social platforms simultaneously. This paper presents a comprehensive survey about the latest research works on multiple aligned HSNs studies based on the broad learning setting, which covers 5 major research tasks, including network alignment, link prediction, community detection, information diffusion and network embedding respectively.","PeriodicalId":90050,"journal":{"name":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","volume":"10 1","pages":"24-50"},"PeriodicalIF":0.0,"publicationDate":"2018-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87905202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Survey on Anomaly detection in Evolving Data: [with Application to Forest Fire Risk Prediction]","authors":"Mahsa Salehi, Lida Rashidi","doi":"10.1145/3229329.3229332","DOIUrl":"https://doi.org/10.1145/3229329.3229332","url":null,"abstract":"Traditionally most of the anomaly detection algorithms have been designed for 'static' datasets, in which all the observations are available at one time. In non-stationary environments on the other hand, the same algorithms cannot be applied as the underlying data distributions change constantly and the same models are not valid. Hence, we need to devise adaptive models that take into account the dynamically changing characteristics of environments and detect anomalies in 'evolving' data. Over the last two decades, many algorithms have been proposed to detect anomalies in evolving data. Some of them consider scenarios where a sequence of objects (called data streams) with one or multiple features evolves over time. Whereas the others concentrate on more complex scenarios, where streaming objects with one or multiple features have causal/non-causal relationships with each other. The latter can be represented as evolving graphs. In this paper, we categorize existing strategies for detecting anomalies in both scenarios including the state-of-the-art techniques. Since label information is mostly unavailable in real-world applications when data evolves, we review the unsupervised approaches in this paper. We then present an interesting application example, i.e., forest re risk prediction, and conclude the paper with future research directions in this eld for researchers and industry.","PeriodicalId":90050,"journal":{"name":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","volume":"3 1","pages":"13-23"},"PeriodicalIF":0.0,"publicationDate":"2018-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73862908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}