Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining最新文献

筛选
英文 中文
Active Deep Learning for Activity Recognition with Context Aware Annotator Selection 基于上下文感知注释器选择的活动识别主动深度学习
H. S. Hossain, Nirmalya Roy
{"title":"Active Deep Learning for Activity Recognition with Context Aware Annotator Selection","authors":"H. S. Hossain, Nirmalya Roy","doi":"10.1145/3292500.3330688","DOIUrl":"https://doi.org/10.1145/3292500.3330688","url":null,"abstract":"Machine learning models are bounded by the credibility of ground truth data used for both training and testing. Regardless of the problem domain, this ground truth annotation is objectively manual and tedious as it needs considerable amount of human intervention. With the advent of Active Learning with multiple annotators, the burden can be somewhat mitigated by actively acquiring labels of most informative data instances. However, multiple annotators with varying degrees of expertise poses new set of challenges in terms of quality of the label received and availability of the annotator. Due to limited amount of ground truth information addressing the variabilities of Activity of Daily Living (ADLs), activity recognition models using wearable and mobile devices are still not robust enough for real-world deployment. In this paper, we first propose an active learning combined deep model which updates its network parameters based on the optimization of a joint loss function. We then propose a novel annotator selection model by exploiting the relationships among the users while considering their heterogeneity with respect to their expertise, physical and spatial context. Our proposed model leverages model-free deep reinforcement learning in a partially observable environment setting to capture the action-reward interaction among multiple annotators. Our experiments in real-world settings exhibit that our active deep model converges to optimal accuracy with fewer labeled instances and achieves ~8% improvement in accuracy in fewer iterations.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133040318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
The Unreasonable Effectiveness, and Difficulty, of Data in Healthcare 医疗保健中数据的不合理有效性和难度
Peter Lee
{"title":"The Unreasonable Effectiveness, and Difficulty, of Data in Healthcare","authors":"Peter Lee","doi":"10.1145/3292500.3330645","DOIUrl":"https://doi.org/10.1145/3292500.3330645","url":null,"abstract":"Data and data analysis are widely assumed to be the key part of the solution to healthcare systems' problems. Indeed, there are countless ways in which data can be converted into better medical diagnostic tools, more effective therapeutics, and improved productivity for clinicians. But while there is clearly great potential, some big challenges remain to make this all a reality, including making access to health data easier, addressing privacy and ethics concerns, and ensuring the clinical safety of \"learning\" systems. This talk illustrates what is possible in healthcare technology, and details key challenges that currently prevent this from becoming a reality.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122966645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Sequence Multi-task Learning to Forecast Mental Wellbeing from Sparse Self-reported Data 序列多任务学习从稀疏自我报告数据预测心理健康
Dimitris Spathis, S. S. Rodríguez, K. Farrahi, C. Mascolo, Jason Rentfrow
{"title":"Sequence Multi-task Learning to Forecast Mental Wellbeing from Sparse Self-reported Data","authors":"Dimitris Spathis, S. S. Rodríguez, K. Farrahi, C. Mascolo, Jason Rentfrow","doi":"10.1145/3292500.3330730","DOIUrl":"https://doi.org/10.1145/3292500.3330730","url":null,"abstract":"Smartphones have started to be used as self reporting tools for mental health state as they accompany individuals during their days and can therefore gather temporally fine grained data. However, the analysis of self reported mood data offers challenges related to non-homogeneity of mood assessment among individuals due to the complexity of the feeling and the reporting scales, as well as the noise and sparseness of the reports when collected in the wild. In this paper, we propose a new end-to-end ML model inspired by video frame prediction and machine translation, that forecasts future sequences of mood from previous self-reported moods collected in the real world using mobile devices. Contrary to traditional time series forecasting algorithms, our multi-task encoder-decoder recurrent neural network learns patterns from different users, allowing and improving the prediction for users with limited number of self-reports. Unlike traditional feature-based machine learning algorithms, the encoder-decoder architecture enables to forecast a sequence of future moods rather than one single step. Meanwhile, multi-task learning exploits some unique characteristics of the data (mood is bi-dimensional), achieving better results than when training single-task networks or other classifiers. Our experiments using a real-world dataset of 33,000 user-weeks revealed that (i) 3 weeks of sparsely reported mood is the optimal number to accurately forecast mood, (ii) multi-task learning models both dimensions of mood \"valence and arousal\" with higher accuracy than separate or traditional ML models, and (iii) mood variability, personality traits and day of the week play a key role in the performance of our model. We believe this work provides psychologists and developers of future mobile mental health applications with a ready-to-use and effective tool for early diagnosis of mental health issues at scale.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123982651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Uncovering Pattern Formation of Information Flow 揭示信息流的模式形成
Chengxi Zang, Peng Cui, Chaoming Song, Wenwu Zhu, Fei Wang
{"title":"Uncovering Pattern Formation of Information Flow","authors":"Chengxi Zang, Peng Cui, Chaoming Song, Wenwu Zhu, Fei Wang","doi":"10.1145/3292500.3330971","DOIUrl":"https://doi.org/10.1145/3292500.3330971","url":null,"abstract":"Pattern formation is a ubiquitous phenomenon that describes the generation of orderly outcomes by self-organization. In both physical society and online social media, patterns formed by social interactions are mainly driven by information flow. Despite an increasing number of studies aiming to understand the spreads of information flow, little is known about the geometry of these spreading patterns and how they were formed during the spreading. In this paper, by exploring 432 million information flow patterns extracted from a large-scale online social media dataset, we uncover a wide range of complex geometric patterns characterized by a three-dimensional metric space. In contrast, the existing understanding of spreading patterns are limited to fanning-out or narrow tree-like geometries. We discover three key ingredients that govern the formation of complex geometric patterns of information flow. As a result, we propose a stochastic process model incorporating these ingredients, demonstrating that it successfully reproduces the diverse geometries discovered from the empirical spreading patterns. Our discoveries provide a theoretical foundation for the microscopic mechanisms of information flow, potentially leading to wide implications for prediction, control and policy decisions in social media.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121241982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Scaling Multi-Armed Bandit Algorithms 缩放多武装强盗算法
Edouard Fouché, Junpei Komiyama, Klemens Böhm
{"title":"Scaling Multi-Armed Bandit Algorithms","authors":"Edouard Fouché, Junpei Komiyama, Klemens Böhm","doi":"10.1145/3292500.3330862","DOIUrl":"https://doi.org/10.1145/3292500.3330862","url":null,"abstract":"The Multi-Armed Bandit (MAB) is a fundamental model capturing the dilemma between exploration and exploitation in sequential decision making. At every time step, the decision maker selects a set of arms and observes a reward from each of the chosen arms. In this paper, we present a variant of the problem, which we call the Scaling MAB (S-MAB): The goal of the decision maker is not only to maximize the cumulative rewards, i.e., choosing the arms with the highest expected reward, but also to decide how many arms to select so that, in expectation, the cost of selecting arms does not exceed the rewards. This problem is relevant to many real-world applications, e.g., online advertising, financial investments or data stream monitoring. We propose an extension of Thompson Sampling, which has strong theoretical guarantees and is reported to perform well in practice. Our extension dynamically controls the number of arms to draw. Furthermore, we combine the proposed method with ADWIN, a state-of-the-art change detector, to deal with non-static environments. We illustrate the benefits of our contribution via a real-world use case on predictive maintenance.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128545532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Modeling and Applications for Temporal Point Processes 时间点过程的建模与应用
Junchi Yan, Hongteng Xu, Liangda Li
{"title":"Modeling and Applications for Temporal Point Processes","authors":"Junchi Yan, Hongteng Xu, Liangda Li","doi":"10.1145/3292500.3332298","DOIUrl":"https://doi.org/10.1145/3292500.3332298","url":null,"abstract":"Real-world entities' behaviors, associated with their side information, are often recorded over time as asynchronous event sequences. Such event sequences are the basis of many practical applications, neural spiking train study, earth quack prediction, crime analysis, infectious disease diffusion forecasting, condition-based preventative maintenance, information retrieval and behavior-based network analysis and services, etc. Temporal point process (TPP) is a principled mathematical tool for the modeling and learning of asynchronous event sequences, which captures the instantaneous happening rate of the events and the temporal dependency between historical and current events. TPP provides us with an interpretable model to describe the generative mechanism of event sequences, which is beneficial for event prediction and causality analysis. Recently, it has been shown that TPP has potentials to many machine learning and data science applications and can be combined with other cutting-edge machine learning techniques like deep learning, reinforcement learning, adversarial learning, and so on. We will start with an elementary introduction of TPP model, including the basic concepts of the model, the simulation method of event sequences; in the second part of the tutorial, we will introduce typical TPP models and their traditional learning methods; in the third part of the tutorial, we will discuss the recent progress on the modeling and learning of TPP, including neural network-based TPP models, generative adversarial networks (GANs) for TPP, and deep reinforcement learning of TPP. We will further talk about the practical application of TPP, including useful data augmentation methods for learning from imperfect observations, typical applications and examples like healthcare and industry maintenance, and existing open source toolboxes.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115359393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Effective and Efficient Reuse of Past Travel Behavior for Route Recommendation 路线推荐中对过去旅行行为的有效重用
Lisi Chen, Shuo Shang, Christian S. Jensen, Bin Yao, Zhiwei Zhang, Ling Shao
{"title":"Effective and Efficient Reuse of Past Travel Behavior for Route Recommendation","authors":"Lisi Chen, Shuo Shang, Christian S. Jensen, Bin Yao, Zhiwei Zhang, Ling Shao","doi":"10.1145/3292500.3330835","DOIUrl":"https://doi.org/10.1145/3292500.3330835","url":null,"abstract":"With the increasing availability of moving-object tracking data, use of this data for route search and recommendation is increasingly important. To this end, we propose a novel parallel split-and-combine approach to enable route search by locations (RSL-Psc). Given a set of routes, a set of places to visit O, and a threshold θ, we retrieve the route composed of sub-routes that (i) has similarity to O no less than θ and (ii) contains the minimum number of sub-route combinations. The resulting functionality targets a broad range of applications, including route planning and recommendation, ridesharing, and location-based services in general. To enable efficient and effective RSL-Psc computation on massive route data, we develop novel search space pruning techniques and enable use of the parallel processing capabilities of modern processors. Specifically, we develop two parallel algorithms, Fully-Split Parallel Search (FSPS) and Group-Split Parallel Search (GSPS). We divide the route split-and-combine task into ∑k=0 M S(|O|,k+1) sub-tasks, where M is the maximum number of combinations and S(⋅) is the Stirling number of the second kind. In each sub-task, we use network expansion and exploit spatial similarity bounds for pruning. The algorithms split candidate routes into sub-routes and combine them to construct new routes. The sub-tasks are independent and are performed in parallel. Extensive experiments with real data offer insight into the performance of the algorithms, indicating that our RSL-Psc problem can generate high-quality results and that the two algorithms are capable of achieving high efficiency and scalability.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114418382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
MeLU
Ho-Yong Lee, Jinbae Im, Seongwon Jang, H. Cho, Sehee Chung
{"title":"MeLU","authors":"Ho-Yong Lee, Jinbae Im, Seongwon Jang, H. Cho, Sehee Chung","doi":"10.1145/3292500.3330859","DOIUrl":"https://doi.org/10.1145/3292500.3330859","url":null,"abstract":"This paper proposes a recommender system to alleviate the cold-start problem that can estimate user preferences based on only a small number of items. To identify a user's preference in the cold state, existing recommender systems, such as Netflix, initially provide items to a user; we call those items evidence candidates. Recommendations are then made based on the items selected by the user. Previous recommendation studies have two limitations: (1) the users who consumed a few items have poor recommendations and (2) inadequate evidence candidates are used to identify user preferences. We propose a meta-learning-based recommender system called MeLU to overcome these two limitations. From meta-learning, which can rapidly adopt new task with a few examples, MeLU can estimate new user's preferences with a few consumed items. In addition, we provide an evidence candidate selection strategy that determines distinguishing items for customized preference estimation. We validate MeLU with two benchmark datasets, and the proposed model reduces at least 5.92% mean absolute error than two comparative models on the datasets. We also conduct a user study experiment to verify the evidence selection strategy.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114492515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 245
Building a Better Self-Driving Car: Hardware, Software, and Knowledge 打造更好的自动驾驶汽车:硬件、软件和知识
K. Chellapilla
{"title":"Building a Better Self-Driving Car: Hardware, Software, and Knowledge","authors":"K. Chellapilla","doi":"10.1145/3292500.3340409","DOIUrl":"https://doi.org/10.1145/3292500.3340409","url":null,"abstract":"Lyft's mission is to improve people's lives with the world's best transportation. Self driving vehicles have the potential to deliver unprecedented improvements to safety and quality, at a price and convenience that challenges traditional models of vehicle ownership. A combination of hardware, software, and knowledge technologies are needed to build self-driving cars. In this talk, I'll present the core problems in self-driving and how recent advances in computer vision, robotics, and machine learning are powering this revolution. The car is carefully designed with a variety of sensors that complement each other to address a wide variety of driving scenarios. Sensor fusion bring all of these signals together into an interpretable AI engine comprising of perception, prediction, planning, and controls. For example, deep learning models and large scale machine learning have closed the gap between human and machine perception. In contrast, predicting the behavior of other humans and effectively planning and negotiating maneuvers continue to be hard problems. Combining AI technologies with deep knowledge about the real world is key to addressing these.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115242712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Statistical Mechanics Methods for Discovering Knowledge from Modern Production Quality Neural Networks 从现代生产质量神经网络中发现知识的统计力学方法
Charles H. Martin, Michael W. Mahoney
{"title":"Statistical Mechanics Methods for Discovering Knowledge from Modern Production Quality Neural Networks","authors":"Charles H. Martin, Michael W. Mahoney","doi":"10.1145/3292500.3332294","DOIUrl":"https://doi.org/10.1145/3292500.3332294","url":null,"abstract":"There have long been connections between statistical mechanics and neural networks, but in recent decades these connections have withered. However, in light of recent failings of statistical learning theory and stochastic optimization theory to describe, even qualitatively, many properties of production-quality neural network models, researchers have revisited ideas from the statistical mechanics of neural networks. This tutorial will provide an overview of the area; it will go into detail on how connections with random matrix theory and heavy-tailed random matrix theory can lead to a practical phenomenological theory for large-scale deep neural networks; and it will describe future directions.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114790757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信