Proceedings of the Tenth ACM International Conference on Web Search and Data Mining最新文献_第2页

Mining Medical Causality for Diagnosis Assistance 为诊断协助挖掘医学因果关系

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining Pub Date : 2017-02-02 DOI: 10.1145/3018661.3022752

Sendong Zhao

{"title":"Mining Medical Causality for Diagnosis Assistance","authors":"Sendong Zhao","doi":"10.1145/3018661.3022752","DOIUrl":"https://doi.org/10.1145/3018661.3022752","url":null,"abstract":"In the medical context, causal knowledge usually refers to causal relations between diseases and symptoms, living habits and diseases, symptoms which get better and therapy, drugs and side-effects, etc [3]. All these causal relations are usually in medical literature, forum and clinical cases and compose the core part of medical diagnosis. Therefore, mining these causal knowledge to predict disease and recommend therapy is of great value for assisting patients and professionals. The task of mining these causal knowledge for diagnosis assistance can be decomposed into four constitutes: (1) mining medical causality from text; (2) medical treatment effectiveness measurement; (3) disease prediction and (4) explicable medical treatment recommendation. However, these tasks have never been systemically studied before. For my PhD thesis, I plan to formally define the problem of mining medical domain causality for diagnosis assistance and propose methods to solve this problem. 1. Ming these textual causalities can be very useful for discovering new knowledge and making decisions. Many studies have been done for causal extraction from the text [1, 4, 5]. However, all these studies are based on pattern or causal triggers, which greatly limit their power to extract causality and rarely consider the frequency of co-occurrence and contextual semantic features. Besides, none of them take the transitivity rules of causality leading to reject those causalities which can be easily get by simple inference. Therefore, we formally define the task of mining causality via frequency of event co-occurrence, semantic distance between event pairs and transitivity rules of causality, and present a factor graph to combine these three resources for causality mining. 2. Treatment effectiveness analysis is usually taken as a subset of causal analysis on observational data. For such real observational data, PSM and RCM are two dominant methods. On one hand, it is usually difficult for PSM to find the matched cases due to the sparsity of symptom. On the other hand, we should check every possible (symptom, treatment) pair by exploiting RCM, leading to make the characteristic of exploding up, especially when we want to check the causal relation between a combination of symptoms and a combination of drugs. Besides, the larger number of symptom or treatment in the combination the less number of patient case retrieved, which lead to the lack of statistical significance. Specifically, patients tend to take tens of herbs as the treatment each time in Traditional Chinese Medicine (TCM). Therefore, how to evaluate the effectiveness of herbs separately and jointly is really a big challenge. This is also a very fundamental research topic supporting many downstream applications. 3. Both hospitals and on-line forums have accumulated sheer amount of records, such as clinical text data and online diagnosis Q&A pairs. The availability of such data in large volume enables automatic disease predic","PeriodicalId":344017,"journal":{"name":"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127235585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Probabilistic Social Sequential Model for Tour Recommendation 旅游推荐的概率社会序列模型

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining Pub Date : 2017-02-02 DOI: 10.1145/3018661.3018711

Vineeth Rakesh, Niranjan Jadhav, Alexander Kotov, C. Reddy

{"title":"Probabilistic Social Sequential Model for Tour Recommendation","authors":"Vineeth Rakesh, Niranjan Jadhav, Alexander Kotov, C. Reddy","doi":"10.1145/3018661.3018711","DOIUrl":"https://doi.org/10.1145/3018661.3018711","url":null,"abstract":"The pervasive growth of location-based services such as Foursquare and Yelp has enabled researchers to incorpo- rate better personalization into recommendation models by leveraging the geo-temporal breadcrumbs left by a plethora of travelers. In this paper, we explore Travel path recommendation, which is one of the applications of intelligent urban navigation that aims in recommending sequence of point of interest (POIs) to tourists. Currently, travelers rely on a tedious and time-consuming process of searching the web, browsing through websites such as Trip Advisor, and reading travel blogs to compile an itinerary. On the other hand, people who do not plan ahead of their trip find it extremely difficult to do this in real-time since there are no automated systems that can provide personalized itinerary for travelers. To tackle this problem, we propose a tour recommendation model that uses a probabilistic generative framework to incorporate user's categorical preference, influence from their social circle, the dynamic travel transitions (or patterns) and the popularity of venues to recommend sequence of POIs for tourists. Through comprehensive experiments over a rich dataset of travel patterns from Foursquare, we show that our model is capable of outperforming the state-of-the-art probabilistic tour recommendation model by providing contextual and meaningful recommendation for travelers.","PeriodicalId":344017,"journal":{"name":"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132901999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 31

Beyond Query Logs: Recommendation and Evaluation 查询日志之外:推荐和评估

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining Pub Date : 2017-02-02 DOI: 10.1145/3018661.3022747

M. Mitsui

引用次数: 1

Adapting Information Retrieval to User Signals via Stochastic Models 基于随机模型的用户信号信息检索

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining Pub Date : 2017-02-02 DOI: 10.1145/3018661.3022753

Maria Maistro

引用次数: 0

Statistical Spoken Dialogue Systems and the Challenges for Machine Learning 统计口语对话系统和机器学习的挑战

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining Pub Date : 2017-02-02 DOI: 10.1145/3018661.3022746

S. Young

引用次数: 4

Trustworthy Analysis of Online A/B Tests: Pitfalls, challenges and solutions 在线A/B测试的可信分析:陷阱、挑战和解决方案

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining Pub Date : 2017-02-02 DOI: 10.1145/3018661.3018677

Alex Deng, Jiannan Lu, Jonthan Litz

{"title":"Trustworthy Analysis of Online A/B Tests: Pitfalls, challenges and solutions","authors":"Alex Deng, Jiannan Lu, Jonthan Litz","doi":"10.1145/3018661.3018677","DOIUrl":"https://doi.org/10.1145/3018661.3018677","url":null,"abstract":"A/B tests (or randomized controlled experiments) play an integral role in the research and development cycles of technology companies. As in classic randomized experiments (e.g., clinical trials), the underlying statistical analysis of A/B tests is based on assuming the randomization unit is independent and identically distributed (iid). However, the randomization mechanisms utilized in online A/B tests can be quite complex and may render this assumption invalid. Analysis that unjustifiably relies on this assumption can yield untrustworthy results and lead to incorrect conclusions. Motivated by challenging problems arising from actual online experiments, we propose a new method of variance estimation that relies only on practically plausible assumptions, is directly applicable to a wide of range of randomization mechanisms, and can be implemented easily. We examine its performance and illustrate its advantages over two commonly used methods of variance estimation on both simulated and empirical datasets. Our results lead to a deeper understanding of the conditions under which the randomization unit can be treated as iid In particular, we show that for purposes of variance estimation, the randomization unit can be approximated as iid when the individual treatment effect variation is small; however, this approximation can lead to variance under-estimation when the individual treatment effect variation is large.","PeriodicalId":344017,"journal":{"name":"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122593376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 36

Anticipating Information Needs Based on Check-in Activity 基于签到活动预测信息需求

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining Pub Date : 2017-02-02 DOI: 10.1145/3018661.3018679

Jan R. Benetka, K. Balog, K. Nørvåg

引用次数: 20

Quantifying and Bursting the Online Filter Bubble 量化和打破在线过滤泡沫

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining Pub Date : 2017-02-02 DOI: 10.1145/3018661.3024933

Venkata Rama Kiran Garimella

引用次数: 3

Summarizing Answers in Non-Factoid Community Question-Answering 总结非事实性社区问答中的答案

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining Pub Date : 2017-02-02 DOI: 10.1145/3018661.3018704

Hongya Song, Z. Ren, Shangsong Liang, Piji Li, Jun Ma, M. de Rijke

{"title":"Summarizing Answers in Non-Factoid Community Question-Answering","authors":"Hongya Song, Z. Ren, Shangsong Liang, Piji Li, Jun Ma, M. de Rijke","doi":"10.1145/3018661.3018704","DOIUrl":"https://doi.org/10.1145/3018661.3018704","url":null,"abstract":"We aim at summarizing answers in community question-answering (CQA). While most previous work focuses on factoid question-answering, we focus on the non-factoid question-answering. Unlike factoid CQA, non-factoid question-answering usually requires passages as answers. The shortness, sparsity and diversity of answers form interesting challenges for summarization. To tackle these challenges, we propose a sparse coding-based summarization strategy that includes three core ingredients: short document expansion, sentence vectorization, and a sparse-coding optimization framework. Specifically, we extend each answer in a question-answering thread to a more comprehensive representation via entity linking and sentence ranking strategies. From answers extended in this manner, each sentence is represented as a feature vector trained from a short text convolutional neural network model. We then use these sentence representations to estimate the saliency of candidate sentences via a sparse-coding framework that jointly considers candidate sentences and Wikipedia sentences as reconstruction items. Given the saliency vectors for all candidate sentences, we extract sentences to generate an answer summary based on a maximal marginal relevance algorithm. Experimental results on a benchmark data collection confirm the effectiveness of our proposed method in answer summarization of non-factoid CQA, and moreover, its significant improvement compared to state-of-the-art baselines in terms of ROUGE metrics.","PeriodicalId":344017,"journal":{"name":"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125675375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 42

Utilizing Knowledge Graphs in Text-centric Information Retrieval 知识图在以文本为中心的信息检索中的应用

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining Pub Date : 2017-02-02 DOI: 10.1145/3018661.3022756

Laura Dietz, Alexander Kotov, E. Meij

引用次数: 39