2017 IEEE International Conference on Data Mining Workshops (ICDMW)最新文献_第8页

Discovery of Informal Topics from Post Traumatic Stress Disorder Forums 从创伤后应激障碍论坛发现非正式话题

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.65

Reilly Grant, David Kucher, A. Leon, Jonathan F. Gemmell, D. Raicu

{"title":"Discovery of Informal Topics from Post Traumatic Stress Disorder Forums","authors":"Reilly Grant, David Kucher, A. Leon, Jonathan F. Gemmell, D. Raicu","doi":"10.1109/ICDMW.2017.65","DOIUrl":"https://doi.org/10.1109/ICDMW.2017.65","url":null,"abstract":"Post Traumatic Stress Disorder (PTSD) is a public health problem afflicting millions of people each year. It is especially prominent among military veterans. Understanding the language, attitudes, and topics associated with PTSD presents an important and challenging problem. Based on their expertise, mental health professionals have constructed a formal definition of PTSD. However, even the most assiduous mental health professionals can care for only a small fraction of those suffering from PTSD, limiting their perspective of the disorder. As social networking sites have grown in acceptance, users have begun to express personal thoughts and feelings, such as those related to PTSD. This wealth of content can be viewed as an enormous collective description of PTSD and its related issues. We automatically extract informal latent topics from thousands of social media posts in which users describe their experience with PTSD and compare these topics to the formal description generated by mental health professionals. We then explore the pattern and associations of these topics. Our informal topic discovery evaluation reveals that we can successfully identify meaningful topics in PTSD social media related data. When comparing our topics to the criteria included in the Diagnostic and Statistical Manual of Mental Disorders (DSM), we found that we were able to automatically reproduce many of the criteria. We also discovered new topics which were not mentioned in the DSM, but were prevalent across the collaborative narrative of thousands of user's experience with PTSD.","PeriodicalId":389183,"journal":{"name":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"156 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127355478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

A Big Data Analysis Framework Using Apache Spark and Deep Learning 一个使用Apache Spark和深度学习的大数据分析框架

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.9

Anand Gupta, H. Thakur, Ritvik Shrivastava, Pulkit Kumar, Sreyashi Nag

{"title":"A Big Data Analysis Framework Using Apache Spark and Deep Learning","authors":"Anand Gupta, H. Thakur, Ritvik Shrivastava, Pulkit Kumar, Sreyashi Nag","doi":"10.1109/ICDMW.2017.9","DOIUrl":"https://doi.org/10.1109/ICDMW.2017.9","url":null,"abstract":"With the spreading prevalence of Big Data, many advances have recently been made in this field. Frameworks such as Apache Hadoop and Apache Spark have gained a lot of traction over the past decades and have become massively popular, especially in industries. It is becoming increasingly evident that effective big data analysis is key to solving artificial intelligence problems. Thus, a multi-algorithm library was implemented in the Spark framework, called MLlib. While this library supports multiple machine learning algorithms, there is still scope to use the Spark setup efficiently for highly time-intensive and computationally expensive procedures like deep learning. In this paper, we propose a novel framework that combines the distributive computational abilities of Apache Spark and the advanced machine learning architecture of a deep multi-layer perceptron (MLP), using the popular concept of Cascade Learning. We conduct empirical analysis of our framework on two real world datasets. The results are encouraging and corroborate our proposed framework, in turn proving that it is an improvement over traditional big data analysis methods that use either Spark or Deep learning as individual elements.","PeriodicalId":389183,"journal":{"name":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123642461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 46

Personal Identification by Pedestrians Behavior 行人行为识别个人身份

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.88

E. Kita, Xuanang Feng, Hiroki Shimokubo

引用次数: 1

A Multilevel NER Framework for Automatic Clinical Name Entity Recognition 用于临床名称实体自动识别的多层NER框架

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.161

T. Luu, R. Phan, Rachel Davey, G. Chetty

引用次数: 6

Automated Storytelling Evaluation and Story Chain Generation 自动讲故事评估和故事链生成

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.15

J. Rigsby, Daniel Barbará

{"title":"Automated Storytelling Evaluation and Story Chain Generation","authors":"J. Rigsby, Daniel Barbará","doi":"10.1109/ICDMW.2017.15","DOIUrl":"https://doi.org/10.1109/ICDMW.2017.15","url":null,"abstract":"Given a beginning and ending document, automated storytelling attempts to fill in intermediary documents to form a coherent story. This is a common problem for analysts; they often have two snippets of information and want to find the other pieces that relate them. Evaluation of the quality of the created stories is difficult and has routinely involved human judgment. This work extends the state of the art by providing quantitative methods of story quality evaluation which are shown to have good agreement with human judgment. Two methods of automated storytelling evaluation, dispersion and coherence are developed. Dispersion, a measure of story flow, ascertains how well the generated story flows away from the beginning document and towards the ending document. Coherence measures how well the articles in the middle of the story provide information about the relationship of the beginning and ending document pair. Kullback-Leibler divergence (KLD) is used to measure the ability to encode the vocabulary of the beginning and ending story documents using the set of middle documents in the story. The dispersion and coherence methodologies developed here have the added benefit that they do not require parametrization or user inputs and are also easily automated. An automated storytelling algorithm is proposed as a multicriteria optimization problem that maximizes dispersion and coherence simultaneously. The developed storytelling methodologies will allow for the automated identification of information which associates disparate documents in support of literaturebased discovery and link analysis tasking. In addition, the methods provide quantitative measures of the strength of these associations.","PeriodicalId":389183,"journal":{"name":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"189 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114198183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Sequential Heterogeneous Attribute Embedding for Item Recommendation 面向项目推荐的顺序异构属性嵌入

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.107

Kuan Liu, Xing Shi, P. Natarajan

{"title":"Sequential Heterogeneous Attribute Embedding for Item Recommendation","authors":"Kuan Liu, Xing Shi, P. Natarajan","doi":"10.1109/ICDMW.2017.107","DOIUrl":"https://doi.org/10.1109/ICDMW.2017.107","url":null,"abstract":"Attributes, such as metadata and profile, carry useful information which in principle can help improve accuracy in recommender systems. However, existing approaches have difficulty in fully leveraging attribute information due to practical challenges such as heterogeneity and sparseness. These approaches also fail to combine recurrent neural networks which have recently shown effectiveness in item recommendations in applications such as video and music browsing. To overcome the challenges and to harvest the advantages of sequence models, we present a novel approach, Heterogeneous Attribute Recurrent Neural Networks (HA-RNN), which incorporates heterogeneous attributes and captures sequential dependencies in both items and attributes. HA-RNN extends recurrent neural networks with 1) a hierarchical attribute combination input layer and 2) an output attribute embedding layer. Experiments on two large-scale datasets show significant improvements over the state-of-the-art models. Ablation experiments demonstrate the crucialness of the two components to address heterogeneous attribute challenges including variable lengths and attribute sparseness. Furthermore, our exploratory studies also shed light on why sequence modeling works well.","PeriodicalId":389183,"journal":{"name":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115984965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Inside the Atoms: Mining a Network of Networks and Beyond 原子内部:挖掘网络的网络和超越

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.138

Hanghang Tong

引用次数: 0

Factor Analysis for Anonymization 匿名化的因素分析

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.139

Aida Calvino, Palmira Aldeguer, J. Domingo-Ferrer

引用次数: 1

Multiple Queries of Information Retrieval Using Krylov Subspace Method 基于Krylov子空间方法的多查询信息检索

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.75

Youzuo Lin

引用次数: 0

Probable Biomarker Identification Using Recursive Feature Extraction and Network Analysis 基于递归特征提取和网络分析的可能生物标志物识别

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.67

Arpita Mishra, Abhishek Gupta, Umesh Maheswari, Laeeq Siddique

引用次数: 1