2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)最新文献_第6页

Customer Simulation for Direct Marketing Experiments 直销实验的顾客模拟

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.59

Yegor Tkachenko, Mykel J. Kochenderfer, Krzysztof Kluza

引用次数: 9

Sparse Linear Discriminant Analysis in Structured Covariates Space 结构化协变量空间中的稀疏线性判别分析

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1002/sam.11376

S. Safo, Q. Long

{"title":"Sparse Linear Discriminant Analysis in Structured Covariates Space","authors":"S. Safo, Q. Long","doi":"10.1002/sam.11376","DOIUrl":"https://doi.org/10.1002/sam.11376","url":null,"abstract":"Classification with high dimensional variables is a popular goal in many modern statistical studies. Fisher's linear discriminant analysis (LDA) is a common and effective tool for classifying entities into existing groups. It is well known that classification using Fisher's discriminant for high dimensional data is as bad as random guessing due to the many noise features that increases misclassification rate. Recently, it is being acknowledged that complex biological mechanisms occur through multiple features working together, though individually these features may contribute to noise accumulation in the data. In view of these, it is important to perform classification with discriminant vectors that use a subset of important variables, while also utilizing prior biological relationships among features. We tackle this problem in this article and propose methods that incorporate variable selection into the classification problem, for the identification of important biomarkers. Furthermore, we incorporate into the LDA problem prior information on the relationships among variables using undirected graphs in order to identify functionally meaningful biomarkers. We compare our methods to existing sparse LDA approaches via simulation studies and real data analysis.","PeriodicalId":193885,"journal":{"name":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121807498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Senpy: A Pragmatic Linked Sentiment Analysis Framework Senpy:一个语用关联情感分析框架

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.79

J. F. Sánchez-Rada, C. Iglesias, Ignacio Corcuera, Óscar Araque

{"title":"Senpy: A Pragmatic Linked Sentiment Analysis Framework","authors":"J. F. Sánchez-Rada, C. Iglesias, Ignacio Corcuera, Óscar Araque","doi":"10.1109/DSAA.2016.79","DOIUrl":"https://doi.org/10.1109/DSAA.2016.79","url":null,"abstract":"Sentiment and emotion analysis technologies have quickly gained momentum in industry and academia. This popularity has spawned a myriad of service and tools. Due to the lack of common interfaces and models, each of these services imposes specific interfaces and representation models. Heterogeneity makes it costly to integrate different services, evaluate them or switch between them. This work aims to remedy heterogeneity by providing an extensible framework and an API aligned with the NIF service specification. It also includes a reference implementation, a first step towards a successful and cost-effective adoption. The specific contributions in this paper are: (i) the Senpy framework, (ii) an architecture for the framework that follows a plug-in approach, (iii) a reference open source implementation of the architecture, (iv) the use and validation of the framework and architecture in a big data sentiment analysis European project. Our aim is to foster the development of a new generation of emotion aware services by isolating the development of new algorithms from the representation of results and the deployment of services.","PeriodicalId":193885,"journal":{"name":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122000326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Data-Driven Sales Leads Prediction for Everything-as-a-Service in the Cloud 数据驱动的销售线索预测云中的一切即服务

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.83

Chul Sung, Bo Zhang, Chunhui Y. Higgins, Y. Choe

{"title":"Data-Driven Sales Leads Prediction for Everything-as-a-Service in the Cloud","authors":"Chul Sung, Bo Zhang, Chunhui Y. Higgins, Y. Choe","doi":"10.1109/DSAA.2016.83","DOIUrl":"https://doi.org/10.1109/DSAA.2016.83","url":null,"abstract":"A cloud platform website, offering a catalog of services, operates under a freemium business model or a free trial business model, aggressively marketing to customers who have previously visited. In such a cloud platform or service business, accurate identification of high profile customers is central to the success for the business. However, there are several limitations of existing approaches because of the following challenges: (1) heavy customer traffic flows, (2) the noise in user behaviors, (3) a lack of collaboration across stakeholders, (4) class imbalanced customer data (few paying customers vs. high numbers of freemium or trial customers), and (5) unpredictable business environments. In this paper, we propose a data-driven iterative sales lead prediction framework for cloud everything as a service (XaaS), including a cloud platform or software. In this framework, from the BizDevOps process we collaborate to extract business insights from multiple business stakeholders. From these business insights, we calculate service usage scores using our RFDL (Recency, Frequency, Duration, and Lifetime) analysis and estimate sales lead prediction based on the usage scores in a supervised manner. Our framework adapts to a continuously changing environment through iterations of the whole process, maintains its performance of sales lead prediction, and finally shares the prediction results to the sales or marketing team effectively. A three-month pilot implementation of the framework led to more than 300 paying customers and more than $200K increase in revenue. We expect our scalable, iterative sales lead prediction approach to be widely applicable to online or cloud business domains where there is a constant flux of customer traffic.","PeriodicalId":193885,"journal":{"name":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126832973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Efficient Sampling-Based ADMM for Distributed Data 基于高效采样的分布式数据ADMM

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.41

Jun-Kun Wang, Shou-de Lin

引用次数: 0

On the Role of Mentions on Tweet Virality 论提及在推特病毒式传播中的作用

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.28

Soumajit Pramanik, Qinna Wang, Maximilien Danisch, Sumanth Bandi, Anand Kumar, Jean-Loup Guillaume, Bivas Mitra

引用次数: 12

Temporal Network Change Detection Using Network Centralities 利用网络中心性进行时态网络变化检测

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.13

Yoshitaro Yonamoto, K. Morino, K. Yamanishi

{"title":"Temporal Network Change Detection Using Network Centralities","authors":"Yoshitaro Yonamoto, K. Morino, K. Yamanishi","doi":"10.1109/DSAA.2016.13","DOIUrl":"https://doi.org/10.1109/DSAA.2016.13","url":null,"abstract":"In this paper, we propose a novel change detection method for temporal networks. In usual change detection algorithms, change scores are generated from an observed time series. When this change score reaches a threshold, an alert is raised to declare the change. Our method aggregates these change scores and alerts based on network centralities. Many types of changes in a network can be discovered from changes to the network structure. Thus, nodes and links should be monitored in order to recognize changes. However, it is difficult to focus on the appropriate nodes and links when there is little information regarding the dataset. Network centrality such as PageRank measures the importance of nodes in a network based on certain criteria. Therefore, it is natural to apply network centralities in order to improve the accuracy of change detection methods. Our analysis reveals how and when network centrality works well in terms of change detection. Based on this understanding, we propose an aggregating algorithm that emphasizes the appropriate network centralities. Our evaluation of the proposed aggregation algorithm showed highly accurate predictions for an artificial dataset and two real datasets. Our method contributes to extending the field of change detection in temporal networks by utilizing network centralities.","PeriodicalId":193885,"journal":{"name":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"303 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116329489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A Distributed Decision Tree Algorithm and Its Implementation on Big Data Platforms 一种分布式决策树算法及其在大数据平台上的实现

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.64

Jingxiang Chen, Tao Wang, Ralph Abbey, J. Pingenot

引用次数: 5

Behavior-Oriented Time Segmentation for Mining Individualized Rules of Mobile Phone Users 面向行为的手机用户个性化规则挖掘时间分割

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.60

Iqbal H. Sarker, A. Colman, M. A. Kabir, Jun Han

{"title":"Behavior-Oriented Time Segmentation for Mining Individualized Rules of Mobile Phone Users","authors":"Iqbal H. Sarker, A. Colman, M. A. Kabir, Jun Han","doi":"10.1109/DSAA.2016.60","DOIUrl":"https://doi.org/10.1109/DSAA.2016.60","url":null,"abstract":"Mobile or cellular phones can record various types of context data related to a user's phone call activities. In this paper, we present an approach to discovering individualized behavior rules for mobile users from their phone call records, based on the temporal context in which a user accepts, rejects or misses a call. One of the determinants of an individual's phone behavior is the various activities undertaken at various times of a day and days of the week. In many cases, such behavior will follow temporal patterns. Currently, researchers modeling user behavior using temporal context statically segment time into arbitrary categories (e.g., morning, evening) or periods (e.g., 1 hour). However, such time categorization does not necessarily map to the patterns of individual user activity and subsequent behavior. Therefore, we propose a behavior-oriented time segmentation (BOTS) technique that dynamically identifies diverse time segments for an individual user's behaviors based on the phone call records. Experiments on real datasets show that our proposed technique better captures the user's dominant call response behavior at various times of the day and week, thereby enabling more appropriate rules to be created for the purpose of automated handling of incoming calls, in an intelligent call interruption management system.","PeriodicalId":193885,"journal":{"name":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122848600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Anomaly Detection in Automobile Control Network Data with Long Short-Term Memory Networks 基于长短期记忆网络的汽车控制网络数据异常检测

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.20

Adrian Taylor, Sylvain P. Leblanc, N. Japkowicz

{"title":"Anomaly Detection in Automobile Control Network Data with Long Short-Term Memory Networks","authors":"Adrian Taylor, Sylvain P. Leblanc, N. Japkowicz","doi":"10.1109/DSAA.2016.20","DOIUrl":"https://doi.org/10.1109/DSAA.2016.20","url":null,"abstract":"Modern automobiles have been proven vulnerable to hacking by security researchers. By exploiting vulnerabilities in the car's external interfaces, such as wifi, bluetooth, and physical connections, they can access a car's controller area network (CAN) bus. On the CAN bus, commands can be sent to control the car, for example cutting the brakes or stopping the engine. While securing the car's interfaces to the outside world is an important part of mitigating this threat, the last line of defence is detecting malicious behaviour on the CAN bus. We propose an anomaly detector based on a Long Short-Term Memory neural network to detect CAN bus attacks. The detector works by learning to predict the next data word originating from each sender on the bus. Highly surprising bits in the actual next word are flagged as anomalies. We evaluate the detector by synthesizing anomalies with modified CAN bus data. The synthesized anomalies are designed to mimic attacks reported in the literature. We show that the detector can detect anomalies we synthesized with low false alarm rates. Additionally, the granularity of the bit predictions can provide forensic investigators clues as to the nature of flagged anomalies.","PeriodicalId":193885,"journal":{"name":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128077029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 268