2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)最新文献_第4页

Exploiting a Bootstrapping Approach for Automatic Annotation of Emotions in Texts 基于自举方法的文本情感自动标注

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.78

Lea Canales, C. Strapparava, E. Boldrini, P. Martínez-Barco

{"title":"Exploiting a Bootstrapping Approach for Automatic Annotation of Emotions in Texts","authors":"Lea Canales, C. Strapparava, E. Boldrini, P. Martínez-Barco","doi":"10.1109/DSAA.2016.78","DOIUrl":"https://doi.org/10.1109/DSAA.2016.78","url":null,"abstract":"The objective of this research is to develop a technique to automatically annotate emotional corpora. The complexity of automatic annotation of emotional corpora still presents numerous challenges and thus there is a need to develop a technique that allow us to tackle the annotation task. The relevance of this research is demonstrated by the fact that people's emotions and the patterns of these emotions provide a great value for business, individuals, society or politics. Hence, the creation of a robust emotion detection system becomes crucial. Due to the subjectivity of the emotions, the main challenge for the creation of emotional resources is the annotation process. Thus, with this staring point in mind, the objective of our paper is to illustrate an innovative and effective bootstrapping process for automatic annotations of emotional corpora. The evaluations carried out confirm the soundness of the proposed approach and allow us to consider the bootstrapping process as an appropriate approach to create resources such as an emotional corpus that can be employed on supervised machine learning towards the improvement of emotion detection systems.","PeriodicalId":193885,"journal":{"name":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"56 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116372838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Role Models: Mining Role Transitions Data in IT Project Management 角色模型:挖掘IT项目管理中的角色转换数据

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.62

G. Palshikar, Sachin Pawar, Nitin Ramrakhiyani

{"title":"Role Models: Mining Role Transitions Data in IT Project Management","authors":"G. Palshikar, Sachin Pawar, Nitin Ramrakhiyani","doi":"10.1109/DSAA.2016.62","DOIUrl":"https://doi.org/10.1109/DSAA.2016.62","url":null,"abstract":"The notion of roles is crucial in project management across various domains. A role indicates a broad set of tasks, activities, deliverables and responsibilities that the person needs to carry out within a project. Assigning roles to team members clarifies the expectations of work items to be delivered by each and structures the interactions of the team among themselves as well as with external stakeholders. This paper analyzes a sizeable real-life dataset regarding the actual usage of roles in software development and maintenance projects in a large multinational IT organization. The paper introduces and formalizes concepts such as seniority level of a role, career progression and career lines, formulates various business questions related to role-based project management, proposes analytics techniques to answer them and outlines the actual results produced to answer the business questions. The business questions are related to dependencies between roles, patterns in role assignments and durations, predicting role changes, discovering insights useful for meeting career aspirations, interesting role sequences etc. The proposed analytics algorithms are based on Markov models, sequence mining, classification and survival analysis.","PeriodicalId":193885,"journal":{"name":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132364641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Using Players' Gameplay Action-Decision Profiles to Prescribe Training: Reducing Training Costs with Serious Games Analytics 利用玩家的玩法动作决策档案来规定训练:利用严肃游戏分析降低训练成本

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.74

C. S. Loh, I. Li

引用次数: 3

Dilation of Chisini-Jensen-Shannon Divergence Chisini-Jensen-Shannon散度的扩张

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.25

P. Sharma, Gary Holness

引用次数: 6

Fraud Detection in Energy Consumption: A Supervised Approach 能源消费中的欺诈检测:一种监督方法

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.19

Bernat Coma-Puig, J. Carmona, Ricard Gavaldà, Santiago Alcoverro, Victor Martin

引用次数: 55

Meeting Health Care Research Needs in a Kimball Integrated Data Warehouse 在Kimball集成数据仓库中满足医疗保健研究需求

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.91

R. Hart, A. Kuo

引用次数: 4

A Framework for Description and Analysis of Sampling-Based Approximate Triangle Counting Algorithms 基于采样的近似三角形计数算法描述与分析框架

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.15

M. H. Chehreghani

引用次数: 0

Projecting "Better Than Randomly": How to Reduce the Dimensionality of Very Large Datasets in a Way That Outperforms Random Projections 投影“优于随机”:如何以优于随机投影的方式降低超大数据集的维数

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.26

M. Wojnowicz, Di Zhang, Glenn Chisholm, Xuan Zhao, M. Wolff

{"title":"Projecting \"Better Than Randomly\": How to Reduce the Dimensionality of Very Large Datasets in a Way That Outperforms Random Projections","authors":"M. Wojnowicz, Di Zhang, Glenn Chisholm, Xuan Zhao, M. Wolff","doi":"10.1109/DSAA.2016.26","DOIUrl":"https://doi.org/10.1109/DSAA.2016.26","url":null,"abstract":"For very large datasets, random projections (RP) have become the tool of choice for dimensionality reduction. This is due to the computational complexity of principal component analysis. However, the recent development of randomized principal component analysis (RPCA) has opened up the possibility of obtaining approximate principal components on very large datasets. In this paper, we compare the performance of RPCA and RP in dimensionality reduction for supervised learning. In Experiment 1, study a malware classification task on a dataset with over 10 million samples, almost 100,000 features, and over 25 billion non-zero values, with the goal of reducing the dimensionality to a compressed representation of 5,000 features. In order to apply RPCA to this dataset, we develop a new algorithm called large sample RPCA (LS-RPCA), which extends the RPCA algorithm to work on datasets with arbitrarily many samples. We find that classification performance is much higher when using LS-RPCA for dimensionality reduction than when using random projections. In particular, across a range of target dimensionalities, we find that using LS-RPCA reduces classification error by between 37% and 54%. Experiment 2 generalizes the phenomenon to multiple datasets, feature representations, and classifiers. These findings have implications for a large number of research projects in which random projections were used as a preprocessing step for dimensionality reduction. As long as accuracy is at a premium and the target dimensionality is sufficiently less than the numeric rank of the dataset, randomized PCA may be a superior choice. Moreover, if the dataset has a large number of samples, then LS-RPCA will provide a method for obtaining the approximate principal components.","PeriodicalId":193885,"journal":{"name":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115946938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

On the Evaluation of Outlier Detection and One-Class Classification Methods 关于离群点检测和一类分类方法的评价

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.8

Lorne Swersky, Henrique O. Marques, J. Sander, R. Campello, A. Zimek

引用次数: 52

A Symbolic Tree Model for Oil and Gas Production Prediction Using Time-Series Production Data 基于时序生产数据的油气产量预测符号树模型

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI: 10.1109/DSAA.2016.36

Bingjie Wei, Helen Pinto, Xin Wang

引用次数: 3