2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)最新文献_第5页

Example-Based Feature Tweaking Using Random Forests 使用随机森林进行基于示例的特征调整

2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI) Pub Date : 2019-07-01 DOI: 10.1109/IRI.2019.00022

Tony Lindgren, P. Papapetrou, Isak Samsten, L. Asker

引用次数: 1

Title Page i 第1页

2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI) Pub Date : 2019-07-01 DOI: 10.1109/iri.2019.00001

引用次数: 0

DDM: Data-Driven Modeling of Physical Phenomenon with Application to METOC 物理现象的数据驱动建模及其在METOC中的应用

2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI) Pub Date : 2019-07-01 DOI: 10.1109/IRI.2019.00057

S. Rubin, Lydia Bouzar-Benlabiod

{"title":"DDM: Data-Driven Modeling of Physical Phenomenon with Application to METOC","authors":"S. Rubin, Lydia Bouzar-Benlabiod","doi":"10.1109/IRI.2019.00057","DOIUrl":"https://doi.org/10.1109/IRI.2019.00057","url":null,"abstract":"The problem addressed by this paper pertains to the representation, acquisition, and randomization of experiential knowledge for autonomous systems in expert reconnaissance. Such systems are characterized by the requirement to render proper decisions not explicitly programmed for. Cases are defined to consist of domain-specific data (e.g., heterogeneous sensory data), which may not be fully general due to the inclusion of (a) extraneous predicates and/or because (b) the predicates are overly specific. Rules satisfy the definition of cases and result from cases (rules), which have undergone at least one of the aforementioned generalizations. Extraneous antecedent predicates may be discovered from cases (rules) sharing a common consequent, if binary tautologies are found in case (rule) pairings, or if higher tautologies are found in a multiplicity of such cases (rules). Eliminating such extraneous antecedent predicates allows for the discovery of possible additional extraneous antecedent predicates - where the antecedent of one is a proper subset of the other. Candidate rules are formed from the intersection of combinations of two or more case (rule) antecedent sets implying a common consequent. The removed antecedent subsets are acquired as new rules implying the common consequent, which are conditioned to fire by the non-monotonic actions of their common antecedent (i.e., by way of an embedded antecedent predicate) - reducing the specificity of the parents by generalizing them into smaller, more reusable rules. Similarly, more general consequent sequences are formed from common subsequences shared by two or more consequent sequences being non-deterministically implied by a common antecedent. The removed consequent subsequences are acquired as new rules, which are set to fire before or after that of its parent's common dependency - reducing the specificity of the parents by generalizing them into smaller, more reusable rules. The rule to fire first will non-monotonically trigger the rule to fire next. This process iterates, since randomization of one side may enable further randomization of the other side. Tautologies are extracted and common subsets or subsequences form candidate rules as previously described (i.e., without creating duplicate productions). The context for the transformations is provided by the cases (rules), which are effectively acquired as previously described. Knowledge is segmented on the basis of whether it is a case, or a rule. Knowledge is further dynamically segmented on the basis of maximally shared left-hand sides (LHS) and maximally shared right-hand sides (RHS) - using logical pointers to minimize space-time requirements. It is proven that the allowance for non determinism is required, which implies that candidate rules cannot be invalidated by syntactically checking them for contradiction with a known valid dependency. A possibility metric is provided for each production, which cumulatively tracks the similarity of t","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134194356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Eye Tracking Area of Interest in the Context of Working Memory Capacity Tasks 工作记忆容量任务背景下的眼动追踪兴趣区域

2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI) Pub Date : 2019-07-01 DOI: 10.1109/IRI.2019.00042

Gavindya Jayawardena, Anne M. P. Michalek, S. Jayarathna

{"title":"Eye Tracking Area of Interest in the Context of Working Memory Capacity Tasks","authors":"Gavindya Jayawardena, Anne M. P. Michalek, S. Jayarathna","doi":"10.1109/IRI.2019.00042","DOIUrl":"https://doi.org/10.1109/IRI.2019.00042","url":null,"abstract":"Adults diagnosed with Attention-Deficit / Hyperactivity Disorder (ADHD) have reduced working memory capacity, indicating attention control deficits. Such deficits affect the characteristic movements of human gaze, thus making it a potential avenue to investigate attention disorders. This paper presents a converging operations approach toward the objective detection of neurocognitive indices of ADHD symptomatology that is grounded in the cognitive neuroscience literature of ADHD. The development of these objective measures of ADHD will facilitate its diagnosis. We hypothesize that the characteristic movements of human gaze within specific areas of interests (AOIs) may be used to estimate psychometric measures and that distinct eye movement scan patterns can be used to better understand ADHD. The results of this feasibility study confirm the utility of a combination of fixation and saccade feature set captured within specific AOIs indexing Working Memory Capacity (WMC) as a predictor of a diagnosis of ADHD in adults. Tree-based classifiers performed best in-terms of predicting ADHD with 86% percent accuracy using physiological measures of sustained visual attention during a WMC task.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130353672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Evaluating Model Predictive Performance: A Medicare Fraud Detection Case Study 评估模型预测性能:医疗保险欺诈检测案例研究

2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI) Pub Date : 2019-07-01 DOI: 10.1109/IRI.2019.00016

Richard A. Bauder, Matthew Herland, T. Khoshgoftaar

{"title":"Evaluating Model Predictive Performance: A Medicare Fraud Detection Case Study","authors":"Richard A. Bauder, Matthew Herland, T. Khoshgoftaar","doi":"10.1109/IRI.2019.00016","DOIUrl":"https://doi.org/10.1109/IRI.2019.00016","url":null,"abstract":"Evaluating a machine learning model's predictive performance is vital for establishing the practical usability in real-world applications. The use of separate training and test datasets, and cross-validation are common when evaluating machine learning models. The former uses two distinct datasets, whereas cross-validation splits a single dataset into smaller training and test subsets. In real-world production applications, it is critical to establish a model's usefulness by validating it on completely new input data, and not just using the crossvalidation results on a single historical dataset. In this paper, we present results for both evaluation methods, to include performance comparisons. In order to provide meaningful comparative analyses between methods, we perform real-world fraud detection experiments using 2013 to 2016 Medicare durable medical equipment claims data. This Medicare dataset is split into training (2013 to 2015 individual years) and test (2016 only). Using this Medicare case study, we assess the fraud detection performance, across three learners, for both model evaluation methods. We find that using the separate training and test sets generally outperforms cross-validation, indicating a better real-world model performance evaluation. Even so, cross-validation has comparable, but conservative, fraud detection results.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"27 34","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114044032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Enhancing Software Requirements Cluster Labeling Using Wikipedia 使用维基百科增强软件需求聚类标记

2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI) Pub Date : 2019-07-01 DOI: 10.1109/IRI.2019.00031

S. Reddivari

{"title":"Enhancing Software Requirements Cluster Labeling Using Wikipedia","authors":"S. Reddivari","doi":"10.1109/IRI.2019.00031","DOIUrl":"https://doi.org/10.1109/IRI.2019.00031","url":null,"abstract":"Clustering plays an important role in reusable requirements retrieval from the ever-growing software project repositories. The literature on requirements cluster labeling is still emerging. Researchers have investigated clustering to support various software engineering activities such as requirements prioritization, feature identification, automated tracing, and code navigation. The primary task in analyzing the clustering results is to \"label\" the clusters by means of some representative words to summarize and comprehend the requirements data. Despite the development of automatic cluster labeling techniques for software requirements, very little is understood about enhancing the cluster labels using external knowledge sources such as Wikipedia. In this paper, we review the literature on enhancing cluster labeling, present a framework for requirements cluster labeling and conduct an experiment to evaluate how the Wikipedia-based enhancement performs in labeling requirements clusters. The results show that Wikipedia-based labeling outperforms traditional Information Retrieval (IR) techniques. Our work sheds light on improving automated ways to support information reuse and management in the context of requirements engineering (RE).","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124667562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

EmoWei : Emotion-Oriented Personalized Weight Management System Based on Sentiment Analysis EmoWei:基于情感分析的面向情感的个性化体重管理系统

2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI) Pub Date : 2019-07-01 DOI: 10.1109/IRI.2019.00060

Jihyeon Kim, Uran Oh

引用次数: 1

Texture Image Categorization in Wavelet Domain via Naive Bayes Classifier Based on Laplace and Generalized Gaussian Distribution 基于拉普拉斯和广义高斯分布的朴素贝叶斯纹理图像小波域分类

2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI) Pub Date : 2019-07-01 DOI: 10.1109/IRI.2019.00034

Muhammad Azam, N. Bouguila

{"title":"Texture Image Categorization in Wavelet Domain via Naive Bayes Classifier Based on Laplace and Generalized Gaussian Distribution","authors":"Muhammad Azam, N. Bouguila","doi":"10.1109/IRI.2019.00034","DOIUrl":"https://doi.org/10.1109/IRI.2019.00034","url":null,"abstract":"In this paper, we have investigated recently proposed feature extraction technique for texture image representation. In the introduced method, features are extracted via bounded Laplace mixture model (BLMM) in wavelet domain. Due to nature of wavelet coefficients that can be modeled accurately with Laplace distribution, it is proposed to apply classifiers based on this distribution, which leads us to introduce Naive Bayes classifier with Laplace distribution for image categorization. The proposed approach is validated through experiments on different texture image datasets and it has shown very good results as compared to the model based on Gaussian distribution. The generalized Gaussian distribution is a generalization of both Laplace and Gaussian distributions, thus we have introduced also Naive Bayes classifier with generalized Gaussian distribution to achieve better performance as compared to the above two models. The proposed approach is also validated through extensive experiments and it is observed that by taking into account the nature of data, proposed models have very good performance. Classification results are presented by different performance metrics to ensure the effectiveness of proposed algorithms in texture image classification.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115284841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fake News Detection Using Bayesian Inference 基于贝叶斯推理的假新闻检测

2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI) Pub Date : 2019-07-01 DOI: 10.1109/IRI.2019.00066

Fatma Najar, Nuha Zamzami, N. Bouguila

引用次数: 9

Machine Learning for Classification of Economic Recessions 经济衰退分类的机器学习

2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI) Pub Date : 2019-07-01 DOI: 10.1109/IRI.2019.00019

Bruce Jackson, M. Rege

引用次数: 1