2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)最新文献

筛选
英文 中文
Example-Based Feature Tweaking Using Random Forests 使用随机森林进行基于示例的特征调整
Tony Lindgren, P. Papapetrou, Isak Samsten, L. Asker
{"title":"Example-Based Feature Tweaking Using Random Forests","authors":"Tony Lindgren, P. Papapetrou, Isak Samsten, L. Asker","doi":"10.1109/IRI.2019.00022","DOIUrl":"https://doi.org/10.1109/IRI.2019.00022","url":null,"abstract":"In certain application areas when using predictive models, it is not enough to make an accurate prediction for an example, instead it might be more important to change a prediction from an undesired class into a desired class. In this paper we investigate methods for changing predictions of examples. To this end, we introduce a novel algorithm for changing predictions of examples and we compare this novel method to an existing method and a baseline method. In an empirical evaluation we compare the three methods on a total of 22 datasets. The results show that the novel method and the baseline method can change an example from an undesired class into a desired class in more cases than the competitor method (and in some cases this difference is statistically significant). We also show that the distance, as measured by the euclidean norm, is higher for the novel and baseline methods (and in some cases this difference is statistically significantly) than for state-of-the-art. The methods and their proposed changes are also evaluated subjectively in a medical domain with interesting results.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134486125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Title Page i 第1页
{"title":"Title Page i","authors":"","doi":"10.1109/iri.2019.00001","DOIUrl":"https://doi.org/10.1109/iri.2019.00001","url":null,"abstract":"","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132812253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DDM: Data-Driven Modeling of Physical Phenomenon with Application to METOC 物理现象的数据驱动建模及其在METOC中的应用
S. Rubin, Lydia Bouzar-Benlabiod
{"title":"DDM: Data-Driven Modeling of Physical Phenomenon with Application to METOC","authors":"S. Rubin, Lydia Bouzar-Benlabiod","doi":"10.1109/IRI.2019.00057","DOIUrl":"https://doi.org/10.1109/IRI.2019.00057","url":null,"abstract":"The problem addressed by this paper pertains to the representation, acquisition, and randomization of experiential knowledge for autonomous systems in expert reconnaissance. Such systems are characterized by the requirement to render proper decisions not explicitly programmed for. Cases are defined to consist of domain-specific data (e.g., heterogeneous sensory data), which may not be fully general due to the inclusion of (a) extraneous predicates and/or because (b) the predicates are overly specific. Rules satisfy the definition of cases and result from cases (rules), which have undergone at least one of the aforementioned generalizations. Extraneous antecedent predicates may be discovered from cases (rules) sharing a common consequent, if binary tautologies are found in case (rule) pairings, or if higher tautologies are found in a multiplicity of such cases (rules). Eliminating such extraneous antecedent predicates allows for the discovery of possible additional extraneous antecedent predicates - where the antecedent of one is a proper subset of the other. Candidate rules are formed from the intersection of combinations of two or more case (rule) antecedent sets implying a common consequent. The removed antecedent subsets are acquired as new rules implying the common consequent, which are conditioned to fire by the non-monotonic actions of their common antecedent (i.e., by way of an embedded antecedent predicate) - reducing the specificity of the parents by generalizing them into smaller, more reusable rules. Similarly, more general consequent sequences are formed from common subsequences shared by two or more consequent sequences being non-deterministically implied by a common antecedent. The removed consequent subsequences are acquired as new rules, which are set to fire before or after that of its parent's common dependency - reducing the specificity of the parents by generalizing them into smaller, more reusable rules. The rule to fire first will non-monotonically trigger the rule to fire next. This process iterates, since randomization of one side may enable further randomization of the other side. Tautologies are extracted and common subsets or subsequences form candidate rules as previously described (i.e., without creating duplicate productions). The context for the transformations is provided by the cases (rules), which are effectively acquired as previously described. Knowledge is segmented on the basis of whether it is a case, or a rule. Knowledge is further dynamically segmented on the basis of maximally shared left-hand sides (LHS) and maximally shared right-hand sides (RHS) - using logical pointers to minimize space-time requirements. It is proven that the allowance for non determinism is required, which implies that candidate rules cannot be invalidated by syntactically checking them for contradiction with a known valid dependency. A possibility metric is provided for each production, which cumulatively tracks the similarity of t","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134194356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Eye Tracking Area of Interest in the Context of Working Memory Capacity Tasks 工作记忆容量任务背景下的眼动追踪兴趣区域
Gavindya Jayawardena, Anne M. P. Michalek, S. Jayarathna
{"title":"Eye Tracking Area of Interest in the Context of Working Memory Capacity Tasks","authors":"Gavindya Jayawardena, Anne M. P. Michalek, S. Jayarathna","doi":"10.1109/IRI.2019.00042","DOIUrl":"https://doi.org/10.1109/IRI.2019.00042","url":null,"abstract":"Adults diagnosed with Attention-Deficit / Hyperactivity Disorder (ADHD) have reduced working memory capacity, indicating attention control deficits. Such deficits affect the characteristic movements of human gaze, thus making it a potential avenue to investigate attention disorders. This paper presents a converging operations approach toward the objective detection of neurocognitive indices of ADHD symptomatology that is grounded in the cognitive neuroscience literature of ADHD. The development of these objective measures of ADHD will facilitate its diagnosis. We hypothesize that the characteristic movements of human gaze within specific areas of interests (AOIs) may be used to estimate psychometric measures and that distinct eye movement scan patterns can be used to better understand ADHD. The results of this feasibility study confirm the utility of a combination of fixation and saccade feature set captured within specific AOIs indexing Working Memory Capacity (WMC) as a predictor of a diagnosis of ADHD in adults. Tree-based classifiers performed best in-terms of predicting ADHD with 86% percent accuracy using physiological measures of sustained visual attention during a WMC task.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130353672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Evaluating Model Predictive Performance: A Medicare Fraud Detection Case Study 评估模型预测性能:医疗保险欺诈检测案例研究
Richard A. Bauder, Matthew Herland, T. Khoshgoftaar
{"title":"Evaluating Model Predictive Performance: A Medicare Fraud Detection Case Study","authors":"Richard A. Bauder, Matthew Herland, T. Khoshgoftaar","doi":"10.1109/IRI.2019.00016","DOIUrl":"https://doi.org/10.1109/IRI.2019.00016","url":null,"abstract":"Evaluating a machine learning model's predictive performance is vital for establishing the practical usability in real-world applications. The use of separate training and test datasets, and cross-validation are common when evaluating machine learning models. The former uses two distinct datasets, whereas cross-validation splits a single dataset into smaller training and test subsets. In real-world production applications, it is critical to establish a model's usefulness by validating it on completely new input data, and not just using the crossvalidation results on a single historical dataset. In this paper, we present results for both evaluation methods, to include performance comparisons. In order to provide meaningful comparative analyses between methods, we perform real-world fraud detection experiments using 2013 to 2016 Medicare durable medical equipment claims data. This Medicare dataset is split into training (2013 to 2015 individual years) and test (2016 only). Using this Medicare case study, we assess the fraud detection performance, across three learners, for both model evaluation methods. We find that using the separate training and test sets generally outperforms cross-validation, indicating a better real-world model performance evaluation. Even so, cross-validation has comparable, but conservative, fraud detection results.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"27 34","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114044032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Enhancing Software Requirements Cluster Labeling Using Wikipedia 使用维基百科增强软件需求聚类标记
S. Reddivari
{"title":"Enhancing Software Requirements Cluster Labeling Using Wikipedia","authors":"S. Reddivari","doi":"10.1109/IRI.2019.00031","DOIUrl":"https://doi.org/10.1109/IRI.2019.00031","url":null,"abstract":"Clustering plays an important role in reusable requirements retrieval from the ever-growing software project repositories. The literature on requirements cluster labeling is still emerging. Researchers have investigated clustering to support various software engineering activities such as requirements prioritization, feature identification, automated tracing, and code navigation. The primary task in analyzing the clustering results is to \"label\" the clusters by means of some representative words to summarize and comprehend the requirements data. Despite the development of automatic cluster labeling techniques for software requirements, very little is understood about enhancing the cluster labels using external knowledge sources such as Wikipedia. In this paper, we review the literature on enhancing cluster labeling, present a framework for requirements cluster labeling and conduct an experiment to evaluate how the Wikipedia-based enhancement performs in labeling requirements clusters. The results show that Wikipedia-based labeling outperforms traditional Information Retrieval (IR) techniques. Our work sheds light on improving automated ways to support information reuse and management in the context of requirements engineering (RE).","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124667562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
EmoWei : Emotion-Oriented Personalized Weight Management System Based on Sentiment Analysis EmoWei:基于情感分析的面向情感的个性化体重管理系统
Jihyeon Kim, Uran Oh
{"title":"EmoWei : Emotion-Oriented Personalized Weight Management System Based on Sentiment Analysis","authors":"Jihyeon Kim, Uran Oh","doi":"10.1109/IRI.2019.00060","DOIUrl":"https://doi.org/10.1109/IRI.2019.00060","url":null,"abstract":"A number of online communities and commercial apps exist to assist people with weight management. However, these systems are limited to logging and tracking meals or workouts without considering one's emotional state, which is known to have a strong impact on health (e.g., stress-related eating). To confirm the feasibility of monitoring emotion from personal logs such as online posts, we first conducted a Recurrent Neural Network (RNN) based sentiment analysis on 17,735 weight loss-related tweets and 200 posts from an online weight management community called FatSecret in comparisons to general tweets. The results suggest that we can infer one's emotion based on their written text and their progress in managing weight. Based on the findings, we propose EmoWei, a new weight management system that integrates users' emotions to provide personalized assistance to achieve their weight loss goals with minimum stress.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125211972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Texture Image Categorization in Wavelet Domain via Naive Bayes Classifier Based on Laplace and Generalized Gaussian Distribution 基于拉普拉斯和广义高斯分布的朴素贝叶斯纹理图像小波域分类
Muhammad Azam, N. Bouguila
{"title":"Texture Image Categorization in Wavelet Domain via Naive Bayes Classifier Based on Laplace and Generalized Gaussian Distribution","authors":"Muhammad Azam, N. Bouguila","doi":"10.1109/IRI.2019.00034","DOIUrl":"https://doi.org/10.1109/IRI.2019.00034","url":null,"abstract":"In this paper, we have investigated recently proposed feature extraction technique for texture image representation. In the introduced method, features are extracted via bounded Laplace mixture model (BLMM) in wavelet domain. Due to nature of wavelet coefficients that can be modeled accurately with Laplace distribution, it is proposed to apply classifiers based on this distribution, which leads us to introduce Naive Bayes classifier with Laplace distribution for image categorization. The proposed approach is validated through experiments on different texture image datasets and it has shown very good results as compared to the model based on Gaussian distribution. The generalized Gaussian distribution is a generalization of both Laplace and Gaussian distributions, thus we have introduced also Naive Bayes classifier with generalized Gaussian distribution to achieve better performance as compared to the above two models. The proposed approach is also validated through extensive experiments and it is observed that by taking into account the nature of data, proposed models have very good performance. Classification results are presented by different performance metrics to ensure the effectiveness of proposed algorithms in texture image classification.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115284841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fake News Detection Using Bayesian Inference 基于贝叶斯推理的假新闻检测
Fatma Najar, Nuha Zamzami, N. Bouguila
{"title":"Fake News Detection Using Bayesian Inference","authors":"Fatma Najar, Nuha Zamzami, N. Bouguila","doi":"10.1109/IRI.2019.00066","DOIUrl":"https://doi.org/10.1109/IRI.2019.00066","url":null,"abstract":"Given the huge volume of information available on social media, making a distinction between false information and a real one is a challenging task. In fact, several statistical models dealing with this problem are based on multinomial distributions. However, a new family of distributions that is an exponential family approximation to the Dirichlet Compound Multinomial (EDCM) has been introduced to be more adjustable to high-dimensional data and to overcome the drawbacks of the multinomial assumption. Thus, in this paper, we tackle the problem of fake news detection using finite mixture models of EDCM distributions. In particular, we develop a Bayesian approach based on Markov Chain Monte Carlo and Metropolis-Hastings algorithm for the learning of these mixture models. The proposed method is validated via extensive simulations and a comparison with multinomial-based mixture models is provided.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115433151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Machine Learning for Classification of Economic Recessions 经济衰退分类的机器学习
Bruce Jackson, M. Rege
{"title":"Machine Learning for Classification of Economic Recessions","authors":"Bruce Jackson, M. Rege","doi":"10.1109/IRI.2019.00019","DOIUrl":"https://doi.org/10.1109/IRI.2019.00019","url":null,"abstract":"The ability to quickly and accurately classify economic activity into periods of recession and expansion is of great interest to economists and policy makers. Machine Learning methods can potentially be applied to the classification of business cycles. This paper describes two machine learning methods, K-Nearest Neighbor and Neural Networks, and compares them to a Dynamic Factor Markov Switching model for determining business cycle turning points. We conclude that machine learning techniques can offer more accurate classifiers that are worthy of additional study.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114776401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信