2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)最新文献

筛选
英文 中文
Conformal Prediction Using Random Survival Forests 使用随机生存森林的适形预测
Henrik Boström, L. Asker, R. Gurung, Isak Karlsson, Tony Lindgren, P. Papapetrou
{"title":"Conformal Prediction Using Random Survival Forests","authors":"Henrik Boström, L. Asker, R. Gurung, Isak Karlsson, Tony Lindgren, P. Papapetrou","doi":"10.1109/ICMLA.2017.00-57","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.00-57","url":null,"abstract":"Random survival forests constitute a robust approach to survival modeling, i.e., predicting the probability that an event will occur before or on a given point in time. Similar to most standard predictive models, no guarantee for the prediction error is provided for this model, which instead typically is empirically evaluated. Conformal prediction is a rather recent framework, which allows the error of a model to be determined by a user specified confidence level, something which is achieved by considering set rather than point predictions. The framework, which has been applied to some of the most popular classification and regression techniques, is here for the first time applied to survival modeling, through random survival forests. An empirical investigation is presented where the technique is evaluated on datasets from two real-world applications; predicting component failure in trucks using operational data and predicting survival and treatment of heart failure patients from administrative healthcare data. The experimental results show that the error levels indeed are very close to the provided confidence levels, as guaranteed by the conformal prediction framework, and that the error for predicting each outcome, i.e., event or no-event, can be controlled separately. The latter may, however, lead to less informative predictions, i.e., larger prediction sets, in case the class distribution is heavily imbalanced.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"96 1","pages":"812-817"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85230241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Predicting Hotel Bookings Cancellation with a Machine Learning Classification Model 用机器学习分类模型预测酒店预订取消
N. António, Ana de Almeida, Luís Nunes
{"title":"Predicting Hotel Bookings Cancellation with a Machine Learning Classification Model","authors":"N. António, Ana de Almeida, Luís Nunes","doi":"10.1109/ICMLA.2017.00-11","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.00-11","url":null,"abstract":"Booking cancellations have significant impact on demand-management decisions in the hospitality industry. To mitigate the effect of cancellations, hotels implement rigid cancellation policies and overbooking tactics, which in turn can have a negative impact on revenue and on the hotel reputation. To reduce this impact, a machine learning based system prototype was developed. It makes use of the hotel’s Property Management Systems data and trains a classification model every day to predict which bookings are “likely to cancel” and with that calculate net demand. This prototype, deployed in a production environment in two hotels, by enforcing A/B testing, also enables the measurement of the impact of actions taken to act upon bookings predicted as “likely to cancel”. Results indicate good prototype performance and provide important indications for research progress whilst evidencing that bookings contacted by hotels cancel less than bookings not contacted.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"23 1","pages":"1049-1054"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84690681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Modeling Over-Dispersion for Network Data Clustering 网络数据聚类的过分散建模
Lu Wang, D. Zhu, Ming Dong, Yan Li
{"title":"Modeling Over-Dispersion for Network Data Clustering","authors":"Lu Wang, D. Zhu, Ming Dong, Yan Li","doi":"10.1109/ICMLA.2017.0-180","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.0-180","url":null,"abstract":"Over-dispersed network data mining has emerged as a central theme in data science, evident by a sharp increase in the volume of real-world network data with imbalanced clusters.While most of existing clustering methods are designed for discovering the number of clusters and class specific connectivity patterns, few methods are available to uncover the imbalanced clusters,commonly existing in network communities and image segments.In this paper, we propose a generalized probabilistic modeling framework,SizeConnectivity, to estimate over-dispersed cluster size distribution together with class specific connectivity patterns from network data.We performed extensive synthetic and real-world experiments on clustering social network data and image data for detecting network communities and image segments.Our results demonstrate a superior performance of our SizeConnectivity clustering method in recovering the hidden structure of network data via modeling over-dispersion.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"55 1","pages":"42-49"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87848311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
MapReduce Based Classification for Fault Detection in Big Data Applications 基于MapReduce分类的大数据故障检测
M. O. Shafiq, Maryam Fekri, Rami Ibrahim
{"title":"MapReduce Based Classification for Fault Detection in Big Data Applications","authors":"M. O. Shafiq, Maryam Fekri, Rami Ibrahim","doi":"10.1109/ICMLA.2017.00-89","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.00-89","url":null,"abstract":"Recently emerging software applications are large, complex, distributed and data-intensive, i.e., big data applications. That makes the monitoring of such applications a challenging task due to lack of standards and techniques for modeling and analysis of execution data (i.e., logs) produced by such applications. Another challenge imposed by big data applications is that the execution data produced by such applications also has high volume, velocity, variety, and require high veracity, value. In this paper, we present our monitoring solution that performs real-time fault detection in big data applications. Our solution is two-fold. First, we prescribe a standard model for structuring execution logs. Second, we prescribe a Bayesian classification based analysis solution that is MapReduce compliant, distributed, parallel, single pass and incremental. That makes it possible for our proposed solution to be deployed and executed on cloud computing platforms to process logs produced by big data applications. We have carried out complexity, scalability, and usability analysis of our proposed solution that how efficiently and effectively it can perform fault detection in big data applications.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"87 1","pages":"637-642"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85884154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Comparing Transfer Learning and Traditional Learning Under Domain Class Imbalance 领域类不平衡下迁移学习与传统学习的比较
Karl R. Weiss, T. Khoshgoftaar
{"title":"Comparing Transfer Learning and Traditional Learning Under Domain Class Imbalance","authors":"Karl R. Weiss, T. Khoshgoftaar","doi":"10.1109/ICMLA.2017.0-138","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.0-138","url":null,"abstract":"Transfer learning is a subclass of machine learning, which uses training data (source) drawn from a different domain than that of the testing data (target). A transfer learning environment is characterized by the unavailability of labeled data from the target domain, due to data being rare or too expensive to obtain. However, there exists abundant labeled data from a different, but similar domain. These two domains are likely to have different distribution characteristics. Transfer learning algorithms attempt to align the distribution characteristics of the source and target domains to create high-performance classifiers. This paper provides comparative performance analysis between stateof- the-art transfer learning algorithms and traditional machine learning algorithms under the domain class imbalance condition. The domain class imbalance condition is characterized by the source and target domains having different class probabilities, which can create marginal distribution differences between the source and target data. Statistical analysis is provided to show the significance of the results.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"85 1","pages":"337-343"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76173440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Feature Extraction and K-means Clustering Approach to Explore Important Features of Urban Identity 特征提取和k均值聚类方法探索城市身份的重要特征
Mei-Chih Chang, Peter Bus, G. Schmitt
{"title":"Feature Extraction and K-means Clustering Approach to Explore Important Features of Urban Identity","authors":"Mei-Chih Chang, Peter Bus, G. Schmitt","doi":"10.1109/ICMLA.2017.00015","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.00015","url":null,"abstract":"Public spaces play an important role in the processes of formation, generation and change of urban identity. Under present day conditions, the identities of cities are rapidly deteriorating and vanishing. Therefore, the importance of urban design, which is a means of designing urban spaces and their physical and social aspects, is ever growing. This paper proposes a novel methodology by using Principle Component Analysis (PCA) and K-means clustering approach to find important features of the urban identity from public space. K. Lynch’s work and Space Syntax theory are reconstructed and integrated with POI (Points of Interest) to quantify the quality of the public space. A case study of Zürich city is used to test of these redefinitions and features of urban identity. The results show that PCA and K-means clustering approach can identify the urban identity and explore important features. This strategy could help to improve planning and design processes and generation of new urban patterns with more appropriate features and qualities.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"21 1","pages":"1139-1144"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80071373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Histogram-Based Asymmetric Relabeling for Learning from Only Positive and Unlabeled Data 基于直方图的非对称重标注学习方法
Tom Arjannikov, G. Tzanetakis
{"title":"Histogram-Based Asymmetric Relabeling for Learning from Only Positive and Unlabeled Data","authors":"Tom Arjannikov, G. Tzanetakis","doi":"10.1109/ICMLA.2017.000-8","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.000-8","url":null,"abstract":"In this paper, we demonstrate how to use asymmetric data relabeling based on feature histograms as a pre-processing step for improving the overall classification performance of different classifiers in situations when only positive and unlabeled data is available. Additionally, this strategy can be used to identify with some level of confidence those data instances that should probably be labeled as positive. Moreover, this approach can be adapted to assess the quality of a given dataset, in terms of how many positive instances are not labeled. We examine our approach using synthetic data and demonstrate its applicability using real, publicly available data.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"21 1","pages":"1065-1070"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77118183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two-phase Parallel Learning to Identify Similar Structures Among Relational Databases 两阶段并行学习识别关系数据库中的相似结构
Debora G. Reis, Rommel N. Carvalho, Ricardo Silva Carvalho, M. Ladeira
{"title":"Two-phase Parallel Learning to Identify Similar Structures Among Relational Databases","authors":"Debora G. Reis, Rommel N. Carvalho, Ricardo Silva Carvalho, M. Ladeira","doi":"10.1109/ICMLA.2017.00-17","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.00-17","url":null,"abstract":"The need for efficient techniques for dealing with large databases increases as the number of large databases grows. We propose a new two-phase parallel learning approach to identify similar structures of relational databases fast. Each phase represents a level of relational metadata aggregation. To test the approach, we realized an experiment in with several large databases of Ministry of Social Development of Brazil to classify which relational database have a similar structure of tables and columns, based on its metadata. The measure of similarity considered Levenshtein and cosine. Generalized Linear Model, Random Forest, and Gradient Boost Machines (GBM) techniques are applied to develop the model. Each model was executed in sequential and parallel processing and had performance compared. As results, the parallel execution of GBM was at least ten times faster than the sequential processing. The results encourage further applications of the propositional parallel learning in relational databases.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"458 1","pages":"1020-1023"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77045276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Noise Prediction and Time-Domain Subtraction Approach to Deep Neural Network Based Speech Enhancement 基于深度神经网络的语音增强噪声预测和时域减法
B. O. Odelowo, David V. Anderson
{"title":"A Noise Prediction and Time-Domain Subtraction Approach to Deep Neural Network Based Speech Enhancement","authors":"B. O. Odelowo, David V. Anderson","doi":"10.1109/ICMLA.2017.0-133","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.0-133","url":null,"abstract":"Deep neural networks (DNNs) have recently been successfully applied to the speech enhancement task; however, the low signal-to-noise ratio (SNR) performance of DNN-based speech enhancement systems remains less than desirable. In this paper, we study an approach to DNN-based speech enhancement based on noise prediction. Three speech enhancement models based on noise prediction are proposed, and their performance is compared to that of conventional spectral-mapping models in seen and unseen noise tests. Objective test results show that the proposed noise prediction models perform well in enhancing speech quality in seen noise conditions and in enhancing high SNR speech signals. They also perform well in enhancing speech intelligibility in both seen and unseen noise conditions, but do not outperform the conventional models on quality metrics in unseen noise conditions. Further analysis of the enhanced speech signals is undertaken to explain the observed results.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"24 1","pages":"372-377"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86914077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Forward Looking Sonar Scene Matching Using Deep Learning 使用深度学习的前视声纳场景匹配
P. Ribeiro, M. Santos, Paulo L. J. Drews-Jr, S. Botelho
{"title":"Forward Looking Sonar Scene Matching Using Deep Learning","authors":"P. Ribeiro, M. Santos, Paulo L. J. Drews-Jr, S. Botelho","doi":"10.1109/ICMLA.2017.00-99","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.00-99","url":null,"abstract":"Optical images display drastically reduced visibility due to underwater turbidity conditions. Sonar imaging presents an alternative form of environment perception for underwater vehicles navigation, mapping and localization. In this work we present a novel method for Acoustic Scene Matching. Therefore, we developed and trained a new Deep Learning architecture designed to compare two acoustic images and decide if they correspond to the same underwater scene. The network is named Sonar Matching Network (SMNet). The acoustic images used in this paper were obtained by a Forward Looking Sonar during a Remotely Operated Vehicle (ROV) mission. A Geographic Positioning System provided the ROV position for the ground truth score which is used in the learning process of our network. The proposed method uses 36.000 samples of real data for validation. From a binary classification perspective, our method achieved 98% of accuracy when two given scenes have more than ten percent of intersection.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"59 1","pages":"574-579"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90617255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信