2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)最新文献

筛选
英文 中文
Multi-query Optimization in Federated Databases Using Evolutionary Algorithm 基于进化算法的联邦数据库多查询优化
Sameen Mansha, F. Kamiran
{"title":"Multi-query Optimization in Federated Databases Using Evolutionary Algorithm","authors":"Sameen Mansha, F. Kamiran","doi":"10.1109/ICMLA.2015.125","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.125","url":null,"abstract":"Multi Query Optimization in federated database systems is a well-studied area. Studies have shown that similar problem arises in wide range of applications, e.g., distributed stream processing systems and wireless sensor networks. In this paper, a general distributed multiquery processing problem motivated by the need to speedup data acquisition in federated databases using evolutionary algorithm is studied. We setup a simple framework in which each individual in population is evolved in terms of cost, uniform labeling of hyper edges and validity of resource constraints through a number of generations. Variations of our general problem can be shown to be NP-Hard. Our extensive empirical evaluation over five different synthetic datasets shows a significant improvement of 8 percent in results as compared to the state-of-the-art methods.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127831019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Population Migration Using Dominance in Multi-population Cultural Algorithms 基于优势的多种群文化算法的种群迁移
Santosh Upadhyayula, Ziad Kobti
{"title":"Population Migration Using Dominance in Multi-population Cultural Algorithms","authors":"Santosh Upadhyayula, Ziad Kobti","doi":"10.1109/ICMLA.2015.102","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.102","url":null,"abstract":"In this study we introduce a new method to enable the migration of individuals from one population to another using the concept of dominance in Multi-Population Cultural Algorithms (MPCA's). The MPCA's artificial population comprises of agents that belong to a certain sub-population. Multiple sub-populations are generated, each running its own Cultural Algorithm (CA). In this work we create a dominance-MPCA (D-MPCA) with a network of populations that implements a dominance strategy. We hypothesize that the evolutionary advantage of dominance can help improve the performance of MPCA in general optimization problems. The Sphere function from the CEC 2013 benchmark optimization functions is used to calculate the fitness value of the individuals. We observe how the populations adapt to the changes. Preliminary results show improved performance in our proposed D-MPCA over traditional MPCA.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"411 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127599696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Probabilistic Graphical Models and Deep Belief Networks for Prognosis of Breast Cancer 乳腺癌预后的概率图模型和深度信念网络
M. Khademi, N. Nedialkov
{"title":"Probabilistic Graphical Models and Deep Belief Networks for Prognosis of Breast Cancer","authors":"M. Khademi, N. Nedialkov","doi":"10.1109/ICMLA.2015.196","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.196","url":null,"abstract":"We propose a probabilistic graphical model (PGM) for prognosis and diagnosis of breast cancer. PGMs are suitable for building predictive models in medical applications, as they are powerful tools for making decisions under uncertainty from big data with missing attributes and noisy evidence. Previous work relied mostly on clinical data to create a predictive model. Moreover, practical knowledge of an expert was needed to build the structure of a model, which may not be accurate. In our opinion, since cancer is basically a genetic disease, the integration of microarray and clinical data can improve the accuracy of a predictive model. However, since microarray data is high-dimensional, including genomic variables may lead to poor results for structure and parameter learning due to the curse of dimensionality and small sample size problems. We address these problems by applying manifold learning and a deep belief network (DBN) to microarray data. First, we construct a PGM and a DBN using clinical and microarray data, and extract the structure of the clinical model automatically by applying a structure learning algorithm to the clinical data. Then, we integrate these two models using softmax nodes. Extensive experiments using real-world databases, such as METABRIC and NKI, show promising results in comparison to Support Vector Machines (SVMs) and k-Nearest Neighbors (k-NN) classifiers, for classifying tumors and predicting events like recurrence and metastasis.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127400397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
A Hybrid Method for Intrusion Detection 一种混合入侵检测方法
Yavuz Canbay, Ş. Sağiroğlu
{"title":"A Hybrid Method for Intrusion Detection","authors":"Yavuz Canbay, Ş. Sağiroğlu","doi":"10.1109/ICMLA.2015.197","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.197","url":null,"abstract":"Intrusion Detection Systems (IDSs) are used to detect malicious actions on information systems such as computing and networking systems. Abnormal behaviors or activities on the network systems could be detected by security systems. But, conventional security systems such as anti-virus and firewall cannot be successful in many malicious actions. To overcome this problem, better and more intelligent IDS solutions are required. In this study, a hybrid approach was proposed to use to detect network attacks. Genetic Algorithm (GA) and K-Nearest Neighbor (KNN) methods were combined to model and detect the attacks. KNN was employed to classify the attacks and GA was used to select k neighbors of an attack sample. This hybrid system was first applied in intrusion detection field. The system provides advantages such as, decreasing dependency of full training data set and providing plausible solution for intrusion detection. The results showed that the proposed system provides better results than single system.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"76 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114942266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
The Effect of Dataset Size on Training Tweet Sentiment Classifiers 数据集大小对Tweet情感分类器训练的影响
Joseph D. Prusa, T. Khoshgoftaar, Naeem Seliya
{"title":"The Effect of Dataset Size on Training Tweet Sentiment Classifiers","authors":"Joseph D. Prusa, T. Khoshgoftaar, Naeem Seliya","doi":"10.1109/ICMLA.2015.22","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.22","url":null,"abstract":"Using automated methods of labeling tweet sentiment, large volumes of tweets can be labeled and used to train classifiers. Millions of tweets could be used to train a classifier, however, doing so is computationally expensive. Thus, it is valuable to establish how many tweets should be utilized to train a classifier, since using additional instances with no gain in performance is a waste of resources. In this study, we seek to find out how many tweets are needed before no significant improvements are observed for sentiment analysis when adding additional instances. We train and evaluate classifiers using C4.5 decision tree, Naïve Bayes, 5 Nearest Neighbor and Radial Basis Function Network, with seven datasets varying from 1000 to 243,000 instances. Models are trained using four runs of 5-fold cross validation. Additionally, we conduct statistical tests to verify our observations and examine the impact of limiting features using frequency. All learners were found to improve with dataset size, with Naïve Bayes being the best performing learner. We found that Naïve Bayes did not significantly benefit from using more than 81,000 instances. To the best of our knowledge, this is the first study to investigate how learners scale in respect to dataset size with results verified using statistical tests and multiple models trained for each learner and dataset size. Additionally, we investigated using feature frequency to greatly reduce data grid size with either a small increase or decrease in classifier performance depending on choice of learner.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133588966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
BreakFast: Analyzing Celerity of News 早餐:分析新闻的快慢
Shuguang Wang, Eui-Hong Han
{"title":"BreakFast: Analyzing Celerity of News","authors":"Shuguang Wang, Eui-Hong Han","doi":"10.1109/ICMLA.2015.25","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.25","url":null,"abstract":"In the hypercompetitive news market, news outlets race to break news first. In order to provide better breaking news service and improve the reader experience, news agencies need to understand how to identify bottlenecks and streamline their reporting and delivery processes. With that in mind, we built a system, BreakFast, to measure and compare the speed of delivery of breaking news from various news sources to readers. One of the primary challenges of this comparison is how to identify which breaking news items are about the same emerging event but reported by different news agencies with different headlines and content. To tackle this problem, we extracted keywords automatically from the content, identified important topics, and then developed a classification model. The model identifies the same breaking stories from multiple news sources with an accuracy of approximately 90%. We also proposed new metrics to evaluate the speed of breaking news services and built real-time dashboards to monitor performance over time. We deployed BreakFast into the breaking news service at The Washington Post. This integrated system narrowed in on bottlenecks in its breaking news generation and delivery process, and improved its breaking news service in terms of time by more than 50%.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116575222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Data-Driven Kernels via Semi-supervised Clustering on the Manifold 基于流形上半监督聚类的数据驱动核
Jared Lundell, Charles DuHadway, D. Ventura
{"title":"Data-Driven Kernels via Semi-supervised Clustering on the Manifold","authors":"Jared Lundell, Charles DuHadway, D. Ventura","doi":"10.1109/ICMLA.2015.135","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.135","url":null,"abstract":"We present an approach to transductive learning that employs semi-supervised clustering of all available data (both labeled and unlabeled) to produce a data-dependent SVM kernel. In the general case where the domain includes irrelevant or redundant attributes, we constrain the clustering to occur on the manifold prescribed by the data (both labeled and unlabeled). Empirical results show that the approach performs comparably to more traditional kernels while providing significant reduction in the number of support vectors used. Further, the kernel construction technique provides some of the benefits that would normally be provided by dimensionality reduction preprocessing step.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131735096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Family of Chisini Mean Based Jensen-Shannon Divergence Kernels 一类基于Chisini均值的Jensen-Shannon散度核
P. Sharma, Gary Holness, Y. Markushin, N. Melikechi
{"title":"A Family of Chisini Mean Based Jensen-Shannon Divergence Kernels","authors":"P. Sharma, Gary Holness, Y. Markushin, N. Melikechi","doi":"10.1109/ICMLA.2015.86","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.86","url":null,"abstract":"Jensen-Shannon divergence is an effective method for measuring the distance between two probability distributions. When the difference between these two distributions is subtle, Jensen-Shannon divergence does not provide adequate separation to draw distinctions from subtly different distributions. We extend Jensen-Shannon divergence by reformulating it using alternate operators that provide different properties concerning robustness. Furthermore, we prove a number of important properties for this extension: the lower limits of its range, and its relationship to Shannon Entropy and Kullback-Leibler divergence. Finally, we propose a family of new kernels, based on Chisini mean Jensen-Shannon divergence, and demonstrate its utility in providing better SVM classification accuracy over RBF kernels for amino acid spectra. Because spectral methods capture phenomenon at subatomic levels, differences between complex compounds can often be subtle. While the impetus behind this work began with spectral data, the methods are generally applicable to domains where subtle differences are important.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134301803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Patient Identification for Telehealth Programs 远程医疗项目的病人识别
Martha Ganser, Sauptik Dhar, Unmesh Kurup, Carlos Cunha, Aca Gacic
{"title":"Patient Identification for Telehealth Programs","authors":"Martha Ganser, Sauptik Dhar, Unmesh Kurup, Carlos Cunha, Aca Gacic","doi":"10.1109/ICMLA.2015.100","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.100","url":null,"abstract":"Telehealth provides an opportunity to reduce healthcare costs through remote patient monitoring, but is not appropriate for all individuals. Our goal was to identify the patients for whom telehealth has the greatest impact. Challenges included the high variability of medical costs and the effect of selection bias on the cost difference between intervention patients and controls. Using Medicare claims data, we computed cost savings by comparing each telehealth patient to a group of control patients who had similar healthcare resource utilization. These estimates were then used to train a predictive model using logistic regression. Filtering the patients based on the model resulted in an average cost savings of $10K, an improvement over the current expected loss of $2K (without filtering).","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133214603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
NewsCubeSum: A Personalized Multidimensional News Update Summarization System NewsCubeSum:个性化多维新闻更新汇总系统
Dingding Wang, Lei Li, Tao Li
{"title":"NewsCubeSum: A Personalized Multidimensional News Update Summarization System","authors":"Dingding Wang, Lei Li, Tao Li","doi":"10.1109/ICMLA.2015.129","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.129","url":null,"abstract":"Popular online publishers produce huge amount of news articles every day, so it is important to summarize the most up-to-the-minute information to help users quickly know the progresses of their interested news events. In this paper, we develop NewsCubeSum, a novel personalized news summarization system utilizing OLAP and supervised sentence selection techniques to generate brief summaries delivering news updates in multiple dimensions (such as time, entity, and topic). An illustrative case study and experimental results on summarization performance comparisons are provided to show the effectiveness of NewsCubeSum.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"466 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113982412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信