{"title":"Two approaches of using heavy tails in high dimensional EDA","authors":"Momodou L. Sanyang, Hanno Muehlbrandt, A. Kabán","doi":"10.1109/ICDMW.2014.184","DOIUrl":null,"url":null,"abstract":"We consider the problem of high dimensional black-box optimisation via Estimation of Distribution Algorithms (EDA). The Gaussian distribution is commonly used as a search operator in most of the EDA methods. However there are indications in the literature that heavy tailed distributions may perform better due to their higher exploration capabilities. Univariate heavy tailed distributions were already proposed for high dimensional problems. In 2D problems it has been reported that a multivariate heavy tailed (such as Cauchy) search distribution is able to blend together the strengths of multivariate modelling with a high exploration power. In this paper, we study whether a similar scheme would work well in high dimensional search problems. To get around of the difficulty of multivariate model building in high dimensions we employ a recently proposed random projections (RP) ensemble based approach which we modify to get samples from a multivariate Cauchy using the scale-mixture representation of the Cauchy distribution. Our experiments show that the resulting RP-based multivariate Cauchy EDA consistently improves on the performance of the univariate Cauchy search distribution. However, intriguingly, the RP-based multivariate Gaussian EDA has the best performance among these methods. It appears that the highly explorative nature of the multivariate Cauchy sampling is exacerbated in high dimensional search spaces and the population based search loses its focus and effectiveness as a result. Finally, we present an idea to increase exploration while maintaining exploitation and focus by using the RP-based multivariate Gaussian EDA in which the RP matrices are drawn with i.i.d. Heavy tailed entries. This achieves improved performance and is competitive with the state of the art.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on Data Mining Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2014.184","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
We consider the problem of high dimensional black-box optimisation via Estimation of Distribution Algorithms (EDA). The Gaussian distribution is commonly used as a search operator in most of the EDA methods. However there are indications in the literature that heavy tailed distributions may perform better due to their higher exploration capabilities. Univariate heavy tailed distributions were already proposed for high dimensional problems. In 2D problems it has been reported that a multivariate heavy tailed (such as Cauchy) search distribution is able to blend together the strengths of multivariate modelling with a high exploration power. In this paper, we study whether a similar scheme would work well in high dimensional search problems. To get around of the difficulty of multivariate model building in high dimensions we employ a recently proposed random projections (RP) ensemble based approach which we modify to get samples from a multivariate Cauchy using the scale-mixture representation of the Cauchy distribution. Our experiments show that the resulting RP-based multivariate Cauchy EDA consistently improves on the performance of the univariate Cauchy search distribution. However, intriguingly, the RP-based multivariate Gaussian EDA has the best performance among these methods. It appears that the highly explorative nature of the multivariate Cauchy sampling is exacerbated in high dimensional search spaces and the population based search loses its focus and effectiveness as a result. Finally, we present an idea to increase exploration while maintaining exploitation and focus by using the RP-based multivariate Gaussian EDA in which the RP matrices are drawn with i.i.d. Heavy tailed entries. This achieves improved performance and is competitive with the state of the art.