{"title":"Extreme Large Margin Distribution Machine and its applications for biomedical datasets","authors":"Zhiyong Yang, Jingcheng Lu, Taohong Zhang","doi":"10.1109/BIBM.2016.7822751","DOIUrl":null,"url":null,"abstract":"Classification methods has become increasingly popular for biomedical and bioinformatical data analysis. However, due to the difficulty of data acquisition, sometimes we could only obtain small-scale datasets which may leads to unreasonable generalization performances. For SVM-like algorithms, we could resort to Large Margin theory to find out solutions for such dilemma. Recent studies on large margin theory show that, besides maximizing the minimum margin of a given training dataset, it is also necessary to optimization the margin distribution to boost the overall generalization ability. Correspondingly, a novel SVM-like algorithm called Large Margin Distribution Machine (LDM) realizes this idea by maximizing the average of margin and minimizing the variance of margin simultaneously. And a series of applications has been reported thereafter. There is another well-known machine learning algorithm called Extreme Learning Machine (ELM) which shares similar framework with SVM. It is believed in this paper ELM could also benefit from the virtues of margin distribution optimization. Bearing this in mind, a novel algorithm called Extreme Large Margin Distribution Machine(ELDM) is proposed in this paper by bridging the advantages of ELM and LDM. And an efficient extension of ELDM for multi-class classifications under One vs. All Scheme is proposed subsequently. Finally, the experiment results on both benchmark datasets and biomedical classification datasets show the effectiveness of our proposed algorithm.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"70 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2016.7822751","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Classification methods has become increasingly popular for biomedical and bioinformatical data analysis. However, due to the difficulty of data acquisition, sometimes we could only obtain small-scale datasets which may leads to unreasonable generalization performances. For SVM-like algorithms, we could resort to Large Margin theory to find out solutions for such dilemma. Recent studies on large margin theory show that, besides maximizing the minimum margin of a given training dataset, it is also necessary to optimization the margin distribution to boost the overall generalization ability. Correspondingly, a novel SVM-like algorithm called Large Margin Distribution Machine (LDM) realizes this idea by maximizing the average of margin and minimizing the variance of margin simultaneously. And a series of applications has been reported thereafter. There is another well-known machine learning algorithm called Extreme Learning Machine (ELM) which shares similar framework with SVM. It is believed in this paper ELM could also benefit from the virtues of margin distribution optimization. Bearing this in mind, a novel algorithm called Extreme Large Margin Distribution Machine(ELDM) is proposed in this paper by bridging the advantages of ELM and LDM. And an efficient extension of ELDM for multi-class classifications under One vs. All Scheme is proposed subsequently. Finally, the experiment results on both benchmark datasets and biomedical classification datasets show the effectiveness of our proposed algorithm.
分类方法在生物医学和生物信息学数据分析中越来越受欢迎。然而,由于数据采集的困难,有时我们只能获得小规模的数据集,这可能会导致不合理的泛化性能。对于类svm算法,我们可以借助大边际理论来解决这种困境。最近关于大边际理论的研究表明,除了最大化给定训练数据集的最小边际外,还需要优化边际分布以提高整体泛化能力。相应的,一种新的类似svm的算法——大额保证金分布机(Large Margin Distribution Machine, LDM)通过同时最大化保证金均值和最小化保证金方差来实现这一思想。此后有一系列的应用报道。还有另一种著名的机器学习算法称为极限学习机(ELM),它与支持向量机有相似的框架。本文认为,边际收益管理也可以受益于边际分配优化的优点。考虑到这一点,本文通过桥接ELM和LDM的优点,提出了一种新的算法,称为极大边际分布机(Extreme Large Margin Distribution Machine, ELDM)。在此基础上,提出了一种适用于多类分类的有效扩展方法。最后,在基准数据集和生物医学分类数据集上的实验结果表明了本文算法的有效性。