{"title":"CIRA:类不平衡弹性自适应高斯过程分类器","authors":"Salma Abdelmonem, Dina Elreedy, Samir I. Shaheen","doi":"10.1016/j.knosys.2024.112500","DOIUrl":null,"url":null,"abstract":"<div><p>The problem of class imbalance is pervasive across various real-world applications, resulting in machine learning classifiers exhibiting bias towards majority classes. Algorithm-level balancing approaches adapt the machine learning algorithms to learn from imbalanced datasets while preserving the data’s original distribution. The Gaussian process classifier is a powerful machine learning classification algorithm, however, as with other standard classifiers, its classification performance could be exacerbated by class imbalance. In this work, we propose the Class Imbalance Resilient Adaptive Gaussian process classifier (CIRA), an algorithm-level adaptation of the binary Gaussian process classifier to alleviate the class imbalance. To the best of our knowledge, the proposed algorithm (CIRA) is the first adaptive method for the Gaussian process classifier to handle unbalanced data. The proposed CIRA algorithm consists of two balancing modifications to the original classifier. The first modification balances the posterior mean approximation to learn a more balanced decision boundary between the majority and minority classes. The second modification adopts an asymmetric conditional prediction model to give more emphasis to the minority points during the training process. We conduct extensive experiments and statistical significance tests on forty-two real-world unbalanced datasets. Through the experiments, our proposed CIRA algorithm surpasses six popular data sampling methods with an average of 2.29%, 3.25%, 3.67%, and 1.81% in terms of the Geometric mean, F1-measure, Matthew correlation coefficient, and Area under the receiver operating characteristics curve performance metrics respectively.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CIRA: Class imbalance resilient adaptive Gaussian process classifier\",\"authors\":\"Salma Abdelmonem, Dina Elreedy, Samir I. Shaheen\",\"doi\":\"10.1016/j.knosys.2024.112500\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The problem of class imbalance is pervasive across various real-world applications, resulting in machine learning classifiers exhibiting bias towards majority classes. Algorithm-level balancing approaches adapt the machine learning algorithms to learn from imbalanced datasets while preserving the data’s original distribution. The Gaussian process classifier is a powerful machine learning classification algorithm, however, as with other standard classifiers, its classification performance could be exacerbated by class imbalance. In this work, we propose the Class Imbalance Resilient Adaptive Gaussian process classifier (CIRA), an algorithm-level adaptation of the binary Gaussian process classifier to alleviate the class imbalance. To the best of our knowledge, the proposed algorithm (CIRA) is the first adaptive method for the Gaussian process classifier to handle unbalanced data. The proposed CIRA algorithm consists of two balancing modifications to the original classifier. The first modification balances the posterior mean approximation to learn a more balanced decision boundary between the majority and minority classes. The second modification adopts an asymmetric conditional prediction model to give more emphasis to the minority points during the training process. We conduct extensive experiments and statistical significance tests on forty-two real-world unbalanced datasets. Through the experiments, our proposed CIRA algorithm surpasses six popular data sampling methods with an average of 2.29%, 3.25%, 3.67%, and 1.81% in terms of the Geometric mean, F1-measure, Matthew correlation coefficient, and Area under the receiver operating characteristics curve performance metrics respectively.</p></div>\",\"PeriodicalId\":49939,\"journal\":{\"name\":\"Knowledge-Based Systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2024-09-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge-Based Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950705124011341\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124011341","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
CIRA: Class imbalance resilient adaptive Gaussian process classifier
The problem of class imbalance is pervasive across various real-world applications, resulting in machine learning classifiers exhibiting bias towards majority classes. Algorithm-level balancing approaches adapt the machine learning algorithms to learn from imbalanced datasets while preserving the data’s original distribution. The Gaussian process classifier is a powerful machine learning classification algorithm, however, as with other standard classifiers, its classification performance could be exacerbated by class imbalance. In this work, we propose the Class Imbalance Resilient Adaptive Gaussian process classifier (CIRA), an algorithm-level adaptation of the binary Gaussian process classifier to alleviate the class imbalance. To the best of our knowledge, the proposed algorithm (CIRA) is the first adaptive method for the Gaussian process classifier to handle unbalanced data. The proposed CIRA algorithm consists of two balancing modifications to the original classifier. The first modification balances the posterior mean approximation to learn a more balanced decision boundary between the majority and minority classes. The second modification adopts an asymmetric conditional prediction model to give more emphasis to the minority points during the training process. We conduct extensive experiments and statistical significance tests on forty-two real-world unbalanced datasets. Through the experiments, our proposed CIRA algorithm surpasses six popular data sampling methods with an average of 2.29%, 3.25%, 3.67%, and 1.81% in terms of the Geometric mean, F1-measure, Matthew correlation coefficient, and Area under the receiver operating characteristics curve performance metrics respectively.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.