{"title":"Maximum Margin-Based Activation Clipping for Posttraining Overfitting Mitigation in DNN Classifiers","authors":"Hang Wang;David J. Miller;George Kesidis","doi":"10.1109/TAI.2025.3552686","DOIUrl":null,"url":null,"abstract":"Sources of overfitting in deep neural net (DNN) classifiers include: 1) large class imbalances; 2) insufficient training set diversity; and 3) over-training. Recently, it was shown that backdoor data-poisoning <italic>also</i> induces overfitting, with unusually large maximum classification margins (MMs) to the attacker’s target class. This is enabled by (unbounded) ReLU activation functions, which allow large signals to propagate in the DNN. Thus, an effective <italic>posttraining</i> backdoor mitigation approach (with no knowledge of the training set and no knowledge or control of the training process) was proposed, informed by a small, clean (poisoning-free) data set and choosing saturation levels on neural activations to limit the DNN’s MMs. Here, we show that nonmalicious sources of overfitting <italic>also</i> exhibit unusually large MMs. Thus, we propose novel posttraining MM-based regularization that substantially mitigates <italic>nonmalicious</i> overfitting due to class imbalances and overtraining. Whereas backdoor mitigation and other adversarial learning defenses often <italic>trade off</i> a classifier’s accuracy to achieve robustness against attacks, our approach, inspired by ideas from adversarial learning, <italic>helps</i> the classifier’s generalization accuracy: as shown for CIFAR-10 and CIFAR-100, our approach improves both the accuracy for rare categories as well as overall. Moreover, unlike other overfitting mitigation methods, it does so with no knowledge of class imbalances, no knowledge of the training set, and without control of the training process.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 10","pages":"2840-2847"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10932822/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Sources of overfitting in deep neural net (DNN) classifiers include: 1) large class imbalances; 2) insufficient training set diversity; and 3) over-training. Recently, it was shown that backdoor data-poisoning also induces overfitting, with unusually large maximum classification margins (MMs) to the attacker’s target class. This is enabled by (unbounded) ReLU activation functions, which allow large signals to propagate in the DNN. Thus, an effective posttraining backdoor mitigation approach (with no knowledge of the training set and no knowledge or control of the training process) was proposed, informed by a small, clean (poisoning-free) data set and choosing saturation levels on neural activations to limit the DNN’s MMs. Here, we show that nonmalicious sources of overfitting also exhibit unusually large MMs. Thus, we propose novel posttraining MM-based regularization that substantially mitigates nonmalicious overfitting due to class imbalances and overtraining. Whereas backdoor mitigation and other adversarial learning defenses often trade off a classifier’s accuracy to achieve robustness against attacks, our approach, inspired by ideas from adversarial learning, helps the classifier’s generalization accuracy: as shown for CIFAR-10 and CIFAR-100, our approach improves both the accuracy for rare categories as well as overall. Moreover, unlike other overfitting mitigation methods, it does so with no knowledge of class imbalances, no knowledge of the training set, and without control of the training process.