提高少数群体比例对泛化有何影响?关于群体失衡的单隐藏层神经网络理论研究

IF 8.7 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Hongkang Li;Shuai Zhang;Yihua Zhang;Meng Wang;Sijia Liu;Pin-Yu Chen
{"title":"提高少数群体比例对泛化有何影响?关于群体失衡的单隐藏层神经网络理论研究","authors":"Hongkang Li;Shuai Zhang;Yihua Zhang;Meng Wang;Sijia Liu;Pin-Yu Chen","doi":"10.1109/JSTSP.2024.3374593","DOIUrl":null,"url":null,"abstract":"Group imbalance has been a known problem in empirical risk minimization (ERM), where the achieved high \n<italic>average</i>\n accuracy is accompanied by low accuracy in a \n<italic>minority</i>\n group. Despite algorithmic efforts to improve the minority group accuracy, a theoretical generalization analysis of ERM on individual groups remains elusive. By formulating the group imbalance problem with the Gaussian Mixture Model, this paper quantifies the impact of individual groups on the sample complexity, the convergence rate, and the average and group-level testing performance. Although our theoretical framework is centered on binary classification using a one-hidden-layer neural network, to the best of our knowledge, we provide the first theoretical analysis of the group-level generalization of ERM in addition to the commonly studied average generalization performance. Sample insights of our theoretical results include that when all group-level co-variance is in the medium regime and all mean are close to zero, the learning performance is most desirable in the sense of a small sample complexity, a fast training rate, and a high average and group-level testing accuracy. Moreover, we show that increasing the fraction of the minority group in the training data does not necessarily improve the generalization performance of the minority group. Our theoretical results are validated on both synthetic and empirical datasets, such as CelebA and CIFAR-10 in image classification.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"18 2","pages":"216-231"},"PeriodicalIF":8.7000,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"How Does Promoting the Minority Fraction Affect Generalization? A Theoretical Study of One-Hidden-Layer Neural Network on Group Imbalance\",\"authors\":\"Hongkang Li;Shuai Zhang;Yihua Zhang;Meng Wang;Sijia Liu;Pin-Yu Chen\",\"doi\":\"10.1109/JSTSP.2024.3374593\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Group imbalance has been a known problem in empirical risk minimization (ERM), where the achieved high \\n<italic>average</i>\\n accuracy is accompanied by low accuracy in a \\n<italic>minority</i>\\n group. Despite algorithmic efforts to improve the minority group accuracy, a theoretical generalization analysis of ERM on individual groups remains elusive. By formulating the group imbalance problem with the Gaussian Mixture Model, this paper quantifies the impact of individual groups on the sample complexity, the convergence rate, and the average and group-level testing performance. Although our theoretical framework is centered on binary classification using a one-hidden-layer neural network, to the best of our knowledge, we provide the first theoretical analysis of the group-level generalization of ERM in addition to the commonly studied average generalization performance. Sample insights of our theoretical results include that when all group-level co-variance is in the medium regime and all mean are close to zero, the learning performance is most desirable in the sense of a small sample complexity, a fast training rate, and a high average and group-level testing accuracy. Moreover, we show that increasing the fraction of the minority group in the training data does not necessarily improve the generalization performance of the minority group. Our theoretical results are validated on both synthetic and empirical datasets, such as CelebA and CIFAR-10 in image classification.\",\"PeriodicalId\":13038,\"journal\":{\"name\":\"IEEE Journal of Selected Topics in Signal Processing\",\"volume\":\"18 2\",\"pages\":\"216-231\"},\"PeriodicalIF\":8.7000,\"publicationDate\":\"2024-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Journal of Selected Topics in Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10462147/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Selected Topics in Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10462147/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

群体不平衡一直是经验风险最小化(ERM)中的一个已知问题,即在获得高平均准确率的同时,少数群体的准确率却很低。尽管在算法上努力提高少数群体的准确率,但在单个群体上对 ERM 的理论概括分析仍未实现。通过用高斯混杂模型提出组不平衡问题,本文量化了单个组对样本复杂度、收敛速度、平均测试性能和组级测试性能的影响。虽然我们的理论框架以使用单隐层神经网络的二元分类为中心,但据我们所知,除了通常研究的平均泛化性能外,我们还首次对 ERM 的组级泛化进行了理论分析。我们的理论结果的样本启示包括:当所有组级共变都处于中等水平且所有均值都接近于零时,学习性能最理想,即样本复杂度小、训练速度快、平均和组级测试精度高。此外,我们还证明,增加训练数据中少数群体的比例并不一定能提高少数群体的泛化性能。我们的理论结果在合成数据集和经验数据集上都得到了验证,如图像分类中的 CelebA 和 CIFAR-10。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
How Does Promoting the Minority Fraction Affect Generalization? A Theoretical Study of One-Hidden-Layer Neural Network on Group Imbalance
Group imbalance has been a known problem in empirical risk minimization (ERM), where the achieved high average accuracy is accompanied by low accuracy in a minority group. Despite algorithmic efforts to improve the minority group accuracy, a theoretical generalization analysis of ERM on individual groups remains elusive. By formulating the group imbalance problem with the Gaussian Mixture Model, this paper quantifies the impact of individual groups on the sample complexity, the convergence rate, and the average and group-level testing performance. Although our theoretical framework is centered on binary classification using a one-hidden-layer neural network, to the best of our knowledge, we provide the first theoretical analysis of the group-level generalization of ERM in addition to the commonly studied average generalization performance. Sample insights of our theoretical results include that when all group-level co-variance is in the medium regime and all mean are close to zero, the learning performance is most desirable in the sense of a small sample complexity, a fast training rate, and a high average and group-level testing accuracy. Moreover, we show that increasing the fraction of the minority group in the training data does not necessarily improve the generalization performance of the minority group. Our theoretical results are validated on both synthetic and empirical datasets, such as CelebA and CIFAR-10 in image classification.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Journal of Selected Topics in Signal Processing
IEEE Journal of Selected Topics in Signal Processing 工程技术-工程:电子与电气
CiteScore
19.00
自引率
1.30%
发文量
135
审稿时长
3 months
期刊介绍: The IEEE Journal of Selected Topics in Signal Processing (JSTSP) focuses on the Field of Interest of the IEEE Signal Processing Society, which encompasses the theory and application of various signal processing techniques. These techniques include filtering, coding, transmitting, estimating, detecting, analyzing, recognizing, synthesizing, recording, and reproducing signals using digital or analog devices. The term "signal" covers a wide range of data types, including audio, video, speech, image, communication, geophysical, sonar, radar, medical, musical, and others. The journal format allows for in-depth exploration of signal processing topics, enabling the Society to cover both established and emerging areas. This includes interdisciplinary fields such as biomedical engineering and language processing, as well as areas not traditionally associated with engineering.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信