AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD)最新文献

Statement of Peer Review 同行评审声明

AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD) Pub Date : 2022-05-31 DOI: 10.3390/cmsf2022003012

Kuan-Chuan Peng, Ziyan Wu

引用次数: 0

Age Should Not Matter: Towards More Accurate Pedestrian Detection via Self-Training 年龄无关紧要:通过自我训练实现更准确的行人检测

AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD) Pub Date : 2022-05-24 DOI: 10.3390/cmsf2022003011

Shunsuke Kogure, Kai Watabe, Ryosuke Yamada, Y. Aoki, Akio Nakamura, Hirokatsu Kataoka

{"title":"Age Should Not Matter: Towards More Accurate Pedestrian Detection via Self-Training","authors":"Shunsuke Kogure, Kai Watabe, Ryosuke Yamada, Y. Aoki, Akio Nakamura, Hirokatsu Kataoka","doi":"10.3390/cmsf2022003011","DOIUrl":"https://doi.org/10.3390/cmsf2022003011","url":null,"abstract":"Why is there a the disparity in the miss rates of pedestrian detection between different age attributes? In this study, we propose to (i) improve the accuracy of pedestrian detection using our pre-trained model and (ii) explore the causes of this disparity. In order to improve detection accuracy, we extend a pedestrian detection pre-training dataset, the Weakly Supervised Pedestrian Dataset (WSPD), by means of self-training, to construct our Self-Trained Person Dataset (STPD). More-over, we hypothesise the cause of the miss rate as being due to three biases: 1) the apparent bias towards “adults” versus “children,” 2) the quantity of training data bias against “chil- dren,” and 3) the scale bias of the bounding box. In addition, we constructed an evaluation dataset by manually annotat- ing “adult” and “child” bounding boxes to the INRIA Person Dataset. As a result, we confirm that the miss rate was re- duced by up to 0.4% for adults and up to 3.9% for children. In addition, we discuss the impact of the size and appearance of the bounding boxes on the disparity in miss rates and pro-vide an outlook for future research.","PeriodicalId":127261,"journal":{"name":"AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117146113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Dual Complementary Prototype Learning for Few-Shot Segmentation 基于双互补原型学习的少镜头分割

AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD) Pub Date : 2022-04-29 DOI: 10.3390/cmsf2022003008

Q. Ren, Jie Chen

引用次数: 2

Extracting Salient Facts from Company Reviews with Scarce Labels 从缺少标签的公司评论中提取重要事实

AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD) Pub Date : 2022-04-29 DOI: 10.3390/cmsf2022003009

Jinfeng Li, Nikita Bhutani, Alexander Whedon, Chieh-Yang Huang, Estevam Hruschka, Yoshihiko Suhara

{"title":"Extracting Salient Facts from Company Reviews with Scarce Labels","authors":"Jinfeng Li, Nikita Bhutani, Alexander Whedon, Chieh-Yang Huang, Estevam Hruschka, Yoshihiko Suhara","doi":"10.3390/cmsf2022003009","DOIUrl":"https://doi.org/10.3390/cmsf2022003009","url":null,"abstract":"In this paper, we propose the task of extracting salient facts from online company reviews. Salient facts present unique and distinctive information about a company, which helps the user in deciding whether to apply to the company. We formulate the salient fact extraction task as a text classification problem, and leverage pretrained language models to tackle the problem. However, the scarcity of salient facts in company reviews causes a serious label imbalance issue, which hinders taking full advantage of pretrained language models. To address the issue, we developed two data enrichment methods: first, representation enrichment, which highlights uncommon tokens by appending special tokens, and second, label propagation, which interactively creates pseudopositive examples from unlabeled data. Experimental results on an online company review corpus show that our approach improves the performance of pretrained language models by up to an F1 score of 0.24. We also confirm that our approach competitively performs well against the state-of-the-art data augmentation method on the SemEval 2019 benchmark even when trained with only 20% of","PeriodicalId":127261,"journal":{"name":"AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125432127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Super-Resolution for Brain MR Images from a Significantly Small Amount of Training Data 基于少量训练数据的脑MR图像的超分辨率

AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD) Pub Date : 2022-04-27 DOI: 10.3390/cmsf2022003007

Kumpei Ikuta, H. Iyatomi, K. Oishi, on behalf of the Alzheimer’s Disease Neuroimaging Initiative

{"title":"Super-Resolution for Brain MR Images from a Significantly Small Amount of Training Data","authors":"Kumpei Ikuta, H. Iyatomi, K. Oishi, on behalf of the Alzheimer’s Disease Neuroimaging Initiative","doi":"10.3390/cmsf2022003007","DOIUrl":"https://doi.org/10.3390/cmsf2022003007","url":null,"abstract":"article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at Abstract: We propose two essential techniques to effectively train generative adversarial network-based super-resolution networks for brain magnetic resonance images, even when only a small number of training samples are available. First, stochastic patch sampling is proposed, which in-creases training samples by sampling many small patches from the input image. However, sampling patches and combining them causes unpleasant artifacts around patch boundaries. The second proposed method, an artifact-suppressing discriminator, suppresses the artifacts by taking two-channel input containing an original high-resolution image and a generated image. With the introduction of the proposed techniques, the network achieved generation of natural-looking MR images from only ~40 training images, and improved the area-under-curve score on Alzheimer’s disease from 76.17% to 81.57%.","PeriodicalId":127261,"journal":{"name":"AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126994606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Quantifying Bias in a Face Verification System 人脸验证系统中的量化偏差

AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD) Pub Date : 2022-04-20 DOI: 10.3390/cmsf2022003006

Megan Frisella, Pooya Khorrami, J. Matterer, K. Kratkiewicz, P. Torres-Carrasquillo

{"title":"Quantifying Bias in a Face Verification System","authors":"Megan Frisella, Pooya Khorrami, J. Matterer, K. Kratkiewicz, P. Torres-Carrasquillo","doi":"10.3390/cmsf2022003006","DOIUrl":"https://doi.org/10.3390/cmsf2022003006","url":null,"abstract":": Machine learning models perform face veriﬁcation (FV) for a variety of highly consequential applications, such as biometric authentication, face identiﬁcation, and surveillance. Many state-of-the-art FV systems suffer from unequal performance across demographic groups, which is commonly overlooked by evaluation measures that do not assess population-speciﬁc performance. Deployed systems with bias may result in serious harm against individuals or groups who experience underperformance. We explore several fairness deﬁnitions and metrics, attempting to quantify bias in Google’s FaceNet model. In addition to statistical fairness metrics, we analyze clustered face embeddings produced by the FV model. We link well-clustered embeddings (well-deﬁned, dense clusters) for a demographic group to biased model performance against that group. We present the intuition that FV systems underperform on protected demographic groups because they are less sensitive to differences between features within those groups, as evidenced by clustered embeddings. We show how this performance discrepancy results from a combination of representation and aggregation bias. death times for White face embeddings to later than other race groups ( p < 0.05 for W × A , W × I , and W × B t -tests), indicating that White embeddings are more in the embedding space. The other race groups have peak death times that are taller and earlier than the White race group. The shorter and wider peak for the White subgroup means that there is more variety (higher variance) in H 0 death times, rather than the consistent peak around 0.8 with less variance for other race groups. This shows that there is more variance for White face distribution in the embedding space compared to other race groups, a trend that was not present in the centroid distance distribution for race groups, which showed four bell-shaped density plots. Thus, our analysis of the ( H 0 ) death times supports previous ﬁndings that the White race group is clustered differently to other race groups. We note that there is less inequality in H 0 death times for female vs. male faces, despite our p -value indicating that this discrepancy may be signiﬁcant ( p < 0.05).","PeriodicalId":127261,"journal":{"name":"AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132036575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

DAP-SDD: Distribution-Aware Pseudo Labeling for Small Defect Detection 小缺陷检测的分布感知伪标记

AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD) Pub Date : 2022-04-20 DOI: 10.3390/cmsf2022003005

Xiaoyan Zhuo, Wolfgang Rahfeldt, Xiaoqian Zhang, Ted Doros, S. Son

{"title":"DAP-SDD: Distribution-Aware Pseudo Labeling for Small Defect Detection","authors":"Xiaoyan Zhuo, Wolfgang Rahfeldt, Xiaoqian Zhang, Ted Doros, S. Son","doi":"10.3390/cmsf2022003005","DOIUrl":"https://doi.org/10.3390/cmsf2022003005","url":null,"abstract":": Detecting defects, especially when they are small in the early manufacturing stages, is critical to achieving a high yield in industrial applications. While numerous modern deep learning models can improve detection performance, they become less effective in detecting small defects in practical applications due to the scarcity of labeled data and signiﬁcant class imbalance in multiple dimensions. In this work, we propose a distribution-aware pseudo labeling method (DAP-SDD) to detect small defects accurately while using limited labeled data effectively. Speciﬁcally, we apply bootstrapping on limited labeled data and then utilize the approximated label distribution to guide pseudo label propagation. Moreover, we propose to use the t-distribution conﬁdence interval for threshold setting to generate more pseudo labels with high conﬁdence. DAP-SDD also incorporates data augmentation to enhance the model’s performance and robustness. We conduct extensive experiments on various datasets to validate the proposed method. Our evaluation results show that, overall, our proposed method requires less than 10% of labeled data to achieve comparable results of using a fully-labeled (100%) dataset and outperforms the state-of-the-art methods. For a dataset of wafer images, our proposed model can achieve above 0.93 of AP (average precision) with only four labeled images (i.e., 2% of labeled data).","PeriodicalId":127261,"journal":{"name":"AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125236256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

The Details Matter: Preventing Class Collapse in Supervised Contrastive Learning 细节至关重要:在监督对比学习中防止班级崩溃

AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD) Pub Date : 2022-04-15 DOI: 10.3390/cmsf2022003004

Daniel Y. Fu, Mayee F. Chen, Michael Zhang, K. Fatahalian, C. Ré

{"title":"The Details Matter: Preventing Class Collapse in Supervised Contrastive Learning","authors":"Daniel Y. Fu, Mayee F. Chen, Michael Zhang, K. Fatahalian, C. Ré","doi":"10.3390/cmsf2022003004","DOIUrl":"https://doi.org/10.3390/cmsf2022003004","url":null,"abstract":": Supervised contrastive learning optimizes a loss that pushes together embeddings of points from the same class while pulling apart embeddings of points from different classes. Class collapse—when every point from the same class has the same embedding—minimizes this loss but loses critical information that is not encoded in the class labels. For instance, the “cat” label does not capture unlabeled categories such as breeds, poses, or backgrounds (which we call “strata”). As a result, class collapse produces embeddings that are less useful for downstream applications such as transfer learning and achieves suboptimal generalization error when there are strata. We explore a simple modiﬁcation to supervised contrastive loss that aims to prevent class collapse by uniformly pulling apart individual points from the same class. We seek to understand the effects of this loss by examining how it embeds strata of different sizes, ﬁnding that it clusters larger strata more tightly than smaller strata. As a result, our loss function produces embeddings that better distinguish strata in embedding space, which produces lift on three downstream applications: 4.4 points on coarse-to-ﬁne transfer learning, 2.5 points on worst-group robustness, and 1.0 points on minimal coreset construction. Our loss also produces more accurate models, with up to 4.0 points of lift across 9 tasks.","PeriodicalId":127261,"journal":{"name":"AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD)","volume":"252 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123750541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Measuring Embedded Human-Like Biases in Face Recognition Models 测量人脸识别模型中嵌入的类人偏见

AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD) Pub Date : 2022-04-11 DOI: 10.3390/cmsf2022003002

Sangeun Lee, Soyoung Oh, Minji Kim, Eunil Park

引用次数: 2

Measuring Gender Bias in Contextualized Embeddings 情境化嵌入中的性别偏见测量

AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD) Pub Date : 2022-04-11 DOI: 10.3390/cmsf2022003003

Styliani Katsarou, Borja Rodríguez-Gálvez, Jesse Shanahan

{"title":"Measuring Gender Bias in Contextualized Embeddings","authors":"Styliani Katsarou, Borja Rodríguez-Gálvez, Jesse Shanahan","doi":"10.3390/cmsf2022003003","DOIUrl":"https://doi.org/10.3390/cmsf2022003003","url":null,"abstract":": Transformer models are now increasingly being used in real-world applications. Indiscrim-inately using these models as automated tools may propagate biases in ways we do not realize. To responsibly direct actions that will combat this problem, it is of crucial importance that we detect and quantify these biases. Robust methods have been developed to measure bias in non-contextualized embeddings. Nevertheless, these methods fail to apply to contextualized embeddings due to their mutable nature. Our study focuses on the detection and measurement of stereotypical biases associated with gender in the embeddings of T5 and mT5. We quantify bias by measuring the gender polarity of T5’s word embeddings for various professions. To measure gender polarity, we use a stable gender direction that we detect in the model’s embedding space. We also measure gender bias with respect to a speciﬁc downstream task and compare Swedish with English, as well as various sizes of the T5 model and its multilingual variant. The insights from our exploration indicate that the use of a stable gender direction, even in a Transformer’s mutable embedding space, can be a robust method to measure bias. We show that higher status professions are associated more with the male gender than the female gender. In addition, our method suggests that the Swedish language carries less bias associated with gender than English, and the higher manifestation of gender bias is associated with the use of larger language models.","PeriodicalId":127261,"journal":{"name":"AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD)","volume":"1 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131539652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4