Candice Schumann, Susanna Ricco, Utsav Prabhu, V. Ferrari, C. Pantofaru
{"title":"迈向更包容的人的一步公平注释","authors":"Candice Schumann, Susanna Ricco, Utsav Prabhu, V. Ferrari, C. Pantofaru","doi":"10.1145/3461702.3462594","DOIUrl":null,"url":null,"abstract":"The Open Images Dataset contains approximately 9 million images and is a widely accepted dataset for computer vision research. As is common practice for large datasets, the annotations are not exhaustive, with bounding boxes and attribute labels for only a subset of the classes in each image. In this paper, we present a new set of annotations on a subset of the Open Images dataset called the MIAP (More Inclusive Annotations for People) subset, containing bounding boxes and attributes for all of the people visible in those images. The attributes and labeling methodology for the MIAP subset were designed to enable research into model fairness. In addition, we analyze the original annotation methodology for the person class and its subclasses, discussing the resulting patterns in order to inform future annotation efforts. By considering both the original and exhaustive annotation sets, researchers can also now study how systematic patterns in training annotations affect modeling.","PeriodicalId":197336,"journal":{"name":"Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":"{\"title\":\"A Step Toward More Inclusive People Annotations for Fairness\",\"authors\":\"Candice Schumann, Susanna Ricco, Utsav Prabhu, V. Ferrari, C. Pantofaru\",\"doi\":\"10.1145/3461702.3462594\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Open Images Dataset contains approximately 9 million images and is a widely accepted dataset for computer vision research. As is common practice for large datasets, the annotations are not exhaustive, with bounding boxes and attribute labels for only a subset of the classes in each image. In this paper, we present a new set of annotations on a subset of the Open Images dataset called the MIAP (More Inclusive Annotations for People) subset, containing bounding boxes and attributes for all of the people visible in those images. The attributes and labeling methodology for the MIAP subset were designed to enable research into model fairness. In addition, we analyze the original annotation methodology for the person class and its subclasses, discussing the resulting patterns in order to inform future annotation efforts. By considering both the original and exhaustive annotation sets, researchers can also now study how systematic patterns in training annotations affect modeling.\",\"PeriodicalId\":197336,\"journal\":{\"name\":\"Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-05-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"38\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3461702.3462594\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3461702.3462594","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 38
摘要
开放图像数据集包含大约900万张图像,是一个被广泛接受的计算机视觉研究数据集。作为大型数据集的常见做法,注释不是详尽的,仅为每个图像中的类的子集使用边界框和属性标签。在本文中,我们在Open Images数据集的一个子集上提出了一组新的注释,称为MIAP (More Inclusive annotations for People)子集,其中包含了在这些图像中可见的所有人的边界框和属性。设计了MIAP子集的属性和标记方法,以便对模型公平性进行研究。此外,我们还分析了person类及其子类的原始注释方法,讨论了结果模式,以便为将来的注释工作提供信息。通过考虑原始注释集和穷举注释集,研究人员现在还可以研究训练注释中的系统模式如何影响建模。
A Step Toward More Inclusive People Annotations for Fairness
The Open Images Dataset contains approximately 9 million images and is a widely accepted dataset for computer vision research. As is common practice for large datasets, the annotations are not exhaustive, with bounding boxes and attribute labels for only a subset of the classes in each image. In this paper, we present a new set of annotations on a subset of the Open Images dataset called the MIAP (More Inclusive Annotations for People) subset, containing bounding boxes and attributes for all of the people visible in those images. The attributes and labeling methodology for the MIAP subset were designed to enable research into model fairness. In addition, we analyze the original annotation methodology for the person class and its subclasses, discussing the resulting patterns in order to inform future annotation efforts. By considering both the original and exhaustive annotation sets, researchers can also now study how systematic patterns in training annotations affect modeling.