基于无标签图像特征学习和概率校准的海洋生物鲁棒检测

IF 6.3 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Tobias Schanz, Klas Ove Möller, S. Rühl, D. S. Greenberg
{"title":"基于无标签图像特征学习和概率校准的海洋生物鲁棒检测","authors":"Tobias Schanz, Klas Ove Möller, S. Rühl, D. S. Greenberg","doi":"10.1088/2632-2153/ace417","DOIUrl":null,"url":null,"abstract":"Advances in in situ marine life imaging have significantly increased the size and quality of available datasets, but automatic image analysis has not kept pace. Machine learning has shown promise for image processing, but its effectiveness is limited by several open challenges: the requirement for large expert-labeled training datasets, disagreement among experts, under-representation of various species and unreliable or overconfident predictions. To overcome these obstacles for automated underwater imaging, we combine and test recent developments in deep classifier networks and self-supervised feature learning. We use unlabeled images for pretraining deep neural networks to extract task-relevant image features, allowing learning algorithms to cope with scarcity in expert labels, and carefully evaluate performance in subsequent label-based tasks. Performance on rare classes is improved by applying data rebalancing together with a Bayesian correction to avoid biasing inferred in situ class frequencies. A divergence-based loss allows training on multiple, conflicting labels for the same image, leading to better estimates of uncertainty which we quantify with a novel accuracy measure. Together, these techniques can reduce the required label counts ∼100-fold while maintaining the accuracy of standard supervised training, shorten training time, cope with expert disagreement and reduce overconfidence.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.3000,"publicationDate":"2023-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Robust detection of marine life with label-free image feature learning and probability calibration\",\"authors\":\"Tobias Schanz, Klas Ove Möller, S. Rühl, D. S. Greenberg\",\"doi\":\"10.1088/2632-2153/ace417\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Advances in in situ marine life imaging have significantly increased the size and quality of available datasets, but automatic image analysis has not kept pace. Machine learning has shown promise for image processing, but its effectiveness is limited by several open challenges: the requirement for large expert-labeled training datasets, disagreement among experts, under-representation of various species and unreliable or overconfident predictions. To overcome these obstacles for automated underwater imaging, we combine and test recent developments in deep classifier networks and self-supervised feature learning. We use unlabeled images for pretraining deep neural networks to extract task-relevant image features, allowing learning algorithms to cope with scarcity in expert labels, and carefully evaluate performance in subsequent label-based tasks. Performance on rare classes is improved by applying data rebalancing together with a Bayesian correction to avoid biasing inferred in situ class frequencies. A divergence-based loss allows training on multiple, conflicting labels for the same image, leading to better estimates of uncertainty which we quantify with a novel accuracy measure. Together, these techniques can reduce the required label counts ∼100-fold while maintaining the accuracy of standard supervised training, shorten training time, cope with expert disagreement and reduce overconfidence.\",\"PeriodicalId\":33757,\"journal\":{\"name\":\"Machine Learning Science and Technology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2023-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine Learning Science and Technology\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.1088/2632-2153/ace417\",\"RegionNum\":2,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning Science and Technology","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/2632-2153/ace417","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

原位海洋生物成像的进步显著提高了可用数据集的大小和质量,但自动图像分析没有跟上步伐。机器学习在图像处理方面显示出了前景,但其有效性受到几个公开挑战的限制:对大型专家标记的训练数据集的需求、专家之间的分歧、各种物种的代表性不足以及不可靠或过于自信的预测。为了克服自动水下成像的这些障碍,我们结合并测试了深度分类器网络和自监督特征学习的最新发展。我们使用未标记的图像来预训练深度神经网络,以提取与任务相关的图像特征,使学习算法能够应对专家标签的稀缺性,并仔细评估后续基于标签的任务的性能。通过应用数据再平衡和贝叶斯校正来提高稀有类的性能,以避免对推断的原位类频率产生偏差。基于散度的损失允许对同一图像的多个冲突标签进行训练,从而更好地估计不确定性,我们用一种新的精度度量来量化不确定性。总之,这些技术可以将所需的标签数量减少约100倍,同时保持标准监督训练的准确性,缩短训练时间,应对专家的分歧,减少过度自信。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Robust detection of marine life with label-free image feature learning and probability calibration
Advances in in situ marine life imaging have significantly increased the size and quality of available datasets, but automatic image analysis has not kept pace. Machine learning has shown promise for image processing, but its effectiveness is limited by several open challenges: the requirement for large expert-labeled training datasets, disagreement among experts, under-representation of various species and unreliable or overconfident predictions. To overcome these obstacles for automated underwater imaging, we combine and test recent developments in deep classifier networks and self-supervised feature learning. We use unlabeled images for pretraining deep neural networks to extract task-relevant image features, allowing learning algorithms to cope with scarcity in expert labels, and carefully evaluate performance in subsequent label-based tasks. Performance on rare classes is improved by applying data rebalancing together with a Bayesian correction to avoid biasing inferred in situ class frequencies. A divergence-based loss allows training on multiple, conflicting labels for the same image, leading to better estimates of uncertainty which we quantify with a novel accuracy measure. Together, these techniques can reduce the required label counts ∼100-fold while maintaining the accuracy of standard supervised training, shorten training time, cope with expert disagreement and reduce overconfidence.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Machine Learning Science and Technology
Machine Learning Science and Technology Computer Science-Artificial Intelligence
CiteScore
9.10
自引率
4.40%
发文量
86
审稿时长
5 weeks
期刊介绍: Machine Learning Science and Technology is a multidisciplinary open access journal that bridges the application of machine learning across the sciences with advances in machine learning methods and theory as motivated by physical insights. Specifically, articles must fall into one of the following categories: advance the state of machine learning-driven applications in the sciences or make conceptual, methodological or theoretical advances in machine learning with applications to, inspiration from, or motivated by scientific problems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信