Moritz D. Lürig, Emanuela Di Martino, Arthur Porto
{"title":"BioEncoder: A metric learning toolkit for comparative organismal biology","authors":"Moritz D. Lürig, Emanuela Di Martino, Arthur Porto","doi":"10.1111/ele.14495","DOIUrl":null,"url":null,"abstract":"<p>In the realm of biological image analysis, deep learning (DL) has become a core toolkit, for example for segmentation and classification. However, conventional DL methods are challenged by large biodiversity datasets characterized by unbalanced classes and hard-to-distinguish phenotypic differences between them. Here we present BioEncoder, a user-friendly toolkit for metric learning, which overcomes these challenges by focussing on learning relationships between individual data points rather than on the separability of classes. BioEncoder is released as a Python package, created for ease of use and flexibility across diverse datasets. It features taxon-agnostic data loaders, custom augmentation options, and simple hyperparameter adjustments through text-based configuration files. The toolkit's significance lies in its potential to unlock new research avenues in biological image analysis while democratizing access to advanced deep metric learning techniques. BioEncoder focuses on the urgent need for toolkits bridging the gap between complex DL pipelines and practical applications in biological research.</p>","PeriodicalId":161,"journal":{"name":"Ecology Letters","volume":"27 8","pages":""},"PeriodicalIF":7.6000,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/ele.14495","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecology Letters","FirstCategoryId":"93","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/ele.14495","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
In the realm of biological image analysis, deep learning (DL) has become a core toolkit, for example for segmentation and classification. However, conventional DL methods are challenged by large biodiversity datasets characterized by unbalanced classes and hard-to-distinguish phenotypic differences between them. Here we present BioEncoder, a user-friendly toolkit for metric learning, which overcomes these challenges by focussing on learning relationships between individual data points rather than on the separability of classes. BioEncoder is released as a Python package, created for ease of use and flexibility across diverse datasets. It features taxon-agnostic data loaders, custom augmentation options, and simple hyperparameter adjustments through text-based configuration files. The toolkit's significance lies in its potential to unlock new research avenues in biological image analysis while democratizing access to advanced deep metric learning techniques. BioEncoder focuses on the urgent need for toolkits bridging the gap between complex DL pipelines and practical applications in biological research.
期刊介绍:
Ecology Letters serves as a platform for the rapid publication of innovative research in ecology. It considers manuscripts across all taxa, biomes, and geographic regions, prioritizing papers that investigate clearly stated hypotheses. The journal publishes concise papers of high originality and general interest, contributing to new developments in ecology. Purely descriptive papers and those that only confirm or extend previous results are discouraged.