Roberta Hunt,José L Reyes-Hernández,Josh Jenkins Shaw,Alexey Solodovnikov,Kim Steenstrup Pedersen
{"title":"Integrating Deep Learning Derived Morphological Traits and Molecular Data for Total-Evidence Phylogenetics: Lessons from Digitized Collections.","authors":"Roberta Hunt,José L Reyes-Hernández,Josh Jenkins Shaw,Alexey Solodovnikov,Kim Steenstrup Pedersen","doi":"10.1093/sysbio/syae072","DOIUrl":null,"url":null,"abstract":"Deep learning has previously shown success in automatically generating morphological traits which carry a phylogenetic signal. In this paper we explore combining molecular data with deep learning derived morphological traits from images of pinned insects to generate total-evidence phylogenies and we reveal challenges. Deep learning derived morphological traits, while informative, underperform when used in isolation compared to molecular analyses. However, they can improve molecular results in total evidence settings. We use a dataset of rove beetle images to compare the effect of different dataset splits and deep metric loss functions on morphological and total evidence results. We find a slight preference for the cladistic dataset split and contrastive loss function. Additionally, we explore the effect of varying the number of genes used in inference and find that different gene combinations provide the best results when used on their own vs in total evidence analysis. Despite the promising nature of integrating deep learning techniques with molecular data, challenges remain regarding the strength of the phylogenetic signal and the resource demands of data acquisition. We suggest that future work focus on improved trait extraction and the development of disentangled networks to better interpret the derived traits, thus expanding the applicability of these methods in phylogenetic studies.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"107 1","pages":""},"PeriodicalIF":6.1000,"publicationDate":"2025-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systematic Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/sysbio/syae072","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EVOLUTIONARY BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Deep learning has previously shown success in automatically generating morphological traits which carry a phylogenetic signal. In this paper we explore combining molecular data with deep learning derived morphological traits from images of pinned insects to generate total-evidence phylogenies and we reveal challenges. Deep learning derived morphological traits, while informative, underperform when used in isolation compared to molecular analyses. However, they can improve molecular results in total evidence settings. We use a dataset of rove beetle images to compare the effect of different dataset splits and deep metric loss functions on morphological and total evidence results. We find a slight preference for the cladistic dataset split and contrastive loss function. Additionally, we explore the effect of varying the number of genes used in inference and find that different gene combinations provide the best results when used on their own vs in total evidence analysis. Despite the promising nature of integrating deep learning techniques with molecular data, challenges remain regarding the strength of the phylogenetic signal and the resource demands of data acquisition. We suggest that future work focus on improved trait extraction and the development of disentangled networks to better interpret the derived traits, thus expanding the applicability of these methods in phylogenetic studies.
期刊介绍:
Systematic Biology is the bimonthly journal of the Society of Systematic Biologists. Papers for the journal are original contributions to the theory, principles, and methods of systematics as well as phylogeny, evolution, morphology, biogeography, paleontology, genetics, and the classification of all living things. A Points of View section offers a forum for discussion, while book reviews and announcements of general interest are also featured.