Rebecca Brown Cooper, Bethany J Allen, Daniele Silvestro
{"title":"DeepDiveR – A software for deep learning estimation of palaeodiversity from fossil occurrences","authors":"Rebecca Brown Cooper, Bethany J Allen, Daniele Silvestro","doi":"10.1101/2024.09.03.610960","DOIUrl":null,"url":null,"abstract":"The incompleteness of the fossil record, in particular variation in preservation and sampling through space and time, presents a barrier to estimating changes in biodiversity which standard statistical methods struggle to account for. Here we present DeepDiveR, an R package for the DeepDive program enabling estimation of biodiversity from fossil occurrence data. The method uses a simulation-trained deep neural network to generate predictions of biodiversity change through time, while accounting for temporal, spatial and taxonomic heterogeneities in preservation. DeepDiveR can be readily used to explore the extinct biodiversity of different clades. We demonstrate the pipeline to build and customise analyses, including consideration of changes in biogeography. We also further develop the model to integrate information about modern diversity in the case of extant clades and introduce a function that automatically adjusts the parameterization of the simulations to generate training data that reflect the distribution of empirical datasets. To demonstrate the software, we analyse the fossil record of the order Carnivora through the Cenozoic, finding a peak in diversity in the Late Miocene and a 37% species loss since the Pleistocene. Our implementation includes the generation summary statistics and plots that allow for an evaluation of the model performance and diversity estimations and a configuration file that captures all parameters required to guarantee the full reproducibility of the results.","PeriodicalId":501477,"journal":{"name":"bioRxiv - Paleontology","volume":"14 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Paleontology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.03.610960","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The incompleteness of the fossil record, in particular variation in preservation and sampling through space and time, presents a barrier to estimating changes in biodiversity which standard statistical methods struggle to account for. Here we present DeepDiveR, an R package for the DeepDive program enabling estimation of biodiversity from fossil occurrence data. The method uses a simulation-trained deep neural network to generate predictions of biodiversity change through time, while accounting for temporal, spatial and taxonomic heterogeneities in preservation. DeepDiveR can be readily used to explore the extinct biodiversity of different clades. We demonstrate the pipeline to build and customise analyses, including consideration of changes in biogeography. We also further develop the model to integrate information about modern diversity in the case of extant clades and introduce a function that automatically adjusts the parameterization of the simulations to generate training data that reflect the distribution of empirical datasets. To demonstrate the software, we analyse the fossil record of the order Carnivora through the Cenozoic, finding a peak in diversity in the Late Miocene and a 37% species loss since the Pleistocene. Our implementation includes the generation summary statistics and plots that allow for an evaluation of the model performance and diversity estimations and a configuration file that captures all parameters required to guarantee the full reproducibility of the results.