Ariana Peck, Yue Yu, Jonathan Schwartz, Anchi Cheng, Utz Heinrich Ermel, Joshua Hutchings, Saugat Kandel, Dari Kimanius, Elizabeth A. Montabana, Daniel Serwas, Hannah Siems, Feng Wang, Zhuowen Zhao, Shawn Zheng, Matthias Haury, David A. Agard, Clinton S. Potter, Bridget Carragher, Kyle Harrington, Mohammadreza Paraan
{"title":"A realistic phantom dataset for benchmarking cryo-ET data annotation","authors":"Ariana Peck, Yue Yu, Jonathan Schwartz, Anchi Cheng, Utz Heinrich Ermel, Joshua Hutchings, Saugat Kandel, Dari Kimanius, Elizabeth A. Montabana, Daniel Serwas, Hannah Siems, Feng Wang, Zhuowen Zhao, Shawn Zheng, Matthias Haury, David A. Agard, Clinton S. Potter, Bridget Carragher, Kyle Harrington, Mohammadreza Paraan","doi":"10.1038/s41592-025-02800-5","DOIUrl":null,"url":null,"abstract":"Cryo-electron tomography (cryo-ET) is a powerful technique for imaging molecular complexes in their native cellular environments. However, identifying the vast majority of molecular species in cellular tomograms remains prohibitively difficult. Machine learning (ML) methods provide an opportunity to automate the annotation process, but algorithm development has been hindered by the lack of large, standardized datasets. Here we present an experimental phantom dataset with comprehensive ground-truth annotations for six molecular species to spur new algorithm development and benchmark existing tools. This annotated dataset is available on the CryoET Data Portal with infrastructure to streamline access for methods developers across fields. A standardized, realistic phantom dataset consisting of ground-truth annotations for six diverse molecular species is provided as a community resource for cryo-electron-tomography algorithm benchmarking.","PeriodicalId":18981,"journal":{"name":"Nature Methods","volume":"22 9","pages":"1819-1823"},"PeriodicalIF":32.1000,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12446061/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Methods","FirstCategoryId":"99","ListUrlMain":"https://www.nature.com/articles/s41592-025-02800-5","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Cryo-electron tomography (cryo-ET) is a powerful technique for imaging molecular complexes in their native cellular environments. However, identifying the vast majority of molecular species in cellular tomograms remains prohibitively difficult. Machine learning (ML) methods provide an opportunity to automate the annotation process, but algorithm development has been hindered by the lack of large, standardized datasets. Here we present an experimental phantom dataset with comprehensive ground-truth annotations for six molecular species to spur new algorithm development and benchmark existing tools. This annotated dataset is available on the CryoET Data Portal with infrastructure to streamline access for methods developers across fields. A standardized, realistic phantom dataset consisting of ground-truth annotations for six diverse molecular species is provided as a community resource for cryo-electron-tomography algorithm benchmarking.
期刊介绍:
Nature Methods is a monthly journal that focuses on publishing innovative methods and substantial enhancements to fundamental life sciences research techniques. Geared towards a diverse, interdisciplinary readership of researchers in academia and industry engaged in laboratory work, the journal offers new tools for research and emphasizes the immediate practical significance of the featured work. It publishes primary research papers and reviews recent technical and methodological advancements, with a particular interest in primary methods papers relevant to the biological and biomedical sciences. This includes methods rooted in chemistry with practical applications for studying biological problems.