Minkyu Jeon, Rishwanth Raghu, Miro Astore, Geoffrey Woollard, Ryan Feathers, Alkin Kaz, Sonya M. Hanson, Pilar Cossio, Ellen D. Zhong
{"title":"CryoBench:针对低温电子显微镜中的异质性问题的各种具有挑战性的数据集","authors":"Minkyu Jeon, Rishwanth Raghu, Miro Astore, Geoffrey Woollard, Ryan Feathers, Alkin Kaz, Sonya M. Hanson, Pilar Cossio, Ellen D. Zhong","doi":"arxiv-2408.05526","DOIUrl":null,"url":null,"abstract":"Cryo-electron microscopy (cryo-EM) is a powerful technique for determining\nhigh-resolution 3D biomolecular structures from imaging data. As this technique\ncan capture dynamic biomolecular complexes, 3D reconstruction methods are\nincreasingly being developed to resolve this intrinsic structural\nheterogeneity. However, the absence of standardized benchmarks with ground\ntruth structures and validation metrics limits the advancement of the field.\nHere, we propose CryoBench, a suite of datasets, metrics, and performance\nbenchmarks for heterogeneous reconstruction in cryo-EM. We propose five\ndatasets representing different sources of heterogeneity and degrees of\ndifficulty. These include conformational heterogeneity generated from simple\nmotions and random configurations of antibody complexes and from tens of\nthousands of structures sampled from a molecular dynamics simulation. We also\ndesign datasets containing compositional heterogeneity from mixtures of\nribosome assembly states and 100 common complexes present in cells. We then\nperform a comprehensive analysis of state-of-the-art heterogeneous\nreconstruction tools including neural and non-neural methods and their\nsensitivity to noise, and propose new metrics for quantitative comparison of\nmethods. We hope that this benchmark will be a foundational resource for\nanalyzing existing methods and new algorithmic development in both the cryo-EM\nand machine learning communities.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"61 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CryoBench: Diverse and challenging datasets for the heterogeneity problem in cryo-EM\",\"authors\":\"Minkyu Jeon, Rishwanth Raghu, Miro Astore, Geoffrey Woollard, Ryan Feathers, Alkin Kaz, Sonya M. Hanson, Pilar Cossio, Ellen D. Zhong\",\"doi\":\"arxiv-2408.05526\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cryo-electron microscopy (cryo-EM) is a powerful technique for determining\\nhigh-resolution 3D biomolecular structures from imaging data. As this technique\\ncan capture dynamic biomolecular complexes, 3D reconstruction methods are\\nincreasingly being developed to resolve this intrinsic structural\\nheterogeneity. However, the absence of standardized benchmarks with ground\\ntruth structures and validation metrics limits the advancement of the field.\\nHere, we propose CryoBench, a suite of datasets, metrics, and performance\\nbenchmarks for heterogeneous reconstruction in cryo-EM. We propose five\\ndatasets representing different sources of heterogeneity and degrees of\\ndifficulty. These include conformational heterogeneity generated from simple\\nmotions and random configurations of antibody complexes and from tens of\\nthousands of structures sampled from a molecular dynamics simulation. We also\\ndesign datasets containing compositional heterogeneity from mixtures of\\nribosome assembly states and 100 common complexes present in cells. We then\\nperform a comprehensive analysis of state-of-the-art heterogeneous\\nreconstruction tools including neural and non-neural methods and their\\nsensitivity to noise, and propose new metrics for quantitative comparison of\\nmethods. We hope that this benchmark will be a foundational resource for\\nanalyzing existing methods and new algorithmic development in both the cryo-EM\\nand machine learning communities.\",\"PeriodicalId\":501022,\"journal\":{\"name\":\"arXiv - QuanBio - Biomolecules\",\"volume\":\"61 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Biomolecules\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.05526\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Biomolecules","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.05526","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
CryoBench: Diverse and challenging datasets for the heterogeneity problem in cryo-EM
Cryo-electron microscopy (cryo-EM) is a powerful technique for determining
high-resolution 3D biomolecular structures from imaging data. As this technique
can capture dynamic biomolecular complexes, 3D reconstruction methods are
increasingly being developed to resolve this intrinsic structural
heterogeneity. However, the absence of standardized benchmarks with ground
truth structures and validation metrics limits the advancement of the field.
Here, we propose CryoBench, a suite of datasets, metrics, and performance
benchmarks for heterogeneous reconstruction in cryo-EM. We propose five
datasets representing different sources of heterogeneity and degrees of
difficulty. These include conformational heterogeneity generated from simple
motions and random configurations of antibody complexes and from tens of
thousands of structures sampled from a molecular dynamics simulation. We also
design datasets containing compositional heterogeneity from mixtures of
ribosome assembly states and 100 common complexes present in cells. We then
perform a comprehensive analysis of state-of-the-art heterogeneous
reconstruction tools including neural and non-neural methods and their
sensitivity to noise, and propose new metrics for quantitative comparison of
methods. We hope that this benchmark will be a foundational resource for
analyzing existing methods and new algorithmic development in both the cryo-EM
and machine learning communities.