Gregory A Book, Vince D Calhoun, Michael C Stevens, Godfrey D Pearlson
{"title":"与Squirrel共享神经成像数据-一种关系数据格式,用于存储原始数据到分析数据以及介于两者之间的所有内容。","authors":"Gregory A Book, Vince D Calhoun, Michael C Stevens, Godfrey D Pearlson","doi":"10.1007/s12021-025-09732-7","DOIUrl":null,"url":null,"abstract":"<p><p>Reproducibility of neuroimaging analyses and aggregation of heterogenous datasets are significant challenges in human subjects imaging research. This stems in part from a lack of an easy to use and universal data format that encompasses all steps of neuroimaging. The BIDS format has become widely adopted, however it is increasingly complex to implement as features are added, with the documentation now exceeding 500 pages. As such, there is a need for standards that can handle the complexity of the data while minimizing the complexity of the format. Here we present a simple but generalizable data sharing specification, called the squirrel format (not related to the squirrel programming language), to share imaging data in a simple, but flexible, specification. It is so named because squirrels are effective at storing significant quantities of food and knowing exactly where and when to find it. The design objectives of the format specification are to 1) store subject information, experimental parameters, raw data, analyzed data, and analysis methods 2) organize data in a human-readable hierarchy 3) enable easy sharing and dissemination of data packages. We developed a relational hierarchy with a structured representation of all steps of neuroimaging data collection and analysis, and a generalizable specification to store any modality of neuroimaging data, which satisfies the design objectives. Additionally, redundancy is minimized by using relational database principles. The specification allows all research data to be classified into one of ten object types, thus simplifying the sharing of neuroimaging data. Like how squirrels employ 'chunking', the squirrel format chunks data into a manageable number of object types. The squirrel format was developed to share neuroimaging data but can be generalized to share any imaging research.</p>","PeriodicalId":49761,"journal":{"name":"Neuroinformatics","volume":"23 3","pages":"37"},"PeriodicalIF":2.7000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sharing Neuroimaging Data with Squirrel - A Relational Data Format to Store Raw to Analyzed Data and Everything in Between.\",\"authors\":\"Gregory A Book, Vince D Calhoun, Michael C Stevens, Godfrey D Pearlson\",\"doi\":\"10.1007/s12021-025-09732-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Reproducibility of neuroimaging analyses and aggregation of heterogenous datasets are significant challenges in human subjects imaging research. This stems in part from a lack of an easy to use and universal data format that encompasses all steps of neuroimaging. The BIDS format has become widely adopted, however it is increasingly complex to implement as features are added, with the documentation now exceeding 500 pages. As such, there is a need for standards that can handle the complexity of the data while minimizing the complexity of the format. Here we present a simple but generalizable data sharing specification, called the squirrel format (not related to the squirrel programming language), to share imaging data in a simple, but flexible, specification. It is so named because squirrels are effective at storing significant quantities of food and knowing exactly where and when to find it. The design objectives of the format specification are to 1) store subject information, experimental parameters, raw data, analyzed data, and analysis methods 2) organize data in a human-readable hierarchy 3) enable easy sharing and dissemination of data packages. We developed a relational hierarchy with a structured representation of all steps of neuroimaging data collection and analysis, and a generalizable specification to store any modality of neuroimaging data, which satisfies the design objectives. Additionally, redundancy is minimized by using relational database principles. The specification allows all research data to be classified into one of ten object types, thus simplifying the sharing of neuroimaging data. Like how squirrels employ 'chunking', the squirrel format chunks data into a manageable number of object types. The squirrel format was developed to share neuroimaging data but can be generalized to share any imaging research.</p>\",\"PeriodicalId\":49761,\"journal\":{\"name\":\"Neuroinformatics\",\"volume\":\"23 3\",\"pages\":\"37\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neuroinformatics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s12021-025-09732-7\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neuroinformatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s12021-025-09732-7","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Sharing Neuroimaging Data with Squirrel - A Relational Data Format to Store Raw to Analyzed Data and Everything in Between.
Reproducibility of neuroimaging analyses and aggregation of heterogenous datasets are significant challenges in human subjects imaging research. This stems in part from a lack of an easy to use and universal data format that encompasses all steps of neuroimaging. The BIDS format has become widely adopted, however it is increasingly complex to implement as features are added, with the documentation now exceeding 500 pages. As such, there is a need for standards that can handle the complexity of the data while minimizing the complexity of the format. Here we present a simple but generalizable data sharing specification, called the squirrel format (not related to the squirrel programming language), to share imaging data in a simple, but flexible, specification. It is so named because squirrels are effective at storing significant quantities of food and knowing exactly where and when to find it. The design objectives of the format specification are to 1) store subject information, experimental parameters, raw data, analyzed data, and analysis methods 2) organize data in a human-readable hierarchy 3) enable easy sharing and dissemination of data packages. We developed a relational hierarchy with a structured representation of all steps of neuroimaging data collection and analysis, and a generalizable specification to store any modality of neuroimaging data, which satisfies the design objectives. Additionally, redundancy is minimized by using relational database principles. The specification allows all research data to be classified into one of ten object types, thus simplifying the sharing of neuroimaging data. Like how squirrels employ 'chunking', the squirrel format chunks data into a manageable number of object types. The squirrel format was developed to share neuroimaging data but can be generalized to share any imaging research.
期刊介绍:
Neuroinformatics publishes original articles and reviews with an emphasis on data structure and software tools related to analysis, modeling, integration, and sharing in all areas of neuroscience research. The editors particularly invite contributions on: (1) Theory and methodology, including discussions on ontologies, modeling approaches, database design, and meta-analyses; (2) Descriptions of developed databases and software tools, and of the methods for their distribution; (3) Relevant experimental results, such as reports accompanie by the release of massive data sets; (4) Computational simulations of models integrating and organizing complex data; and (5) Neuroengineering approaches, including hardware, robotics, and information theory studies.