The Open Connectome Project Data Cluster: Scalable Analysis and Vision for High-Throughput Neuroscience.

Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management Pub Date : 2013-01-01 DOI:10.1145/2484838.2484870

Randal Burns, William Gray Roncal, Dean Kleissas, Kunal Lillaney, Priya Manavalan, Eric Perlman, Daniel R Berger, Davi D Bock, Kwanghun Chung, Logan Grosenick, Narayanan Kasthuri, Nicholas C Weiler, Karl Deisseroth, Michael Kazhdan, Jeff Lichtman, R Clay Reid, Stephen J Smith, Alexander S Szalay, Joshua T Vogelstein, R Jacob Vogelstein

{"title":"The Open Connectome Project Data Cluster: Scalable Analysis and Vision for High-Throughput Neuroscience.","authors":"Randal Burns, William Gray Roncal, Dean Kleissas, Kunal Lillaney, Priya Manavalan, Eric Perlman, Daniel R Berger, Davi D Bock, Kwanghun Chung, Logan Grosenick, Narayanan Kasthuri, Nicholas C Weiler, Karl Deisseroth, Michael Kazhdan, Jeff Lichtman, R Clay Reid, Stephen J Smith, Alexander S Szalay, Joshua T Vogelstein, R Jacob Vogelstein","doi":"10.1145/2484838.2484870","DOIUrl":null,"url":null,"abstract":"We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes- neural connectivity maps of the brain-using the parallel execution of computer vision algorithms on high-performance compute clusters. These services and open-science data sets are publicly available at openconnecto.me. The system design inherits much from NoSQL scale-out and data-intensive computing architectures. We distribute data to cluster nodes by partitioning a spatial index. We direct I/O to different systems-reads to parallel disk arrays and writes to solid-state storage-to avoid I/O interference and maximize throughput. All programming interfaces are RESTful Web services, which are simple and stateless, improving scalability and usability. We include a performance evaluation of the production system, highlighting the effec-tiveness of spatial data organization.","PeriodicalId":74773,"journal":{"name":"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2013-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/2484838.2484870","citationCount":"61","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2484838.2484870","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 61

Abstract

We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes- neural connectivity maps of the brain-using the parallel execution of computer vision algorithms on high-performance compute clusters. These services and open-science data sets are publicly available at openconnecto.me. The system design inherits much from NoSQL scale-out and data-intensive computing architectures. We distribute data to cluster nodes by partitioning a spatial index. We direct I/O to different systems-reads to parallel disk arrays and writes to solid-state storage-to avoid I/O interference and maximize throughput. All programming interfaces are RESTful Web services, which are simple and stateless, improving scalability and usability. We include a performance evaluation of the production system, highlighting the effec-tiveness of spatial data organization.

查看原文本刊更多论文

开放连接体项目数据集群:高通量神经科学的可扩展分析和视觉。

我们描述了一个可扩展的数据库集群，用于高通量脑成像数据的空间分析和注释，最初用于三维电子显微镜图像堆栈，但也用于时间序列和多通道数据。该系统主要设计用于在高性能计算集群上并行执行计算机视觉算法来构建连接体(大脑的神经连接图)的工作负载。这些服务和开放科学数据集可以在openconnect .me上公开获取。系统设计继承了大量NoSQL横向扩展和数据密集型计算架构。我们通过划分空间索引将数据分布到集群节点。我们将I/O引导到不同的系统——读到并行磁盘阵列，写到固态存储——以避免I/O干扰并最大限度地提高吞吐量。所有编程接口都是RESTful Web服务，它们简单且无状态，从而提高了可伸缩性和可用性。我们包括对生产系统的绩效评估，突出了空间数据组织的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management

自引率

0.00%

发文量