Tim Cadman, Mariska K Slofstra, Marije A van der Geest, Demetris Avraam, Tom R P Bishop, Tommy de Boer, Liesbeth Duijts, Sido Haakma, Eleanor Hyde, Vincent Jaddoe, Tarik Karramass, Fleur Kelpin, Yannick Marcon, Angela Pinot de Moira, Dick Postma, Clemens Tolboom, Ruben L Veenstra, Stuart Wheater, Marieke Welten, Rebecca C Wilson, Erik Zwart, Morris Swertz
{"title":"MOLGENIS Armadillo:使用 DataSHIELD 进行联合分析的轻量级服务器。","authors":"Tim Cadman, Mariska K Slofstra, Marije A van der Geest, Demetris Avraam, Tom R P Bishop, Tommy de Boer, Liesbeth Duijts, Sido Haakma, Eleanor Hyde, Vincent Jaddoe, Tarik Karramass, Fleur Kelpin, Yannick Marcon, Angela Pinot de Moira, Dick Postma, Clemens Tolboom, Ruben L Veenstra, Stuart Wheater, Marieke Welten, Rebecca C Wilson, Erik Zwart, Morris Swertz","doi":"10.1093/bioinformatics/btae726","DOIUrl":null,"url":null,"abstract":"<p><strong>Summary: </strong>Extensive human health data from cohort studies, national registries, and biobanks can reveal lifecourse risk factors impacting health. Combining these sources offers increased statistical power, rare outcome detection, replication of findings, and extended study periods. Traditionally, this required data transfer to a central location or separate partner analyses with pooled summary statistics, posing ethical, legal, and time constraints. Federated analysis-which involves remote data analysis without sharing individual-level data-is a promising alternative. One promising solution is DataSHIELD (https://datashield.org/), an open-source R based implementation. To enable federated analysis, data owners need a user-friendly way to install the federated infrastructure and manage users and data. Here, we present MOLGENIS Armadillo: a lightweight server for federated analysis solutions such as DataSHIELD.</p><p><strong>Availability and implementation: </strong>Armadillo is implemented as a collection of three packages freely available under the open source licence LGPLv3: two R packages downloadable from the Comprehensive R Archive Network (CRAN) (\"MolgenisArmadillo\" and \"DSMolgenisArmdillo\") and one Java application (\"ArmadilloService\") as jar and docker images via Github (https://github.com/molgenis/molgenis-service-armadillo).</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11734753/pdf/","citationCount":"0","resultStr":"{\"title\":\"MOLGENIS Armadillo: a lightweight server for federated analysis using DataSHIELD.\",\"authors\":\"Tim Cadman, Mariska K Slofstra, Marije A van der Geest, Demetris Avraam, Tom R P Bishop, Tommy de Boer, Liesbeth Duijts, Sido Haakma, Eleanor Hyde, Vincent Jaddoe, Tarik Karramass, Fleur Kelpin, Yannick Marcon, Angela Pinot de Moira, Dick Postma, Clemens Tolboom, Ruben L Veenstra, Stuart Wheater, Marieke Welten, Rebecca C Wilson, Erik Zwart, Morris Swertz\",\"doi\":\"10.1093/bioinformatics/btae726\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Summary: </strong>Extensive human health data from cohort studies, national registries, and biobanks can reveal lifecourse risk factors impacting health. Combining these sources offers increased statistical power, rare outcome detection, replication of findings, and extended study periods. Traditionally, this required data transfer to a central location or separate partner analyses with pooled summary statistics, posing ethical, legal, and time constraints. Federated analysis-which involves remote data analysis without sharing individual-level data-is a promising alternative. One promising solution is DataSHIELD (https://datashield.org/), an open-source R based implementation. To enable federated analysis, data owners need a user-friendly way to install the federated infrastructure and manage users and data. Here, we present MOLGENIS Armadillo: a lightweight server for federated analysis solutions such as DataSHIELD.</p><p><strong>Availability and implementation: </strong>Armadillo is implemented as a collection of three packages freely available under the open source licence LGPLv3: two R packages downloadable from the Comprehensive R Archive Network (CRAN) (\\\"MolgenisArmadillo\\\" and \\\"DSMolgenisArmdillo\\\") and one Java application (\\\"ArmadilloService\\\") as jar and docker images via Github (https://github.com/molgenis/molgenis-service-armadillo).</p>\",\"PeriodicalId\":93899,\"journal\":{\"name\":\"Bioinformatics (Oxford, England)\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-12-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11734753/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics (Oxford, England)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioinformatics/btae726\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btae726","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
摘要:来自队列研究、国家登记和生物库的大量人类健康数据可以揭示影响健康的生命过程风险因素。将这些数据源结合起来可提高统计能力、检测罕见结果、复制研究结果并延长研究周期。传统上,这需要将数据传输到一个中央位置,或者由不同的合作伙伴进行分析,并汇总统计数据,这就造成了伦理、法律和时间上的限制。联合分析--涉及远程数据分析,但不共享个人层面的数据--是一种很有前途的替代方案。DataSHIELD (https://datashield.org/)就是一个很有前途的解决方案,它是基于 R 的开源实现。为了实现联合分析,数据所有者需要一种用户友好的方式来安装联合基础架构并管理用户和数据。在此,我们介绍 MOLGENIS Armadillo:用于联合分析解决方案(如 DataSHIELD)的轻量级服务器:Armadillo由三个软件包组成,在开源许可证LGPLv3下免费提供:两个R软件包可从Comprehensive R Archive Network (CRAN)下载("MolgenisArmadillo "和 "DSMolgenisArmdillo"),一个Java应用程序("ArmadilloService")以jar和docker镜像的形式通过Github (https://github.com/molgenis/molgenis-service-armadillo)提供:在补充材料中,我们提供了用户界面(UI)的截图,以说明如何使用 Armadillo。
MOLGENIS Armadillo: a lightweight server for federated analysis using DataSHIELD.
Summary: Extensive human health data from cohort studies, national registries, and biobanks can reveal lifecourse risk factors impacting health. Combining these sources offers increased statistical power, rare outcome detection, replication of findings, and extended study periods. Traditionally, this required data transfer to a central location or separate partner analyses with pooled summary statistics, posing ethical, legal, and time constraints. Federated analysis-which involves remote data analysis without sharing individual-level data-is a promising alternative. One promising solution is DataSHIELD (https://datashield.org/), an open-source R based implementation. To enable federated analysis, data owners need a user-friendly way to install the federated infrastructure and manage users and data. Here, we present MOLGENIS Armadillo: a lightweight server for federated analysis solutions such as DataSHIELD.
Availability and implementation: Armadillo is implemented as a collection of three packages freely available under the open source licence LGPLv3: two R packages downloadable from the Comprehensive R Archive Network (CRAN) ("MolgenisArmadillo" and "DSMolgenisArmdillo") and one Java application ("ArmadilloService") as jar and docker images via Github (https://github.com/molgenis/molgenis-service-armadillo).