The Sickle Africa Data Coordinating Centre (SADaCC): a data science hub for interdisciplinary sickle cell disease research and training.

IF 3.6 4区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Ambroise Wonkam, Nchangwi Syntia Munung, Mario Jonas, Wilson Mupfurirwa, Arthemon Nguweneza, Kevin Esoh, Chandre Oosterwyk-Liu, Zimkita Magangana, Khuthala Mnika, Valentina Ngo Bitoungui, Martha Kamkuemah, Kambe Banda, Nabeelah Samie, Jade Hotchkis, Victoria Nembaware, Andre-Pascal Kengne, Nicola Mulder
{"title":"The Sickle Africa Data Coordinating Centre (SADaCC): a data science hub for interdisciplinary sickle cell disease research and training.","authors":"Ambroise Wonkam, Nchangwi Syntia Munung, Mario Jonas, Wilson Mupfurirwa, Arthemon Nguweneza, Kevin Esoh, Chandre Oosterwyk-Liu, Zimkita Magangana, Khuthala Mnika, Valentina Ngo Bitoungui, Martha Kamkuemah, Kambe Banda, Nabeelah Samie, Jade Hotchkis, Victoria Nembaware, Andre-Pascal Kengne, Nicola Mulder","doi":"10.1093/database/baag007","DOIUrl":null,"url":null,"abstract":"<p><p>Sickle cell disease (SCD) is one of the most prevalent monogenic disorders worldwide, with the highest burden in Africa, where ~75% of the 7.74 million global cases occur. Scientific progress in understanding its epidemiology, clinical heterogeneity, and treatment outcomes has been constrained by heterogeneous, non-standardized, and non-interoperable datasets that limit data integration and cross-country analyses. To address this, the Sickle Africa Data Coordinating Centre (SADaCC) was established as the data science hub of the SickleInAfrica consortium to support the development and expansion of Pan-African SCD registry. SADaCC now coordinates one of the largest patient-consented SCD datasets globally, with data from over 40 000 persons living with SCD in seven countries (Ghana, Mali, Nigeria, Tanzania, Uganda, Zambia, and Zimbabwe) within the Sickle Pan-African Research Consortium (SPARCo), as well as genomic data from SADaCC satellite sites in Cameroon, South Africa, and Malawi. The registry is built on FAIR-compliant architecture, the Sickle Cell Disease Ontology, and powered by a suite of digital platforms such as REDCap, NextCloud, RStudio, GitHub, Docker, and Jupyter. In partnership with SPARCo, SADaCC is also piloting a biobank that will link biospecimens with data in the registry to advance multi-omics research. Beyond infrastructure, SADaCC leads training and/or research in big data analytics, genomics, bioethics, implementation science, qualitative research, and psychosocial studies. Ethical, legal, and social considerations are embedded across all operations with emphasis on equitable intra-African collaboration and patient involvement in research. Looking ahead, SADaCC will integrate real-time data streams, AI-driven analytics, and multi-omics data to drive big data and genetic medicine research for SCD in Africa.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2026 ","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12923167/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Database: The Journal of Biological Databases and Curation","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/database/baag007","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Sickle cell disease (SCD) is one of the most prevalent monogenic disorders worldwide, with the highest burden in Africa, where ~75% of the 7.74 million global cases occur. Scientific progress in understanding its epidemiology, clinical heterogeneity, and treatment outcomes has been constrained by heterogeneous, non-standardized, and non-interoperable datasets that limit data integration and cross-country analyses. To address this, the Sickle Africa Data Coordinating Centre (SADaCC) was established as the data science hub of the SickleInAfrica consortium to support the development and expansion of Pan-African SCD registry. SADaCC now coordinates one of the largest patient-consented SCD datasets globally, with data from over 40 000 persons living with SCD in seven countries (Ghana, Mali, Nigeria, Tanzania, Uganda, Zambia, and Zimbabwe) within the Sickle Pan-African Research Consortium (SPARCo), as well as genomic data from SADaCC satellite sites in Cameroon, South Africa, and Malawi. The registry is built on FAIR-compliant architecture, the Sickle Cell Disease Ontology, and powered by a suite of digital platforms such as REDCap, NextCloud, RStudio, GitHub, Docker, and Jupyter. In partnership with SPARCo, SADaCC is also piloting a biobank that will link biospecimens with data in the registry to advance multi-omics research. Beyond infrastructure, SADaCC leads training and/or research in big data analytics, genomics, bioethics, implementation science, qualitative research, and psychosocial studies. Ethical, legal, and social considerations are embedded across all operations with emphasis on equitable intra-African collaboration and patient involvement in research. Looking ahead, SADaCC will integrate real-time data streams, AI-driven analytics, and multi-omics data to drive big data and genetic medicine research for SCD in Africa.

非洲镰状细胞病数据协调中心:跨学科镰状细胞病研究和培训的数据科学中心。
镰状细胞病(SCD)是世界上最普遍的单基因疾病之一,非洲负担最重,在全球774万例病例中约有75%发生在非洲。在了解其流行病学、临床异质性和治疗结果方面的科学进展受到异质性、非标准化和不可互操作的数据集的限制,这些数据集限制了数据整合和跨国分析。为了解决这一问题,镰状非洲数据协调中心(SADaCC)作为镰状非洲联盟的数据科学中心成立,以支持泛非SCD注册的发展和扩展。SADaCC现在协调全球最大的患者同意的SCD数据集之一,其中包括镰状泛非洲研究联盟(SPARCo)内7个国家(加纳、马里、尼日利亚、坦桑尼亚、乌干达、赞比亚和津巴布韦)4万多名SCD患者的数据,以及来自SADaCC在喀麦隆、南非和马拉维卫星站点的基因组数据。该注册表建立在fair兼容的架构上,即镰状细胞疾病本体,并由一系列数字平台(如REDCap, NextCloud, RStudio, GitHub, Docker和Jupyter)提供支持。SADaCC还与SPARCo合作,正在试点一个生物库,将生物标本与登记处的数据联系起来,以推进多组学研究。除了基础设施,SADaCC还在大数据分析、基因组学、生物伦理学、实施科学、定性研究和社会心理研究等领域开展培训和/或研究。伦理、法律和社会方面的考虑贯穿于所有行动之中,重点是非洲内部的公平合作和患者参与研究。展望未来,SADaCC将整合实时数据流、人工智能驱动的分析和多组学数据,推动非洲SCD的大数据和基因医学研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Database: The Journal of Biological Databases and Curation
Database: The Journal of Biological Databases and Curation MATHEMATICAL & COMPUTATIONAL BIOLOGY-
CiteScore
9.00
自引率
3.40%
发文量
100
审稿时长
>12 weeks
期刊介绍: Huge volumes of primary data are archived in numerous open-access databases, and with new generation technologies becoming more common in laboratories, large datasets will become even more prevalent. The archiving, curation, analysis and interpretation of all of these data are a challenge. Database development and biocuration are at the forefront of the endeavor to make sense of this mounting deluge of data. Database: The Journal of Biological Databases and Curation provides an open access platform for the presentation of novel ideas in database research and biocuration, and aims to help strengthen the bridge between database developers, curators, and users.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书