{"title":"Omilayers: a Python package for efficient data management to support multi-omic analysis.","authors":"Dimitrios Kioroglou","doi":"10.1186/s12859-025-06067-7","DOIUrl":null,"url":null,"abstract":"<p><p>Multi-omic integration involves the management of diverse omic datasets. Conducting an effective analysis of these datasets necessitates a data management system that meets a specific set of requirements, such as rapid storage and retrieval of data with varying numbers of features and mixed data-types, ensurance of reliable and secure database transactions, extension of stored data row and column-wise and facilitation of data distribution. SQLite and DuckDB are embedded databases that fulfil these requirements. However, they utilize the structured query language (SQL) that hinders their implementation by the uninitiated user, and complicates their use in repetitive tasks due to the necessity of writing SQL queries. This study offers Omilayers, a Python package that encapsulates these two databases and exposes a subset of their functionality that is geared towards frequent and repetitive analytical procedures. Synthetic data were used to demonstrate the use of Omilayers and compare the performance of SQLite and DuckDB.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"40"},"PeriodicalIF":2.9000,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11800426/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-025-06067-7","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Multi-omic integration involves the management of diverse omic datasets. Conducting an effective analysis of these datasets necessitates a data management system that meets a specific set of requirements, such as rapid storage and retrieval of data with varying numbers of features and mixed data-types, ensurance of reliable and secure database transactions, extension of stored data row and column-wise and facilitation of data distribution. SQLite and DuckDB are embedded databases that fulfil these requirements. However, they utilize the structured query language (SQL) that hinders their implementation by the uninitiated user, and complicates their use in repetitive tasks due to the necessity of writing SQL queries. This study offers Omilayers, a Python package that encapsulates these two databases and exposes a subset of their functionality that is geared towards frequent and repetitive analytical procedures. Synthetic data were used to demonstrate the use of Omilayers and compare the performance of SQLite and DuckDB.
期刊介绍:
BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology.
BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.