Ruben Peeters , Laura Rodriguez Martin , Fen Zhang , Hanny Willems , Liese Gilles , Jan Theunis , Jos Bessems , Caio Mescouto Terra de Souza , Stijn Baken , Dirk Devriendt , Eva Govarts
{"title":"Enhancing data interoperability in human biomonitoring studies: the HBM data toolkit","authors":"Ruben Peeters , Laura Rodriguez Martin , Fen Zhang , Hanny Willems , Liese Gilles , Jan Theunis , Jos Bessems , Caio Mescouto Terra de Souza , Stijn Baken , Dirk Devriendt , Eva Govarts","doi":"10.1016/j.ijheh.2025.114669","DOIUrl":null,"url":null,"abstract":"<div><div>Harmonization and aggregation of heterogeneous data from Human Biomonitoring (HBM) studies is critical to enhance the reliability of conclusions and move towards FAIR (i.e., Findable, Accessible, Interoperable, Reusable) data. We introduce the HBM Data Toolkit developed by the Flemish Institute for Technological Research (Vlaamse Instelling voor Technologisch Onderzoek - VITO) with the primary goal of optimizing data integrity and interoperability, key steps towards FAIR, while using flexible templates and ensuring data confidentiality. The HBM Data Toolkit was built in 2023–2024 and made available for stakeholders (via <span><span>https://hbm.vito.be/tools</span><svg><path></path></svg></span>) within the Partnership for the Assessment of Risks from Chemicals (PARC eu-parc.eu). The toolkit consists of 4 modules including data harmonization, data validation, derived variables, and summary statistics calculation. A Python package was created to interpret the templates, making validation and transformation possible. Using Pyodide and WebAssembly, the toolkit runs entirely in the web browser, enabling secure, local execution of Python code without uploading any data. In the validation module, input files in common format (i.e., Excel) were used to configure data templates, aligning with standards and formats as specified under the HBM4EU project (hbm4eu.eu) and PARC. The HBM Data Toolkit allows harmonized data storage in the Personal Exposure and Health (PEH) data platform. Formatted and validated HBM data were made compatible with the Monte Carlo Risk Assessment (MCRA) platform. In the derived variables calculation module, the toolkit also allows users to calculate imputed censored data and standardize/normalize the biomarker data. Furthermore, summary statistics (e.g., geometric mean, percentiles) can be calculated and further visualized in the European HBM dashboard and integrated into the Information Platform for Chemical Monitoring (IPCHEM). In conclusion, the current toolkit proves effective in advancing data quality, harmonization, and aggregation in HBM studies. With local execution, user-friendly codebooks, and standardized schemas, it supports a unified framework that enables consistent analysis and interpretation across diverse studies and datasets.</div></div>","PeriodicalId":13994,"journal":{"name":"International journal of hygiene and environmental health","volume":"270 ","pages":"Article 114669"},"PeriodicalIF":4.4000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of hygiene and environmental health","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1438463925001518","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}
引用次数: 0
Abstract
Harmonization and aggregation of heterogeneous data from Human Biomonitoring (HBM) studies is critical to enhance the reliability of conclusions and move towards FAIR (i.e., Findable, Accessible, Interoperable, Reusable) data. We introduce the HBM Data Toolkit developed by the Flemish Institute for Technological Research (Vlaamse Instelling voor Technologisch Onderzoek - VITO) with the primary goal of optimizing data integrity and interoperability, key steps towards FAIR, while using flexible templates and ensuring data confidentiality. The HBM Data Toolkit was built in 2023–2024 and made available for stakeholders (via https://hbm.vito.be/tools) within the Partnership for the Assessment of Risks from Chemicals (PARC eu-parc.eu). The toolkit consists of 4 modules including data harmonization, data validation, derived variables, and summary statistics calculation. A Python package was created to interpret the templates, making validation and transformation possible. Using Pyodide and WebAssembly, the toolkit runs entirely in the web browser, enabling secure, local execution of Python code without uploading any data. In the validation module, input files in common format (i.e., Excel) were used to configure data templates, aligning with standards and formats as specified under the HBM4EU project (hbm4eu.eu) and PARC. The HBM Data Toolkit allows harmonized data storage in the Personal Exposure and Health (PEH) data platform. Formatted and validated HBM data were made compatible with the Monte Carlo Risk Assessment (MCRA) platform. In the derived variables calculation module, the toolkit also allows users to calculate imputed censored data and standardize/normalize the biomarker data. Furthermore, summary statistics (e.g., geometric mean, percentiles) can be calculated and further visualized in the European HBM dashboard and integrated into the Information Platform for Chemical Monitoring (IPCHEM). In conclusion, the current toolkit proves effective in advancing data quality, harmonization, and aggregation in HBM studies. With local execution, user-friendly codebooks, and standardized schemas, it supports a unified framework that enables consistent analysis and interpretation across diverse studies and datasets.
期刊介绍:
The International Journal of Hygiene and Environmental Health serves as a multidisciplinary forum for original reports on exposure assessment and the reactions to and consequences of human exposure to the biological, chemical, and physical environment. Research reports, short communications, reviews, scientific comments, technical notes, and editorials will be peer-reviewed before acceptance for publication. Priority will be given to articles on epidemiological aspects of environmental toxicology, health risk assessments, susceptible (sub) populations, sanitation and clean water, human biomonitoring, environmental medicine, and public health aspects of exposure-related outcomes.