Adeel Ansari, Marisa Conte, Allen Flynn, Avanti Paturkar
{"title":"一个原型ETL管道,在部署纯函数以丰富知识图谱患者数据时使用HL7 FHIR RDF资源。","authors":"Adeel Ansari, Marisa Conte, Allen Flynn, Avanti Paturkar","doi":"10.1186/s13326-025-00335-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>For clinical care and research, knowledge graphs with patient data can be enriched by extracting parameters from a knowledge graph and then using them as inputs to compute new patient features with pure functions. Systematic and transparent methods for enriching knowledge graphs with newly computed patient features are of interest. When enriching the patient data in knowledge graphs this way, existing ontologies and well-known data resource standards can help promote semantic interoperability.</p><p><strong>Results: </strong>We developed and tested a new data processing pipeline for extracting, computing, and returning newly computed results to a large knowledge graph populated with electronic health record and patient survey data. We show that RDF data resource types already specified by Health Level 7's FHIR RDF effort can be programmatically validated and then used by this new data processing pipeline to represent newly derived patient-level features.</p><p><strong>Conclusions: </strong>Knowledge graph technology can be augmented with standards-based semantic data processing pipelines for deploying and tracing the use of pure functions to derive new patient-level features from existing data. Semantic data processing pipelines enable research enterprises to report on new patient-level computations of interest with linked metadata that details the origin and background of every new computation.</p>","PeriodicalId":15055,"journal":{"name":"Journal of Biomedical Semantics","volume":"16 1","pages":"16"},"PeriodicalIF":2.0000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12400713/pdf/","citationCount":"0","resultStr":"{\"title\":\"A prototype ETL pipeline that uses HL7 FHIR RDF resources when deploying pure functions to enrich knowledge graph patient data.\",\"authors\":\"Adeel Ansari, Marisa Conte, Allen Flynn, Avanti Paturkar\",\"doi\":\"10.1186/s13326-025-00335-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>For clinical care and research, knowledge graphs with patient data can be enriched by extracting parameters from a knowledge graph and then using them as inputs to compute new patient features with pure functions. Systematic and transparent methods for enriching knowledge graphs with newly computed patient features are of interest. When enriching the patient data in knowledge graphs this way, existing ontologies and well-known data resource standards can help promote semantic interoperability.</p><p><strong>Results: </strong>We developed and tested a new data processing pipeline for extracting, computing, and returning newly computed results to a large knowledge graph populated with electronic health record and patient survey data. We show that RDF data resource types already specified by Health Level 7's FHIR RDF effort can be programmatically validated and then used by this new data processing pipeline to represent newly derived patient-level features.</p><p><strong>Conclusions: </strong>Knowledge graph technology can be augmented with standards-based semantic data processing pipelines for deploying and tracing the use of pure functions to derive new patient-level features from existing data. Semantic data processing pipelines enable research enterprises to report on new patient-level computations of interest with linked metadata that details the origin and background of every new computation.</p>\",\"PeriodicalId\":15055,\"journal\":{\"name\":\"Journal of Biomedical Semantics\",\"volume\":\"16 1\",\"pages\":\"16\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12400713/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Biomedical Semantics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1186/s13326-025-00335-4\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biomedical Semantics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1186/s13326-025-00335-4","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
A prototype ETL pipeline that uses HL7 FHIR RDF resources when deploying pure functions to enrich knowledge graph patient data.
Background: For clinical care and research, knowledge graphs with patient data can be enriched by extracting parameters from a knowledge graph and then using them as inputs to compute new patient features with pure functions. Systematic and transparent methods for enriching knowledge graphs with newly computed patient features are of interest. When enriching the patient data in knowledge graphs this way, existing ontologies and well-known data resource standards can help promote semantic interoperability.
Results: We developed and tested a new data processing pipeline for extracting, computing, and returning newly computed results to a large knowledge graph populated with electronic health record and patient survey data. We show that RDF data resource types already specified by Health Level 7's FHIR RDF effort can be programmatically validated and then used by this new data processing pipeline to represent newly derived patient-level features.
Conclusions: Knowledge graph technology can be augmented with standards-based semantic data processing pipelines for deploying and tracing the use of pure functions to derive new patient-level features from existing data. Semantic data processing pipelines enable research enterprises to report on new patient-level computations of interest with linked metadata that details the origin and background of every new computation.
期刊介绍:
Journal of Biomedical Semantics addresses issues of semantic enrichment and semantic processing in the biomedical domain. The scope of the journal covers two main areas:
Infrastructure for biomedical semantics: focusing on semantic resources and repositories, meta-data management and resource description, knowledge representation and semantic frameworks, the Biomedical Semantic Web, and semantic interoperability.
Semantic mining, annotation, and analysis: focusing on approaches and applications of semantic resources; and tools for investigation, reasoning, prediction, and discoveries in biomedicine.