An Intelligent Search & Retrieval System (IRIS) and Clinical and Research Repository for Decision Support Based on Machine Learning and Joint Kernel-based Supervised Hashing.

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Cancer Informatics Pub Date : 2024-02-04 eCollection Date: 2024-01-01 DOI:10.1177/11769351231223806
David J Foran, Wenjin Chen, Tahsin Kurc, Rajarshi Gupta, Jakub Roman Kaczmarzyk, Luke Austin Torre-Healy, Erich Bremer, Samuel Ajjarapu, Nhan Do, Gerald Harris, Antoinette Stroup, Eric Durbin, Joel H Saltz
{"title":"An Intelligent Search & Retrieval System (IRIS) and Clinical and Research Repository for Decision Support Based on Machine Learning and Joint Kernel-based Supervised Hashing.","authors":"David J Foran, Wenjin Chen, Tahsin Kurc, Rajarshi Gupta, Jakub Roman Kaczmarzyk, Luke Austin Torre-Healy, Erich Bremer, Samuel Ajjarapu, Nhan Do, Gerald Harris, Antoinette Stroup, Eric Durbin, Joel H Saltz","doi":"10.1177/11769351231223806","DOIUrl":null,"url":null,"abstract":"<p><p>Large-scale, multi-site collaboration is becoming indispensable for a wide range of research and clinical activities in oncology. To facilitate the next generation of advances in cancer biology, precision oncology and the population sciences it will be necessary to develop and implement data management and analytic tools that empower investigators to reliably and objectively detect, characterize and chronicle the phenotypic and genomic changes that occur during the transformation from the benign to cancerous state and throughout the course of disease progression. To facilitate these efforts it is incumbent upon the informatics community to establish the workflows and architectures that automate the aggregation and organization of a growing range and number of clinical data types and modalities ranging from new molecular and laboratory tests to sophisticated diagnostic imaging studies. In an attempt to meet those challenges, leading health care centers across the country are making steep investments to establish enterprise-wide, data warehouses. A significant limitation of many data warehouses, however, is that they are designed to support only alphanumeric information. In contrast to those traditional designs, the system that we have developed supports automated collection and mining of multimodal data including genomics, digital pathology and radiology images. In this paper, our team describes the design, development and implementation of a multi-modal, Clinical & Research Data Warehouse (CRDW) that is tightly integrated with a suite of computational and machine-learning tools to provide actionable insight into the underlying characteristics of the tumor environment that would not be revealed using standard methods and tools. The System features a flexible Extract, Transform and Load (ETL) interface that enables it to adapt to aggregate data originating from different clinical and research sources depending on the specific EHR and other data sources utilized at a given deployment site.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2024-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10840403/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/11769351231223806","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Large-scale, multi-site collaboration is becoming indispensable for a wide range of research and clinical activities in oncology. To facilitate the next generation of advances in cancer biology, precision oncology and the population sciences it will be necessary to develop and implement data management and analytic tools that empower investigators to reliably and objectively detect, characterize and chronicle the phenotypic and genomic changes that occur during the transformation from the benign to cancerous state and throughout the course of disease progression. To facilitate these efforts it is incumbent upon the informatics community to establish the workflows and architectures that automate the aggregation and organization of a growing range and number of clinical data types and modalities ranging from new molecular and laboratory tests to sophisticated diagnostic imaging studies. In an attempt to meet those challenges, leading health care centers across the country are making steep investments to establish enterprise-wide, data warehouses. A significant limitation of many data warehouses, however, is that they are designed to support only alphanumeric information. In contrast to those traditional designs, the system that we have developed supports automated collection and mining of multimodal data including genomics, digital pathology and radiology images. In this paper, our team describes the design, development and implementation of a multi-modal, Clinical & Research Data Warehouse (CRDW) that is tightly integrated with a suite of computational and machine-learning tools to provide actionable insight into the underlying characteristics of the tumor environment that would not be revealed using standard methods and tools. The System features a flexible Extract, Transform and Load (ETL) interface that enables it to adapt to aggregate data originating from different clinical and research sources depending on the specific EHR and other data sources utilized at a given deployment site.

基于机器学习和基于联合核的监督哈希算法的智能搜索与检索系统(IRIS)以及用于决策支持的临床与研究资料库。
在肿瘤学的广泛研究和临床活动中,大规模、多地点合作正变得不可或缺。为了促进下一代癌症生物学、精准肿瘤学和群体科学的发展,有必要开发和实施数据管理和分析工具,使研究人员能够可靠、客观地检测、描述和记录从良性状态向癌症状态转变过程中以及整个疾病进展过程中发生的表型和基因组变化。为了促进这些工作,信息学界有责任建立工作流程和架构,以自动汇总和组织范围和数量不断扩大的临床数据类型和模式,从新的分子和实验室测试到复杂的诊断成像研究。为了应对这些挑战,全国领先的医疗保健中心正在进行大量投资,以建立全企业范围的数据仓库。然而,许多数据仓库的一个重大局限是,它们在设计上只能支持字母数字信息。与这些传统设计不同,我们开发的系统支持自动收集和挖掘多模态数据,包括基因组学、数字病理学和放射学图像。在本文中,我们的团队介绍了多模态临床与研究数据仓库(CRDW)的设计、开发和实施,该数据仓库与一整套计算和机器学习工具紧密集成,可为肿瘤环境的潜在特征提供可操作的洞察力,而这些特征是标准方法和工具无法揭示的。该系统具有灵活的提取、转换和加载(ETL)接口,可根据特定部署地点使用的特定电子病历和其他数据源,对来自不同临床和研究来源的数据进行聚合。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Cancer Informatics
Cancer Informatics Medicine-Oncology
CiteScore
3.00
自引率
5.00%
发文量
30
审稿时长
8 weeks
期刊介绍: The field of cancer research relies on advances in many other disciplines, including omics technology, mass spectrometry, radio imaging, computer science, and biostatistics. Cancer Informatics provides open access to peer-reviewed high-quality manuscripts reporting bioinformatics analysis of molecular genetics and/or clinical data pertaining to cancer, emphasizing the use of machine learning, artificial intelligence, statistical algorithms, advanced imaging techniques, data visualization, and high-throughput technologies. As the leading journal dedicated exclusively to the report of the use of computational methods in cancer research and practice, Cancer Informatics leverages methodological improvements in systems biology, genomics, proteomics, metabolomics, and molecular biochemistry into the fields of cancer detection, treatment, classification, risk-prediction, prevention, outcome, and modeling.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信