A new process for healthcare big data warehouse integration

IF 0.4 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Nouha Arfaoui
{"title":"A new process for healthcare big data warehouse integration","authors":"Nouha Arfaoui","doi":"10.1504/ijdmmm.2023.132974","DOIUrl":null,"url":null,"abstract":"Healthcare domain generates huge amount of data from different and heterogynous clinical data sources using different devices to ensure a good managing hospital performance. Because of the quantity and complexity structure of the data, we use big healthcare data warehouse for the storage first and the decision making later. To achieve our goal, we propose a new process that deals with this type of data. It starts by unifying the different data, then it extracts it, loads it into big healthcare data warehouse and finally it makes the necessary transformations. For the first step, the ontology is used. It is the best solution to solve the problem of data sources heterogeneity. We use, also, Hadoop and its ecosystem including Hive, MapReduce and HDFS to accelerate the treatment through the parallelism exploiting the performance of ELT to ensure the 'schema-on-read' where the data is stored before performing the transformation tasks.","PeriodicalId":43061,"journal":{"name":"International Journal of Data Mining Modelling and Management","volume":"56 1","pages":"0"},"PeriodicalIF":0.4000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Data Mining Modelling and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/ijdmmm.2023.132974","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Healthcare domain generates huge amount of data from different and heterogynous clinical data sources using different devices to ensure a good managing hospital performance. Because of the quantity and complexity structure of the data, we use big healthcare data warehouse for the storage first and the decision making later. To achieve our goal, we propose a new process that deals with this type of data. It starts by unifying the different data, then it extracts it, loads it into big healthcare data warehouse and finally it makes the necessary transformations. For the first step, the ontology is used. It is the best solution to solve the problem of data sources heterogeneity. We use, also, Hadoop and its ecosystem including Hive, MapReduce and HDFS to accelerate the treatment through the parallelism exploiting the performance of ELT to ensure the 'schema-on-read' where the data is stored before performing the transformation tasks.
医疗大数据仓库集成的新流程
医疗保健领域使用不同的设备从不同的异构临床数据源生成大量数据,以确保良好的管理医院性能。由于数据量大、结构复杂,我们采用大型医疗数据仓库进行先存储后决策。为了实现我们的目标,我们提出了一个处理这类数据的新流程。它首先统一不同的数据,然后提取数据,将其加载到大型医疗保健数据仓库中,最后进行必要的转换。第一步,使用本体。它是解决数据源异构问题的最佳方案。我们还使用Hadoop及其生态系统,包括Hive, MapReduce和HDFS,通过并行性利用ELT的性能来加速处理,以确保在执行转换任务之前存储数据的“schema-on-read”。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Data Mining Modelling and Management
International Journal of Data Mining Modelling and Management COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
1.10
自引率
0.00%
发文量
22
期刊介绍: Facilitating transformation from data to information to knowledge is paramount for organisations. Companies are flooded with data and conflicting information, but with limited real usable knowledge. However, rarely should a process be looked at from limited angles or in parts. Isolated islands of data mining, modelling and management (DMMM) should be connected. IJDMMM highlightes integration of DMMM, statistics/machine learning/databases, each element of data chain management, types of information, algorithms in software; from data pre-processing to post-processing; between theory and applications. Topics covered include: -Artificial intelligence- Biomedical science- Business analytics/intelligence, process modelling- Computer science, database management systems- Data management, mining, modelling, warehousing- Engineering- Environmental science, environment (ecoinformatics)- Information systems/technology, telecommunications/networking- Management science, operations research, mathematics/statistics- Social sciences- Business/economics, (computational) finance- Healthcare, medicine, pharmaceuticals- (Computational) chemistry, biology (bioinformatics)- Sustainable mobility systems, intelligent transportation systems- National security
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信