A Comprehensive Review of Unstructured Data Management Approaches in Data Warehouse

Vedika Gupta, A. Gosain
{"title":"A Comprehensive Review of Unstructured Data Management Approaches in Data Warehouse","authors":"Vedika Gupta, A. Gosain","doi":"10.1109/ISCBI.2013.20","DOIUrl":null,"url":null,"abstract":"The amount of business data is large & keeps on evolving leading to heterogeneous information base. The challenge is to access, analyze & integrate various data sources for making intelligent decisions. Business data can be structured or unstructured. Structured data attains the row-column format easily while unstructured data (USD) is the one that poses problem in such kind of tabular storage. Owing to the fact that USD is more than three times of structured data, and that it is more resourceful business wise and helps in charting out strategies and making decisions, it becomes important to devise methods for handling USD in data warehouse. Since the importance of USD has been realized, various authors have discussed different ways to manage it and extract useful information from it. In this paper, we have first comprehensively reviewed & surveyed the representative research works of various authors that have demonstrated how unstructured data can be handled in the warehouse. Finally, we have manifested & sorted them on various parameters & provided the same in tabular form.","PeriodicalId":311471,"journal":{"name":"2013 International Symposium on Computational and Business Intelligence","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Symposium on Computational and Business Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCBI.2013.20","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

The amount of business data is large & keeps on evolving leading to heterogeneous information base. The challenge is to access, analyze & integrate various data sources for making intelligent decisions. Business data can be structured or unstructured. Structured data attains the row-column format easily while unstructured data (USD) is the one that poses problem in such kind of tabular storage. Owing to the fact that USD is more than three times of structured data, and that it is more resourceful business wise and helps in charting out strategies and making decisions, it becomes important to devise methods for handling USD in data warehouse. Since the importance of USD has been realized, various authors have discussed different ways to manage it and extract useful information from it. In this paper, we have first comprehensively reviewed & surveyed the representative research works of various authors that have demonstrated how unstructured data can be handled in the warehouse. Finally, we have manifested & sorted them on various parameters & provided the same in tabular form.
数据仓库中非结构化数据管理方法综述
业务数据量大且不断发展,导致信息库异构。挑战在于访问、分析和整合各种数据源,以做出明智的决策。业务数据可以是结构化的,也可以是非结构化的。结构化数据很容易实现行-列格式,而非结构化数据(USD)在这种表格存储中会产生问题。由于USD是结构化数据的三倍以上,而且它更具有商业智慧,有助于制定战略和决策,因此设计在数据仓库中处理USD的方法变得非常重要。自从人们认识到美元的重要性以来,许多作者讨论了管理美元和从中提取有用信息的不同方法。在本文中,我们首先全面回顾和调查了不同作者的代表性研究成果,这些研究成果展示了如何在仓库中处理非结构化数据。最后,我们在各种参数上进行了显示和排序,并以表格的形式提供了相同的参数。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信