数据集成、数据质量和数据治理

Wilfried Lemahieu, S. V. Broucke, B. Baesens
{"title":"数据集成、数据质量和数据治理","authors":"Wilfried Lemahieu, S. V. Broucke, B. Baesens","doi":"10.1017/9781316888773.020","DOIUrl":null,"url":null,"abstract":"Chapter Objectives In this chapter, you will learn to: • identify the key challenges and approaches for data and process integration; • understand the basic mechanisms of searching unstructured data within an organization and across the World Wide Web; • define data quality as a multidimensional concept and understand how master data management (MDM) can contribute to it; • understand different frameworks and standards for data governance; • highlight more recent approaches in data warehousing, data integration, and governance. Opening Scenario Things are going well at Sober. The company has set up a solid data environment based on a solid relational database management system used to support the bulk of its operations. Sober's mobile app development team has been using MongoDB as a scalable NoSQL DBMS to handle the increased workload coming from mobile app users and to provide back-end support for experimental features the team wants to test in new versions of their mobile app. Sober's development and database team is already paying attention to various data quality and governance aspects: the RDBMS is the central source of truth, strongly focusing on solid schema design and regular quality checks being performed on the data. The NoSQL database is an additional support system to handle large query volumes from mobile users in real-time in a scalable manner, but where all data changes are still being propagated to the central RDBMS. This is done in a manual manner, which sometimes leads to the two data sources not being in agreement with each other. Sober's team therefore wants to consider better data quality approaches to implement more robust quality checks on data and make sure that changes to the NoSQL database are propagated to the RDBMS system in a timely and correct manner. Sober also wants to understand how their data flows can be better integrated with their business processes. In this chapter we will look at some managerial and technical aspects of data integration. We will zoom in on data integration techniques, data quality, and data governance. As companies often end up with many information systems and databases over time, the concept of data integration becomes increasingly important to consolidate a company's data to provide one, unified view to applications and users.","PeriodicalId":186558,"journal":{"name":"Principles of Database Management","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data Integration, Data Quality, and Data Governance\",\"authors\":\"Wilfried Lemahieu, S. V. Broucke, B. Baesens\",\"doi\":\"10.1017/9781316888773.020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Chapter Objectives In this chapter, you will learn to: • identify the key challenges and approaches for data and process integration; • understand the basic mechanisms of searching unstructured data within an organization and across the World Wide Web; • define data quality as a multidimensional concept and understand how master data management (MDM) can contribute to it; • understand different frameworks and standards for data governance; • highlight more recent approaches in data warehousing, data integration, and governance. Opening Scenario Things are going well at Sober. The company has set up a solid data environment based on a solid relational database management system used to support the bulk of its operations. Sober's mobile app development team has been using MongoDB as a scalable NoSQL DBMS to handle the increased workload coming from mobile app users and to provide back-end support for experimental features the team wants to test in new versions of their mobile app. Sober's development and database team is already paying attention to various data quality and governance aspects: the RDBMS is the central source of truth, strongly focusing on solid schema design and regular quality checks being performed on the data. The NoSQL database is an additional support system to handle large query volumes from mobile users in real-time in a scalable manner, but where all data changes are still being propagated to the central RDBMS. This is done in a manual manner, which sometimes leads to the two data sources not being in agreement with each other. Sober's team therefore wants to consider better data quality approaches to implement more robust quality checks on data and make sure that changes to the NoSQL database are propagated to the RDBMS system in a timely and correct manner. Sober also wants to understand how their data flows can be better integrated with their business processes. In this chapter we will look at some managerial and technical aspects of data integration. We will zoom in on data integration techniques, data quality, and data governance. As companies often end up with many information systems and databases over time, the concept of data integration becomes increasingly important to consolidate a company's data to provide one, unified view to applications and users.\",\"PeriodicalId\":186558,\"journal\":{\"name\":\"Principles of Database Management\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Principles of Database Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1017/9781316888773.020\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Principles of Database Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1017/9781316888773.020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在本章中,您将学习:•识别数据和流程集成的关键挑战和方法;•了解在组织内部和万维网上搜索非结构化数据的基本机制;•将数据质量定义为一个多维概念,并了解主数据管理(MDM)如何为其做出贡献;•了解不同的数据治理框架和标准;•强调数据仓库、数据集成和治理方面的最新方法。开场场景醒酒店进展顺利。该公司已经建立了一个基于可靠关系数据库管理系统的可靠数据环境,用于支持其大部分业务。Sober的移动应用开发团队一直在使用MongoDB作为可扩展的NoSQL DBMS来处理来自移动应用用户的工作量增加,并为团队想要在新版本的移动应用中测试的实验功能提供后端支持。Sober的开发和数据库团队已经在关注各种数据质量和治理方面:RDBMS是事实的中心来源,强烈关注于可靠的模式设计和对数据执行的定期质量检查。NoSQL数据库是一个额外的支持系统,可以以可扩展的方式实时处理来自移动用户的大量查询,但是所有数据更改仍然被传播到中央RDBMS。这是以手动方式完成的,这有时会导致两个数据源不一致。因此,Sober的团队想要考虑更好的数据质量方法来实现更强大的数据质量检查,并确保对NoSQL数据库的更改能够及时、正确地传播到RDBMS系统。Sober还想了解他们的数据流如何更好地与业务流程集成。在本章中,我们将着眼于数据集成的一些管理和技术方面。我们将聚焦于数据集成技术、数据质量和数据治理。随着时间的推移,公司往往最终拥有许多信息系统和数据库,因此数据集成的概念对于整合公司的数据以向应用程序和用户提供一个统一的视图变得越来越重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Data Integration, Data Quality, and Data Governance
Chapter Objectives In this chapter, you will learn to: • identify the key challenges and approaches for data and process integration; • understand the basic mechanisms of searching unstructured data within an organization and across the World Wide Web; • define data quality as a multidimensional concept and understand how master data management (MDM) can contribute to it; • understand different frameworks and standards for data governance; • highlight more recent approaches in data warehousing, data integration, and governance. Opening Scenario Things are going well at Sober. The company has set up a solid data environment based on a solid relational database management system used to support the bulk of its operations. Sober's mobile app development team has been using MongoDB as a scalable NoSQL DBMS to handle the increased workload coming from mobile app users and to provide back-end support for experimental features the team wants to test in new versions of their mobile app. Sober's development and database team is already paying attention to various data quality and governance aspects: the RDBMS is the central source of truth, strongly focusing on solid schema design and regular quality checks being performed on the data. The NoSQL database is an additional support system to handle large query volumes from mobile users in real-time in a scalable manner, but where all data changes are still being propagated to the central RDBMS. This is done in a manual manner, which sometimes leads to the two data sources not being in agreement with each other. Sober's team therefore wants to consider better data quality approaches to implement more robust quality checks on data and make sure that changes to the NoSQL database are propagated to the RDBMS system in a timely and correct manner. Sober also wants to understand how their data flows can be better integrated with their business processes. In this chapter we will look at some managerial and technical aspects of data integration. We will zoom in on data integration techniques, data quality, and data governance. As companies often end up with many information systems and databases over time, the concept of data integration becomes increasingly important to consolidate a company's data to provide one, unified view to applications and users.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信