{"title":"面向大数据的数据质量评估","authors":"Oumaima Reda, Imad Sassi, A. Zellou, S. Anter","doi":"10.1145/3419604.3419803","DOIUrl":null,"url":null,"abstract":"In recent years, as more and more data sources have become available and the volumes of data potentially accessible have increased, the assessment of data quality has taken a central role whether at the academic, professional or any other sector. Given that users are often concerned with the need to filter a large amount of data to better satisfy their requirements and needs, and that data analysis can be based on inaccurate, incomplete, ambiguous, duplicated and of poor quality, it makes everyone wonder what the results of these analyses will really be like. However, there is a very complex process involved in the identification of new, valid, potentially useful and meaningful data from a large data collection and various information systems, and is critically dependent on a number of measures to be developed to ensure data quality. To this end, the main objective of this paper is to introduce a general study on data quality related with big data, by providing what other researchers came up with on that subject. The paper will be finalized by a comparative study between the different existing data quality models.","PeriodicalId":250715,"journal":{"name":"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Towards a Data Quality Assessment in Big Data\",\"authors\":\"Oumaima Reda, Imad Sassi, A. Zellou, S. Anter\",\"doi\":\"10.1145/3419604.3419803\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, as more and more data sources have become available and the volumes of data potentially accessible have increased, the assessment of data quality has taken a central role whether at the academic, professional or any other sector. Given that users are often concerned with the need to filter a large amount of data to better satisfy their requirements and needs, and that data analysis can be based on inaccurate, incomplete, ambiguous, duplicated and of poor quality, it makes everyone wonder what the results of these analyses will really be like. However, there is a very complex process involved in the identification of new, valid, potentially useful and meaningful data from a large data collection and various information systems, and is critically dependent on a number of measures to be developed to ensure data quality. To this end, the main objective of this paper is to introduce a general study on data quality related with big data, by providing what other researchers came up with on that subject. The paper will be finalized by a comparative study between the different existing data quality models.\",\"PeriodicalId\":250715,\"journal\":{\"name\":\"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3419604.3419803\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3419604.3419803","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
In recent years, as more and more data sources have become available and the volumes of data potentially accessible have increased, the assessment of data quality has taken a central role whether at the academic, professional or any other sector. Given that users are often concerned with the need to filter a large amount of data to better satisfy their requirements and needs, and that data analysis can be based on inaccurate, incomplete, ambiguous, duplicated and of poor quality, it makes everyone wonder what the results of these analyses will really be like. However, there is a very complex process involved in the identification of new, valid, potentially useful and meaningful data from a large data collection and various information systems, and is critically dependent on a number of measures to be developed to ensure data quality. To this end, the main objective of this paper is to introduce a general study on data quality related with big data, by providing what other researchers came up with on that subject. The paper will be finalized by a comparative study between the different existing data quality models.