Farnaz Davardoost, Amin Babazadeh Sangar, K. Majidzadeh
{"title":"基于并行相似度算法的面向文档NoSQL数据库OLAP多维数据集提取","authors":"Farnaz Davardoost, Amin Babazadeh Sangar, K. Majidzadeh","doi":"10.1109/CJECE.2019.2953049","DOIUrl":null,"url":null,"abstract":"Today, the relational database is not suitable for data management due to the large variety and volume of data which are mostly untrusted. Therefore, NoSQL has attracted the attention of companies. Despite it being a proper choice for managing a variety of large volume data, there is a big challenge and difficulty in performing online analytical processing (OLAP) on NoSQL since it is schema-less. This article aims to introduce a model to overcome null value in converting document-oriented NoSQL databases into relational databases using parallel similarity techniques. The proposed model includes four phases, shingling, chunck, minhashing, and locality-sensitive hashing MapReduce (LSHMR). Each phase performs a proper process on input NoSQL databases. The main idea of LSHMR is based on the nature of both locality-sensitive hashing (LSH) and MapReduce (MR). In this article, the LSH similarity search technique is used on the MR framework to extract OLAP cubes. LSH is used to decrease the number of comparisons. Furthermore, MR enables efficient distributed and parallel computing. The proposed model is an efficient and suitable approach for extracting OLAP cubes from an NoSQL database.","PeriodicalId":55287,"journal":{"name":"Canadian Journal of Electrical and Computer Engineering-Revue Canadienne De Genie Electrique et Informatique","volume":null,"pages":null},"PeriodicalIF":1.7000,"publicationDate":"2020-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CJECE.2019.2953049","citationCount":"9","resultStr":"{\"title\":\"Extracting OLAP Cubes From Document-Oriented NoSQL Database Based on Parallel Similarity Algorithms\",\"authors\":\"Farnaz Davardoost, Amin Babazadeh Sangar, K. Majidzadeh\",\"doi\":\"10.1109/CJECE.2019.2953049\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Today, the relational database is not suitable for data management due to the large variety and volume of data which are mostly untrusted. Therefore, NoSQL has attracted the attention of companies. Despite it being a proper choice for managing a variety of large volume data, there is a big challenge and difficulty in performing online analytical processing (OLAP) on NoSQL since it is schema-less. This article aims to introduce a model to overcome null value in converting document-oriented NoSQL databases into relational databases using parallel similarity techniques. The proposed model includes four phases, shingling, chunck, minhashing, and locality-sensitive hashing MapReduce (LSHMR). Each phase performs a proper process on input NoSQL databases. The main idea of LSHMR is based on the nature of both locality-sensitive hashing (LSH) and MapReduce (MR). In this article, the LSH similarity search technique is used on the MR framework to extract OLAP cubes. LSH is used to decrease the number of comparisons. Furthermore, MR enables efficient distributed and parallel computing. The proposed model is an efficient and suitable approach for extracting OLAP cubes from an NoSQL database.\",\"PeriodicalId\":55287,\"journal\":{\"name\":\"Canadian Journal of Electrical and Computer Engineering-Revue Canadienne De Genie Electrique et Informatique\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2020-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/CJECE.2019.2953049\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Canadian Journal of Electrical and Computer Engineering-Revue Canadienne De Genie Electrique et Informatique\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CJECE.2019.2953049\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Engineering\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Canadian Journal of Electrical and Computer Engineering-Revue Canadienne De Genie Electrique et Informatique","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CJECE.2019.2953049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Engineering","Score":null,"Total":0}
Extracting OLAP Cubes From Document-Oriented NoSQL Database Based on Parallel Similarity Algorithms
Today, the relational database is not suitable for data management due to the large variety and volume of data which are mostly untrusted. Therefore, NoSQL has attracted the attention of companies. Despite it being a proper choice for managing a variety of large volume data, there is a big challenge and difficulty in performing online analytical processing (OLAP) on NoSQL since it is schema-less. This article aims to introduce a model to overcome null value in converting document-oriented NoSQL databases into relational databases using parallel similarity techniques. The proposed model includes four phases, shingling, chunck, minhashing, and locality-sensitive hashing MapReduce (LSHMR). Each phase performs a proper process on input NoSQL databases. The main idea of LSHMR is based on the nature of both locality-sensitive hashing (LSH) and MapReduce (MR). In this article, the LSH similarity search technique is used on the MR framework to extract OLAP cubes. LSH is used to decrease the number of comparisons. Furthermore, MR enables efficient distributed and parallel computing. The proposed model is an efficient and suitable approach for extracting OLAP cubes from an NoSQL database.
期刊介绍:
The Canadian Journal of Electrical and Computer Engineering (ISSN-0840-8688), issued quarterly, has been publishing high-quality refereed scientific papers in all areas of electrical and computer engineering since 1976