{"title":"CedCom:大数据应用的高性能架构","authors":"Tanguy Raynaud, R. Haque, H. Aït-Kaci","doi":"10.1109/AICCSA.2014.7073257","DOIUrl":null,"url":null,"abstract":"Distributed architecture is widely used for storing and processing Big Data. Operations on Big Data need first, locating the required data blocks and then, reading them. Data can be located in different types of memories in particular, cache memory, main memory, and secondary memory. Reading data from secondary memory to process Big Data jobs is not an ideal approach especially for high performance applications because, accessing data in secondary devices can be slow for processors. In addition, fetching data from main memory is time consuming due to limited I/O bandwidth. These system level issues are barriers for optimizing performance of Big Data applications. Simply put, for optimizing the application performance, it is not sufficient to have efficient algorithms only, an efficient architecture is needed to provide faster data access by the processors. The need for such an architecture has been documented in the literature, however, the state of the art is still missing an efficient architecture. This paper develops a promising architecture which caches data in main memory. It essentially transforms a main memory into a attraction memory which enables high-speed data access. Also, it enables automatic migration of data blocks and computations across the nodes contained in the clusters. It offers an exchange protocol for fast transfer of data blocks between the different physical nodes and speeds up job processing. The proposed architecture combines the power of Cache-Only Memory Architecture (COMA) and the structural principle of Hadoop.","PeriodicalId":412749,"journal":{"name":"2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"CedCom: A high-performance architecture for Big Data applications\",\"authors\":\"Tanguy Raynaud, R. Haque, H. Aït-Kaci\",\"doi\":\"10.1109/AICCSA.2014.7073257\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Distributed architecture is widely used for storing and processing Big Data. Operations on Big Data need first, locating the required data blocks and then, reading them. Data can be located in different types of memories in particular, cache memory, main memory, and secondary memory. Reading data from secondary memory to process Big Data jobs is not an ideal approach especially for high performance applications because, accessing data in secondary devices can be slow for processors. In addition, fetching data from main memory is time consuming due to limited I/O bandwidth. These system level issues are barriers for optimizing performance of Big Data applications. Simply put, for optimizing the application performance, it is not sufficient to have efficient algorithms only, an efficient architecture is needed to provide faster data access by the processors. The need for such an architecture has been documented in the literature, however, the state of the art is still missing an efficient architecture. This paper develops a promising architecture which caches data in main memory. It essentially transforms a main memory into a attraction memory which enables high-speed data access. Also, it enables automatic migration of data blocks and computations across the nodes contained in the clusters. It offers an exchange protocol for fast transfer of data blocks between the different physical nodes and speeds up job processing. The proposed architecture combines the power of Cache-Only Memory Architecture (COMA) and the structural principle of Hadoop.\",\"PeriodicalId\":412749,\"journal\":{\"name\":\"2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA)\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AICCSA.2014.7073257\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICCSA.2014.7073257","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
CedCom: A high-performance architecture for Big Data applications
Distributed architecture is widely used for storing and processing Big Data. Operations on Big Data need first, locating the required data blocks and then, reading them. Data can be located in different types of memories in particular, cache memory, main memory, and secondary memory. Reading data from secondary memory to process Big Data jobs is not an ideal approach especially for high performance applications because, accessing data in secondary devices can be slow for processors. In addition, fetching data from main memory is time consuming due to limited I/O bandwidth. These system level issues are barriers for optimizing performance of Big Data applications. Simply put, for optimizing the application performance, it is not sufficient to have efficient algorithms only, an efficient architecture is needed to provide faster data access by the processors. The need for such an architecture has been documented in the literature, however, the state of the art is still missing an efficient architecture. This paper develops a promising architecture which caches data in main memory. It essentially transforms a main memory into a attraction memory which enables high-speed data access. Also, it enables automatic migration of data blocks and computations across the nodes contained in the clusters. It offers an exchange protocol for fast transfer of data blocks between the different physical nodes and speeds up job processing. The proposed architecture combines the power of Cache-Only Memory Architecture (COMA) and the structural principle of Hadoop.