{"title":"用于实时流分析的列存储引擎","authors":"Alex Skidanov, Anders J. Papito, A. Prout","doi":"10.1109/ICDE.2016.7498332","DOIUrl":null,"url":null,"abstract":"This paper describes novel aspects of the column store implemented in the MemSQL database engine and describes the design choices made to support real-time streaming workloads. Column stores have traditionally been restricted to data warehouse scenarios where low latency queries are a secondary goal, and where restricting data ingestion to be offline, batched, append-only, or some combination thereof is acceptable. In contrast, the MemSQL column store implementation treats low latency queries and ongoing writes as first class citizens, with a focus on avoiding interference between read, ingest, update, and storage optimization workloads through the use of fragmented snapshot transactions and optimistic storage reordering. This implementation broadens the range of serviceable column store workloads to include those with more stringent demands on query and data latency, such as those backing operational systems used by adtech, financial services, fraud detection and other real-time or data streaming applications.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"139 1","pages":"1287-1297"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"A column store engine for real-time streaming analytics\",\"authors\":\"Alex Skidanov, Anders J. Papito, A. Prout\",\"doi\":\"10.1109/ICDE.2016.7498332\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes novel aspects of the column store implemented in the MemSQL database engine and describes the design choices made to support real-time streaming workloads. Column stores have traditionally been restricted to data warehouse scenarios where low latency queries are a secondary goal, and where restricting data ingestion to be offline, batched, append-only, or some combination thereof is acceptable. In contrast, the MemSQL column store implementation treats low latency queries and ongoing writes as first class citizens, with a focus on avoiding interference between read, ingest, update, and storage optimization workloads through the use of fragmented snapshot transactions and optimistic storage reordering. This implementation broadens the range of serviceable column store workloads to include those with more stringent demands on query and data latency, such as those backing operational systems used by adtech, financial services, fraud detection and other real-time or data streaming applications.\",\"PeriodicalId\":6883,\"journal\":{\"name\":\"2016 IEEE 32nd International Conference on Data Engineering (ICDE)\",\"volume\":\"139 1\",\"pages\":\"1287-1297\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-05-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE 32nd International Conference on Data Engineering (ICDE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.2016.7498332\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2016.7498332","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A column store engine for real-time streaming analytics
This paper describes novel aspects of the column store implemented in the MemSQL database engine and describes the design choices made to support real-time streaming workloads. Column stores have traditionally been restricted to data warehouse scenarios where low latency queries are a secondary goal, and where restricting data ingestion to be offline, batched, append-only, or some combination thereof is acceptable. In contrast, the MemSQL column store implementation treats low latency queries and ongoing writes as first class citizens, with a focus on avoiding interference between read, ingest, update, and storage optimization workloads through the use of fragmented snapshot transactions and optimistic storage reordering. This implementation broadens the range of serviceable column store workloads to include those with more stringent demands on query and data latency, such as those backing operational systems used by adtech, financial services, fraud detection and other real-time or data streaming applications.