Zaheer Chothia, J. Liagouris, D. Dimitrova, Timothy Roscoe
{"title":"联机重建数据中心日志结构信息","authors":"Zaheer Chothia, J. Liagouris, D. Dimitrova, Timothy Roscoe","doi":"10.1145/3064176.3064195","DOIUrl":null,"url":null,"abstract":"Well-run datacenter application architectures are heavily instrumented to provide detailed traces of messages and remote invocations. Reconstructing user sessions, call graphs, transaction trees, and other structural information from these messages, a process known as sessionization, is the foundation for a variety of diagnostic, profiling, and monitoring tasks essential to the operation of the datacenter. We present the design and implementation of a system which processes log streams at gigabits per second and reconstructs user sessions comprising millions of transactions per second in real time with modest compute resources, while dealing with clock skew, message loss, and other real-world phenomena that make such a task challenging. Our system is based on the Timely Dataflow framework for low latency, data-parallel computation, and we demonstrate its utility with a number of use-cases and traces from a large, operational, mission-critical enterprise data center.","PeriodicalId":262089,"journal":{"name":"Proceedings of the Twelfth European Conference on Computer Systems","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Online Reconstruction of Structural Information from Datacenter Logs\",\"authors\":\"Zaheer Chothia, J. Liagouris, D. Dimitrova, Timothy Roscoe\",\"doi\":\"10.1145/3064176.3064195\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Well-run datacenter application architectures are heavily instrumented to provide detailed traces of messages and remote invocations. Reconstructing user sessions, call graphs, transaction trees, and other structural information from these messages, a process known as sessionization, is the foundation for a variety of diagnostic, profiling, and monitoring tasks essential to the operation of the datacenter. We present the design and implementation of a system which processes log streams at gigabits per second and reconstructs user sessions comprising millions of transactions per second in real time with modest compute resources, while dealing with clock skew, message loss, and other real-world phenomena that make such a task challenging. Our system is based on the Timely Dataflow framework for low latency, data-parallel computation, and we demonstrate its utility with a number of use-cases and traces from a large, operational, mission-critical enterprise data center.\",\"PeriodicalId\":262089,\"journal\":{\"name\":\"Proceedings of the Twelfth European Conference on Computer Systems\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Twelfth European Conference on Computer Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3064176.3064195\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Twelfth European Conference on Computer Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3064176.3064195","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Online Reconstruction of Structural Information from Datacenter Logs
Well-run datacenter application architectures are heavily instrumented to provide detailed traces of messages and remote invocations. Reconstructing user sessions, call graphs, transaction trees, and other structural information from these messages, a process known as sessionization, is the foundation for a variety of diagnostic, profiling, and monitoring tasks essential to the operation of the datacenter. We present the design and implementation of a system which processes log streams at gigabits per second and reconstructs user sessions comprising millions of transactions per second in real time with modest compute resources, while dealing with clock skew, message loss, and other real-world phenomena that make such a task challenging. Our system is based on the Timely Dataflow framework for low latency, data-parallel computation, and we demonstrate its utility with a number of use-cases and traces from a large, operational, mission-critical enterprise data center.