R. Guerraoui, Ron R. Levy, Bastian Pochon, Vivien Quéma
{"title":"High Throughput Total Order Broadcast for Cluster Environments","authors":"R. Guerraoui, Ron R. Levy, Bastian Pochon, Vivien Quéma","doi":"10.1109/DSN.2006.37","DOIUrl":"https://doi.org/10.1109/DSN.2006.37","url":null,"abstract":"Total order broadcast is a fundamental communication primitive that plays a central role in bringing cheap software-based high availability to a wide array of services. This paper studies the practical performance of such a primitive on a cluster of homogeneous machines. We present FSR, a (uniform) total order broadcast protocol that provides high throughput, regardless of message broadcast patterns. FSR is based on a ring topology, only relies on point-to-point inter-process communication, and has a linear latency with respect to the total number of processes in the system. Moreover, it is fair in the sense that each process has an equal opportunity of having its messages delivered by all processes. On a cluster of Itanium based machines, FSR achieves a throughput of 79 Mbit/s on a 100 Mbit/s switched Ethernet network","PeriodicalId":228470,"journal":{"name":"International Conference on Dependable Systems and Networks (DSN'06)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133884667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mitigating Active Attacks Towards Client Networks Using the Bitmap Filter","authors":"Chun-Ying Huang, Kuan-Ta Chen, C. Lei","doi":"10.1109/DSN.2006.54","DOIUrl":"https://doi.org/10.1109/DSN.2006.54","url":null,"abstract":"With the emergence of active worms, the targets of attacks have been moved from well-known Internet servers to generic Internet hosts, and since the rate at which patches can be applied is always much slower than the spread of a worm, an Internet worm can usually attack or infect millions of hosts in a short time. It is difficult to eliminate Internet attacks globally; thus, protecting client networks from being attacked or infected is a relatively critical issue. In this paper, we propose a method that protects client networks from being attacked by people who try to scan, attack, or infect hosts in local networks via unpatched vulnerabilities. Based on the symmetry of network traffic in both temporal and spatial domains, a bitmap filter is installed at the entry point of a client network to filter out possible attack traffic. Our evaluation shows that with a small amount of memory (less than 1 megabyte), more than 95% of attack traffic can be filtered out in a small- or medium-scale client network","PeriodicalId":228470,"journal":{"name":"International Conference on Dependable Systems and Networks (DSN'06)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128426769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"In-Register Duplication: Exploiting Narrow-Width Value for Improving Register File Reliability","authors":"Jie S. Hu, Shuai Wang, Sotirios G. Ziavras","doi":"10.1109/DSN.2006.43","DOIUrl":"https://doi.org/10.1109/DSN.2006.43","url":null,"abstract":"Protecting the register value and its data buses is crucial to reliable computing in high-performance microprocessors due to the increasing susceptibility of CMOS circuitry to soft errors induced by high-energy particle strikes. Since the register file is in the critical path of the processor pipeline, any reliable design that increases either the pressure on the register file or the register file access latency is not desirable. In this paper, we propose to exploit narrow-width register values, which present the majority of the generated values, for duplicating a copy of the value within the same data item, called in-register duplication (IRD), eliminating the requirement of additional copy registers. The datapath pipeline is augmented to efficiently incorporate parity encoding and parity checking such that error recovery is seamlessly supported in IRD and the parity checking is overlapped with the execution stage to avoid increasing the critical path. Our experimental evaluation using the SPEC CINT2000 benchmark suite shows that IRD provides superior read-with-duplicate (RWD) and error detection/recovery rates under heavy error injection as compared to previous reliability schemes","PeriodicalId":228470,"journal":{"name":"International Conference on Dependable Systems and Networks (DSN'06)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126135599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dongkook Park, C. Nicopoulos, Jongman Kim, N. Vijaykrishnan, C. Das
{"title":"Exploring Fault-Tolerant Network-on-Chip Architectures","authors":"Dongkook Park, C. Nicopoulos, Jongman Kim, N. Vijaykrishnan, C. Das","doi":"10.1109/DSN.2006.35","DOIUrl":"https://doi.org/10.1109/DSN.2006.35","url":null,"abstract":"The advent of deep sub-micron technology has exacerbated reliability issues in on-chip interconnects. In particular, single event upsets, such as soft errors, and hard faults are rapidly becoming a force to be reckoned with. This spiraling trend highlights the importance of detailed analysis of these reliability hazards and the incorporation of comprehensive protection measures into all network-on-chip (NoC) designs. In this paper, we examine the impact of transient failures on the reliability of on-chip interconnects and develop comprehensive counter-measures to either prevent or recover from them. In this regard, we propose several novel schemes to remedy various kinds of soft error symptoms, while keeping area and power overhead at a minimum. Our proposed solutions are architected to fully exploit the available infrastructures in an NoC and enable versatile reuse of valuable resources. The effectiveness of the proposed techniques has been validated using a cycle-accurate simulator","PeriodicalId":228470,"journal":{"name":"International Conference on Dependable Systems and Networks (DSN'06)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121019056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Component-Level Path Composition Approach for Efficient Transient Analysis of Large CTMCs","authors":"V. Lam, W. Sanders, P. Buchholz","doi":"10.1109/DSN.2006.1","DOIUrl":"https://doi.org/10.1109/DSN.2006.1","url":null,"abstract":"Path-based techniques make the analysis of very large Markov models feasible by trading off high computational complexity for low space complexity. Often, a drawback in these techniques is that they have to evaluate many paths in order to compute reasonably tight bounds on the exact solutions of the models. In this paper, we present a path composition algorithm to speed up path evaluation significantly. It works by quickly composing subpaths that are precomputed locally at the component level. The algorithm is computationally efficient since individual subpaths are precomputed only once, and the results are reused many times in the computation of all composed paths. To the best of our knowledge, this work is the first to propose the idea of path composition for the analysis of Markov models. A practical implementation of the algorithm makes it feasible to solve even larger models, since it helps not only in evaluating more paths faster but also in computing long paths efficiently by composing them from short ones. In addition to presenting the algorithm, we demonstrate its application and evaluate its performance in computing the reliability and availability of a large distributed information service system in the presence of fault propagation and in computing the probabilities of buffer overflow and buffer flushing in a media multicast system with varying system configurations","PeriodicalId":228470,"journal":{"name":"International Conference on Dependable Systems and Networks (DSN'06)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121217900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Aguilera, C. Delporte-Gallet, H. Fauconnier, S. Toueg
{"title":"Consensus with Byzantine Failures and Little System Synchrony","authors":"M. Aguilera, C. Delporte-Gallet, H. Fauconnier, S. Toueg","doi":"10.1109/DSN.2006.22","DOIUrl":"https://doi.org/10.1109/DSN.2006.22","url":null,"abstract":"We study consensus in a message-passing system where only some of the n2 links exhibit some synchrony. This problem was previously studied for systems with process crashes; we now consider Byzantine failures. We show that consensus can be solved in a system where there is at least one non-faulty process whose links are eventually timely; all other links can be arbitrarily slow. We also show that, in terms of problem solvability, such a system is strictly weaker than one where all links are eventually timely","PeriodicalId":228470,"journal":{"name":"International Conference on Dependable Systems and Networks (DSN'06)","volume":"235 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114262800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Solving Atomic Broadcast with Indirect Consensus","authors":"Richard Ekwall, A. Schiper","doi":"10.1109/DSN.2006.65","DOIUrl":"https://doi.org/10.1109/DSN.2006.65","url":null,"abstract":"In previous work, it has been shown how to solve atomic broadcast by reduction to consensus on messages. While this solution is theoretically correct, it has its limitations in practice, since executing consensus on large messages can quickly saturate the system. The problem can be addressed by executing consensus on message identifiers instead of the full messages, in order to decouple the size of the messages from the size of the data sent by the consensus algorithm. In this paper, we study the impact of executing consensus on message identifiers instead of on the full messages, in the context of solving atomic broadcast. We also discuss the implications of executing consensus on message identifiers on the consensus and atomic broadcast algorithms","PeriodicalId":228470,"journal":{"name":"International Conference on Dependable Systems and Networks (DSN'06)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122937582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Neves, João Antunes, M. Correia, P. Veríssimo, R. Neves
{"title":"Using Attack Injection to Discover New Vulnerabilities","authors":"N. Neves, João Antunes, M. Correia, P. Veríssimo, R. Neves","doi":"10.1109/DSN.2006.72","DOIUrl":"https://doi.org/10.1109/DSN.2006.72","url":null,"abstract":"Due to our increasing reliance on computer systems, security incidents and their causes are important problems that need to be addressed. To contribute to this objective, the paper describes a new tool for the discovery of security vulnerabilities on network connected servers. The AJECT tool uses a specification of the server's communication protocol to automatically generate a large number of attacks accordingly to some predefined test classes. Then, while it performs these attacks through the network, it monitors the behavior of the server both from a client perspective and inside the target machine. The observation of an incorrect behavior indicates a successful attack and the potential existence of a vulnerability. To demonstrate the usefulness of this approach, a considerable number of experiments were carried out with several IMAP servers. The results show that AJECT can discover several kinds of vulnerabilities, including a previously unknown vulnerability","PeriodicalId":228470,"journal":{"name":"International Conference on Dependable Systems and Networks (DSN'06)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124255238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhen Guo, Guofei Jiang, Haifeng Chen, K. Yoshihira
{"title":"Tracking Probabilistic Correlation of Monitoring Data for Fault Detection in Complex Systems","authors":"Zhen Guo, Guofei Jiang, Haifeng Chen, K. Yoshihira","doi":"10.1109/DSN.2006.70","DOIUrl":"https://doi.org/10.1109/DSN.2006.70","url":null,"abstract":"Due to their growing complexity, it becomes extremely difficult to detect and isolate faults in complex systems. While large amount of monitoring data can be collected from such systems for fault analysis, one challenge is how to correlate the data effectively across distributed systems and observation time. Much of the internal monitoring data reacts to the volume of user requests accordingly when user requests flow through distributed systems. In this paper, we use Gaussian mixture models to characterize probabilistic correlation between flow-intensities measured at multiple points. A novel algorithm derived from expectation-maximization (EM) algorithm is proposed to learn the \"likely\" boundary of normal data relationship, which is further used as an oracle in anomaly detection. Our recursive algorithm can adaptively estimate the boundary of dynamic data relationship and detect faults in real time. Our approach is tested in a real system with injected faults and the results demonstrate its feasibility","PeriodicalId":228470,"journal":{"name":"International Conference on Dependable Systems and Networks (DSN'06)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122097841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lucky Read/Write Access to Robust Atomic Storage","authors":"R. Guerraoui, Ron R. Levy, M. Vukolic","doi":"10.1109/DSN.2006.50","DOIUrl":"https://doi.org/10.1109/DSN.2006.50","url":null,"abstract":"This paper establishes tight bounds on the best-case time-complexity of distributed atomic read/write storage implementations that tolerate worst-case conditions. We study asynchronous robust implementations where a writer and a set of reader processes (clients) access an atomic storage implemented over a set of 2t+b+1 server processes of which t can fail: b of these can be malicious and the rest can crash. We define a lucky operation (read or write) as one that runs synchronously and without contention. It is often argued in practice that lucky operations are the most frequent. We determine the exact conditions under which a lucky operation can be fast, namely expedited in one-communication round-trip with no data authentication. We show that every lucky write (resp., read) can be fast despite fw (resp., fr) actual failures, if and only if fw+f rlest-b","PeriodicalId":228470,"journal":{"name":"International Conference on Dependable Systems and Networks (DSN'06)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123726875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}