P. Chung, Adam Woei-Jyh Lee, Joanne Shih, S. Yajnik, Yennun Huang
{"title":"分布式对象的故障注入实验","authors":"P. Chung, Adam Woei-Jyh Lee, Joanne Shih, S. Yajnik, Yennun Huang","doi":"10.1109/DOA.1999.793991","DOIUrl":null,"url":null,"abstract":"Discusses experiments to study the behavior of distributed objects in the presence of failures. The work is motivated by a practical need in designing object-based distributed systems. System developers need to understand how objects fail and how to handle these failures in their design. We consider two distributed object platforms-DCOM and IONA's Orbix, an implementation of CORBA. We investigate nine potential failure scenarios. These correspond to three different failure types (hanging, abnormal termination and crashes) of three system components (threads, processes and machines). We design experiments to inject failures into server object executions. The results are presented as perceived by clients when these failures occur in the server objects. We apply the results of these experiments to evaluate the effectiveness of a set of simple monitoring and recovery mechanisms and also to suggest improvements in the current DCOM and Orbix implementations.","PeriodicalId":360176,"journal":{"name":"Proceedings of the International Symposium on Distributed Objects and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Fault-injection experiments for distributed objects\",\"authors\":\"P. Chung, Adam Woei-Jyh Lee, Joanne Shih, S. Yajnik, Yennun Huang\",\"doi\":\"10.1109/DOA.1999.793991\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Discusses experiments to study the behavior of distributed objects in the presence of failures. The work is motivated by a practical need in designing object-based distributed systems. System developers need to understand how objects fail and how to handle these failures in their design. We consider two distributed object platforms-DCOM and IONA's Orbix, an implementation of CORBA. We investigate nine potential failure scenarios. These correspond to three different failure types (hanging, abnormal termination and crashes) of three system components (threads, processes and machines). We design experiments to inject failures into server object executions. The results are presented as perceived by clients when these failures occur in the server objects. We apply the results of these experiments to evaluate the effectiveness of a set of simple monitoring and recovery mechanisms and also to suggest improvements in the current DCOM and Orbix implementations.\",\"PeriodicalId\":360176,\"journal\":{\"name\":\"Proceedings of the International Symposium on Distributed Objects and Applications\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1999-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the International Symposium on Distributed Objects and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DOA.1999.793991\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International Symposium on Distributed Objects and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DOA.1999.793991","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Fault-injection experiments for distributed objects
Discusses experiments to study the behavior of distributed objects in the presence of failures. The work is motivated by a practical need in designing object-based distributed systems. System developers need to understand how objects fail and how to handle these failures in their design. We consider two distributed object platforms-DCOM and IONA's Orbix, an implementation of CORBA. We investigate nine potential failure scenarios. These correspond to three different failure types (hanging, abnormal termination and crashes) of three system components (threads, processes and machines). We design experiments to inject failures into server object executions. The results are presented as perceived by clients when these failures occur in the server objects. We apply the results of these experiments to evaluate the effectiveness of a set of simple monitoring and recovery mechanisms and also to suggest improvements in the current DCOM and Orbix implementations.