{"title":"Evaluation of fault-tolerant designs implemented on SRAM-based FPGAs","authors":"H. Asadi, S. Miremadi, H. Zarandi, A. Ejlali","doi":"10.1109/PRDC.2004.1276583","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276583","url":null,"abstract":"The technology of SRAM-based devices is sensible to single event upsets (SEUs) that may be induced mainly by high energy heavy ions and neutrons. We present a framework for the evaluation of fault-tolerant designs implemented on SRAM-based FPGAs using emulated SEUs. The SEU injection process is performed by inserting emulated SEUs in the device using its configuration bitstream file. An Altera FPGA, i.e. the Flex10K200, and the ITC'99 benchmark circuits are used to experimentally evaluate the method. The results show that between 32 to 45 percent of SEUs injected to the device propagate to the output terminals of the device.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126413910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Systematic comparisons of RDT communication-induced checkpointing protocols","authors":"Jichiang Tsai","doi":"10.1109/PRDC.2004.1276554","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276554","url":null,"abstract":"Rollback-dependency trackability (RDT) is a property stating that all rollback dependencies between local checkpoints are online trackable by using a transitive dependency vector. Since the RDT property was introduced, many communication-induced checkpointing protocols satisfying such a property have been proposed in the literature. Most protocols can be classified as three families, PCM family, EPSCM family and PMM family, according to their underlying RDT characterizations. Up to now, several theoretical analyses on comparing the performance of RDT protocols were addressed, but simulation studies on this topic are rare and not comprehensive. We present a simulation study for comparing RDT protocols systematically. The simulation is carried out in different computational environments. We will not only verify the results from existing theoretical analyses, but also explore the impact of optimizations on protocols. Our results can provide guidelines for understanding the efficiency of RDT protocols.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132226743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Availabilities and costs of reliable Fat-Btrees","authors":"Jun Miyazaki, Yohei Abe, H. Yokota","doi":"10.1109/PRDC.2004.1276567","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276567","url":null,"abstract":"The Fat-Btree is expected to be used as a parallel directory structure for high performance and highly reliable data intensive systems, such as databases and file systems. Though the Fat-Btree is essentially high performance, it needs to be highly reliable as well. In order to achieve high reliability, we introduce five configuration methods. We show not only their steady-state and computational availabilities but also cost related comparisons, so as to evaluate the properties of these configurations.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"573 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116297422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Aspects for improvement of performance in fault-tolerant software","authors":"Diana Szentiványi, S. Nadjm-Tehrani","doi":"10.1109/PRDC.2004.1276578","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276578","url":null,"abstract":"We describe the use of aspect-oriented programming to improve performance of fault-tolerant (FT) servers built with middleware support. Its contribution is to shift method call logging from middleware to application level in primary-backup replication. The novelty consists in no burden being placed on application writers, except for a simple component description aiding automatic generation of aspect code. The approach is illustrated by describing how synchronization aspects are weaved in an application, and modifications of an FT-CORBA platform to avoid middleware level logging. Evaluation is performed using a telecom application enriched with aspects, running on top of the aspect-supporting platform. We compare overheads with earlier results from runs on the base-line platform. Experiments show a drop of around 40% of original overheads. This is due to methods starting execution before previous ones end, in contrast to ordering enforced at middleware level where methods are executed sequentially, not adapting to application knowledge.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127100342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Connecting network partitions with location-assisted forwarding nodes in mobile ad hoc environments","authors":"Chia-Ho Ou, K. Ssu, H. C. Jiau","doi":"10.1109/PRDC.2004.1276574","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276574","url":null,"abstract":"A mobile ad hoc network consisting of a set of mobile nodes does not have any infrastructure support or central management. Data packets are routed by the mobile nodes in the network instead of base stations. Due to node mobility or geographical limitations, the ad hoc network topology may be partitioned. Data communication and information availability are thus affected. The disconnection problem makes a critical strike on ad hoc routing protocols because most protocols typically assume that the ad hoc network is always connected. We describe an approach to recovering the disconnected mobile ad hoc networks by deploying forwarding nodes. The forwarding nodes can automatically move to appropriate locations for interconnecting network partitions. The mechanism is distributed and self-organized and can be integrated with other routing protocols. The mechanism has been implemented in the network simulator ns-2. The simulation results show that the algorithm can efficiently improve information availability and recover network partitions.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129076191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Knauf, S. Tsuruta, H. Ihara, Avelino J. Gonzalez, Torsten Kurbad
{"title":"Improving AI systems' dependability by utilizing historical knowledge","authors":"R. Knauf, S. Tsuruta, H. Ihara, Avelino J. Gonzalez, Torsten Kurbad","doi":"10.1109/PRDC.2004.1276590","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276590","url":null,"abstract":"A Turing test is a promising way to validate AI systems which usually have no way to proof correctness. However, human experts (validators) are often too busy to participate in it and sometimes have different opinions per person as well as per validation session. To cope with these and increase the validation dependability, a validation knowledge base (VKB) in Turing test-like validation is proposed. The VKB is constructed and maintained across various validation sessions. Primary benefits are (1) decreasing validators' workload, (2) refining the methodology itself, e.g. selecting dependable validators using VKB, and (3) increasing AI systems' dependabilities through dependable validation, e.g. support to identify optimal solutions. Finally, validation experts software agents (VESA) are introduced to further break limitations of human validator's dependability. Each VESA is a software agent corresponding to a particular human validator. This suggests the ability to systematically \"construct\" human-like validators by keeping personal validation knowledge per corresponding validator. This will bring a new dimension towards dependable AI systems.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129259665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Baldoni, R. Beraldi, S. Piergiovanni, A. Virgillito
{"title":"Measuring notification loss in publish/subscribe communication systems","authors":"R. Baldoni, R. Beraldi, S. Piergiovanni, A. Virgillito","doi":"10.1109/PRDC.2004.1276556","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276556","url":null,"abstract":"A publish/subscribe communication system (PSS) realizes a many-to-many anonymous interaction among its participants. Producers of information (publishers) issue notifications to the PSS. These are delivered by the PSS to all subscribers that declared interest in it. However, this decoupled form of interaction introduces delays between i) the production of a notification and its delivery to subscribers (diffusion delay) and ii) the declaration of interest by a subscriber and its registration in the PSS (subscription/unsubscription delay). Such delays could lead to notification loss scenarios where an event is not delivered to an intended subscriber even though it was issued when the subscription was active. We studied this notification loss phenomenon by presenting a simulation study of a PSS and an analytical model. The latter measures the percentage of notifications guaranteed by a PSS implementation to a subscriber. This addresses a QoS issue. The model is based on a formal framework of a distributed computation. The framework abstracts the PSS through the two delays, defining safety and liveness properties that precisely characterize the semantics of the PSS.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127678762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Protecting wavelet lifting transforms","authors":"G. Redinbo, Cung Nguyen","doi":"10.1109/PRDC.2004.1276573","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276573","url":null,"abstract":"Wavelet transforms are the central to many applications in image processing and data compression. They have banks of multirate filters that are difficult to protect from computer-induced numerical errors. An efficient algorithm-based fault tolerance approach is proposed for detecting arithmetic errors in the output data. Concurrent weighted parity values are designed to detect the effects of a single numerical error within the transform structure. The parity calculations use weighted sums of data, where the input parity weighting is related to the weighting used on the output data. Each parity computation is properly viewed as an inner product between weighting values and the data motivating the use of dual space functionals related to the error gain matrices that describe error propagations to the output. The parity weighting values are defined by a combination of dual space functionals. An iterative procedure for evaluating the design of the parity weights has been incorporated in Matlab code.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126002262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Serdar Cabuk, N. Malhotra, Longbi Lin, S. Bagchi, N. Shroff
{"title":"Analysis and evaluation of topological and application characteristics of unreliable mobile wireless ad-hoc network","authors":"Serdar Cabuk, N. Malhotra, Longbi Lin, S. Bagchi, N. Shroff","doi":"10.1109/PRDC.2004.1276575","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276575","url":null,"abstract":"We present a study of topological characteristics of mobile wireless ad-hoc networks. The characteristics studied are connectivity, coverage, and diameter. Knowledge of topological characteristics of a network aids in the design and performance prediction of network protocols. We introduce intelligent goal-directed mobility algorithms for achieving desired topological characteristics. A simulation-based study shows that to achieve low, medium and high network QoS defined in terms of combined requirements of the three metrics, the network needs respectively 8, 16, and 40 nodes. If nodes can fail, the requirements increase to 8, 36 and 60 nodes respectively. We present a theoretical derivation of the improvement due to the mobility models and the sufficient condition for 100% connectivity and coverage. Next, we show the effect of improved topological characteristics in enhancing QoS of an application level protocol, namely, a location determination protocol called Hop-Terrain. The study shows that the error in location estimation is reduced by up to 68% with goal-directed mobility.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129867257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluation of memory built-in self repair techniques for high defect density technologies","authors":"L. Anghel, Nadir Achouri, M. Nicolaidis","doi":"10.1109/PRDC.2004.1276581","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276581","url":null,"abstract":"Memory built in self repair (BISR) is gaining importance since several years. New fault tolerance approaches are mandatory to cope with increasing defect levels affecting memories produced with current and upcoming nanometric CMOS process. This problem will be exacerbated with nanotechnologies, where defect densities are predicted to reach levels that are several orders of magnitude higher than in current CMOS technologies. This work presents an evaluation of the area cost and yield of BISR architectures addressing memories affected by high defect densities. Statistical fault injection simulations were conducted on several memories. The obtained results show that BISR architectures can be used for future high defect technologies, providing close to 100% memory yield, by means of reasonable hardware cost.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125085046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}