Scaling and Self-repair of Linux Based Services Using a Novel Distributed Computing Model Exploiting Parallelism

2011 IEEE 20th International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises Pub Date : 2011-06-27 DOI:10.1109/WETICE.2011.18

G. Morana, Rao V. Mikkilineni

{"title":"Scaling and Self-repair of Linux Based Services Using a Novel Distributed Computing Model Exploiting Parallelism","authors":"G. Morana, Rao V. Mikkilineni","doi":"10.1109/WETICE.2011.18","DOIUrl":null,"url":null,"abstract":"This paper describes a prototype implementing a high degree of fault tolerance, reliability and resilience in distributed software systems. The prototype incorporates fault, configuration, accounting, performance and security (FCAPS) management using a signaling network overlay and allows the dynamic control of a set of nodes called Distributed Intelligent Managed Elements (DIMEs) in a network. Each DIME is a computing entity (implemented in Linux and in the future will be ported to Windows) endowed with self-management and signaling capabilities to collaborate with other DIMEs in a network. The prototype incorporates a new computing model proposed by Mikkilineni in 2010, with signaling network overlay over the computing network and allows parallelism in resource monitoring, analysis and reconfiguration. A workflow is implemented as a set of tasks, arranged or organized in a directed acyclic graph (DAG) and executed by a managed network of DIMEs. Distributed DIME networks provide a network computing model to create distributed computing clouds and execute distributed managed workflows with high degree of agility, availability, reliability, performance and security.","PeriodicalId":274311,"journal":{"name":"2011 IEEE 20th International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 20th International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WETICE.2011.18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 28

Abstract

This paper describes a prototype implementing a high degree of fault tolerance, reliability and resilience in distributed software systems. The prototype incorporates fault, configuration, accounting, performance and security (FCAPS) management using a signaling network overlay and allows the dynamic control of a set of nodes called Distributed Intelligent Managed Elements (DIMEs) in a network. Each DIME is a computing entity (implemented in Linux and in the future will be ported to Windows) endowed with self-management and signaling capabilities to collaborate with other DIMEs in a network. The prototype incorporates a new computing model proposed by Mikkilineni in 2010, with signaling network overlay over the computing network and allows parallelism in resource monitoring, analysis and reconfiguration. A workflow is implemented as a set of tasks, arranged or organized in a directed acyclic graph (DAG) and executed by a managed network of DIMEs. Distributed DIME networks provide a network computing model to create distributed computing clouds and execute distributed managed workflows with high degree of agility, availability, reliability, performance and security.

查看原文本刊更多论文

基于Linux的服务的扩展和自修复——一种利用并行性的新型分布式计算模型

本文描述了一种在分布式软件系统中实现高度容错、可靠性和弹性的原型。该原型结合了故障、配置、会计、性能和安全(FCAPS)管理，使用信号网络覆盖，并允许在网络中动态控制一组称为分布式智能管理元素(DIMEs)的节点。每个DIME都是一个计算实体(在Linux中实现，将来将移植到Windows中)，具有自我管理和信令功能，可以与网络中的其他DIME协作。该原型采用了Mikkilineni在2010年提出的一种新的计算模型，将信号网络覆盖在计算网络上，并允许资源监控、分析和重新配置的并行性。工作流被实现为一组任务，在有向无环图(DAG)中安排或组织，并由受管理的dime网络执行。分布式DIME网络提供了一种网络计算模型，用于创建分布式计算云和执行分布式托管工作流，具有高度的敏捷性、可用性、可靠性、性能和安全性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 IEEE 20th International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises

自引率

0.00%

发文量