{"title":"迈向透明高效的软件分布式共享内存","authors":"D. Scales, K. Gharachorloo","doi":"10.1145/268998.266673","DOIUrl":null,"url":null,"abstract":"Despite a large research effort, software distributed shared memory systems have not been widely used to run parallel applications across clusters of computers. The higher performance of hardware multiprocessors makes them the preferred platform for developing and executing applications. In addition, most applications are distributed only in binary format for a handful of popular hardware systems. Due to their limited functionality, software systems cannot directly execute the applications developed for hardware platforms. We have developed a system called Shasta that attempts to address the issues of efficiency and transparency that have hindered wider acceptance of software systems. Shasta is a distributed shared memory system that supports coherence at a fine granularity in software and can efficiently exploit small-scale SMP nodes by allowing processes on the same node to share data at hardware speeds. Thispaper focuses onourgoal oftappingintolarge classes of commercially available applications by transparently executing the same binaries that run on hardware platforms. We discuss the issues involved in achieving transparent execution of binaries, which include supportingthe full instruction set architecture, implementing an appropriate memory consistency model, and extending OS services across separate nodes. We also describe the techniques used in Shasta to solve the above problems. The Shasta system is fully functional on a prototype cluster of Alpha multiprocessors connected through Digital’s Memory Channel network and can transparentlyrunparallelapplicationsontheclusterthatwere compiled to run on a single shared-memory multiprocessor. As an example of Shasta’s flexibility, it can execute Oracle 7.3, a commercial database engine, across the cluster, includingworkloadsmodeled after theTPC-B and TPC-D database benchmarks. To characterize the performance of the system and the cost of providing complete transparency, we present performance results for microbenchmarks and applications runningon the cluster, include preliminary results for Oracle runs.","PeriodicalId":340271,"journal":{"name":"Proceedings of the sixteenth ACM symposium on Operating systems principles","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"63","resultStr":"{\"title\":\"Towards transparent and efficient software distributed shared memory\",\"authors\":\"D. Scales, K. Gharachorloo\",\"doi\":\"10.1145/268998.266673\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Despite a large research effort, software distributed shared memory systems have not been widely used to run parallel applications across clusters of computers. The higher performance of hardware multiprocessors makes them the preferred platform for developing and executing applications. In addition, most applications are distributed only in binary format for a handful of popular hardware systems. Due to their limited functionality, software systems cannot directly execute the applications developed for hardware platforms. We have developed a system called Shasta that attempts to address the issues of efficiency and transparency that have hindered wider acceptance of software systems. Shasta is a distributed shared memory system that supports coherence at a fine granularity in software and can efficiently exploit small-scale SMP nodes by allowing processes on the same node to share data at hardware speeds. Thispaper focuses onourgoal oftappingintolarge classes of commercially available applications by transparently executing the same binaries that run on hardware platforms. We discuss the issues involved in achieving transparent execution of binaries, which include supportingthe full instruction set architecture, implementing an appropriate memory consistency model, and extending OS services across separate nodes. We also describe the techniques used in Shasta to solve the above problems. The Shasta system is fully functional on a prototype cluster of Alpha multiprocessors connected through Digital’s Memory Channel network and can transparentlyrunparallelapplicationsontheclusterthatwere compiled to run on a single shared-memory multiprocessor. As an example of Shasta’s flexibility, it can execute Oracle 7.3, a commercial database engine, across the cluster, includingworkloadsmodeled after theTPC-B and TPC-D database benchmarks. To characterize the performance of the system and the cost of providing complete transparency, we present performance results for microbenchmarks and applications runningon the cluster, include preliminary results for Oracle runs.\",\"PeriodicalId\":340271,\"journal\":{\"name\":\"Proceedings of the sixteenth ACM symposium on Operating systems principles\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"63\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the sixteenth ACM symposium on Operating systems principles\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/268998.266673\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the sixteenth ACM symposium on Operating systems principles","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/268998.266673","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Towards transparent and efficient software distributed shared memory
Despite a large research effort, software distributed shared memory systems have not been widely used to run parallel applications across clusters of computers. The higher performance of hardware multiprocessors makes them the preferred platform for developing and executing applications. In addition, most applications are distributed only in binary format for a handful of popular hardware systems. Due to their limited functionality, software systems cannot directly execute the applications developed for hardware platforms. We have developed a system called Shasta that attempts to address the issues of efficiency and transparency that have hindered wider acceptance of software systems. Shasta is a distributed shared memory system that supports coherence at a fine granularity in software and can efficiently exploit small-scale SMP nodes by allowing processes on the same node to share data at hardware speeds. Thispaper focuses onourgoal oftappingintolarge classes of commercially available applications by transparently executing the same binaries that run on hardware platforms. We discuss the issues involved in achieving transparent execution of binaries, which include supportingthe full instruction set architecture, implementing an appropriate memory consistency model, and extending OS services across separate nodes. We also describe the techniques used in Shasta to solve the above problems. The Shasta system is fully functional on a prototype cluster of Alpha multiprocessors connected through Digital’s Memory Channel network and can transparentlyrunparallelapplicationsontheclusterthatwere compiled to run on a single shared-memory multiprocessor. As an example of Shasta’s flexibility, it can execute Oracle 7.3, a commercial database engine, across the cluster, includingworkloadsmodeled after theTPC-B and TPC-D database benchmarks. To characterize the performance of the system and the cost of providing complete transparency, we present performance results for microbenchmarks and applications runningon the cluster, include preliminary results for Oracle runs.