{"title":"基于MPI的3D FMA并行实现","authors":"E. Lu, D. Okunbor","doi":"10.1109/MPIDC.1996.534102","DOIUrl":null,"url":null,"abstract":"The simulation of N-body systems has been used extensively in biophysics and chemistry to investigate the dynamics of biomolecules, and in astrophysics to study the chaotic characteristics of the galactic system. However, the long-range force calculation has a time complexity of O(N/sup 2/), where N is the number of particles in the system. The fast multipole algorithm (FMA), proposed by Greengard and Rokhlin (1987), reduces the time complexity to O(N). Our goal is to build a parallel FMA library which is portable, scalable and efficient. We use the Message Passing Interface (MPI) as the communication back-end. Also, an effective communication scheme to reduce the communication overhead and a partitioning technique to obtain good load balancing among the processors were implemented into the library.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Parallel implementation of 3D FMA using MPI\",\"authors\":\"E. Lu, D. Okunbor\",\"doi\":\"10.1109/MPIDC.1996.534102\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The simulation of N-body systems has been used extensively in biophysics and chemistry to investigate the dynamics of biomolecules, and in astrophysics to study the chaotic characteristics of the galactic system. However, the long-range force calculation has a time complexity of O(N/sup 2/), where N is the number of particles in the system. The fast multipole algorithm (FMA), proposed by Greengard and Rokhlin (1987), reduces the time complexity to O(N). Our goal is to build a parallel FMA library which is portable, scalable and efficient. We use the Message Passing Interface (MPI) as the communication back-end. Also, an effective communication scheme to reduce the communication overhead and a partitioning technique to obtain good load balancing among the processors were implemented into the library.\",\"PeriodicalId\":432081,\"journal\":{\"name\":\"Proceedings. Second MPI Developer's Conference\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1996-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. Second MPI Developer's Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MPIDC.1996.534102\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. Second MPI Developer's Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MPIDC.1996.534102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The simulation of N-body systems has been used extensively in biophysics and chemistry to investigate the dynamics of biomolecules, and in astrophysics to study the chaotic characteristics of the galactic system. However, the long-range force calculation has a time complexity of O(N/sup 2/), where N is the number of particles in the system. The fast multipole algorithm (FMA), proposed by Greengard and Rokhlin (1987), reduces the time complexity to O(N). Our goal is to build a parallel FMA library which is portable, scalable and efficient. We use the Message Passing Interface (MPI) as the communication back-end. Also, an effective communication scheme to reduce the communication overhead and a partitioning technique to obtain good load balancing among the processors were implemented into the library.