{"title":"Parking Search in Urban Street Networks: Taming Down the Complexity of the Search-Time Problem via a Coarse-Graining Approach","authors":"L'eo Bulckaen, Nilankur Dutta, Alexandre Nicolas","doi":"10.1007/978-3-031-30445-3_39","DOIUrl":"https://doi.org/10.1007/978-3-031-30445-3_39","url":null,"abstract":"","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"167 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125526126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Kokkos-Based Implementation of MPCD on Heterogeneous Nodes","authors":"R. Halver, Christoph Junghans, G. Sutmann","doi":"10.1007/978-3-031-30445-3_1","DOIUrl":"https://doi.org/10.1007/978-3-031-30445-3_1","url":null,"abstract":"","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"03 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129958367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Acceptance Rates of Invertible Neural Networks on Electron Spectra from Near-Critical Laser-Plasmas: A Comparison","authors":"T. Miethlinger, N. Hoffmann, T. Kluge","doi":"10.1007/978-3-031-30445-3_23","DOIUrl":"https://doi.org/10.1007/978-3-031-30445-3_23","url":null,"abstract":"","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128259174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distributed Work Stealing in a Task-Based Dataflow Runtime","authors":"Joseph John, Joshua Milthorpe, P. Strazdins","doi":"10.48550/arXiv.2211.00838","DOIUrl":"https://doi.org/10.48550/arXiv.2211.00838","url":null,"abstract":"The task-based dataflow programming model has emerged as an alternative to the process-centric programming model for extreme-scale applications. However, load balancing is still a challenge in task-based dataflow runtimes. In this paper, we present extensions to the PaR-SEC runtime to demonstrate that distributed work stealing is an effective load-balancing method for task-based dataflow runtimes. In contrast to shared-memory work stealing, we find that each process should consider future tasks and the expected waiting time for execution when determining whether to steal. We demonstrate the effectiveness of the proposed work-stealing policies for a sparse Cholesky factorization, which shows a speedup of up to 35% compared to a static division of work.","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130967655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Måns I. Andersson, N. A. Murugan, Artur Podobas, S. Markidis
{"title":"Breaking Down the Parallel Performance of GROMACS, a High-Performance Molecular Dynamics Software","authors":"Måns I. Andersson, N. A. Murugan, Artur Podobas, S. Markidis","doi":"10.1007/978-3-031-30442-2_25","DOIUrl":"https://doi.org/10.1007/978-3-031-30442-2_25","url":null,"abstract":"","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115617024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Felix Liu, Måns I. Andersson, A. Fredriksson, S. Markidis
{"title":"Distributed Objective Function Evaluation for Optimization of Radiation Therapy Treatment Plans","authors":"Felix Liu, Måns I. Andersson, A. Fredriksson, S. Markidis","doi":"10.1007/978-3-031-30442-2_29","DOIUrl":"https://doi.org/10.1007/978-3-031-30442-2_29","url":null,"abstract":"","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115453300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Neural Nets with a Newton Conjugate Gradient Method on Multiple GPUs","authors":"Severin Reiz, T. Neckel, H. Bungartz","doi":"10.48550/arXiv.2208.02017","DOIUrl":"https://doi.org/10.48550/arXiv.2208.02017","url":null,"abstract":"Training deep neural networks consumes increasing computational resource shares in many compute centers. Often, a brute force approach to obtain hyperparameter values is employed. Our goal is (1) to enhance this by enabling second-order optimization methods with fewer hyperparameters for large-scale neural networks and (2) to perform a survey of the performance optimizers for specific tasks to suggest users the best one for their problem. We introduce a novel second-order optimization method that requires the effect of the Hessian on a vector only and avoids the huge cost of explicitly setting up the Hessian for large-scale networks. We compare the proposed second-order method with two state-of-the-art optimizers on five representative neural network problems, including regression and very deep networks from computer vision or variational autoencoders. For the largest setup, we efficiently parallelized the optimizers with Horovod and applied it to a 8 GPU NVIDIA P100 (DGX-1) machine.","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121337011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MD-Bench: A generic proxy-app toolbox for state-of-the-art molecular dynamics algorithms","authors":"R. Machado, Jan Eitzinger, H. Köstler, G. Wellein","doi":"10.48550/arXiv.2207.13094","DOIUrl":"https://doi.org/10.48550/arXiv.2207.13094","url":null,"abstract":"Proxy-apps, or mini-apps, are simple self-contained benchmark codes with performance-relevant kernels extracted from real applications. Initially used to facilitate software-hardware co-design, they are a crucial ingredient for serious performance engineering, especially when dealing with large-scale production codes. MD-Bench is a new proxy-app in the area of classical short-range molecular dynamics. In contrast to existing proxy-apps in MD (e.g. miniMD and coMD) it does not resemble a single application code, but implements state-of-the art algorithms from multiple applications (currently LAMMPS and GROMACS). The MD-Bench source code is understandable, extensible and suited for teaching, benchmarking and researching MD algorithms. Primary design goals are transparency and simplicity, a developer is able to tinker with the source code down to the assembly level. This paper introduces MD-Bench, explains its design and structure, covers implemented optimization variants, and illustrates its usage on three examples.","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128367727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring Techniques for the Analysis of Spontaneous Asynchronicity in MPI-Parallel Applications","authors":"Ayesha Afzal, G. Hager, G. Wellein, S. Markidis","doi":"10.1007/978-3-031-30442-2_12","DOIUrl":"https://doi.org/10.1007/978-3-031-30442-2_12","url":null,"abstract":"","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128508072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}