Eve Le Guillou, Michael Will, Pierre Guillou, Jonas Lukasczyk, Pierre Fortin, Christoph Garth, Julien Tierny
{"title":"A Generic Software Framework for Distributed Topological Analysis Pipelines","authors":"Eve Le Guillou, Michael Will, Pierre Guillou, Jonas Lukasczyk, Pierre Fortin, Christoph Garth, Julien Tierny","doi":"arxiv-2310.08339","DOIUrl":null,"url":null,"abstract":"This system paper presents a software framework for the support of\ntopological analysis pipelines in a distributed-memory model. While several\nrecent papers introduced topology-based approaches for distributed-memory\nenvironments, these were reporting experiments obtained with tailored,\nmono-algorithm implementations. In contrast, we describe in this paper a\ngeneral-purpose, generic framework for topological analysis pipelines, i.e. a\nsequence of topological algorithms interacting together, possibly on distinct\nnumbers of processes. Specifically, we instantiated our framework with the MPI\nmodel, within the Topology ToolKit (TTK). While developing this framework, we\nfaced several algorithmic and software engineering challenges, which we\ndocument in this paper. We provide a taxonomy for the distributed-memory\ntopological algorithms supported by TTK, depending on their communication needs\nand provide examples of hybrid MPI+thread parallelizations. Detailed\nperformance analyses show that parallel efficiencies range from $20\\%$ to\n$80\\%$ (depending on the algorithms), and that the MPI-specific preconditioning\nintroduced by our framework induces a negligible computation time overhead. We\nillustrate the new distributed-memory capabilities of TTK with an example of\nadvanced analysis pipeline, combining multiple algorithms, run on the largest\npublicly available dataset we have found (120 billion vertices) on a standard\ncluster with 64 nodes (for a total of 1,536 cores). Finally, we provide a\nroadmap for the completion of TTK's MPI extension, along with generic\nrecommendations for each algorithm communication category.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Mathematical Software","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2310.08339","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This system paper presents a software framework for the support of
topological analysis pipelines in a distributed-memory model. While several
recent papers introduced topology-based approaches for distributed-memory
environments, these were reporting experiments obtained with tailored,
mono-algorithm implementations. In contrast, we describe in this paper a
general-purpose, generic framework for topological analysis pipelines, i.e. a
sequence of topological algorithms interacting together, possibly on distinct
numbers of processes. Specifically, we instantiated our framework with the MPI
model, within the Topology ToolKit (TTK). While developing this framework, we
faced several algorithmic and software engineering challenges, which we
document in this paper. We provide a taxonomy for the distributed-memory
topological algorithms supported by TTK, depending on their communication needs
and provide examples of hybrid MPI+thread parallelizations. Detailed
performance analyses show that parallel efficiencies range from $20\%$ to
$80\%$ (depending on the algorithms), and that the MPI-specific preconditioning
introduced by our framework induces a negligible computation time overhead. We
illustrate the new distributed-memory capabilities of TTK with an example of
advanced analysis pipeline, combining multiple algorithms, run on the largest
publicly available dataset we have found (120 billion vertices) on a standard
cluster with 64 nodes (for a total of 1,536 cores). Finally, we provide a
roadmap for the completion of TTK's MPI extension, along with generic
recommendations for each algorithm communication category.