{"title":"A Partition-insensitive Parallel Framework for Distributed Model Fitting","authors":"Xiaofei Wu, Rongmei Liang, Fabio Roli, Marcello Pelillo, Jing Yuan","doi":"arxiv-2406.00703","DOIUrl":null,"url":null,"abstract":"Distributed model fitting refers to the process of fitting a mathematical or\nstatistical model to the data using distributed computing resources, such that\ncomputing tasks are divided among multiple interconnected computers or nodes,\noften organized in a cluster or network. Most of the existing methods for\ndistributed model fitting are to formulate it in a consensus optimization\nproblem, and then build up algorithms based on the alternating direction method\nof multipliers (ADMM). This paper introduces a novel parallel framework for\nachieving a distributed model fitting. In contrast to previous consensus\nframeworks, the introduced parallel framework offers two notable advantages.\nFirstly, it exhibits insensitivity to sample partitioning, meaning that the\nsolution of the algorithm remains unaffected by variations in the number of\nslave nodes or/and the amount of data each node carries. Secondly, fewer\nvariables are required to be updated at each iteration, so that the proposed\nparallel framework performs in a more succinct and efficient way, and adapts to\nhigh-dimensional data. In addition, we prove that the algorithms under the new\nparallel framework have a worst-case linear convergence rate in theory.\nNumerical experiments confirm the generality, robustness, and accuracy of our\nproposed parallel framework.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.00703","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Distributed model fitting refers to the process of fitting a mathematical or
statistical model to the data using distributed computing resources, such that
computing tasks are divided among multiple interconnected computers or nodes,
often organized in a cluster or network. Most of the existing methods for
distributed model fitting are to formulate it in a consensus optimization
problem, and then build up algorithms based on the alternating direction method
of multipliers (ADMM). This paper introduces a novel parallel framework for
achieving a distributed model fitting. In contrast to previous consensus
frameworks, the introduced parallel framework offers two notable advantages.
Firstly, it exhibits insensitivity to sample partitioning, meaning that the
solution of the algorithm remains unaffected by variations in the number of
slave nodes or/and the amount of data each node carries. Secondly, fewer
variables are required to be updated at each iteration, so that the proposed
parallel framework performs in a more succinct and efficient way, and adapts to
high-dimensional data. In addition, we prove that the algorithms under the new
parallel framework have a worst-case linear convergence rate in theory.
Numerical experiments confirm the generality, robustness, and accuracy of our
proposed parallel framework.