A. Fois , L. Insolia , L. Consolini , F. Laurini , M. Locatelli , M. Riani
{"title":"Clusterwise linear regression using a probabilistic branch and bound algorithm under Gaussianity","authors":"A. Fois , L. Insolia , L. Consolini , F. Laurini , M. Locatelli , M. Riani","doi":"10.1016/j.cor.2025.107375","DOIUrl":null,"url":null,"abstract":"<div><div>Clusterwise Linear Regression (CLR) combines classical linear regression with cluster analysis to model heterogeneous data. It overcomes the limitations of a single global model by simultaneously partitioning the data points into distinct clusters and fitting each cluster separately. However, since the underlying point-to-cluster assignments are unknown, the estimation process typically leads to a computationally challenging combinatorial problem. In this work, we introduce a new reformulation of the CLR problem under Gaussian assumptions, and propose a probabilistic branch-and-bound algorithm called <span>pclustreg</span>. This algorithm gives, with high probability, solutions that are at least as good as the (unknown) ground truth in terms of log-likelihood, bridging the gap between existing likelihood-based heuristic and global methods. Moreover, by limiting the number of expanded nodes, it can also be used as an effective heuristic algorithm. We highlight the performance of <span>pclustreg</span> on both synthetic and real-world datasets, comparing it against the state-of-the-art likelihood-based heuristic method, and show that it achieves comparable or better results both in terms of solution accuracy and computing times.</div></div>","PeriodicalId":10542,"journal":{"name":"Computers & Operations Research","volume":"189 ","pages":"Article 107375"},"PeriodicalIF":4.3000,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Operations Research","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0305054825004046","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/1/19 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Clusterwise Linear Regression (CLR) combines classical linear regression with cluster analysis to model heterogeneous data. It overcomes the limitations of a single global model by simultaneously partitioning the data points into distinct clusters and fitting each cluster separately. However, since the underlying point-to-cluster assignments are unknown, the estimation process typically leads to a computationally challenging combinatorial problem. In this work, we introduce a new reformulation of the CLR problem under Gaussian assumptions, and propose a probabilistic branch-and-bound algorithm called pclustreg. This algorithm gives, with high probability, solutions that are at least as good as the (unknown) ground truth in terms of log-likelihood, bridging the gap between existing likelihood-based heuristic and global methods. Moreover, by limiting the number of expanded nodes, it can also be used as an effective heuristic algorithm. We highlight the performance of pclustreg on both synthetic and real-world datasets, comparing it against the state-of-the-art likelihood-based heuristic method, and show that it achieves comparable or better results both in terms of solution accuracy and computing times.
期刊介绍:
Operations research and computers meet in a large number of scientific fields, many of which are of vital current concern to our troubled society. These include, among others, ecology, transportation, safety, reliability, urban planning, economics, inventory control, investment strategy and logistics (including reverse logistics). Computers & Operations Research provides an international forum for the application of computers and operations research techniques to problems in these and related fields.