Learning to branch: Generalization guarantees and limits of data-independent discretization

IF 2.3 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of the ACM Pub Date : 2023-12-25 DOI:10.1145/3637840

Maria-Florina Balcan, Travis Dick, Tuomas Sandholm, Ellen Vitercik

{"title":"Learning to branch: Generalization guarantees and limits of data-independent discretization","authors":"Maria-Florina Balcan, Travis Dick, Tuomas Sandholm, Ellen Vitercik","doi":"10.1145/3637840","DOIUrl":null,"url":null,"abstract":"Tree search algorithms, such as branch-and-bound, are the most widely used tools for solving combinatorial and non-convex problems. For example, they are the foremost method for solving (mixed) integer programs and constraint satisfaction problems. Tree search algorithms come with a variety of tunable parameters that are notoriously challenging to tune by hand. A growing body of research has demonstrated the power of using a data-driven approach to automatically optimize the parameters of tree search algorithms. These techniques use a training set of integer programs sampled from an application-specific instance distribution to find a parameter setting that has strong average performance over the training set. However, with too few samples, a parameter setting may have strong average performance on the training set but poor expected performance on future integer programs from the same application. Our main contribution is to provide the first sample complexity guarantees for tree search parameter tuning. These guarantees bound the number of samples sufficient to ensure that the average performance of tree search over the samples nearly matches its future expected performance on the unknown instance distribution. In particular, the parameters we analyze weight scoring rules used for variable selection. Proving these guarantees is challenging because tree size is a volatile function of these parameters: we prove that for any discretization (uniform or not) of the parameter space, there exists a distribution over integer programs such that every parameter setting in the discretization results in a tree with exponential expected size, yet there exist parameter settings between the discretized points that result in trees of constant size. In addition, we provide data-dependent guarantees that depend on the volatility of these tree-size functions: our guarantees improve if the tree-size functions can be well-approximated by simpler functions. Finally, via experiments, we illustrate that learning an optimal weighting of scoring rules reduces tree size.","PeriodicalId":50022,"journal":{"name":"Journal of the ACM","volume":"104 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2023-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the ACM","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3637840","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Tree search algorithms, such as branch-and-bound, are the most widely used tools for solving combinatorial and non-convex problems. For example, they are the foremost method for solving (mixed) integer programs and constraint satisfaction problems. Tree search algorithms come with a variety of tunable parameters that are notoriously challenging to tune by hand. A growing body of research has demonstrated the power of using a data-driven approach to automatically optimize the parameters of tree search algorithms. These techniques use a training set of integer programs sampled from an application-specific instance distribution to find a parameter setting that has strong average performance over the training set. However, with too few samples, a parameter setting may have strong average performance on the training set but poor expected performance on future integer programs from the same application. Our main contribution is to provide the first sample complexity guarantees for tree search parameter tuning. These guarantees bound the number of samples sufficient to ensure that the average performance of tree search over the samples nearly matches its future expected performance on the unknown instance distribution. In particular, the parameters we analyze weight scoring rules used for variable selection. Proving these guarantees is challenging because tree size is a volatile function of these parameters: we prove that for any discretization (uniform or not) of the parameter space, there exists a distribution over integer programs such that every parameter setting in the discretization results in a tree with exponential expected size, yet there exist parameter settings between the discretized points that result in trees of constant size. In addition, we provide data-dependent guarantees that depend on the volatility of these tree-size functions: our guarantees improve if the tree-size functions can be well-approximated by simpler functions. Finally, via experiments, we illustrate that learning an optimal weighting of scoring rules reduces tree size.

查看原文本刊更多论文

学会分支：与数据无关的离散化的泛化保证和极限

树状搜索算法（如分支与边界）是解决组合问题和非凸问题最广泛使用的工具。例如，它们是解决（混合）整数程序和约束满足问题的最主要方法。树状搜索算法有多种可调参数，而人工调整这些参数的难度可想而知。越来越多的研究表明，使用数据驱动方法自动优化树搜索算法的参数非常有效。这些技术使用从特定应用实例分布中采样的整数程序训练集，以找到对训练集具有较高平均性能的参数设置。然而，如果样本太少，参数设置在训练集上的平均性能可能很高，但在同一应用的未来整数程序上的预期性能却很差。我们的主要贡献是首次为树搜索参数调整提供了样本复杂度保证。这些保证限定了样本数量，足以确保树搜索在样本上的平均性能与其在未知实例分布上的未来预期性能相匹配。特别是，我们分析的参数是用于变量选择的加权评分规则。证明这些保证极具挑战性，因为树的大小是这些参数的波动函数：我们证明，对于参数空间的任何离散化（统一或非统一），都存在一个整数程序分布，使得离散化中的每个参数设置都会导致一棵树的预期大小呈指数级增长，而离散化点之间的参数设置又会导致一棵树的大小不变。此外，我们还提供了依赖数据的保证，这些保证取决于这些树大小函数的波动性：如果树大小函数可以用更简单的函数很好地近似，我们的保证就会提高。最后，我们通过实验说明，学习评分规则的最优加权可以减小树的大小。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of the ACM 工程技术-计算机：理论方法

CiteScore

7.50

自引率

0.00%

发文量

审稿时长

3 months

期刊介绍： The best indicator of the scope of the journal is provided by the areas covered by its Editorial Board. These areas change from time to time, as the field evolves. The following areas are currently covered by a member of the Editorial Board: Algorithms and Combinatorial Optimization; Algorithms and Data Structures; Algorithms, Combinatorial Optimization, and Games; Artificial Intelligence; Complexity Theory; Computational Biology; Computational Geometry; Computer Graphics and Computer Vision; Computer-Aided Verification; Cryptography and Security; Cyber-Physical, Embedded, and Real-Time Systems; Database Systems and Theory; Distributed Computing; Economics and Computation; Information Theory; Logic and Computation; Logic, Algorithms, and Complexity; Machine Learning and Computational Learning Theory; Networking; Parallel Computing and Architecture; Programming Languages; Quantum Computing; Randomized Algorithms and Probabilistic Analysis of Algorithms; Scientific Computing and High Performance Computing; Software Engineering; Web Algorithms and Data Mining