Advancing non-convex and constrained learning: challenges and opportunities

AI matters Pub Date : 2019-12-06 DOI:10.1145/3362077.3362085

Tianbao Yang

{"title":"Advancing non-convex and constrained learning: challenges and opportunities","authors":"Tianbao Yang","doi":"10.1145/3362077.3362085","DOIUrl":null,"url":null,"abstract":"As data gets more complex and applications of machine learning (ML) algorithms for decision-making broaden and diversify, traditional ML methods by minimizing an unconstrained or simply constrained convex objective are becoming increasingly unsatisfactory. To address this new challenge, recent ML research has sparked a paradigm shift in learning predictive models into non-convex learning and heavily constrained learning. Non-Convex Learning (NCL) refers to a family of learning methods that involve optimizing non-convex objectives. Heavily Constrained Learning (HCL) refers to a family of learning methods that involve constraints that are much more complicated than a simple norm constraint (e.g., data-dependent functional constraints, non-convex constraints), as in conventional learning. This paradigm shift has already created many promising outcomes: (i) non-convex deep learning has brought breakthroughs for learning representations from large-scale structured data (e.g., images, speech) (LeCun, Bengio, & Hinton, 2015; Krizhevsky, Sutskever, & Hinton, 2012; Amodei et al., 2016; Deng & Liu, 2018); (ii) non-convex regularizers (e.g., for enforcing sparsity or low-rank) could be more effective than their convex counterparts for learning high-dimensional structured models (C.-H. Zhang & Zhang, 2012; J. Fan & Li, 2001; C.-H. Zhang, 2010; T. Zhang, 2010); (iii) constrained learning is being used to learn predictive models that satisfy various constraints to respect social norms (e.g., fairness) (B. E. Woodworth, Gunasekar, Ohannessian, & Srebro, 2017; Hardt, Price, Srebro, et al., 2016; Zafar, Valera, Gomez Rodriguez, & Gummadi, 2017; A. Agarwal, Beygelzimer, Dudík, Langford, & Wallach, 2018), to improve the interpretability (Gupta et al., 2016; Canini, Cotter, Gupta, Fard, & Pfeifer, 2016; You, Ding, Canini, Pfeifer, & Gupta, 2017), to enhance the robustness (Globerson & Roweis, 2006a; Sra, Nowozin, & Wright, 2011; T. Yang, Mahdavi, Jin, Zhang, & Zhou, 2012), etc. In spite of great promises brought by these new learning paradigms, they also bring emerging challenges to the design of computationally efficient algorithms for big data and the analysis of their statistical properties.","PeriodicalId":91445,"journal":{"name":"AI matters","volume":"5 1","pages":"29-39"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3362077.3362085","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI matters","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3362077.3362085","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

Abstract

As data gets more complex and applications of machine learning (ML) algorithms for decision-making broaden and diversify, traditional ML methods by minimizing an unconstrained or simply constrained convex objective are becoming increasingly unsatisfactory. To address this new challenge, recent ML research has sparked a paradigm shift in learning predictive models into non-convex learning and heavily constrained learning. Non-Convex Learning (NCL) refers to a family of learning methods that involve optimizing non-convex objectives. Heavily Constrained Learning (HCL) refers to a family of learning methods that involve constraints that are much more complicated than a simple norm constraint (e.g., data-dependent functional constraints, non-convex constraints), as in conventional learning. This paradigm shift has already created many promising outcomes: (i) non-convex deep learning has brought breakthroughs for learning representations from large-scale structured data (e.g., images, speech) (LeCun, Bengio, & Hinton, 2015; Krizhevsky, Sutskever, & Hinton, 2012; Amodei et al., 2016; Deng & Liu, 2018); (ii) non-convex regularizers (e.g., for enforcing sparsity or low-rank) could be more effective than their convex counterparts for learning high-dimensional structured models (C.-H. Zhang & Zhang, 2012; J. Fan & Li, 2001; C.-H. Zhang, 2010; T. Zhang, 2010); (iii) constrained learning is being used to learn predictive models that satisfy various constraints to respect social norms (e.g., fairness) (B. E. Woodworth, Gunasekar, Ohannessian, & Srebro, 2017; Hardt, Price, Srebro, et al., 2016; Zafar, Valera, Gomez Rodriguez, & Gummadi, 2017; A. Agarwal, Beygelzimer, Dudík, Langford, & Wallach, 2018), to improve the interpretability (Gupta et al., 2016; Canini, Cotter, Gupta, Fard, & Pfeifer, 2016; You, Ding, Canini, Pfeifer, & Gupta, 2017), to enhance the robustness (Globerson & Roweis, 2006a; Sra, Nowozin, & Wright, 2011; T. Yang, Mahdavi, Jin, Zhang, & Zhou, 2012), etc. In spite of great promises brought by these new learning paradigms, they also bring emerging challenges to the design of computationally efficient algorithms for big data and the analysis of their statistical properties.

查看原文本刊更多论文

推进非凸学习和约束学习:挑战与机遇

随着数据变得越来越复杂，机器学习(ML)算法在决策中的应用越来越广泛和多样化，传统的ML方法通过最小化无约束或简单约束的凸目标变得越来越不令人满意。为了应对这一新的挑战，最近的机器学习研究引发了学习预测模型向非凸学习和严重约束学习的范式转变。非凸学习(NCL)是指一系列涉及优化非凸目标的学习方法。重度约束学习(HCL)是指一系列学习方法，这些方法涉及比简单规范约束复杂得多的约束(例如，依赖数据的功能约束，非凸约束)，就像传统学习一样。这种范式转变已经产生了许多有希望的结果:(i)非凸深度学习为从大规模结构化数据(例如，图像，语音)中学习表征带来了突破(LeCun, Bengio， & Hinton, 2015;Krizhevsky, Sutskever， & Hinton, 2012;Amodei et al.， 2016;邓&刘，2018);(ii)非凸正则化器(例如，用于加强稀疏性或低秩)在学习高维结构化模型(c - h)方面可能比凸正则化器更有效。Zhang & Zhang, 2012;范杰、李，2001;学术界。张,2010;张涛，2010);(iii)约束学习被用于学习满足各种约束的预测模型，以尊重社会规范(例如，公平性)(B. E. Woodworth, Gunasekar, Ohannessian， & Srebro, 2017);Hardt, Price, Srebro等，2016;Zafar, Valera, Gomez Rodriguez， & Gummadi, 2017;A. Agarwal, Beygelzimer, Dudík, Langford， & Wallach, 2018)，以提高可解释性(Gupta et al.， 2016;Canini, Cotter, Gupta, Fard， & Pfeifer, 2016;You, Ding, Canini, Pfeifer， & Gupta, 2017)，以增强鲁棒性(Globerson & Roweis, 2006a;Sra, Nowozin， & Wright, 2011;杨涛，马大维，金，张，周，2012)等。尽管这些新的学习范式带来了巨大的希望，但它们也给大数据计算高效算法的设计和统计特性的分析带来了新的挑战。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

AI matters

自引率

0.00%

发文量