{"title":"Towards sharper excess risk bounds for differentially private pairwise learning","authors":"Yilin Kang, Jian Li, Yong Liu, Weiping Wang","doi":"10.1016/j.neucom.2024.128610","DOIUrl":null,"url":null,"abstract":"Pairwise learning is a vital part of machine learning. It depends on pairs of training instances, and is naturally fit for modeling relationships between samples. However, as a data driven paradigm, it faces huge privacy issues. Differential privacy (DP) is a useful tool to protect the privacy of machine learning, but corresponding excess population risk bounds are loose in existing DP pairwise learning analysis. In this paper, we propose a gradient perturbation algorithm for pairwise learning to get better risk bounds under Polyak–Łojasiewicz condition, including both convex and non-convex cases. Specifically, for the theoretical risk bound in expectation, previous best results are of rates <mml:math altimg=\"si1.svg\" display=\"inline\"><mml:mrow><mml:mi mathvariant=\"script\">O</mml:mi><mml:mrow><mml:mo fence=\"true\">(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:msup><mml:mrow><mml:mi>ϵ</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:mfrac></mml:mrow><mml:mo fence=\"true\">)</mml:mo></mml:mrow></mml:mrow></mml:math> and <mml:math altimg=\"si2.svg\" display=\"inline\"><mml:mrow><mml:mi mathvariant=\"script\">O</mml:mi><mml:mrow><mml:mo fence=\"true\">(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msqrt><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msqrt></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mi>ϵ</mml:mi></mml:mrow></mml:mfrac><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msqrt><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac></mml:mrow><mml:mo fence=\"true\">)</mml:mo></mml:mrow></mml:mrow></mml:math> under strongly convex condition and convex conditions, respectively. In this paper, we use the <ce:italic>on-average stability</ce:italic> and achieve an <mml:math altimg=\"si3.svg\" display=\"inline\"><mml:mrow><mml:mi mathvariant=\"script\">O</mml:mi><mml:mrow><mml:mo fence=\"true\">(</mml:mo><mml:mrow><mml:mo>min</mml:mo><mml:mrow><mml:mo fence=\"true\">{</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msqrt><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msqrt></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>.</mml:mo><mml:mn>5</mml:mn></mml:mrow></mml:msup><mml:mi>ϵ</mml:mi></mml:mrow></mml:mfrac><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:msup><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>.</mml:mo><mml:mn>5</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mo>.</mml:mo><mml:mn>5</mml:mn></mml:mrow></mml:msup><mml:msup><mml:mrow><mml:mi>ϵ</mml:mi></mml:mrow><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:mo>,</mml:mo><mml:mfrac><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:msup><mml:mrow><mml:mi>ϵ</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:mfrac></mml:mrow><mml:mo fence=\"true\">}</mml:mo></mml:mrow></mml:mrow><mml:mo fence=\"true\">)</mml:mo></mml:mrow></mml:mrow></mml:math> bound, significantly improving previous bounds. For the high probability risk bound, previous best results are analyzed by the uniform stability, and <mml:math altimg=\"si4.svg\" display=\"inline\"><mml:mrow><mml:mi mathvariant=\"script\">O</mml:mi><mml:mrow><mml:mo fence=\"true\">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>β</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>U</mml:mi></mml:mrow></mml:msubsup><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:msqrt><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msqrt></mml:mrow><mml:mrow><mml:msqrt><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msqrt><mml:mi>ϵ</mml:mi></mml:mrow></mml:mfrac></mml:mrow><mml:mo fence=\"true\">)</mml:mo></mml:mrow></mml:mrow></mml:math> excess population risk bounds are achieved under strongly convex or convex conditions, where <mml:math altimg=\"si5.svg\" display=\"inline\"><mml:msubsup><mml:mrow><mml:mi>β</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>U</mml:mi></mml:mrow></mml:msubsup></mml:math> is the traditional pairwise uniform stability parameter, it is large since it considers the worst case of the loss sensitivity. In this paper, we propose the <ce:italic>pairwise locally elastic stability</ce:italic> and improve the high probability bound to <mml:math altimg=\"si6.svg\" display=\"inline\"><mml:mrow><mml:mi mathvariant=\"script\">O</mml:mi><mml:mrow><mml:mo fence=\"true\">(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>β</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant=\"double-struck\">E</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msqrt><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:msqrt><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msqrt></mml:mrow><mml:mrow><mml:msqrt><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msqrt><mml:mi>ϵ</mml:mi></mml:mrow></mml:mfrac></mml:mrow><mml:mo fence=\"true\">)</mml:mo></mml:mrow></mml:mrow></mml:math>, in which the pairwise locally elastic stability parameter <mml:math altimg=\"si7.svg\" display=\"inline\"><mml:mrow><mml:msub><mml:mrow><mml:mi>β</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant=\"double-struck\">E</mml:mi></mml:mrow></mml:msub><mml:mo linebreak=\"goodbreak\" linebreakstyle=\"after\">≪</mml:mo><mml:msubsup><mml:mrow><mml:mi>β</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>U</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:math> because it considers the average sensitivity of the pairwise loss function.","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1016/j.neucom.2024.128610","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Pairwise learning is a vital part of machine learning. It depends on pairs of training instances, and is naturally fit for modeling relationships between samples. However, as a data driven paradigm, it faces huge privacy issues. Differential privacy (DP) is a useful tool to protect the privacy of machine learning, but corresponding excess population risk bounds are loose in existing DP pairwise learning analysis. In this paper, we propose a gradient perturbation algorithm for pairwise learning to get better risk bounds under Polyak–Łojasiewicz condition, including both convex and non-convex cases. Specifically, for the theoretical risk bound in expectation, previous best results are of rates O(pn2ϵ2+1n) and O(pnϵ+1n) under strongly convex condition and convex conditions, respectively. In this paper, we use the on-average stability and achieve an O(min{pn1.5ϵ+p1.5n2.5ϵ3,pn2ϵ2+1n}) bound, significantly improving previous bounds. For the high probability risk bound, previous best results are analyzed by the uniform stability, and O(βnU+pnϵ) excess population risk bounds are achieved under strongly convex or convex conditions, where βnU is the traditional pairwise uniform stability parameter, it is large since it considers the worst case of the loss sensitivity. In this paper, we propose the pairwise locally elastic stability and improve the high probability bound to O(βEn+pnϵ), in which the pairwise locally elastic stability parameter βE≪βnU because it considers the average sensitivity of the pairwise loss function.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.