{"title":"On Different Formulations of a Continuous CTA Model.","authors":"Goran Lesaja, Ionut Iacob, Anna Oganian","doi":"10.1007/978-3-030-57521-2_12","DOIUrl":null,"url":null,"abstract":"<p><p>In this paper, we consider a Controlled Tabular Adjustment (CTA) model for statistical disclosure limitation of tabular data. The goal of the CTA model is to find the closest safe (masked) table to the original table that contains sensitive information. The measure of closeness is usually measured using <i>ℓ</i> <sub>1</sub> or <i>ℓ</i> <sub>2</sub> norm. However, in the norm-based CTA model, there is no control of how well the statistical properties of the data in the original table are preserved in the masked table. Hence, we propose a different criterion of \"closeness\" between the masked and original table which attempts to minimally change certain statistics used in the analysis of the table. The Chi-square statistic is among the most utilized measures for the analysis of data in two-dimensional tables. Hence, we propose a <i>Chi-square</i> CTA model which minimizes the objective function that depends on the difference of the Chi-square statistics of the original and masked table. The model is non-linear and non-convex and therefore harder to solve which prompted us to also consider a modification of this model which can be transformed into a linear programming model that can be solved more efficiently. We present numerical results for the two-dimensional table illustrating our novel approach and providing a comparison with norm-based CTA models.</p>","PeriodicalId":91946,"journal":{"name":"Privacy in statistical databases. PSD (Conference : 2004- )","volume":"12276 ","pages":"166-179"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8057307/pdf/nihms-1676971.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Privacy in statistical databases. PSD (Conference : 2004- )","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/978-3-030-57521-2_12","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2020/9/16 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we consider a Controlled Tabular Adjustment (CTA) model for statistical disclosure limitation of tabular data. The goal of the CTA model is to find the closest safe (masked) table to the original table that contains sensitive information. The measure of closeness is usually measured using ℓ1 or ℓ2 norm. However, in the norm-based CTA model, there is no control of how well the statistical properties of the data in the original table are preserved in the masked table. Hence, we propose a different criterion of "closeness" between the masked and original table which attempts to minimally change certain statistics used in the analysis of the table. The Chi-square statistic is among the most utilized measures for the analysis of data in two-dimensional tables. Hence, we propose a Chi-square CTA model which minimizes the objective function that depends on the difference of the Chi-square statistics of the original and masked table. The model is non-linear and non-convex and therefore harder to solve which prompted us to also consider a modification of this model which can be transformed into a linear programming model that can be solved more efficiently. We present numerical results for the two-dimensional table illustrating our novel approach and providing a comparison with norm-based CTA models.