{"title":"Generation of Prototypes for Masking Sequences of Events","authors":"A. Valls, Cristina Gómez-Alonso, V. Torra","doi":"10.1109/ARES.2009.55","DOIUrl":null,"url":null,"abstract":"Sequences of categorical data are in common use to represent sequences of events. In order to transfer such data to third parties for their analysis, masking methods can be applied to satisfy privacy laws and avoid the disclosure of sensitive information. Masking methods distort the data so that privacy is kept at the expenses of some information loss. %Different methods exist, each one trying to find a good trade-off between the risk of disclosure and the information loss. Microaggregation is one of the existing masking methods. In microaggregation small clusters are automatically built and the values of the members of a cluster are substituted by the values of the prototype of that cluster. Due to the fact that microaggregation is an NP-hard problem, heuristic approaches have been developed. Existing methods are mainly devoted to numerical and categorical data. The extension of these methods to sequences of categorical data requires the definition of special algorithms for clustering and prototyping.Artificial Intelligence offers techniques and tools that are appropriate for symbolic data. As in our context the sequences are defined in terms of categorical (symbolic) values, such AI techniques are of special relevance. In this paper, we will use them to propose a new method for generating the prototype of a small group of sequences of categorical values. These results can later be used in e.g. microaggregation.","PeriodicalId":169468,"journal":{"name":"2009 International Conference on Availability, Reliability and Security","volume":"114 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Conference on Availability, Reliability and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARES.2009.55","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Sequences of categorical data are in common use to represent sequences of events. In order to transfer such data to third parties for their analysis, masking methods can be applied to satisfy privacy laws and avoid the disclosure of sensitive information. Masking methods distort the data so that privacy is kept at the expenses of some information loss. %Different methods exist, each one trying to find a good trade-off between the risk of disclosure and the information loss. Microaggregation is one of the existing masking methods. In microaggregation small clusters are automatically built and the values of the members of a cluster are substituted by the values of the prototype of that cluster. Due to the fact that microaggregation is an NP-hard problem, heuristic approaches have been developed. Existing methods are mainly devoted to numerical and categorical data. The extension of these methods to sequences of categorical data requires the definition of special algorithms for clustering and prototyping.Artificial Intelligence offers techniques and tools that are appropriate for symbolic data. As in our context the sequences are defined in terms of categorical (symbolic) values, such AI techniques are of special relevance. In this paper, we will use them to propose a new method for generating the prototype of a small group of sequences of categorical values. These results can later be used in e.g. microaggregation.