PRIVACY BASED CLASSIFICATION MODEL OF PUBLIC DATA BY UTILIZING TWO-STEPS VALIDATION APPROACH

Asia-Pacific Journal of Information Technology and Multimedia Pub Date : 2022-12-31 DOI:10.17576/apjitm-2022-0101-10

M. Hussin, R. A. Raja Mahmood

{"title":"PRIVACY BASED CLASSIFICATION MODEL OF PUBLIC DATA BY UTILIZING TWO-STEPS VALIDATION APPROACH","authors":"M. Hussin, R. A. Raja Mahmood","doi":"10.17576/apjitm-2022-0101-10","DOIUrl":null,"url":null,"abstract":"Digital information has become a trend and is integral to modernizing and leveraging various resources in Information Technology (IT). Vast data and information can be obtained anytime and anywhere at our fingertips through ICT facilities. This is considered public data due to its being shared publicly, such as on social media. Public data can be arranged according to various criteria and formats. Users have a right to understand which data can be publicly shared and which data is supposed to be in a private state. However, people always misunderstand and mislead which data needs to be secured and which can be shared. It is further critical when this public data is already exposed to data breaches and data theft. In this work, we propose a data privacy classification approach for public data where this data resides on digital platforms. It aims to inform the public about the level of data privacy before they reveal it on open and free digital platforms. We employed three different privacy classes: low, medium, and high. In response to that, we identified entities of public data that refer to digital information platforms such as websites, mobile apps, and online systems. We then dug further into the data attributes of each entity. The public data attributes are sorted and passed to respondents to obtain their input regarding their decisions on which privacy class is suitable for the respective attribute. Based on the input from respondents, we then used a Naive Bayesian classifier to generate probability weightage for re-assigning the data attributes into the most suitable privacy class. This two-level data classification stage brings better perspectives on data privacy. This modified version of the public data privacy class is then verified by the respondents to analyze their preferences while measuring users’ satisfaction. According to the results, our public data privacy classification model meets public expectations. Optimistically, well-organized data classification contributes to better data practices.","PeriodicalId":160138,"journal":{"name":"Asia-Pacific Journal of Information Technology and Multimedia","volume":"138 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Asia-Pacific Journal of Information Technology and Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17576/apjitm-2022-0101-10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Digital information has become a trend and is integral to modernizing and leveraging various resources in Information Technology (IT). Vast data and information can be obtained anytime and anywhere at our fingertips through ICT facilities. This is considered public data due to its being shared publicly, such as on social media. Public data can be arranged according to various criteria and formats. Users have a right to understand which data can be publicly shared and which data is supposed to be in a private state. However, people always misunderstand and mislead which data needs to be secured and which can be shared. It is further critical when this public data is already exposed to data breaches and data theft. In this work, we propose a data privacy classification approach for public data where this data resides on digital platforms. It aims to inform the public about the level of data privacy before they reveal it on open and free digital platforms. We employed three different privacy classes: low, medium, and high. In response to that, we identified entities of public data that refer to digital information platforms such as websites, mobile apps, and online systems. We then dug further into the data attributes of each entity. The public data attributes are sorted and passed to respondents to obtain their input regarding their decisions on which privacy class is suitable for the respective attribute. Based on the input from respondents, we then used a Naive Bayesian classifier to generate probability weightage for re-assigning the data attributes into the most suitable privacy class. This two-level data classification stage brings better perspectives on data privacy. This modified version of the public data privacy class is then verified by the respondents to analyze their preferences while measuring users’ satisfaction. According to the results, our public data privacy classification model meets public expectations. Optimistically, well-organized data classification contributes to better data practices.

查看原文本刊更多论文

基于两步验证方法的公共数据隐私分类模型

数字信息已成为一种趋势，是实现信息技术现代化和充分利用各种资源的必要条件。通过信息通信技术设施，我们可以随时随地获取大量数据和信息。这被认为是公共数据，因为它是公开分享的，比如在社交媒体上。公共数据可以按照各种标准和格式进行排列。用户有权了解哪些数据可以公开共享，哪些数据应该处于私有状态。然而，人们总是误解和误导哪些数据需要保护，哪些数据可以共享。当这些公共数据已经暴露于数据泄露和数据盗窃时，这就更加关键了。在这项工作中，我们提出了一种数据隐私分类方法，用于存储在数字平台上的公共数据。它的目的是在公众在开放和免费的数字平台上披露数据隐私之前，让他们了解数据隐私的水平。我们使用了三种不同的隐私等级:低、中、高。为此，我们确定了公共数据实体，这些实体指的是数字信息平台，如网站、移动应用程序和在线系统。然后我们进一步挖掘每个实体的数据属性。对公共数据属性进行排序并传递给应答者，以获得他们关于决定哪个隐私类适合各自属性的输入。根据受访者的输入，我们使用朴素贝叶斯分类器生成概率权重，以便将数据属性重新分配到最合适的隐私类中。这种两级数据分类阶段为数据隐私提供了更好的视角。这个公共数据隐私类的修改版本然后由受访者验证，以分析他们的偏好，同时衡量用户的满意度。结果表明，我们的公共数据隐私分类模型符合公众期望。乐观地说，组织良好的数据分类有助于更好的数据实践。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Asia-Pacific Journal of Information Technology and Multimedia

自引率

0.00%

发文量