Diana Tsyrkaniuk, V. Sokolov, N. Mazur, V. Kozachok, V. Astapenya
{"title":"METHOD OF MARKETPLACE LEGITIMATE USER AND ATTACKER PROFILING","authors":"Diana Tsyrkaniuk, V. Sokolov, N. Mazur, V. Kozachok, V. Astapenya","doi":"10.28925/2663-4023.2021.14.5067","DOIUrl":null,"url":null,"abstract":"The number and complexity of cybercrime are constantly growing. New types of attacks and competition are emerging. The number of systems is growing faster than new cybersecurity professionals are learning, making it increasingly difficult to track users' actions in real-time manually. E-commerce is incredibly active. Not all retailers have enough resources to maintain their online stores, so they are forced to work with intermediaries. Unique trading platforms increasingly perform the role of intermediaries with their electronic catalogs (showcases), payment and logistics services, quality control - marketplaces. The article considers the problem of protecting the personal data of marketplace users. The article aims to develop a mathematical behavior model to increase the protection of the user's data to counter fraud (antifraud). Profiling can be built in two directions: profiling a legitimate user and an attacker (profitability and scoring issues are beyond the scope of this study). User profiling is based on typical behavior, amounts, and quantities of goods, the speed of filling the electronic cart, the number of refusals and returns, etc. A proprietary model for profiling user behavior based on the Python programming language and the Scikit-learn library using the method of random forest, linear regression, and decision tree was proposed, metrics were used using an error matrix, and algorithms were evaluated. As a result of comparing the evaluation of these algorithms of three methods, the linear regression method showed the best results: A is 98.60%, P is 0.01%, R is 0.54%, F is 0.33%. 2% of violators have been correctly identified, which positively affects the protection of personal data.","PeriodicalId":198390,"journal":{"name":"Cybersecurity: Education, Science, Technique","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cybersecurity: Education, Science, Technique","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.28925/2663-4023.2021.14.5067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The number and complexity of cybercrime are constantly growing. New types of attacks and competition are emerging. The number of systems is growing faster than new cybersecurity professionals are learning, making it increasingly difficult to track users' actions in real-time manually. E-commerce is incredibly active. Not all retailers have enough resources to maintain their online stores, so they are forced to work with intermediaries. Unique trading platforms increasingly perform the role of intermediaries with their electronic catalogs (showcases), payment and logistics services, quality control - marketplaces. The article considers the problem of protecting the personal data of marketplace users. The article aims to develop a mathematical behavior model to increase the protection of the user's data to counter fraud (antifraud). Profiling can be built in two directions: profiling a legitimate user and an attacker (profitability and scoring issues are beyond the scope of this study). User profiling is based on typical behavior, amounts, and quantities of goods, the speed of filling the electronic cart, the number of refusals and returns, etc. A proprietary model for profiling user behavior based on the Python programming language and the Scikit-learn library using the method of random forest, linear regression, and decision tree was proposed, metrics were used using an error matrix, and algorithms were evaluated. As a result of comparing the evaluation of these algorithms of three methods, the linear regression method showed the best results: A is 98.60%, P is 0.01%, R is 0.54%, F is 0.33%. 2% of violators have been correctly identified, which positively affects the protection of personal data.