基于机器学习方法的公共采购腐败指数预测

International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management Pub Date : 2019-09-17 DOI:10.5220/0008353603330340

K. Rabuzin, Nikola Modrusan

{"title":"基于机器学习方法的公共采购腐败指数预测","authors":"K. Rabuzin, Nikola Modrusan","doi":"10.5220/0008353603330340","DOIUrl":null,"url":null,"abstract":"The protection of citizens’ public financial resources through advanced corruption detection models in public procurement has become an almost inevitable topic and the subject of numerous studies. Since it almost always focuses on the prediction of corrupt competition, the calculation of various indices and indications of corruption to the data itself are very difficult to come by. These data sets usually have very few observations, especially accurately labelled ones. The prevention or detection of compromised public procurement processes is definitely a crucial step, related to the initial phase of public procurement, i.e., the phase of publication of the notice. The aim of this paper is to compare prediction models using text-mining techniques and machine-learning methods to detect suspicious tenders, and to develop a model to detect suspicious one-bid tenders. Consequently, we have analyzed tender documentation for particular tenders, extracted the content of interest about the levels of all bids and grouped it by procurement lots using machine-learning methods. A model that includes the aforementioned components uses the most common text classification algorithms for the purpose of prediction: naive Bayes, logistic regression and support vector machines. The results of the research showed that knowledge in the tender documentation can be used for detection suspicious tenders.","PeriodicalId":133533,"journal":{"name":"International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management","volume":"311 15","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Prediction of Public Procurement Corruption Indices using Machine Learning Methods\",\"authors\":\"K. Rabuzin, Nikola Modrusan\",\"doi\":\"10.5220/0008353603330340\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The protection of citizens’ public financial resources through advanced corruption detection models in public procurement has become an almost inevitable topic and the subject of numerous studies. Since it almost always focuses on the prediction of corrupt competition, the calculation of various indices and indications of corruption to the data itself are very difficult to come by. These data sets usually have very few observations, especially accurately labelled ones. The prevention or detection of compromised public procurement processes is definitely a crucial step, related to the initial phase of public procurement, i.e., the phase of publication of the notice. The aim of this paper is to compare prediction models using text-mining techniques and machine-learning methods to detect suspicious tenders, and to develop a model to detect suspicious one-bid tenders. Consequently, we have analyzed tender documentation for particular tenders, extracted the content of interest about the levels of all bids and grouped it by procurement lots using machine-learning methods. A model that includes the aforementioned components uses the most common text classification algorithms for the purpose of prediction: naive Bayes, logistic regression and support vector machines. The results of the research showed that knowledge in the tender documentation can be used for detection suspicious tenders.\",\"PeriodicalId\":133533,\"journal\":{\"name\":\"International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management\",\"volume\":\"311 15\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5220/0008353603330340\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0008353603330340","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 24

摘要

通过先进的公共采购腐败检测模型来保护公民的公共财政资源已经成为一个几乎不可避免的话题和众多研究的主题。由于它几乎总是集中在对腐败竞争的预测上，各种指数的计算和数据本身的腐败迹象是非常难以获得的。这些数据集通常只有很少的观测值，尤其是精确标记的观测值。预防或发现受到损害的公共采购程序绝对是一个关键步骤，它关系到公共采购的初始阶段，即公告的发布阶段。本文的目的是比较使用文本挖掘技术和机器学习方法来检测可疑投标的预测模型，并开发一个检测可疑单标投标的模型。因此，我们分析了特定招标的招标文件，提取了所有投标级别的相关内容，并使用机器学习方法按采购批次对其进行分组。包含上述组件的模型使用最常见的文本分类算法进行预测:朴素贝叶斯、逻辑回归和支持向量机。研究结果表明，投标文件中的知识可用于检测可疑投标。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Prediction of Public Procurement Corruption Indices using Machine Learning Methods

The protection of citizens’ public financial resources through advanced corruption detection models in public procurement has become an almost inevitable topic and the subject of numerous studies. Since it almost always focuses on the prediction of corrupt competition, the calculation of various indices and indications of corruption to the data itself are very difficult to come by. These data sets usually have very few observations, especially accurately labelled ones. The prevention or detection of compromised public procurement processes is definitely a crucial step, related to the initial phase of public procurement, i.e., the phase of publication of the notice. The aim of this paper is to compare prediction models using text-mining techniques and machine-learning methods to detect suspicious tenders, and to develop a model to detect suspicious one-bid tenders. Consequently, we have analyzed tender documentation for particular tenders, extracted the content of interest about the levels of all bids and grouped it by procurement lots using machine-learning methods. A model that includes the aforementioned components uses the most common text classification algorithms for the purpose of prediction: naive Bayes, logistic regression and support vector machines. The results of the research showed that knowledge in the tender documentation can be used for detection suspicious tenders.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management

自引率

0.00%

发文量