Using Clustering and Text Mining to Create a Reference Price Database

Rommel N. Carvalho, E. D. Paiva, Henrique A. Da Rocha, Gilson Libório Mendes
{"title":"Using Clustering and Text Mining to Create a Reference Price Database","authors":"Rommel N. Carvalho, E. D. Paiva, Henrique A. Da Rocha, Gilson Libório Mendes","doi":"10.21528/LNLM-VOL12-NO1-ART3","DOIUrl":null,"url":null,"abstract":"Since 2004, Brazil’s Office of the Comptroller General (CGU) has been publishing several data related to government expenditures in the Transparency Portal. In 2010, CGU started publishing daily every financial statement produced by the Federal Government. Nevertheless, inconsistencies which hinder accountability have been found in this data base. This paper presents how CGU uses clustering and text mining techniques to retrieve essential information for a good accountability, which includes what was bought, the price paid per item, a price reference per product, etc. This analysis has allowed CGU to draw some preliminary conclusions which are presented as a means to illustrate the research results. Finally, this information will eventually be incorporated in the Transparency Portal, allowing every citizen to understand how much the Government is really paying, in general, for products. Thus, improving social control and providing a solid accountability not only to CGU, as an internal control agency, but also to Brazil’s citizens who, in the end, are the ones paying the bill.","PeriodicalId":386768,"journal":{"name":"Learning and Nonlinear Models","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Learning and Nonlinear Models","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21528/LNLM-VOL12-NO1-ART3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Since 2004, Brazil’s Office of the Comptroller General (CGU) has been publishing several data related to government expenditures in the Transparency Portal. In 2010, CGU started publishing daily every financial statement produced by the Federal Government. Nevertheless, inconsistencies which hinder accountability have been found in this data base. This paper presents how CGU uses clustering and text mining techniques to retrieve essential information for a good accountability, which includes what was bought, the price paid per item, a price reference per product, etc. This analysis has allowed CGU to draw some preliminary conclusions which are presented as a means to illustrate the research results. Finally, this information will eventually be incorporated in the Transparency Portal, allowing every citizen to understand how much the Government is really paying, in general, for products. Thus, improving social control and providing a solid accountability not only to CGU, as an internal control agency, but also to Brazil’s citizens who, in the end, are the ones paying the bill.
使用聚类和文本挖掘创建参考价格数据库
自2004年以来,巴西总审计长办公室(CGU)一直在透明度门户网站上公布与政府支出有关的若干数据。2010年,CGU开始每天发布联邦政府制作的每一份财务报表。然而,在这个数据库中发现了妨碍问责的不一致之处。本文介绍了CGU如何使用聚类和文本挖掘技术来检索基本信息,以获得良好的问责制,包括购买的内容,每件商品的支付价格,每个产品的价格参考等。这一分析使CGU得出了一些初步结论,这些结论是作为说明研究结果的一种手段而提出的。最后,这些信息最终将被纳入透明度门户网站,使每个公民都能了解政府一般为产品实际支付了多少钱。因此,改善社会控制,不仅向作为内部控制机构的CGU,而且向巴西公民提供可靠的问责制,因为巴西公民最终是买单的人。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信