{"title":"从公共部门合同中提取单价的过程","authors":"T. Bruckner, Filip Vencovský","doi":"10.18267/J.AIP.139","DOIUrl":null,"url":null,"abstract":"Czech government institutions commissioned a research on extracting usual unit prices from public IT contracts to aid future public tender sizing. The goal of the project is to obtain millions of contracts from the public register, convert them to full text, extract unit prices from the text and publish a pricelist of IT industry manday prices. This paper designs the process and method of price extraction, demonstrates and evaluates the result on five iterations of extraction and discusses the experience of two years of project performance. The process is designed as a set of repeatable workflows and specified activity and role description. The method is designed as a combination of automated and manual actions. Due to the format and content variability of involved documents and the low mistake tolerance, the possibility of automated extraction of unit prices from full text contract is limited, and human workforce for validation is crucial.","PeriodicalId":36592,"journal":{"name":"Acta Informatica Pragensia","volume":" ","pages":""},"PeriodicalIF":0.8000,"publicationDate":"2020-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Process of Unit Price Extraction from Public Sector Contracts\",\"authors\":\"T. Bruckner, Filip Vencovský\",\"doi\":\"10.18267/J.AIP.139\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Czech government institutions commissioned a research on extracting usual unit prices from public IT contracts to aid future public tender sizing. The goal of the project is to obtain millions of contracts from the public register, convert them to full text, extract unit prices from the text and publish a pricelist of IT industry manday prices. This paper designs the process and method of price extraction, demonstrates and evaluates the result on five iterations of extraction and discusses the experience of two years of project performance. The process is designed as a set of repeatable workflows and specified activity and role description. The method is designed as a combination of automated and manual actions. Due to the format and content variability of involved documents and the low mistake tolerance, the possibility of automated extraction of unit prices from full text contract is limited, and human workforce for validation is crucial.\",\"PeriodicalId\":36592,\"journal\":{\"name\":\"Acta Informatica Pragensia\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2020-12-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta Informatica Pragensia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18267/J.AIP.139\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Informatica Pragensia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18267/J.AIP.139","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
The Process of Unit Price Extraction from Public Sector Contracts
Czech government institutions commissioned a research on extracting usual unit prices from public IT contracts to aid future public tender sizing. The goal of the project is to obtain millions of contracts from the public register, convert them to full text, extract unit prices from the text and publish a pricelist of IT industry manday prices. This paper designs the process and method of price extraction, demonstrates and evaluates the result on five iterations of extraction and discusses the experience of two years of project performance. The process is designed as a set of repeatable workflows and specified activity and role description. The method is designed as a combination of automated and manual actions. Due to the format and content variability of involved documents and the low mistake tolerance, the possibility of automated extraction of unit prices from full text contract is limited, and human workforce for validation is crucial.