基于重要特征分析的工程设备成本估算方法

IF 1.7 Q2 ENGINEERING, MULTIDISCIPLINARY
Nataliya Boyko, Oleksii Lukash
{"title":"基于重要特征分析的工程设备成本估算方法","authors":"Nataliya Boyko, Oleksii Lukash","doi":"10.1155/2023/8833753","DOIUrl":null,"url":null,"abstract":"This paper considers the current market pace, which requires a corresponding competitive advantage. This study forecasted the cost of heavy machinery depending on geolocation and essential characteristics by the field of activity. This study analyzes specific categories of heavy machinery for important price characteristics. The study classified them by keywords in the text description as essential characteristics. Accordingly, a dataset was formed based on the data obtained. The research objective is to collect and structure data from web resources for the sale of heavy equipment. This paper describes in detail the preliminary data processing. The main stages of preprocessing are presented in detail: detection and processing of missing data, removing anomalous data, coding of categorical data, and scaling. The method of the average value of a specific grouped set was applied to fill in the gaps according to the characteristics and available data. The mode value from the grouped items was used to fill in the gaps. The interquartile range and standard deviation were used to detect anomalies. We used the Kolmogorov–Smirnov, KS_Test, and Lilliefors tests to check the data for normality. In this study, the assessment of abnormal data was applied separately to each set of grouped data with the same parameters. The study built and analyzed models using machine learning methods (linear and polynomial regression, decision trees, random forest, support vector machine, and neural network). Two data encoding methods were used to achieve maximum model accuracy: Label Encoder and One Hot Encoder. The work of each algorithm is considered on the example of the created dataset. In this study, the parameter used for coding was the geolocation of heavy equipment. The study pays additional attention to the specific characteristics of heavy machinery by the sector of the economy. The existing methods and tools for price forecasting, depending on the specific characteristics of the equipment, were analyzed. The practical significance of this work lies in developing an algorithm for predicting the cost of heavy machinery by assessing several parameters.","PeriodicalId":15716,"journal":{"name":"Journal of Engineering","volume":"45 1","pages":"0"},"PeriodicalIF":1.7000,"publicationDate":"2023-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Methodology for Estimating the Cost of Construction Equipment Based on the Analysis of Important Characteristics Using Machine Learning Methods\",\"authors\":\"Nataliya Boyko, Oleksii Lukash\",\"doi\":\"10.1155/2023/8833753\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper considers the current market pace, which requires a corresponding competitive advantage. This study forecasted the cost of heavy machinery depending on geolocation and essential characteristics by the field of activity. This study analyzes specific categories of heavy machinery for important price characteristics. The study classified them by keywords in the text description as essential characteristics. Accordingly, a dataset was formed based on the data obtained. The research objective is to collect and structure data from web resources for the sale of heavy equipment. This paper describes in detail the preliminary data processing. The main stages of preprocessing are presented in detail: detection and processing of missing data, removing anomalous data, coding of categorical data, and scaling. The method of the average value of a specific grouped set was applied to fill in the gaps according to the characteristics and available data. The mode value from the grouped items was used to fill in the gaps. The interquartile range and standard deviation were used to detect anomalies. We used the Kolmogorov–Smirnov, KS_Test, and Lilliefors tests to check the data for normality. In this study, the assessment of abnormal data was applied separately to each set of grouped data with the same parameters. The study built and analyzed models using machine learning methods (linear and polynomial regression, decision trees, random forest, support vector machine, and neural network). Two data encoding methods were used to achieve maximum model accuracy: Label Encoder and One Hot Encoder. The work of each algorithm is considered on the example of the created dataset. In this study, the parameter used for coding was the geolocation of heavy equipment. The study pays additional attention to the specific characteristics of heavy machinery by the sector of the economy. The existing methods and tools for price forecasting, depending on the specific characteristics of the equipment, were analyzed. The practical significance of this work lies in developing an algorithm for predicting the cost of heavy machinery by assessing several parameters.\",\"PeriodicalId\":15716,\"journal\":{\"name\":\"Journal of Engineering\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2023-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1155/2023/8833753\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2023/8833753","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

本文考虑的是当前的市场节奏,这就需要相应的竞争优势。本研究预测了重型机械的成本取决于地理位置和基本特征的活动领域。本研究分析了具体类别的重型机械的重要价格特征。本研究通过文本描述中的关键词将其分类为基本特征。据此,根据获得的数据形成数据集。研究的目的是为重型设备的销售从网络资源中收集和结构化数据。本文详细介绍了初步数据处理。详细介绍了预处理的主要阶段:缺失数据的检测和处理、异常数据的去除、分类数据的编码和缩放。根据特征和现有数据,采用特定分组集平均值法填补空白。使用分组项的模式值来填补空白。使用四分位数间距和标准差来检测异常。我们使用Kolmogorov-Smirnov、KS_Test和Lilliefors检验来检查数据的正态性。在本研究中,对每组参数相同的分组数据分别进行异常数据的评估。该研究使用机器学习方法(线性和多项式回归、决策树、随机森林、支持向量机和神经网络)构建和分析模型。使用了两种数据编码方法来达到最大的模型精度:标签编码器和一个热编码器。每个算法的工作都以创建的数据集为例进行考虑。在本研究中,用于编码的参数为重型设备的地理位置。该研究进一步关注了经济部门对重型机械的具体特征。根据设备的具体特点,分析了现有的价格预测方法和工具。本文的实际意义在于开发了一种通过评估多个参数来预测重型机械成本的算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Methodology for Estimating the Cost of Construction Equipment Based on the Analysis of Important Characteristics Using Machine Learning Methods
This paper considers the current market pace, which requires a corresponding competitive advantage. This study forecasted the cost of heavy machinery depending on geolocation and essential characteristics by the field of activity. This study analyzes specific categories of heavy machinery for important price characteristics. The study classified them by keywords in the text description as essential characteristics. Accordingly, a dataset was formed based on the data obtained. The research objective is to collect and structure data from web resources for the sale of heavy equipment. This paper describes in detail the preliminary data processing. The main stages of preprocessing are presented in detail: detection and processing of missing data, removing anomalous data, coding of categorical data, and scaling. The method of the average value of a specific grouped set was applied to fill in the gaps according to the characteristics and available data. The mode value from the grouped items was used to fill in the gaps. The interquartile range and standard deviation were used to detect anomalies. We used the Kolmogorov–Smirnov, KS_Test, and Lilliefors tests to check the data for normality. In this study, the assessment of abnormal data was applied separately to each set of grouped data with the same parameters. The study built and analyzed models using machine learning methods (linear and polynomial regression, decision trees, random forest, support vector machine, and neural network). Two data encoding methods were used to achieve maximum model accuracy: Label Encoder and One Hot Encoder. The work of each algorithm is considered on the example of the created dataset. In this study, the parameter used for coding was the geolocation of heavy equipment. The study pays additional attention to the specific characteristics of heavy machinery by the sector of the economy. The existing methods and tools for price forecasting, depending on the specific characteristics of the equipment, were analyzed. The practical significance of this work lies in developing an algorithm for predicting the cost of heavy machinery by assessing several parameters.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Engineering
Journal of Engineering ENGINEERING, MULTIDISCIPLINARY-
CiteScore
4.20
自引率
0.00%
发文量
68
期刊介绍: Journal of Engineering is a peer-reviewed, Open Access journal that publishes original research articles as well as review articles in several areas of engineering. The subject areas covered by the journal are: - Chemical Engineering - Civil Engineering - Computer Engineering - Electrical Engineering - Industrial Engineering - Mechanical Engineering
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信