{"title":"Python软件在化工数据分类中的应用","authors":"Gonca Ertürk, O. Akpolat","doi":"10.26650/joda.1264915","DOIUrl":null,"url":null,"abstract":"Nowadays, much data is produced in analytical devices in the field of chemistry and can be stored digitally. By evaluating these data, it is possible to decipher the relationships between them and to make predictions for the new data measured using these relationships with the help of data mining algorithms. One of the areas of chemistry where a lot of data are produced is the environment. Most of the pollution in wastewater consists of detergents, organic substances, and oils. The main processes in wastewater treatment are to destroy (1) biodegradable organic matter, (2) suspended solids, (3) harmful heavy metals and toxic compounds, (4) nitrogen and phosphorus depending on the ambient conditions, and (5) pathogenic organisms. Monitoring the wastewater treatment processes and providing the necessary controls bases on the continuous determination of the wastewater and activated sludge characteristics. The basic measurement criteria for determining the properties of wastewater are the amounts of biochemical oxygen demand (BOD5), chemical oxygen demand (COD), total organic carbon (TOC) and dissolved oxygen (DO). Among these parameters, BOD5 measurement takes at least 5 days, while others can be measured in 1-2 hours max. If BOD5 values could be mathematically associated with the other parameters, it would provide a great advantage in terms of controlling the estimated process depending on them in a shorter time. In the study conducted within this framework, a set of data was created by measuring the above-mentioned parameters from 334 samples taken from a treatment plant for statistical evaluation, and the interactions of the parameters in this data set with each other were examined by a decision tree method. Thus, this study tries to based on estimate the weight of the parameters on the BOD5 value of the samples. The data mining algorithm selected for this modelling was written with Python software and the performance of the algorithm was examined in estimating the BOD5 parameter depending on other parameters by extracting the decision tree rules.","PeriodicalId":250029,"journal":{"name":"Journal of Data Applications","volume":"185 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Application with Python Software for the Classification of Chemical Data\",\"authors\":\"Gonca Ertürk, O. Akpolat\",\"doi\":\"10.26650/joda.1264915\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, much data is produced in analytical devices in the field of chemistry and can be stored digitally. By evaluating these data, it is possible to decipher the relationships between them and to make predictions for the new data measured using these relationships with the help of data mining algorithms. One of the areas of chemistry where a lot of data are produced is the environment. Most of the pollution in wastewater consists of detergents, organic substances, and oils. The main processes in wastewater treatment are to destroy (1) biodegradable organic matter, (2) suspended solids, (3) harmful heavy metals and toxic compounds, (4) nitrogen and phosphorus depending on the ambient conditions, and (5) pathogenic organisms. Monitoring the wastewater treatment processes and providing the necessary controls bases on the continuous determination of the wastewater and activated sludge characteristics. The basic measurement criteria for determining the properties of wastewater are the amounts of biochemical oxygen demand (BOD5), chemical oxygen demand (COD), total organic carbon (TOC) and dissolved oxygen (DO). Among these parameters, BOD5 measurement takes at least 5 days, while others can be measured in 1-2 hours max. If BOD5 values could be mathematically associated with the other parameters, it would provide a great advantage in terms of controlling the estimated process depending on them in a shorter time. In the study conducted within this framework, a set of data was created by measuring the above-mentioned parameters from 334 samples taken from a treatment plant for statistical evaluation, and the interactions of the parameters in this data set with each other were examined by a decision tree method. Thus, this study tries to based on estimate the weight of the parameters on the BOD5 value of the samples. The data mining algorithm selected for this modelling was written with Python software and the performance of the algorithm was examined in estimating the BOD5 parameter depending on other parameters by extracting the decision tree rules.\",\"PeriodicalId\":250029,\"journal\":{\"name\":\"Journal of Data Applications\",\"volume\":\"185 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Data Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.26650/joda.1264915\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Data Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26650/joda.1264915","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Application with Python Software for the Classification of Chemical Data
Nowadays, much data is produced in analytical devices in the field of chemistry and can be stored digitally. By evaluating these data, it is possible to decipher the relationships between them and to make predictions for the new data measured using these relationships with the help of data mining algorithms. One of the areas of chemistry where a lot of data are produced is the environment. Most of the pollution in wastewater consists of detergents, organic substances, and oils. The main processes in wastewater treatment are to destroy (1) biodegradable organic matter, (2) suspended solids, (3) harmful heavy metals and toxic compounds, (4) nitrogen and phosphorus depending on the ambient conditions, and (5) pathogenic organisms. Monitoring the wastewater treatment processes and providing the necessary controls bases on the continuous determination of the wastewater and activated sludge characteristics. The basic measurement criteria for determining the properties of wastewater are the amounts of biochemical oxygen demand (BOD5), chemical oxygen demand (COD), total organic carbon (TOC) and dissolved oxygen (DO). Among these parameters, BOD5 measurement takes at least 5 days, while others can be measured in 1-2 hours max. If BOD5 values could be mathematically associated with the other parameters, it would provide a great advantage in terms of controlling the estimated process depending on them in a shorter time. In the study conducted within this framework, a set of data was created by measuring the above-mentioned parameters from 334 samples taken from a treatment plant for statistical evaluation, and the interactions of the parameters in this data set with each other were examined by a decision tree method. Thus, this study tries to based on estimate the weight of the parameters on the BOD5 value of the samples. The data mining algorithm selected for this modelling was written with Python software and the performance of the algorithm was examined in estimating the BOD5 parameter depending on other parameters by extracting the decision tree rules.