O. Matviichuk, Oksana Biloshytska, O. Horodetska, V. Pavlov, M. Linnik, I. Nastenko
{"title":"Positional Approach to the Voting Function Formation of Random Forest Trees as an Example of Solving the Differentiating Tuberculosis Forms Problem","authors":"O. Matviichuk, Oksana Biloshytska, O. Horodetska, V. Pavlov, M. Linnik, I. Nastenko","doi":"10.1109/CSIT56902.2022.10000450","DOIUrl":null,"url":null,"abstract":"The paper proposes a new voting technology for random forest trees – the Positional Approach to the Voting Function Formation (PAVFF). In contrast to existing forms of organizing the voting of random forest trees, the paper proposes to change the subjects of voting and to use as such individual finite elements of the tree, with weights determined in accordance with the competences of new voting units. Each forest tree in the voting function is represented by its individual branch (voting unit) with the corresponding competence level assigned at the stage of tree finite element verification. Furthermore, we propose different mechanisms for organizing the received units in the voting process. The effectiveness of the new mechanism is shown by the example of the differentiation problem of drug-sensitive and drug-resistant forms of tuberculosis. The task feature space is formed by the ROI (regions of interest) textural characteristics on the patient’s lungs CT scan. The initial feature space was composed of the elements of a few textural characteristic matrices. From over half a million input features, a few optimal ensembles were selected to form random forest trees. We used intra- and inter-class variance selection techniques for this purpose, with the final selection made by a genetic algorithm using a combined correlation criterion. After verification of voting units (finite elements of trees) 3 variants of voting by competence were formed: by the most competent unit, weighted average participation of all participants, and group voting with coefficient revaluation by the Group Method of Data Handling. The results were compared with similar voting by random forest trees. A 5% improvement in classification quality is shown.","PeriodicalId":282561,"journal":{"name":"2022 IEEE 17th International Conference on Computer Sciences and Information Technologies (CSIT)","volume":"153 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 17th International Conference on Computer Sciences and Information Technologies (CSIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSIT56902.2022.10000450","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The paper proposes a new voting technology for random forest trees – the Positional Approach to the Voting Function Formation (PAVFF). In contrast to existing forms of organizing the voting of random forest trees, the paper proposes to change the subjects of voting and to use as such individual finite elements of the tree, with weights determined in accordance with the competences of new voting units. Each forest tree in the voting function is represented by its individual branch (voting unit) with the corresponding competence level assigned at the stage of tree finite element verification. Furthermore, we propose different mechanisms for organizing the received units in the voting process. The effectiveness of the new mechanism is shown by the example of the differentiation problem of drug-sensitive and drug-resistant forms of tuberculosis. The task feature space is formed by the ROI (regions of interest) textural characteristics on the patient’s lungs CT scan. The initial feature space was composed of the elements of a few textural characteristic matrices. From over half a million input features, a few optimal ensembles were selected to form random forest trees. We used intra- and inter-class variance selection techniques for this purpose, with the final selection made by a genetic algorithm using a combined correlation criterion. After verification of voting units (finite elements of trees) 3 variants of voting by competence were formed: by the most competent unit, weighted average participation of all participants, and group voting with coefficient revaluation by the Group Method of Data Handling. The results were compared with similar voting by random forest trees. A 5% improvement in classification quality is shown.