{"title":"Analytical models combining methodology with classification model example","authors":"M. Gorawski, E. Płuciennik","doi":"10.1109/INFTECH.2008.4621623","DOIUrl":null,"url":null,"abstract":"Distributed computing is nowadays almost ubiquities. So is data mining - time and hardware resources consuming process of building analytical models of data. Authors propose methodology of combining local analytical models (build parallely in nodes of distributed computer system) into a global one without necessary to construct distributed version of data mining algorithm. Basic assumptions for proposed solution is (i) a complete horizontal data fragmentation and (ii) a model form understood for human being. All steps of combining methodology are presented with classification model example in form of a rule set. Authors define and consider problems with combining local classification modelspsila rules into one final set of global model rules encompassing conflicting rules, sub-rules, partial sub-rules and unclassified objects. Algorithms for different combining strategies are also presented as well as their tests results. Tests were conducted with data sets from UCI Machine Learning Repository.","PeriodicalId":247264,"journal":{"name":"2008 1st International Conference on Information Technology","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 1st International Conference on Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFTECH.2008.4621623","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
Distributed computing is nowadays almost ubiquities. So is data mining - time and hardware resources consuming process of building analytical models of data. Authors propose methodology of combining local analytical models (build parallely in nodes of distributed computer system) into a global one without necessary to construct distributed version of data mining algorithm. Basic assumptions for proposed solution is (i) a complete horizontal data fragmentation and (ii) a model form understood for human being. All steps of combining methodology are presented with classification model example in form of a rule set. Authors define and consider problems with combining local classification modelspsila rules into one final set of global model rules encompassing conflicting rules, sub-rules, partial sub-rules and unclassified objects. Algorithms for different combining strategies are also presented as well as their tests results. Tests were conducted with data sets from UCI Machine Learning Repository.