{"title":"CONSTRUCTION OF REGRESSION TREES ON INTERVAL-VALUED SYMBOLIC VARIABLES","authors":"Asanao Shimokawa, Y. Kawasaki, E. Miyaoka","doi":"10.5183/JJSCS.1405001_211","DOIUrl":null,"url":null,"abstract":"Analysis based on interval-valued symbolic variables, which are given as p-dimensional hyperrectangles in R, is considered appropriate in some scenarios. However, the methods analyzing these variables are not as well studied as those for classical variables, which are given as single points in R. The regression tree, which is constructed using the CART algorithm, is one such example, and we consider it in this paper. To construct a regression tree based on interval-valued symbolic variables, several models are considered. Our proposed model is different from the other models, because, in this model, a concept can be included in several terminal nodes in a tree. If we want to construct a regression tree using the proposed model, several problems such as the representation method of predictive models in each node and searching an optimal splitting point in interval values, should be addressed. We address these problems and present an application of this model in reference to the study of HIV-1-infected patients’ data.","PeriodicalId":338719,"journal":{"name":"Journal of the Japanese Society of Computational Statistics","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Japanese Society of Computational Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5183/JJSCS.1405001_211","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Analysis based on interval-valued symbolic variables, which are given as p-dimensional hyperrectangles in R, is considered appropriate in some scenarios. However, the methods analyzing these variables are not as well studied as those for classical variables, which are given as single points in R. The regression tree, which is constructed using the CART algorithm, is one such example, and we consider it in this paper. To construct a regression tree based on interval-valued symbolic variables, several models are considered. Our proposed model is different from the other models, because, in this model, a concept can be included in several terminal nodes in a tree. If we want to construct a regression tree using the proposed model, several problems such as the representation method of predictive models in each node and searching an optimal splitting point in interval values, should be addressed. We address these problems and present an application of this model in reference to the study of HIV-1-infected patients’ data.