{"title":"Analysis and visualization of category membership distribution in multivariate data","authors":"Y. Pao, B. Duan, Y.L. Zhao, S. LeClair","doi":"10.1109/IPMM.1999.791565","DOIUrl":null,"url":null,"abstract":"This paper reports on some advances in generic data processing procedures with focus on a specific materials discovery and design task. The task is to predict whether a new ternary materials system would be compound forming or not, with the prediction to be based on knowledge of many other known exemplars. The activities and results of three related efforts are described. In one effort, using a combination of clustering and mapping procedures, an accuracy of more than 99% was attained in predicting the category status of new ternary systems. A second effort addressed the question of how to identify redundant or superfluous features. A procedure for identifying the extent of functional dependency amongst features was developed. A third effort addressed the question of how to obtain reduced dimension representations of multivariate data, albeit at the cost of loss of some information. Visualizations of low-dimensional representations can be helpful in building up holistic views of data space, for use in exploration and discovery of new materials.","PeriodicalId":194215,"journal":{"name":"Proceedings of the Second International Conference on Intelligent Processing and Manufacturing of Materials. IPMM'99 (Cat. No.99EX296)","volume":"208 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Second International Conference on Intelligent Processing and Manufacturing of Materials. IPMM'99 (Cat. No.99EX296)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPMM.1999.791565","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
This paper reports on some advances in generic data processing procedures with focus on a specific materials discovery and design task. The task is to predict whether a new ternary materials system would be compound forming or not, with the prediction to be based on knowledge of many other known exemplars. The activities and results of three related efforts are described. In one effort, using a combination of clustering and mapping procedures, an accuracy of more than 99% was attained in predicting the category status of new ternary systems. A second effort addressed the question of how to identify redundant or superfluous features. A procedure for identifying the extent of functional dependency amongst features was developed. A third effort addressed the question of how to obtain reduced dimension representations of multivariate data, albeit at the cost of loss of some information. Visualizations of low-dimensional representations can be helpful in building up holistic views of data space, for use in exploration and discovery of new materials.