{"title":"一种用于遗留数据自动文本分类的混合学习算法","authors":"Dali Wang, Ying Bai, David Hamblin","doi":"10.5121/ijaia.2019.10504","DOIUrl":null,"url":null,"abstract":"The goal of this research is to develop an algorithm to automatically classify measurement types from NASA’s airborne measurement data archive. The product has to meet specific metrics in term of accuracy, robustness and usability, as the initial decision-tree based development has shown limited applicability due to its resource intensive characteristics. We have developed an innovative solution that is much more efficient while offering comparable performance. Similar to many industrial applications, the data available are noisy and correlated; and there is a wide range of features that are associated with the type of measurement to be identified. The proposed algorithm uses a decision tree to select features and determine their weights.A weighted Naive Bayes is used due to the presence of highly correlated inputs. The development has been successfully deployed in an industrial scale, and the results show that the development is well-balanced in term of performance and resource requirements.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Hybrid Learning Algorithm in Automated Text Categorization of Legacy Data\",\"authors\":\"Dali Wang, Ying Bai, David Hamblin\",\"doi\":\"10.5121/ijaia.2019.10504\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The goal of this research is to develop an algorithm to automatically classify measurement types from NASA’s airborne measurement data archive. The product has to meet specific metrics in term of accuracy, robustness and usability, as the initial decision-tree based development has shown limited applicability due to its resource intensive characteristics. We have developed an innovative solution that is much more efficient while offering comparable performance. Similar to many industrial applications, the data available are noisy and correlated; and there is a wide range of features that are associated with the type of measurement to be identified. The proposed algorithm uses a decision tree to select features and determine their weights.A weighted Naive Bayes is used due to the presence of highly correlated inputs. The development has been successfully deployed in an industrial scale, and the results show that the development is well-balanced in term of performance and resource requirements.\",\"PeriodicalId\":93188,\"journal\":{\"name\":\"International journal of artificial intelligence & applications\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of artificial intelligence & applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5121/ijaia.2019.10504\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of artificial intelligence & applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5121/ijaia.2019.10504","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Hybrid Learning Algorithm in Automated Text Categorization of Legacy Data
The goal of this research is to develop an algorithm to automatically classify measurement types from NASA’s airborne measurement data archive. The product has to meet specific metrics in term of accuracy, robustness and usability, as the initial decision-tree based development has shown limited applicability due to its resource intensive characteristics. We have developed an innovative solution that is much more efficient while offering comparable performance. Similar to many industrial applications, the data available are noisy and correlated; and there is a wide range of features that are associated with the type of measurement to be identified. The proposed algorithm uses a decision tree to select features and determine their weights.A weighted Naive Bayes is used due to the presence of highly correlated inputs. The development has been successfully deployed in an industrial scale, and the results show that the development is well-balanced in term of performance and resource requirements.