{"title":"链接数据的无监督特征选择","authors":"Rachana T. Nemade, R. Makhijani","doi":"10.1109/ICRAIE.2014.6909131","DOIUrl":null,"url":null,"abstract":"The widespread use of social media web sites gives high dimensional linked data. For limiting the amount and dimensionality of the data, feature subset selection is an effective way which selects features that correlate well with the target class. The high dimensional linked data from social media web sites lacks the availability of label information. So feature selection for linked data remains a challenging task. By using the link information feature relevance assessment is done. In this paper, we propose the unsupervised feature selection from linked data, UFSLD algorithm. The UFSLD algorithm works in three steps. In the first step, the interdependency among the linked data is exploited and the relevant features are selected. In the second step, the features from first step are classified to form the clusters by using graph-theoretic clustering method. In the third step, the most representative feature from each cluster is selected to form a subset of features. MST clustering method is used to ensure the efficiency of this algorithm. Experiments are conducted to compare UFSLD with one unsupervised and another supervised feature selection algorithm and the effectiveness of this algorithm is evaluated.","PeriodicalId":355706,"journal":{"name":"International Conference on Recent Advances and Innovations in Engineering (ICRAIE-2014)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Unsupervised feature selection for linked data\",\"authors\":\"Rachana T. Nemade, R. Makhijani\",\"doi\":\"10.1109/ICRAIE.2014.6909131\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The widespread use of social media web sites gives high dimensional linked data. For limiting the amount and dimensionality of the data, feature subset selection is an effective way which selects features that correlate well with the target class. The high dimensional linked data from social media web sites lacks the availability of label information. So feature selection for linked data remains a challenging task. By using the link information feature relevance assessment is done. In this paper, we propose the unsupervised feature selection from linked data, UFSLD algorithm. The UFSLD algorithm works in three steps. In the first step, the interdependency among the linked data is exploited and the relevant features are selected. In the second step, the features from first step are classified to form the clusters by using graph-theoretic clustering method. In the third step, the most representative feature from each cluster is selected to form a subset of features. MST clustering method is used to ensure the efficiency of this algorithm. Experiments are conducted to compare UFSLD with one unsupervised and another supervised feature selection algorithm and the effectiveness of this algorithm is evaluated.\",\"PeriodicalId\":355706,\"journal\":{\"name\":\"International Conference on Recent Advances and Innovations in Engineering (ICRAIE-2014)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-05-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Recent Advances and Innovations in Engineering (ICRAIE-2014)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICRAIE.2014.6909131\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Recent Advances and Innovations in Engineering (ICRAIE-2014)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRAIE.2014.6909131","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The widespread use of social media web sites gives high dimensional linked data. For limiting the amount and dimensionality of the data, feature subset selection is an effective way which selects features that correlate well with the target class. The high dimensional linked data from social media web sites lacks the availability of label information. So feature selection for linked data remains a challenging task. By using the link information feature relevance assessment is done. In this paper, we propose the unsupervised feature selection from linked data, UFSLD algorithm. The UFSLD algorithm works in three steps. In the first step, the interdependency among the linked data is exploited and the relevant features are selected. In the second step, the features from first step are classified to form the clusters by using graph-theoretic clustering method. In the third step, the most representative feature from each cluster is selected to form a subset of features. MST clustering method is used to ensure the efficiency of this algorithm. Experiments are conducted to compare UFSLD with one unsupervised and another supervised feature selection algorithm and the effectiveness of this algorithm is evaluated.