{"title":"基于度量学习的能源数据属性描述识别相似度度量","authors":"Guo-Jing Liu, Hao Chen, Lin-Yu Wang, Di Zhu","doi":"10.1109/ICMLC51923.2020.9469547","DOIUrl":null,"url":null,"abstract":"Combining the yearbooks of different cities in China is important for investigating and planning the usage of energy. However, since the yearbooks of cities may be prepared according to different habits and regulations, the same attributes may be described differently. As a result, identifying the same attribute from different yearbook is an important problem. Manual processing is not preferable since it is inefficient and inaccurate. A machine learning model based automatic approach is proposed in this study. Our model applies a metric learning method to quantify the similarity between the attribute descriptions for energy-related data. The attribute descriptions are first converted from texts to a Boolean vector by a bag of words method. The embedding layer method is applied to deal with the sparsity problem of the Boolean vector. A metric learning model is then trained to construct a metric for the similarity of the descriptions. The experimental results indicate that our proposed method outperforms the one without using metric learning.","PeriodicalId":170815,"journal":{"name":"2020 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Metric Learning Based Similarity Measure For Attribute Description Identification Of Energy Data\",\"authors\":\"Guo-Jing Liu, Hao Chen, Lin-Yu Wang, Di Zhu\",\"doi\":\"10.1109/ICMLC51923.2020.9469547\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Combining the yearbooks of different cities in China is important for investigating and planning the usage of energy. However, since the yearbooks of cities may be prepared according to different habits and regulations, the same attributes may be described differently. As a result, identifying the same attribute from different yearbook is an important problem. Manual processing is not preferable since it is inefficient and inaccurate. A machine learning model based automatic approach is proposed in this study. Our model applies a metric learning method to quantify the similarity between the attribute descriptions for energy-related data. The attribute descriptions are first converted from texts to a Boolean vector by a bag of words method. The embedding layer method is applied to deal with the sparsity problem of the Boolean vector. A metric learning model is then trained to construct a metric for the similarity of the descriptions. The experimental results indicate that our proposed method outperforms the one without using metric learning.\",\"PeriodicalId\":170815,\"journal\":{\"name\":\"2020 International Conference on Machine Learning and Cybernetics (ICMLC)\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Machine Learning and Cybernetics (ICMLC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLC51923.2020.9469547\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Machine Learning and Cybernetics (ICMLC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLC51923.2020.9469547","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Metric Learning Based Similarity Measure For Attribute Description Identification Of Energy Data
Combining the yearbooks of different cities in China is important for investigating and planning the usage of energy. However, since the yearbooks of cities may be prepared according to different habits and regulations, the same attributes may be described differently. As a result, identifying the same attribute from different yearbook is an important problem. Manual processing is not preferable since it is inefficient and inaccurate. A machine learning model based automatic approach is proposed in this study. Our model applies a metric learning method to quantify the similarity between the attribute descriptions for energy-related data. The attribute descriptions are first converted from texts to a Boolean vector by a bag of words method. The embedding layer method is applied to deal with the sparsity problem of the Boolean vector. A metric learning model is then trained to construct a metric for the similarity of the descriptions. The experimental results indicate that our proposed method outperforms the one without using metric learning.