Fahad Ahmed Satti, Musarrat Hussain, Sungyoung Lee, T. Chung
{"title":"句法类型识别在嵌入向量模式匹配中的意义","authors":"Fahad Ahmed Satti, Musarrat Hussain, Sungyoung Lee, T. Chung","doi":"10.1109/imcom53663.2022.9721780","DOIUrl":null,"url":null,"abstract":"Data Interoperability provides a bridge between information systems to store, exchange and consume heterogeneous data. In order to achieve this goal, schema maps provide the necessary foundations. Traditional solutions rely on expert generated rules, ontologies, and syntactic matching to identify the similarity between attributes in the various data schema. While previously we have presented the effectiveness of transformer based models and unsupervised learning to calculate attribute similarities, in this paper we present the additional application of a naive syntactic similarity measurement We have evaluated the results of agreement between the computed and human annotated results, in terms of Mathews Correlation Coefficient (MCC). These results indicate that on weighted comparison there is no positive effect of including naive syntactic similarity in addition to semantic similarity.","PeriodicalId":367038,"journal":{"name":"2022 16th International Conference on Ubiquitous Information Management and Communication (IMCOM)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Significance of Syntactic Type Identification in Embedding Vector based Schema Matching\",\"authors\":\"Fahad Ahmed Satti, Musarrat Hussain, Sungyoung Lee, T. Chung\",\"doi\":\"10.1109/imcom53663.2022.9721780\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data Interoperability provides a bridge between information systems to store, exchange and consume heterogeneous data. In order to achieve this goal, schema maps provide the necessary foundations. Traditional solutions rely on expert generated rules, ontologies, and syntactic matching to identify the similarity between attributes in the various data schema. While previously we have presented the effectiveness of transformer based models and unsupervised learning to calculate attribute similarities, in this paper we present the additional application of a naive syntactic similarity measurement We have evaluated the results of agreement between the computed and human annotated results, in terms of Mathews Correlation Coefficient (MCC). These results indicate that on weighted comparison there is no positive effect of including naive syntactic similarity in addition to semantic similarity.\",\"PeriodicalId\":367038,\"journal\":{\"name\":\"2022 16th International Conference on Ubiquitous Information Management and Communication (IMCOM)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 16th International Conference on Ubiquitous Information Management and Communication (IMCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/imcom53663.2022.9721780\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 16th International Conference on Ubiquitous Information Management and Communication (IMCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/imcom53663.2022.9721780","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Significance of Syntactic Type Identification in Embedding Vector based Schema Matching
Data Interoperability provides a bridge between information systems to store, exchange and consume heterogeneous data. In order to achieve this goal, schema maps provide the necessary foundations. Traditional solutions rely on expert generated rules, ontologies, and syntactic matching to identify the similarity between attributes in the various data schema. While previously we have presented the effectiveness of transformer based models and unsupervised learning to calculate attribute similarities, in this paper we present the additional application of a naive syntactic similarity measurement We have evaluated the results of agreement between the computed and human annotated results, in terms of Mathews Correlation Coefficient (MCC). These results indicate that on weighted comparison there is no positive effect of including naive syntactic similarity in addition to semantic similarity.