{"title":"A Cross-Domain Semantic Similarity Measure and Multi-Source Domain Adaptation in Sentiment Analysis","authors":"Dipak Patel, Kiran R. Amin","doi":"10.1109/ICAISS55157.2022.10011051","DOIUrl":null,"url":null,"abstract":"Domain adaptation becomes crucial when there is a lack of labelled data in various domains. The accuracy of traditional machine learning models degrades largely if they are trained on one domain (called the source or training domain) and classify the data of a different domain (called the target domain or test domain, which is different from the source domain). The machine needs to train on a corresponding domain to improve the classification accuracy, but labelling each new domain is a complex and time-consuming task. Hence, the domain adaptation technique is required to solve the issue of data labeling. The similarity measure plays a vital role in selecting important pivot features from the target domain that match source domains. This research article has introduced an enhanced cross-entropy measure for matching the normalized frequency distribution of different domains and found an important domain-specific feature set. In addition, the technique of using enhanced cross entropy measures is proposed in the multi-source domain adaptation model to effectively classify the target domain data. The result shows that there is an improvement of 3.66% to 9.09% using our approach.","PeriodicalId":243784,"journal":{"name":"2022 International Conference on Augmented Intelligence and Sustainable Systems (ICAISS)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Augmented Intelligence and Sustainable Systems (ICAISS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAISS55157.2022.10011051","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Domain adaptation becomes crucial when there is a lack of labelled data in various domains. The accuracy of traditional machine learning models degrades largely if they are trained on one domain (called the source or training domain) and classify the data of a different domain (called the target domain or test domain, which is different from the source domain). The machine needs to train on a corresponding domain to improve the classification accuracy, but labelling each new domain is a complex and time-consuming task. Hence, the domain adaptation technique is required to solve the issue of data labeling. The similarity measure plays a vital role in selecting important pivot features from the target domain that match source domains. This research article has introduced an enhanced cross-entropy measure for matching the normalized frequency distribution of different domains and found an important domain-specific feature set. In addition, the technique of using enhanced cross entropy measures is proposed in the multi-source domain adaptation model to effectively classify the target domain data. The result shows that there is an improvement of 3.66% to 9.09% using our approach.