Triveni Lal Pa, Madhu Kumari, Tajinder Singh, Mohammad Ahsan
{"title":"Semantic Representations in Text Data","authors":"Triveni Lal Pa, Madhu Kumari, Tajinder Singh, Mohammad Ahsan","doi":"10.14257/ijgdc.2018.11.9.06","DOIUrl":null,"url":null,"abstract":"Automatic text mining processes and other sophisticated natural language processing constructs need realistic representations of text/documents which embed semantics efficiently. All the representations work on the notion that every data contains different explanatory factors (attributes). In this article, we exploit these explanatory factors to study and compare various semantic representation methods for text documents. The article critically reviews recent trends in the area of semi-supervised semantic representations, covering cutting-edge methods in distributed representations such as embeddings. This article gives a broad and synthesized description of various forms of text representations, presented in their chronological order ranging from BoW models to the most recent embeddings learning. Conclusively, various findings taken together provide valuable pointers for researchers looking to work in the field of semantic representations. In addition, the article also shows that one need to develop a model for learning universal embeddings in unsupervised/semi-supervised settings that incorporate contextual as well as word-order information, with language independent features and which would be feasible for large dataset.","PeriodicalId":46000,"journal":{"name":"International Journal of Grid and Distributed Computing","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Grid and Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14257/ijgdc.2018.11.9.06","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Automatic text mining processes and other sophisticated natural language processing constructs need realistic representations of text/documents which embed semantics efficiently. All the representations work on the notion that every data contains different explanatory factors (attributes). In this article, we exploit these explanatory factors to study and compare various semantic representation methods for text documents. The article critically reviews recent trends in the area of semi-supervised semantic representations, covering cutting-edge methods in distributed representations such as embeddings. This article gives a broad and synthesized description of various forms of text representations, presented in their chronological order ranging from BoW models to the most recent embeddings learning. Conclusively, various findings taken together provide valuable pointers for researchers looking to work in the field of semantic representations. In addition, the article also shows that one need to develop a model for learning universal embeddings in unsupervised/semi-supervised settings that incorporate contextual as well as word-order information, with language independent features and which would be feasible for large dataset.
期刊介绍:
IJGDC aims to facilitate and support research related to control and automation technology and its applications. Our Journal provides a chance for academic and industry professionals to discuss recent progress in the area of control and automation. To bridge the gap of users who do not have access to major databases where one should pay for every downloaded article; this online publication platform is open to all readers as part of our commitment to global scientific society. Journal Topics: -Architectures and Fabrics -Autonomic and Adaptive Systems -Cluster and Grid Integration -Creation and Management of Virtual Enterprises and Organizations -Dependable and Survivable Distributed Systems -Distributed and Large-Scale Data Access and Management -Distributed Multimedia Systems -Distributed Trust Management -eScience and eBusiness Applications -Fuzzy Algorithm -Grid Economy and Business Models -Histogram Methodology -Image or Speech Filtering -Image or Speech Recognition -Information Services -Large-Scale Group Communication -Metadata, Ontologies, and Provenance -Middleware and Toolkits -Monitoring, Management and Organization Tools -Networking and Security -Novel Distributed Applications -Performance Measurement and Modeling -Pervasive Computing -Problem Solving Environments -Programming Models, Tools and Environments -QoS and resource management -Real-time and Embedded Systems -Security and Trust in Grid and Distributed Systems -Sensor Networks -Utility Computing on Global Grids -Web Services and Service-Oriented Architecture -Wireless and Mobile Ad Hoc Networks -Workflow and Multi-agent Systems