{"title":"Latent Semantic Scaling: A Semisupervised Text Analysis Technique for New Domains and Languages","authors":"Kohei Watanabe","doi":"10.1080/19312458.2020.1832976","DOIUrl":null,"url":null,"abstract":"ABSTRACT Many social scientists recognize that quantitative text analysis is a useful research methodology, but its application is still concentrated in documents written in European languages, especially English, and few sub-fields of political science, such as comparative politics and legislative studies. This seems to be due to the absence of flexible and cost-efficient methods that can be used to analyze documents in different domains and languages. Aiming to solve this problem, this paper proposes a semisupervised document scaling technique, called Latent Semantic Scaling (LSS), which can locate documents on various pre-defined dimensions. LSS achieves this by combining user-provided seed words and latent semantic analysis (word embedding). The article demonstrates its flexibility and efficiency in large-scale sentiment analysis of New York Times articles on the economy and Asahi Shimbun articles on politics. These examples show that LSS can produce results comparable to that of the Lexicoder Sentiment Dictionary (LSD) in both English and Japanese with only small sets of sentiment seed words. A new heuristic method that assists LSS users to choose a near-optimal number of singular values to obtain word vectors that best capture differences between documents on target dimensions is also presented.","PeriodicalId":47552,"journal":{"name":"Communication Methods and Measures","volume":"15 1","pages":"81 - 102"},"PeriodicalIF":6.3000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/19312458.2020.1832976","citationCount":"48","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communication Methods and Measures","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1080/19312458.2020.1832976","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMMUNICATION","Score":null,"Total":0}
引用次数: 48
Abstract
ABSTRACT Many social scientists recognize that quantitative text analysis is a useful research methodology, but its application is still concentrated in documents written in European languages, especially English, and few sub-fields of political science, such as comparative politics and legislative studies. This seems to be due to the absence of flexible and cost-efficient methods that can be used to analyze documents in different domains and languages. Aiming to solve this problem, this paper proposes a semisupervised document scaling technique, called Latent Semantic Scaling (LSS), which can locate documents on various pre-defined dimensions. LSS achieves this by combining user-provided seed words and latent semantic analysis (word embedding). The article demonstrates its flexibility and efficiency in large-scale sentiment analysis of New York Times articles on the economy and Asahi Shimbun articles on politics. These examples show that LSS can produce results comparable to that of the Lexicoder Sentiment Dictionary (LSD) in both English and Japanese with only small sets of sentiment seed words. A new heuristic method that assists LSS users to choose a near-optimal number of singular values to obtain word vectors that best capture differences between documents on target dimensions is also presented.
期刊介绍:
Communication Methods and Measures aims to achieve several goals in the field of communication research. Firstly, it aims to bring attention to and showcase developments in both qualitative and quantitative research methodologies to communication scholars. This journal serves as a platform for researchers across the field to discuss and disseminate methodological tools and approaches.
Additionally, Communication Methods and Measures seeks to improve research design and analysis practices by offering suggestions for improvement. It aims to introduce new methods of measurement that are valuable to communication scientists or enhance existing methods. The journal encourages submissions that focus on methods for enhancing research design and theory testing, employing both quantitative and qualitative approaches.
Furthermore, the journal is open to articles devoted to exploring the epistemological aspects relevant to communication research methodologies. It welcomes well-written manuscripts that demonstrate the use of methods and articles that highlight the advantages of lesser-known or newer methods over those traditionally used in communication.
In summary, Communication Methods and Measures strives to advance the field of communication research by showcasing and discussing innovative methodologies, improving research practices, and introducing new measurement methods.