Stig Hebbelstrup Rye Rasmussen, A. Bor, Mathias Osmundsen, M. B. Petersen
{"title":"标签文本的“超级无监督”分类:以网络政治敌意为例","authors":"Stig Hebbelstrup Rye Rasmussen, A. Bor, Mathias Osmundsen, M. B. Petersen","doi":"10.1017/s0007123423000042","DOIUrl":null,"url":null,"abstract":"\n We live in a world of text. Yet the sheer magnitude of social media data, coupled with a need to measure complex psychological constructs, has made this important source of data difficult to use. Researchers often engage in costly hand coding of thousands of texts using supervised techniques or rely on unsupervised techniques where the measurement of predefined constructs is difficult. We propose a novel approach that we call ‘super-unsupervised’ learning and demonstrate its usefulness by measuring the psychologically complex construct of online political hostility based on a large corpus of tweets. This approach accomplishes the feat by combining the best features of supervised and unsupervised learning techniques: measurements of complex psychological constructs without a single labelled data source. We first outline the approach before conducting a diverse series of tests that include: (i) face validity, (ii) convergent and discriminant validity, (iii) criterion validity, (iv) external validity, and (v) ecological validity.","PeriodicalId":48301,"journal":{"name":"British Journal of Political Science","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2023-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"‘Super-Unsupervised’ Classification for Labelling Text: Online Political Hostility as an Illustration\",\"authors\":\"Stig Hebbelstrup Rye Rasmussen, A. Bor, Mathias Osmundsen, M. B. Petersen\",\"doi\":\"10.1017/s0007123423000042\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n We live in a world of text. Yet the sheer magnitude of social media data, coupled with a need to measure complex psychological constructs, has made this important source of data difficult to use. Researchers often engage in costly hand coding of thousands of texts using supervised techniques or rely on unsupervised techniques where the measurement of predefined constructs is difficult. We propose a novel approach that we call ‘super-unsupervised’ learning and demonstrate its usefulness by measuring the psychologically complex construct of online political hostility based on a large corpus of tweets. This approach accomplishes the feat by combining the best features of supervised and unsupervised learning techniques: measurements of complex psychological constructs without a single labelled data source. We first outline the approach before conducting a diverse series of tests that include: (i) face validity, (ii) convergent and discriminant validity, (iii) criterion validity, (iv) external validity, and (v) ecological validity.\",\"PeriodicalId\":48301,\"journal\":{\"name\":\"British Journal of Political Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2023-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"British Journal of Political Science\",\"FirstCategoryId\":\"90\",\"ListUrlMain\":\"https://doi.org/10.1017/s0007123423000042\",\"RegionNum\":1,\"RegionCategory\":\"社会学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"POLITICAL SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"British Journal of Political Science","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.1017/s0007123423000042","RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"POLITICAL SCIENCE","Score":null,"Total":0}
‘Super-Unsupervised’ Classification for Labelling Text: Online Political Hostility as an Illustration
We live in a world of text. Yet the sheer magnitude of social media data, coupled with a need to measure complex psychological constructs, has made this important source of data difficult to use. Researchers often engage in costly hand coding of thousands of texts using supervised techniques or rely on unsupervised techniques where the measurement of predefined constructs is difficult. We propose a novel approach that we call ‘super-unsupervised’ learning and demonstrate its usefulness by measuring the psychologically complex construct of online political hostility based on a large corpus of tweets. This approach accomplishes the feat by combining the best features of supervised and unsupervised learning techniques: measurements of complex psychological constructs without a single labelled data source. We first outline the approach before conducting a diverse series of tests that include: (i) face validity, (ii) convergent and discriminant validity, (iii) criterion validity, (iv) external validity, and (v) ecological validity.
期刊介绍:
The British Journal of Political Science is a broadly based journal aiming to cover developments across a wide range of countries and specialisms. Contributions are drawn from all fields of political science (including political theory, political behaviour, public policy and international relations), and articles from scholars in related disciplines (sociology, social psychology, economics and philosophy) appear frequently. With a reputation established over nearly 40 years of publication, the British Journal of Political Science is widely recognised as one of the premier journals in its field.