Text Classification in Organizational Research – A Hybrid Approach Combining Dictionary Content Analysis and Supervised Machine Learning Techniques

IF 7.6 0 MANAGEMENT
Heiko Hossfeld, Martin Wolfslast
{"title":"Text Classification in Organizational Research – A Hybrid Approach Combining Dictionary Content Analysis and Supervised Machine Learning Techniques","authors":"Heiko Hossfeld, Martin Wolfslast","doi":"10.5771/0935-9915-2022-1-59","DOIUrl":null,"url":null,"abstract":"Big Data is an emerging field in organizational research as it provides new types of data, and technologies like digitization and web scraping allow to study huge amounts of data. Since large parts of digital data consist of unstructured text, text classification - assigning texts (or parts of texts) to predefined categories - is a central task. Text classification not only allows to identify relevant texts in a jumble of data but also to extract information from texts, such as sentiments, topics, and intentions. However, large amounts of textual data require the use of automated text mining methods, which is mostly uncharted territory in organizational research. We, therefore, outline and discuss the two existing approaches to text classification, one originating from social science (dictionary content analysis) the other from computer science (supervised machine learning). Since both approaches have advantages and disadvantages, we combine ideas from both to develop a hybrid approach that reduces existing issues and requires significantly less knowledge in programming and computer science than supervised machine learning. To illustrate our approach, we develop a classifier that identifies critical media coverage of organizational actions.","PeriodicalId":47269,"journal":{"name":"Management Revue","volume":null,"pages":null},"PeriodicalIF":7.6000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Management Revue","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5771/0935-9915-2022-1-59","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"MANAGEMENT","Score":null,"Total":0}
引用次数: 0

Abstract

Big Data is an emerging field in organizational research as it provides new types of data, and technologies like digitization and web scraping allow to study huge amounts of data. Since large parts of digital data consist of unstructured text, text classification - assigning texts (or parts of texts) to predefined categories - is a central task. Text classification not only allows to identify relevant texts in a jumble of data but also to extract information from texts, such as sentiments, topics, and intentions. However, large amounts of textual data require the use of automated text mining methods, which is mostly uncharted territory in organizational research. We, therefore, outline and discuss the two existing approaches to text classification, one originating from social science (dictionary content analysis) the other from computer science (supervised machine learning). Since both approaches have advantages and disadvantages, we combine ideas from both to develop a hybrid approach that reduces existing issues and requires significantly less knowledge in programming and computer science than supervised machine learning. To illustrate our approach, we develop a classifier that identifies critical media coverage of organizational actions.
组织研究中的文本分类-结合字典内容分析和监督机器学习技术的混合方法
大数据是组织研究中的一个新兴领域,因为它提供了新的数据类型,而数字化和网络抓取等技术允许研究大量数据。由于大部分数字数据由非结构化文本组成,因此文本分类——将文本(或文本的一部分)分配到预定义的类别——是一项中心任务。文本分类不仅允许在一堆数据中识别相关文本,而且还允许从文本中提取信息,如情感、主题和意图。然而,大量的文本数据需要使用自动文本挖掘方法,这在组织研究中大多是未知的领域。因此,我们概述并讨论了两种现有的文本分类方法,一种来自社会科学(字典内容分析),另一种来自计算机科学(监督机器学习)。由于这两种方法各有优缺点,我们将两者的想法结合起来,开发一种混合方法,减少现有问题,并且比监督式机器学习在编程和计算机科学方面需要的知识少得多。为了说明我们的方法,我们开发了一个分类器来识别组织行动的关键媒体报道。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Management Revue
Management Revue MANAGEMENT-
CiteScore
1.20
自引率
0.00%
发文量
7
期刊介绍: Management Revue - Socio-Economic Studies is an interdisciplinary European journal that undergoes peer review. It publishes qualitative and quantitative work, along with purely theoretical papers, contributing to the study of management, organization, and industrial relations. The journal welcomes contributions from various disciplines, including business and public administration, organizational behavior, economics, sociology, and psychology. Regular features include reviews of books relevant to management and organization studies. Special issues provide a unique perspective on specific research fields. Organized by selected guest editors, each special issue includes at least two overview articles from leaders in the field, along with at least three new empirical papers and up to ten book reviews related to the topic. The journal aims to offer in-depth insights into selected research topics, presenting potentially controversial perspectives, new theoretical insights, valuable empirical analysis, and brief reviews of key publications. Its objective is to establish Management Revue - Socio-Economic Studies as a top-quality symposium journal for the international academic community.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信