Topic modelling applied on innovation studies of Flemish companies

IF 1.7 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS
Annelien Crijns, Victor Vanhullebusch, Manon Reusens, Michael Reusens, B. Baesens
{"title":"Topic modelling applied on innovation studies of Flemish companies","authors":"Annelien Crijns, Victor Vanhullebusch, Manon Reusens, Michael Reusens, B. Baesens","doi":"10.1080/2573234X.2023.2186274","DOIUrl":null,"url":null,"abstract":"ABSTRACT Mapping innovation in companies for the purpose of official statistics is usually done through business surveys. However, this traditional approach faces several drawbacks like a lack of responses, response bias, low frequency, and high costs. Alternatively, text-based models trained on web-scraped text from company websites have been developed to complement or substitute traditional business surveys. This paper utilises web scraping and text-based models to map the business innovation in Flanders with a focus on identifying different types of innovation through topic modelling. More specifically, the scraped web texts are used to identify innovative economic sectors or topics, and to classify firms into these topics using Top2Vec and Lbl2Vec. We conclude that both models can be successfully combined to discover topics (or sectors) and classify companies into these topics which results in an additional parameter for mapping innovation in different regions.","PeriodicalId":36417,"journal":{"name":"Journal of Business Analytics","volume":null,"pages":null},"PeriodicalIF":1.7000,"publicationDate":"2023-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Business Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/2573234X.2023.2186274","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 1

Abstract

ABSTRACT Mapping innovation in companies for the purpose of official statistics is usually done through business surveys. However, this traditional approach faces several drawbacks like a lack of responses, response bias, low frequency, and high costs. Alternatively, text-based models trained on web-scraped text from company websites have been developed to complement or substitute traditional business surveys. This paper utilises web scraping and text-based models to map the business innovation in Flanders with a focus on identifying different types of innovation through topic modelling. More specifically, the scraped web texts are used to identify innovative economic sectors or topics, and to classify firms into these topics using Top2Vec and Lbl2Vec. We conclude that both models can be successfully combined to discover topics (or sectors) and classify companies into these topics which results in an additional parameter for mapping innovation in different regions.
主题模型在佛兰德公司创新研究中的应用
为了官方统计,通常通过商业调查来绘制公司的创新地图。然而,这种传统方法面临着一些缺点,如缺乏响应、响应偏差、频率低和成本高。另外,基于文本的模型从公司网站上抓取的文本进行训练,以补充或替代传统的商业调查。本文利用网络抓取和基于文本的模型来映射法兰德斯的商业创新,重点是通过主题建模来识别不同类型的创新。更具体地说,抓取的网络文本用于识别创新的经济部门或主题,并使用Top2Vec和Lbl2Vec将公司分类到这些主题中。我们的结论是,这两个模型可以成功地结合起来发现主题(或部门)并将公司分类到这些主题中,从而为绘制不同地区的创新提供了额外的参数。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Business Analytics
Journal of Business Analytics Business, Management and Accounting-Management Information Systems
CiteScore
2.50
自引率
0.00%
发文量
13
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信