The Value of Publicly Available, Textual and Non-Textual Information for Startup Performance Prediction

Entrepreneurship & Finance eJournal Pub Date : 2020-11-01 DOI:10.2139/ssrn.3570379

Ulrich Kaiser, J. Kuhn

{"title":"The Value of Publicly Available, Textual and Non-Textual Information for Startup Performance Prediction","authors":"Ulrich Kaiser, J. Kuhn","doi":"10.2139/ssrn.3570379","DOIUrl":null,"url":null,"abstract":"Abstract We use administrative textual and non-textual data retrieved from publicly available archives to predict the performance of Danish startups at the time of foundation. The performance outcomes we consider are survival, high employment growth, a return on assets of above 20 percent, new patent applications and participation in an innovation subsidy program. We consider a base specification that includes variables for legal form, region, ownership and industry in all specifications and add variable sets representing firm names, business purpose statements (BPSs) as well as founder and startup characteristics. To forecast the two innovation-related performance outcomes well, we only need to include a set of variables derived from the BPS texts on top of the base variables while an accurate prediction of startup survival requires the combination of the firm names and the BPS variables along with founder characteristics. An accurate forecast of high employment growth needs the combination of the BPS variables and the founder characteristics. All information our forecasts require is likely to be easily obtainable since the underlying information is mandatory to report upon business registration in many countries. The substantial accuracy of our predictions for survival, employment growth, new patents and participation in innovation subsidy programs indicates ample scope for algorithmic scoring models as an additional pillar of funding and innovation support decisions.","PeriodicalId":11881,"journal":{"name":"Entrepreneurship & Finance eJournal","volume":"26 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Entrepreneurship & Finance eJournal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3570379","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

Abstract

Abstract We use administrative textual and non-textual data retrieved from publicly available archives to predict the performance of Danish startups at the time of foundation. The performance outcomes we consider are survival, high employment growth, a return on assets of above 20 percent, new patent applications and participation in an innovation subsidy program. We consider a base specification that includes variables for legal form, region, ownership and industry in all specifications and add variable sets representing firm names, business purpose statements (BPSs) as well as founder and startup characteristics. To forecast the two innovation-related performance outcomes well, we only need to include a set of variables derived from the BPS texts on top of the base variables while an accurate prediction of startup survival requires the combination of the firm names and the BPS variables along with founder characteristics. An accurate forecast of high employment growth needs the combination of the BPS variables and the founder characteristics. All information our forecasts require is likely to be easily obtainable since the underlying information is mandatory to report upon business registration in many countries. The substantial accuracy of our predictions for survival, employment growth, new patents and participation in innovation subsidy programs indicates ample scope for algorithmic scoring models as an additional pillar of funding and innovation support decisions.

查看原文本刊更多论文

公开可用、文本和非文本信息对启动性能预测的价值

我们使用从公开档案中检索的行政文本和非文本数据来预测丹麦初创公司在成立时的表现。我们考虑的绩效结果是生存、高就业增长、超过20%的资产回报率、新专利申请和参与创新补贴计划。我们考虑了一个基本规范，其中包括所有规范中法律形式、地区、所有权和行业的变量，并添加了代表公司名称、商业目的声明(bps)以及创始人和初创公司特征的变量集。为了很好地预测这两种与创新相关的绩效结果，我们只需要在基本变量的基础上包括一组来自BPS文本的变量，而对初创企业生存的准确预测需要将公司名称、BPS变量以及创始人特征结合起来。对高就业增长的准确预测需要结合BPS变量和创始人特征。我们的预测所需的所有信息可能很容易获得，因为在许多国家，基础信息是强制性的，必须在商业登记时报告。我们对生存、就业增长、新专利和参与创新补贴计划的预测具有相当的准确性，这表明算法评分模型有足够的空间作为资助和创新支持决策的额外支柱。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Entrepreneurship & Finance eJournal

自引率

0.00%

发文量