Deciphering Entrepreneurial Pitches: A Multimodal Deep Learning Approach to Predict Probability of Investment

Companion Publication of the 2020 International Conference on Multimodal Interaction Pub Date : 2023-10-09 DOI:10.1145/3577190.3614146

Pepijn Van Aken, Merel M. Jung, Werner Liebregts, Itir Onal Ertugrul

{"title":"Deciphering Entrepreneurial Pitches: A Multimodal Deep Learning Approach to Predict Probability of Investment","authors":"Pepijn Van Aken, Merel M. Jung, Werner Liebregts, Itir Onal Ertugrul","doi":"10.1145/3577190.3614146","DOIUrl":null,"url":null,"abstract":"Acquiring early-stage investments for the purpose of developing a business is a fundamental aspect of the entrepreneurial process, which regularly entails pitching the business proposal to potential investors. Previous research suggests that business viability data and the perception of the entrepreneur play an important role in the investment decision-making process. This perception of the entrepreneur is shaped by verbal and non-verbal behavioral cues produced in investor-entrepreneur interactions. This study explores the impact of such cues on decisions that involve investing in a startup on the basis of a pitch. A multimodal approach is developed in which acoustic and linguistic features are extracted from recordings of entrepreneurial pitches to predict the likelihood of investment. The acoustic and linguistic modalities are represented using both hand-crafted and deep features. The capabilities of deep learning models are exploited to capture the temporal dynamics of the inputs. The findings show promising results for the prediction of the likelihood of investment using a multimodal architecture consisting of acoustic and linguistic features. Models based on deep features generally outperform hand-crafted representations. Experiments with an explainable model provide insights about the important features. The most predictive model is found to be a multimodal one that combines deep acoustic and linguistic features using an early fusion strategy and achieves an MAE of 13.91.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"91 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion Publication of the 2020 International Conference on Multimodal Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3577190.3614146","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Acquiring early-stage investments for the purpose of developing a business is a fundamental aspect of the entrepreneurial process, which regularly entails pitching the business proposal to potential investors. Previous research suggests that business viability data and the perception of the entrepreneur play an important role in the investment decision-making process. This perception of the entrepreneur is shaped by verbal and non-verbal behavioral cues produced in investor-entrepreneur interactions. This study explores the impact of such cues on decisions that involve investing in a startup on the basis of a pitch. A multimodal approach is developed in which acoustic and linguistic features are extracted from recordings of entrepreneurial pitches to predict the likelihood of investment. The acoustic and linguistic modalities are represented using both hand-crafted and deep features. The capabilities of deep learning models are exploited to capture the temporal dynamics of the inputs. The findings show promising results for the prediction of the likelihood of investment using a multimodal architecture consisting of acoustic and linguistic features. Models based on deep features generally outperform hand-crafted representations. Experiments with an explainable model provide insights about the important features. The most predictive model is found to be a multimodal one that combines deep acoustic and linguistic features using an early fusion strategy and achieves an MAE of 13.91.

查看原文本刊更多论文

解读创业推销:预测投资概率的多模态深度学习方法

为了发展业务而获得早期投资是创业过程的一个基本方面，这通常需要向潜在投资者推销商业提案。以往的研究表明，企业生存能力数据和企业家的感知在投资决策过程中起着重要作用。这种对企业家的看法是由投资者与企业家互动中产生的语言和非语言行为线索形成的。这项研究探讨了这些线索对投资创业公司决策的影响。开发了一种多模式方法，其中从创业宣传的录音中提取声学和语言特征来预测投资的可能性。声学和语言模式是用手工和深层特征来表示的。利用深度学习模型的能力来捕捉输入的时间动态。研究结果显示，使用由声学和语言特征组成的多模态架构预测投资可能性的结果很有希望。基于深度特征的模型通常优于手工制作的表示。用一个可解释的模型进行的实验提供了关于重要特征的见解。最具预测性的模型是使用早期融合策略结合深层声学和语言特征的多模态模型，MAE为13.91。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Companion Publication of the 2020 International Conference on Multimodal Interaction

自引率

0.00%

发文量