Predicting Role Relevance with Minimal Domain Expertise in a Financial Domain

Proceedings of the 3rd International Workshop on Data Science for Macro--Modeling with Financial and Economic Datasets Pub Date : 2017-04-19 DOI:10.1145/3077240.3077249

M. Kejriwal

引用次数: 2

Abstract

Word embeddings have made enormous inroads in recent years in a wide variety of text mining applications. In this paper, we explore a word embedding-based architecture for predicting the relevance of a role between two financial entities within the context of natural language sentences. In this extended abstract, we propose a pooled approach that uses a collection of sentences to train word embeddings using the skip-gram word2vec architecture. We use the word embeddings to obtain context vectors that are assigned one or more labels based on manual annotations. We train a machine learning classifier using the labeled context vectors, and use the trained classifier to predict contextual role relevance on test data. Our approach serves as a good minimal-expertise baseline for the task as it is simple and intuitive, uses open-source modules, requires little feature crafting effort and performs well across roles.

查看原文本刊更多论文

用最小的领域专业知识预测金融领域的角色相关性

近年来，词嵌入在各种文本挖掘应用中取得了巨大的进展。在本文中，我们探索了一种基于词嵌入的架构，用于在自然语言句子的上下文中预测两个金融实体之间角色的相关性。在这篇扩展摘要中，我们提出了一种池化方法，该方法使用skip-gram word2vec架构使用句子集合来训练词嵌入。我们使用词嵌入来获得上下文向量，这些上下文向量基于手动注释被分配一个或多个标签。我们使用标记的上下文向量训练机器学习分类器，并使用训练好的分类器来预测测试数据上的上下文角色相关性。我们的方法可以作为任务的一个很好的最小专业基线，因为它简单直观，使用开源模块，需要很少的功能制作工作，并且跨角色执行良好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 3rd International Workshop on Data Science for Macro--Modeling with Financial and Economic Datasets

自引率

0.00%

发文量