{"title":"使用概念和语义理解关系","authors":"Jouyon Park, Hyunsouk Cho, Seung-won Hwang","doi":"10.1145/3077240.3077250","DOIUrl":null,"url":null,"abstract":"The Financial Entity Identification and Information Integration (FEIII) task aims at the question of understanding relationships among financial entities and their roles using three sentences extracted from each financial contract containing the target word. FEIII task has two challenges - 1) data sparseness: small training sets (9% of test data) and 2) context sparseness: limited context (three sentences). Existing statistical approaches, such as Bayes and TF-IDF, cannot evaluate the imporatance of words unobservged in training data, which is vulnerable to the above challenges. We overcome each challenge by considering 1) the concepts of words from knowledge bases (Probase) in addition to the words themselves (conceptual feature) and 2) word semantics from distributed representations such as word2vec (semantic feature). We empirically evaluate the proposed classification model on the four-class classification (highly relevant, relevant, neutral, and irrelevant), and show that the proposed model increases 18% of F1-score compared to the statistical baselines.","PeriodicalId":326424,"journal":{"name":"Proceedings of the 3rd International Workshop on Data Science for Macro--Modeling with Financial and Economic Datasets","volume":"128 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Understanding Relations using Concepts and Semantics\",\"authors\":\"Jouyon Park, Hyunsouk Cho, Seung-won Hwang\",\"doi\":\"10.1145/3077240.3077250\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Financial Entity Identification and Information Integration (FEIII) task aims at the question of understanding relationships among financial entities and their roles using three sentences extracted from each financial contract containing the target word. FEIII task has two challenges - 1) data sparseness: small training sets (9% of test data) and 2) context sparseness: limited context (three sentences). Existing statistical approaches, such as Bayes and TF-IDF, cannot evaluate the imporatance of words unobservged in training data, which is vulnerable to the above challenges. We overcome each challenge by considering 1) the concepts of words from knowledge bases (Probase) in addition to the words themselves (conceptual feature) and 2) word semantics from distributed representations such as word2vec (semantic feature). We empirically evaluate the proposed classification model on the four-class classification (highly relevant, relevant, neutral, and irrelevant), and show that the proposed model increases 18% of F1-score compared to the statistical baselines.\",\"PeriodicalId\":326424,\"journal\":{\"name\":\"Proceedings of the 3rd International Workshop on Data Science for Macro--Modeling with Financial and Economic Datasets\",\"volume\":\"128 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd International Workshop on Data Science for Macro--Modeling with Financial and Economic Datasets\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3077240.3077250\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Workshop on Data Science for Macro--Modeling with Financial and Economic Datasets","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3077240.3077250","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Understanding Relations using Concepts and Semantics
The Financial Entity Identification and Information Integration (FEIII) task aims at the question of understanding relationships among financial entities and their roles using three sentences extracted from each financial contract containing the target word. FEIII task has two challenges - 1) data sparseness: small training sets (9% of test data) and 2) context sparseness: limited context (three sentences). Existing statistical approaches, such as Bayes and TF-IDF, cannot evaluate the imporatance of words unobservged in training data, which is vulnerable to the above challenges. We overcome each challenge by considering 1) the concepts of words from knowledge bases (Probase) in addition to the words themselves (conceptual feature) and 2) word semantics from distributed representations such as word2vec (semantic feature). We empirically evaluate the proposed classification model on the four-class classification (highly relevant, relevant, neutral, and irrelevant), and show that the proposed model increases 18% of F1-score compared to the statistical baselines.