Alessio Palmero Aprosio, Sara Tonelli, S. Menini, Giovanni Moretti
{"title":"Using Semantic Linking to Understand Persons’ Networks Extracted from Text","authors":"Alessio Palmero Aprosio, Sara Tonelli, S. Menini, Giovanni Moretti","doi":"10.3389/fdigh.2017.00022","DOIUrl":null,"url":null,"abstract":"In this work, we describe a methodology to interpret large persons' networks extracted from text by classifying cliques using the DBpedia ontology. The approach relies on a combination of NLP, Semantic web technologies and network analysis. The classification methodology that first starts from single nodes and then generalises to cliques is effective in terms of performance and is able to deal also with nodes that are not linked to Wikipedia. The gold standard manually developed for evaluation shows that groups of co-occurring entities share in most of the cases a category that can be automatically assigned. This holds for both languages considered in this study. The outcome of this work may be of interest to enhance the readability of large networks and to provide an additional semantic layer on top of cliques. This would greatly help humanities scholars when dealing with large amounts of textual data that need to be interpreted or categorised. Furthermore, it represents an unsupervised approach to automatically extend DBpedia starting from a corpus.","PeriodicalId":227954,"journal":{"name":"Frontiers Digit. Humanit.","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers Digit. Humanit.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fdigh.2017.00022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In this work, we describe a methodology to interpret large persons' networks extracted from text by classifying cliques using the DBpedia ontology. The approach relies on a combination of NLP, Semantic web technologies and network analysis. The classification methodology that first starts from single nodes and then generalises to cliques is effective in terms of performance and is able to deal also with nodes that are not linked to Wikipedia. The gold standard manually developed for evaluation shows that groups of co-occurring entities share in most of the cases a category that can be automatically assigned. This holds for both languages considered in this study. The outcome of this work may be of interest to enhance the readability of large networks and to provide an additional semantic layer on top of cliques. This would greatly help humanities scholars when dealing with large amounts of textual data that need to be interpreted or categorised. Furthermore, it represents an unsupervised approach to automatically extend DBpedia starting from a corpus.