Simone Angioni, Angelo Salatino, Francesco Osborne, Diego Reforgiato, Recupero, E. Motta
{"title":"AIDA:关于学术界和工业界研究动态的知识图谱","authors":"Simone Angioni, Angelo Salatino, Francesco Osborne, Diego Reforgiato, Recupero, E. Motta","doi":"10.1162/qss_a_00162","DOIUrl":null,"url":null,"abstract":"Abstract Academia and industry share a complex, multifaceted, and symbiotic relationship. Analyzing the knowledge flow between them, understanding which directions have the biggest potential, and discovering the best strategies to harmonize their efforts is a critical task for several stakeholders. Research publications and patents are an ideal medium to analyze this space, but current data sets of scholarly data cannot be used for such a purpose because they lack a high-quality characterization of the relevant research topics and industrial sectors. In this paper, we introduce the Academia/Industry DynAmics (AIDA) Knowledge Graph, which describes 21 million publications and 8 million patents according to the research topics drawn from the Computer Science Ontology. 5.1 million publications and 5.6 million patents are further characterized according to the type of the author’s affiliations and 66 industrial sectors from the proposed Industrial Sectors Ontology (INDUSO). AIDA was generated by an automatic pipeline that integrates data from Microsoft Academic Graph, Dimensions, DBpedia, the Computer Science Ontology, and the Global Research Identifier Database. It is publicly available under CC BY 4.0 and can be downloaded as a dump or queried via a triplestore. We evaluated the different parts of the generation pipeline on a manually crafted gold standard yielding competitive results.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"2 1","pages":"1356-1398"},"PeriodicalIF":4.1000,"publicationDate":"2021-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"AIDA: A knowledge graph about research dynamics in academia and industry\",\"authors\":\"Simone Angioni, Angelo Salatino, Francesco Osborne, Diego Reforgiato, Recupero, E. Motta\",\"doi\":\"10.1162/qss_a_00162\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Academia and industry share a complex, multifaceted, and symbiotic relationship. Analyzing the knowledge flow between them, understanding which directions have the biggest potential, and discovering the best strategies to harmonize their efforts is a critical task for several stakeholders. Research publications and patents are an ideal medium to analyze this space, but current data sets of scholarly data cannot be used for such a purpose because they lack a high-quality characterization of the relevant research topics and industrial sectors. In this paper, we introduce the Academia/Industry DynAmics (AIDA) Knowledge Graph, which describes 21 million publications and 8 million patents according to the research topics drawn from the Computer Science Ontology. 5.1 million publications and 5.6 million patents are further characterized according to the type of the author’s affiliations and 66 industrial sectors from the proposed Industrial Sectors Ontology (INDUSO). AIDA was generated by an automatic pipeline that integrates data from Microsoft Academic Graph, Dimensions, DBpedia, the Computer Science Ontology, and the Global Research Identifier Database. It is publicly available under CC BY 4.0 and can be downloaded as a dump or queried via a triplestore. We evaluated the different parts of the generation pipeline on a manually crafted gold standard yielding competitive results.\",\"PeriodicalId\":34021,\"journal\":{\"name\":\"Quantitative Science Studies\",\"volume\":\"2 1\",\"pages\":\"1356-1398\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2021-11-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Quantitative Science Studies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1162/qss_a_00162\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"INFORMATION SCIENCE & LIBRARY SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantitative Science Studies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1162/qss_a_00162","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 22
摘要
学术界和产业界有着复杂的、多方面的、共生的关系。分析他们之间的知识流动,了解哪个方向具有最大的潜力,并发现协调他们努力的最佳策略是几个利益相关者的关键任务。研究出版物和专利是分析这一领域的理想媒介,但目前的学术数据集不能用于这一目的,因为它们缺乏对相关研究主题和工业部门的高质量描述。本文引入了学术界/行业动态(AIDA)知识图谱,该图谱根据从计算机科学本体中提取的研究主题描述了2100万份出版物和800万项专利,并根据作者所属单位的类型和提出的工业部门本体(INDUSO)中的66个工业部门进一步描述了510万份出版物和560万项专利。AIDA是由一个自动管道生成的,该管道集成了来自微软学术图、维度、DBpedia、计算机科学本体和全球研究标识数据库的数据。它在CC BY 4.0下公开提供,可以作为转储文件下载或通过triplestore查询。我们在手工制作的黄金标准上评估了生成管道的不同部分,产生了具有竞争力的结果。
AIDA: A knowledge graph about research dynamics in academia and industry
Abstract Academia and industry share a complex, multifaceted, and symbiotic relationship. Analyzing the knowledge flow between them, understanding which directions have the biggest potential, and discovering the best strategies to harmonize their efforts is a critical task for several stakeholders. Research publications and patents are an ideal medium to analyze this space, but current data sets of scholarly data cannot be used for such a purpose because they lack a high-quality characterization of the relevant research topics and industrial sectors. In this paper, we introduce the Academia/Industry DynAmics (AIDA) Knowledge Graph, which describes 21 million publications and 8 million patents according to the research topics drawn from the Computer Science Ontology. 5.1 million publications and 5.6 million patents are further characterized according to the type of the author’s affiliations and 66 industrial sectors from the proposed Industrial Sectors Ontology (INDUSO). AIDA was generated by an automatic pipeline that integrates data from Microsoft Academic Graph, Dimensions, DBpedia, the Computer Science Ontology, and the Global Research Identifier Database. It is publicly available under CC BY 4.0 and can be downloaded as a dump or queried via a triplestore. We evaluated the different parts of the generation pipeline on a manually crafted gold standard yielding competitive results.