研究人员专业领域推断

2018 7th Brazilian Conference on Intelligent Systems (BRACIS) Pub Date : 2018-10-01 DOI:10.1109/bracis.2018.00020

Felipe Penhorate Carvalho da Fonseca, Luciano Antonio Digiampietri

{"title":"研究人员专业领域推断","authors":"Felipe Penhorate Carvalho da Fonseca, Luciano Antonio Digiampietri","doi":"10.1109/bracis.2018.00020","DOIUrl":null,"url":null,"abstract":"Nowadays, there is a wide range of academic data available on the web. This information allows solving tasks such as the discovery of specialists in a given area, identification of potential scholarship holders, suggestion of collaborators, among others. However, the success of these tasks depends on the quality of the data used, since incorrect or incomplete data tend to impair the performance of the applied algorithms. The present work utilized machine learning techniques to help to infer the researchers' areas based on the data registered in the Lattes Platform, using the subareas as a case study. The subareas present a variant of the original problem with more challenges, as the number of classes is bigger. The goal of this paper is to analyze the contribution of factors such as social network metrics, the language of the titles and the hierarchical structure of the areas in the performance of the algorithms, and propose a new approach combining different characteristics. The proposed approach can be applied to different academic data, but the data from the Lattes Platform was used for the tests and validations of the proposed solution. As a result, we identified that the social network metrics and the numerical representations of the data improved inference accuracy when compared to state-of-the-art techniques, and the use of the hierarchical structure information achieved even better results.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Inference of Researchers' Area of Expertise\",\"authors\":\"Felipe Penhorate Carvalho da Fonseca, Luciano Antonio Digiampietri\",\"doi\":\"10.1109/bracis.2018.00020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, there is a wide range of academic data available on the web. This information allows solving tasks such as the discovery of specialists in a given area, identification of potential scholarship holders, suggestion of collaborators, among others. However, the success of these tasks depends on the quality of the data used, since incorrect or incomplete data tend to impair the performance of the applied algorithms. The present work utilized machine learning techniques to help to infer the researchers' areas based on the data registered in the Lattes Platform, using the subareas as a case study. The subareas present a variant of the original problem with more challenges, as the number of classes is bigger. The goal of this paper is to analyze the contribution of factors such as social network metrics, the language of the titles and the hierarchical structure of the areas in the performance of the algorithms, and propose a new approach combining different characteristics. The proposed approach can be applied to different academic data, but the data from the Lattes Platform was used for the tests and validations of the proposed solution. As a result, we identified that the social network metrics and the numerical representations of the data improved inference accuracy when compared to state-of-the-art techniques, and the use of the hierarchical structure information achieved even better results.\",\"PeriodicalId\":405190,\"journal\":{\"name\":\"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)\",\"volume\":\"58 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/bracis.2018.00020\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/bracis.2018.00020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

如今，网上有各种各样的学术数据。这些信息有助于解决诸如发现特定领域的专家、确定潜在奖学金获得者、建议合作者等任务。然而，这些任务的成功取决于所使用数据的质量，因为不正确或不完整的数据往往会损害所应用算法的性能。目前的工作利用机器学习技术来帮助推断基于在拿铁平台上注册的数据的研究人员的领域，使用子领域作为案例研究。子区域呈现出原始问题的变体，具有更多挑战，因为类的数量更大。本文的目标是分析社交网络指标、标题语言和区域层次结构等因素对算法性能的贡献，并提出一种结合不同特征的新方法。所提出的方法可以应用于不同的学术数据，但来自拿铁平台的数据被用于测试和验证所提出的解决方案。因此，我们发现，与最先进的技术相比，社交网络指标和数据的数字表示提高了推理精度，并且使用分层结构信息获得了更好的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Inference of Researchers' Area of Expertise

Nowadays, there is a wide range of academic data available on the web. This information allows solving tasks such as the discovery of specialists in a given area, identification of potential scholarship holders, suggestion of collaborators, among others. However, the success of these tasks depends on the quality of the data used, since incorrect or incomplete data tend to impair the performance of the applied algorithms. The present work utilized machine learning techniques to help to infer the researchers' areas based on the data registered in the Lattes Platform, using the subareas as a case study. The subareas present a variant of the original problem with more challenges, as the number of classes is bigger. The goal of this paper is to analyze the contribution of factors such as social network metrics, the language of the titles and the hierarchical structure of the areas in the performance of the algorithms, and propose a new approach combining different characteristics. The proposed approach can be applied to different academic data, but the data from the Lattes Platform was used for the tests and validations of the proposed solution. As a result, we identified that the social network metrics and the numerical representations of the data improved inference accuracy when compared to state-of-the-art techniques, and the use of the hierarchical structure information achieved even better results.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 7th Brazilian Conference on Intelligent Systems (BRACIS)

自引率

0.00%

发文量