Determining research priorities using machine learning

IF 1.9 4区 物理与天体物理 Q2 ASTRONOMY & ASTROPHYSICS
B.A. Thomas , A. Buonomo , H. Thronson , L. Barbier
{"title":"Determining research priorities using machine learning","authors":"B.A. Thomas ,&nbsp;A. Buonomo ,&nbsp;H. Thronson ,&nbsp;L. Barbier","doi":"10.1016/j.ascom.2024.100879","DOIUrl":null,"url":null,"abstract":"<div><p>We summarize our exploratory investigation into whether Machine Learning (ML) techniques applied to publicly available professional text can substantially augment strategic planning for astronomy. We find that an approach based on Latent Dirichlet Allocation (LDA) using content drawn from astronomy journal papers can be used to infer high-priority research areas. While the LDA models are challenging to interpret, we find that they may be strongly associated with meaningful keywords and scientific papers which allow for human interpretation of the topic models.</p><p>Significant correlation is found between the results of applying these models to the previous decade of astronomical research (“1998–2010” corpus) and the contents of the Science Frontier Panels report which contains high-priority research areas identified by the 2010 National Academies’ Astronomy and Astrophysics Decadal Survey (“DS2010” corpus). Significant correlations also exist between model results of the 1998–2010 corpus and the submitted whitepapers to the Decadal Survey (“whitepapers” corpus). Importantly, we derive predictive metrics based on these results which can provide leading indicators of which content modeled by the topic models will become highly cited in the future. Using these identified metrics and the associations between papers and topic models it is possible to identify important papers for planners to consider.</p><p>A preliminary version of our work was presented by Thronson et al. (2021) and Thomas et al. (2022).</p></div>","PeriodicalId":48757,"journal":{"name":"Astronomy and Computing","volume":"49 ","pages":"Article 100879"},"PeriodicalIF":1.9000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Astronomy and Computing","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2213133724000945","RegionNum":4,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
引用次数: 0

Abstract

We summarize our exploratory investigation into whether Machine Learning (ML) techniques applied to publicly available professional text can substantially augment strategic planning for astronomy. We find that an approach based on Latent Dirichlet Allocation (LDA) using content drawn from astronomy journal papers can be used to infer high-priority research areas. While the LDA models are challenging to interpret, we find that they may be strongly associated with meaningful keywords and scientific papers which allow for human interpretation of the topic models.

Significant correlation is found between the results of applying these models to the previous decade of astronomical research (“1998–2010” corpus) and the contents of the Science Frontier Panels report which contains high-priority research areas identified by the 2010 National Academies’ Astronomy and Astrophysics Decadal Survey (“DS2010” corpus). Significant correlations also exist between model results of the 1998–2010 corpus and the submitted whitepapers to the Decadal Survey (“whitepapers” corpus). Importantly, we derive predictive metrics based on these results which can provide leading indicators of which content modeled by the topic models will become highly cited in the future. Using these identified metrics and the associations between papers and topic models it is possible to identify important papers for planners to consider.

A preliminary version of our work was presented by Thronson et al. (2021) and Thomas et al. (2022).

利用机器学习确定研究重点
我们总结了我们对机器学习(ML)技术应用于公开的专业文本是否能大大增强天文学战略规划的探索性研究。我们发现,利用天文学期刊论文中的内容,基于 Latent Dirichlet Allocation (LDA) 的方法可用于推断高优先级的研究领域。虽然 LDA 模型的解释具有挑战性,但我们发现这些模型可能与有意义的关键词和科学论文密切相关,这使得人类可以对主题模型进行解释。将这些模型应用于过去十年的天文学研究("1998-2010 "语料库)的结果与科学前沿小组报告的内容之间存在显著的相关性,后者包含 2010 年美国国家科学院天文学和天体物理学十年调查("DS2010 "语料库)确定的高优先级研究领域。1998-2010 年语料库的模型结果与提交给十年调查的白皮书("白皮书 "语料库)之间也存在显著的相关性。重要的是,我们在这些结果的基础上得出了预测指标,这些指标可以为主题模型所建模的内容在未来成为高引用率内容提供先导指标。Thronson 等人(2021 年)和 Thomas 等人(2022 年)介绍了我们工作的初步版本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Astronomy and Computing
Astronomy and Computing ASTRONOMY & ASTROPHYSICSCOMPUTER SCIENCE,-COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
CiteScore
4.10
自引率
8.00%
发文量
67
期刊介绍: Astronomy and Computing is a peer-reviewed journal that focuses on the broad area between astronomy, computer science and information technology. The journal aims to publish the work of scientists and (software) engineers in all aspects of astronomical computing, including the collection, analysis, reduction, visualisation, preservation and dissemination of data, and the development of astronomical software and simulations. The journal covers applications for academic computer science techniques to astronomy, as well as novel applications of information technologies within astronomy.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信