B.A. Thomas , A. Buonomo , H. Thronson , L. Barbier
{"title":"利用机器学习确定研究重点","authors":"B.A. Thomas , A. Buonomo , H. Thronson , L. Barbier","doi":"10.1016/j.ascom.2024.100879","DOIUrl":null,"url":null,"abstract":"<div><p>We summarize our exploratory investigation into whether Machine Learning (ML) techniques applied to publicly available professional text can substantially augment strategic planning for astronomy. We find that an approach based on Latent Dirichlet Allocation (LDA) using content drawn from astronomy journal papers can be used to infer high-priority research areas. While the LDA models are challenging to interpret, we find that they may be strongly associated with meaningful keywords and scientific papers which allow for human interpretation of the topic models.</p><p>Significant correlation is found between the results of applying these models to the previous decade of astronomical research (“1998–2010” corpus) and the contents of the Science Frontier Panels report which contains high-priority research areas identified by the 2010 National Academies’ Astronomy and Astrophysics Decadal Survey (“DS2010” corpus). Significant correlations also exist between model results of the 1998–2010 corpus and the submitted whitepapers to the Decadal Survey (“whitepapers” corpus). Importantly, we derive predictive metrics based on these results which can provide leading indicators of which content modeled by the topic models will become highly cited in the future. Using these identified metrics and the associations between papers and topic models it is possible to identify important papers for planners to consider.</p><p>A preliminary version of our work was presented by Thronson et al. (2021) and Thomas et al. (2022).</p></div>","PeriodicalId":48757,"journal":{"name":"Astronomy and Computing","volume":"49 ","pages":"Article 100879"},"PeriodicalIF":1.9000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Determining research priorities using machine learning\",\"authors\":\"B.A. Thomas , A. Buonomo , H. Thronson , L. Barbier\",\"doi\":\"10.1016/j.ascom.2024.100879\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>We summarize our exploratory investigation into whether Machine Learning (ML) techniques applied to publicly available professional text can substantially augment strategic planning for astronomy. We find that an approach based on Latent Dirichlet Allocation (LDA) using content drawn from astronomy journal papers can be used to infer high-priority research areas. While the LDA models are challenging to interpret, we find that they may be strongly associated with meaningful keywords and scientific papers which allow for human interpretation of the topic models.</p><p>Significant correlation is found between the results of applying these models to the previous decade of astronomical research (“1998–2010” corpus) and the contents of the Science Frontier Panels report which contains high-priority research areas identified by the 2010 National Academies’ Astronomy and Astrophysics Decadal Survey (“DS2010” corpus). Significant correlations also exist between model results of the 1998–2010 corpus and the submitted whitepapers to the Decadal Survey (“whitepapers” corpus). Importantly, we derive predictive metrics based on these results which can provide leading indicators of which content modeled by the topic models will become highly cited in the future. Using these identified metrics and the associations between papers and topic models it is possible to identify important papers for planners to consider.</p><p>A preliminary version of our work was presented by Thronson et al. (2021) and Thomas et al. (2022).</p></div>\",\"PeriodicalId\":48757,\"journal\":{\"name\":\"Astronomy and Computing\",\"volume\":\"49 \",\"pages\":\"Article 100879\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Astronomy and Computing\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2213133724000945\",\"RegionNum\":4,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ASTRONOMY & ASTROPHYSICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Astronomy and Computing","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2213133724000945","RegionNum":4,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
Determining research priorities using machine learning
We summarize our exploratory investigation into whether Machine Learning (ML) techniques applied to publicly available professional text can substantially augment strategic planning for astronomy. We find that an approach based on Latent Dirichlet Allocation (LDA) using content drawn from astronomy journal papers can be used to infer high-priority research areas. While the LDA models are challenging to interpret, we find that they may be strongly associated with meaningful keywords and scientific papers which allow for human interpretation of the topic models.
Significant correlation is found between the results of applying these models to the previous decade of astronomical research (“1998–2010” corpus) and the contents of the Science Frontier Panels report which contains high-priority research areas identified by the 2010 National Academies’ Astronomy and Astrophysics Decadal Survey (“DS2010” corpus). Significant correlations also exist between model results of the 1998–2010 corpus and the submitted whitepapers to the Decadal Survey (“whitepapers” corpus). Importantly, we derive predictive metrics based on these results which can provide leading indicators of which content modeled by the topic models will become highly cited in the future. Using these identified metrics and the associations between papers and topic models it is possible to identify important papers for planners to consider.
A preliminary version of our work was presented by Thronson et al. (2021) and Thomas et al. (2022).
Astronomy and ComputingASTRONOMY & ASTROPHYSICSCOMPUTER SCIENCE,-COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
CiteScore
4.10
自引率
8.00%
发文量
67
期刊介绍:
Astronomy and Computing is a peer-reviewed journal that focuses on the broad area between astronomy, computer science and information technology. The journal aims to publish the work of scientists and (software) engineers in all aspects of astronomical computing, including the collection, analysis, reduction, visualisation, preservation and dissemination of data, and the development of astronomical software and simulations. The journal covers applications for academic computer science techniques to astronomy, as well as novel applications of information technologies within astronomy.