Unsupervised learning and natural language processing highlight research trends in a superbug

IF 3 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Carlos-Francisco Méndez-Cruz, Joel Rodríguez-Herrera, A. Varela-Vega, Valeria Mateo-Estrada, Santiago Castillo-Ramírez
{"title":"Unsupervised learning and natural language processing highlight research trends in a superbug","authors":"Carlos-Francisco Méndez-Cruz, Joel Rodríguez-Herrera, A. Varela-Vega, Valeria Mateo-Estrada, Santiago Castillo-Ramírez","doi":"10.3389/frai.2024.1336071","DOIUrl":null,"url":null,"abstract":"Antibiotic-resistance Acinetobacter baumannii is a very important nosocomial pathogen worldwide. Thousands of studies have been conducted about this pathogen. However, there has not been any attempt to use all this information to highlight the research trends concerning this pathogen. Here we use unsupervised learning and natural language processing (NLP), two areas of Artificial Intelligence, to analyse the most extensive database of articles created (5,500+ articles, from 851 different journals, published over 3 decades). K-means clustering found 113 theme clusters and these were defined with representative terms automatically obtained with topic modelling, summarising different research areas. The biggest clusters, all with over 100 articles, are biased toward multidrug resistance, carbapenem resistance, clinical treatment, and nosocomial infections. However, we also found that some research areas, such as ecology and non-human infections, have received very little attention. This approach allowed us to study research themes over time unveiling those of recent interest, such as the use of cefiderocol (a recently approved antibiotic) against A. baumannii. In a broader context, our results show that unsupervised learning, NLP and topic modelling can be used to describe and analyse the research themes for important infectious diseases. This strategy should be very useful to analyse other ESKAPE pathogens or any other pathogens relevant to Public Health.","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":3.0000,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frai.2024.1336071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Antibiotic-resistance Acinetobacter baumannii is a very important nosocomial pathogen worldwide. Thousands of studies have been conducted about this pathogen. However, there has not been any attempt to use all this information to highlight the research trends concerning this pathogen. Here we use unsupervised learning and natural language processing (NLP), two areas of Artificial Intelligence, to analyse the most extensive database of articles created (5,500+ articles, from 851 different journals, published over 3 decades). K-means clustering found 113 theme clusters and these were defined with representative terms automatically obtained with topic modelling, summarising different research areas. The biggest clusters, all with over 100 articles, are biased toward multidrug resistance, carbapenem resistance, clinical treatment, and nosocomial infections. However, we also found that some research areas, such as ecology and non-human infections, have received very little attention. This approach allowed us to study research themes over time unveiling those of recent interest, such as the use of cefiderocol (a recently approved antibiotic) against A. baumannii. In a broader context, our results show that unsupervised learning, NLP and topic modelling can be used to describe and analyse the research themes for important infectious diseases. This strategy should be very useful to analyse other ESKAPE pathogens or any other pathogens relevant to Public Health.
无监督学习和自然语言处理凸显超级细菌的研究趋势
抗生素耐药性鲍曼不动杆菌是全球一种非常重要的院内病原体。有关这种病原体的研究已达数千项。然而,还没有人试图利用所有这些信息来突出有关这种病原体的研究趋势。在这里,我们利用无监督学习和自然语言处理(NLP)这两个人工智能领域来分析所创建的最广泛的文章数据库(5500 多篇文章,来自 851 种不同的期刊,发表时间超过 30 年)。K-means 聚类找到了 113 个主题聚类,这些聚类是通过主题建模自动获得的代表性术语定义的,概括了不同的研究领域。最大的聚类都超过 100 篇文章,偏重于多重耐药性、碳青霉烯耐药性、临床治疗和非社会性感染。不过,我们也发现一些研究领域,如生态学和非人类感染,很少受到关注。通过这种方法,我们可以研究随着时间推移而变化的研究主题,从而揭示出近期人们感兴趣的研究主题,例如使用头孢羟氨苄(最近批准的一种抗生素)治疗鲍曼尼氏菌。在更广泛的背景下,我们的研究结果表明,无监督学习、NLP 和主题建模可用于描述和分析重要传染病的研究主题。这一策略对于分析其他 ESKAPE 病原体或任何其他与公共卫生相关的病原体都非常有用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
6.10
自引率
2.50%
发文量
272
审稿时长
13 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信