Identifying the most important data for research in the field of infectious diseases: thinking on the basis of artificial intelligence.

IF 1.9 4区 医学 Q4 MICROBIOLOGY
Revista Espanola De Quimioterapia Pub Date : 2023-12-01 Epub Date: 2023-08-12 DOI:10.37201/req/032.2023
A Téllez Santoyo, C Lopera, A Ladino Vásquez, F Seguí Fernández, I Grafiá Pérez, M Chumbita, T F Aiello, P Monzó, O Peyrony, P Puerta-Alcalde, C Cardozo, N Garcia-Pouton, P Castro, S Fernández Méndez, J M Nicolas Arfelis, A Soriano, C Garcia-Vidal
{"title":"Identifying the most important data for research in the field of infectious diseases: thinking on the basis of artificial intelligence.","authors":"A Téllez Santoyo, C Lopera, A Ladino Vásquez, F Seguí Fernández, I Grafiá Pérez, M Chumbita, T F Aiello, P Monzó, O Peyrony, P Puerta-Alcalde, C Cardozo, N Garcia-Pouton, P Castro, S Fernández Méndez, J M Nicolas Arfelis, A Soriano, C Garcia-Vidal","doi":"10.37201/req/032.2023","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>Clinical data on which artificial intelligence (AI) algorithms are trained and tested provide the basis to improve diagnosis or treatment of infectious diseases (ID). We aimed to identify important data for ID research to prioritise efforts being undertaken in AI programmes.</p><p><strong>Methods: </strong>We searched for 1,000 articlesfrom high-impact ID journals on PubMed, selecting 288 of the latest articles from 10 top journals. We classified them into structured or unstructured data. Variables were homogenised and grouped into the following categories: epidemiology, admission, demographics, comorbidities, clinical manifestations, laboratory, microbiology, other diagnoses, treatment, outcomes and other non-categorizable variables.</p><p><strong>Results: </strong>4,488 individual variables were collected, from the 288 articles. 3,670 (81.8%) variables were classified as structured data whilst 818 (18.2%) as unstructured data. From the structured data, 2,319 (63.2%) variables were classified as direct-retrievable from electronic health records-whilst 1,351 (36.8%) were indirect. The most frequent unstructured data were related to clinical manifestations and were repeated across articles. Data on demographics, comorbidities and microbiology constituted the most frequent group of variables.</p><p><strong>Conclusions: </strong>This article identified that structured variables have comprised the most important data in research to generate knowledge in the field of ID. Extracting these data should be a priority when a medical centre intends to start an AI programme for ID. We also documented that the most important unstructured data in this field are those related to clinical manifestations. Such data could easily undergo some structuring with the use of semi-structured medical records focusing on a few symptoms.</p>","PeriodicalId":21232,"journal":{"name":"Revista Espanola De Quimioterapia","volume":null,"pages":null},"PeriodicalIF":1.9000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10710675/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista Espanola De Quimioterapia","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.37201/req/032.2023","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/8/12 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: Clinical data on which artificial intelligence (AI) algorithms are trained and tested provide the basis to improve diagnosis or treatment of infectious diseases (ID). We aimed to identify important data for ID research to prioritise efforts being undertaken in AI programmes.

Methods: We searched for 1,000 articlesfrom high-impact ID journals on PubMed, selecting 288 of the latest articles from 10 top journals. We classified them into structured or unstructured data. Variables were homogenised and grouped into the following categories: epidemiology, admission, demographics, comorbidities, clinical manifestations, laboratory, microbiology, other diagnoses, treatment, outcomes and other non-categorizable variables.

Results: 4,488 individual variables were collected, from the 288 articles. 3,670 (81.8%) variables were classified as structured data whilst 818 (18.2%) as unstructured data. From the structured data, 2,319 (63.2%) variables were classified as direct-retrievable from electronic health records-whilst 1,351 (36.8%) were indirect. The most frequent unstructured data were related to clinical manifestations and were repeated across articles. Data on demographics, comorbidities and microbiology constituted the most frequent group of variables.

Conclusions: This article identified that structured variables have comprised the most important data in research to generate knowledge in the field of ID. Extracting these data should be a priority when a medical centre intends to start an AI programme for ID. We also documented that the most important unstructured data in this field are those related to clinical manifestations. Such data could easily undergo some structuring with the use of semi-structured medical records focusing on a few symptoms.

识别传染病领域最重要的研究数据:基于人工智能的思考。
目的:人工智能(AI)算法训练和测试的临床数据为提高传染病(ID)的诊断或治疗提供依据。我们的目标是确定ID研究的重要数据,以优先考虑人工智能项目中正在进行的努力。方法:在PubMed高影响力ID期刊中检索1000篇文章,从10种顶级期刊中选取最新文章288篇。我们将它们分为结构化数据和非结构化数据。变量均质化并分为以下类别:流行病学、入院、人口统计学、合并症、临床表现、实验室、微生物学、其他诊断、治疗、结局和其他不可分类的变量。结果:288篇文献共收集个体变量4488个。3670个(81.8%)变量被归类为结构化数据,818个(18.2%)变量被归类为非结构化数据。从结构化数据中,2319个(63.2%)变量被归类为可直接从电子健康记录中检索,而1351个(36.8%)变量是间接的。最常见的非结构化数据与临床表现有关,并且在文章中重复。人口统计、合并症和微生物学数据构成了最常见的一组变量。结论:本文发现结构化变量构成了研究中生成ID领域知识的最重要数据。当医疗中心打算启动用于身份识别的人工智能程序时,提取这些数据应该是一个优先事项。我们还记录了该领域最重要的非结构化数据是那些与临床表现相关的数据。这样的数据可以很容易地进行一些结构化,使用半结构化的医疗记录,重点关注几个症状。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.90
自引率
10.50%
发文量
146
审稿时长
>12 weeks
期刊介绍: The official journal of the Sociedad Española de Quimioterapia (Spanish Society of Chemotherapy), publishes articles that further knowledge and advance the science and application of antimicrobial chemotherapy with antibiotics and antifungal, antiviral and antiprotozoal agents primarily in human medicine. Authors sign an exclusive license agreement, where authors have copyright but license exclusive rights in their article to the Publisher. All manuscripts are free open access. Revista Española de Quimioterapia includes the following sections: reviews, original articles, brierf reports, letters, and consensus documents.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信