Creating a dermatologic database for artificial intelligence, a Chilean experience, and advice from ChatGPT

Leonel Hidalgo, María Paz Salinas, Javiera Sepúlveda, Karina Carrasco, Pamela Romero, Alma Pedro, Soledad Vidaurre, Domingo Mery, Cristian Navarrete-Dechent
{"title":"Creating a dermatologic database for artificial intelligence, a Chilean experience, and advice from ChatGPT","authors":"Leonel Hidalgo,&nbsp;María Paz Salinas,&nbsp;Javiera Sepúlveda,&nbsp;Karina Carrasco,&nbsp;Pamela Romero,&nbsp;Alma Pedro,&nbsp;Soledad Vidaurre,&nbsp;Domingo Mery,&nbsp;Cristian Navarrete-Dechent","doi":"10.1002/jvc2.546","DOIUrl":null,"url":null,"abstract":"<p>Since artificial intelligence (AI) has widely shown applications for skin cancer diagnosis, creating comprehensive image datasets is key.<span><sup>1-4</sup></span> Availability of databases are increasing, with a low representation of higher phototypes, certain ethnic groups, and limited metadata.<span><sup>5</sup></span> Excluding specific populations perpetuates healthcare disparities in the AI era.<span><sup>6</sup></span> Due to the lack of diverse datasets, external use and validation of AI algorithms is not currently possible in our population. We started a project to create a Chilean AI database: The ‘Trawa’ database ('skin' in Mapuzungun, a native Chilean language). This study aims to describe our current dataset characteristics along with the limitations during its creation.</p><p>This was a retrospective study approved by the local Institutional Review Board (IRB). The images were collected from January 2019 to December 2020, from four dermatologists working in a Tertiary Care Academic Hospital. Clinical and dermoscopy images were obtained with variable smartphones. All included lesions are biopsy-proven. Metadata (i.e., age, sex, anatomical location, histopathological details, relevant past medical story, and phototype) was obtained from the electronic medical records. Cases were coded in a specific folder. All data was stored in a Health Insurance Portability and Accountability Act (HIPAA)-compliant web hosting.</p><p>During the study period, we collected 860 individual cases consisting of 4435 clinical and dermoscopy images (Figure 1), organized in seven categories: actinic keratosis, basal cell carcinoma, cutaneous squamous cell carcinoma, melanoma, naevus, seborrhoeic keratosis and others (angiomas, warts, etc.) (Table 1), regarding metadata 52.6% were women; the average age was 54 years; 32.8% had photodamage and 70.2% were phototype III. Most cases were located on the head and neck (50.6%); and 26.8% of the diagnosis were malignant.</p><p>Finally, we also suggest working with multidisciplinary teams composed of dermatologists and computer science professionals. Creating and improving databases will augment the performance of AI algorithms,<span><sup>9</sup></span> and for us, this is a necessary step for performing collaborative work with other countries in the region (e.g., Latin America).<span><sup>3</sup></span> Potential applications of the current database include algorithm training fine-tuned for local data as well as comparing different algorithms performance on different and diverse databases. The main limitations of our database is its relatively small size. Organising lesions requires a large team and multiple resources. Also, we have included only lesions with histopathology confirmation, biasing the database towards more 'suspicious' lesions. Using noninvasive imaging technologies such as reflectance confocal microscopy could be an alternative to include nonbiopsied benign lesions.<span><sup>10</sup></span></p><p><i>Acquisition, analysis, and interpretation of data</i>: Leonel Hidalgo, María Paz Salinas, Javiera Sepúlveda, Karina Carrasco, Pamela Romero, Alma Pedro, Soledad Vidaurre, Domingo Mery and Cristian Navarrete-Dechent. D<i>rafting and revising the article</i>: Leonel Hidalgo, María Paz Salinas, Javiera Sepúlveda, Karina Carrasco, Pamela Romero, Alma Pedro, Soledad Vidaurre, Domingo Mery and Cristian Navarrete-Dechent. <i>Final approval of the version to be published</i>: Leonel Hidalgo, María Paz Salinas, Javiera Sepúlveda, Karina Carrasco, Pamela Romero, Alma Pedro, Soledad Vidaurre, Domingo Mery and Cristian Navarrete-Dechent.</p><p>This work was funded in part by ANID—Millennium Science Initiative Programme ICN2021_004.</p><p>The authors declare no conflict of interest.</p><p>Reviewed and approved by Scientific Ethical Committee for Health Sciences of Pontificia Universidad Católica de Chile; approval #211213001.</p>","PeriodicalId":94325,"journal":{"name":"JEADV clinical practice","volume":"4 1","pages":"296-298"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jvc2.546","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JEADV clinical practice","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jvc2.546","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Since artificial intelligence (AI) has widely shown applications for skin cancer diagnosis, creating comprehensive image datasets is key.1-4 Availability of databases are increasing, with a low representation of higher phototypes, certain ethnic groups, and limited metadata.5 Excluding specific populations perpetuates healthcare disparities in the AI era.6 Due to the lack of diverse datasets, external use and validation of AI algorithms is not currently possible in our population. We started a project to create a Chilean AI database: The ‘Trawa’ database ('skin' in Mapuzungun, a native Chilean language). This study aims to describe our current dataset characteristics along with the limitations during its creation.

This was a retrospective study approved by the local Institutional Review Board (IRB). The images were collected from January 2019 to December 2020, from four dermatologists working in a Tertiary Care Academic Hospital. Clinical and dermoscopy images were obtained with variable smartphones. All included lesions are biopsy-proven. Metadata (i.e., age, sex, anatomical location, histopathological details, relevant past medical story, and phototype) was obtained from the electronic medical records. Cases were coded in a specific folder. All data was stored in a Health Insurance Portability and Accountability Act (HIPAA)-compliant web hosting.

During the study period, we collected 860 individual cases consisting of 4435 clinical and dermoscopy images (Figure 1), organized in seven categories: actinic keratosis, basal cell carcinoma, cutaneous squamous cell carcinoma, melanoma, naevus, seborrhoeic keratosis and others (angiomas, warts, etc.) (Table 1), regarding metadata 52.6% were women; the average age was 54 years; 32.8% had photodamage and 70.2% were phototype III. Most cases were located on the head and neck (50.6%); and 26.8% of the diagnosis were malignant.

Finally, we also suggest working with multidisciplinary teams composed of dermatologists and computer science professionals. Creating and improving databases will augment the performance of AI algorithms,9 and for us, this is a necessary step for performing collaborative work with other countries in the region (e.g., Latin America).3 Potential applications of the current database include algorithm training fine-tuned for local data as well as comparing different algorithms performance on different and diverse databases. The main limitations of our database is its relatively small size. Organising lesions requires a large team and multiple resources. Also, we have included only lesions with histopathology confirmation, biasing the database towards more 'suspicious' lesions. Using noninvasive imaging technologies such as reflectance confocal microscopy could be an alternative to include nonbiopsied benign lesions.10

Acquisition, analysis, and interpretation of data: Leonel Hidalgo, María Paz Salinas, Javiera Sepúlveda, Karina Carrasco, Pamela Romero, Alma Pedro, Soledad Vidaurre, Domingo Mery and Cristian Navarrete-Dechent. Drafting and revising the article: Leonel Hidalgo, María Paz Salinas, Javiera Sepúlveda, Karina Carrasco, Pamela Romero, Alma Pedro, Soledad Vidaurre, Domingo Mery and Cristian Navarrete-Dechent. Final approval of the version to be published: Leonel Hidalgo, María Paz Salinas, Javiera Sepúlveda, Karina Carrasco, Pamela Romero, Alma Pedro, Soledad Vidaurre, Domingo Mery and Cristian Navarrete-Dechent.

This work was funded in part by ANID—Millennium Science Initiative Programme ICN2021_004.

The authors declare no conflict of interest.

Reviewed and approved by Scientific Ethical Committee for Health Sciences of Pontificia Universidad Católica de Chile; approval #211213001.

Abstract Image

求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
0.30
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信