{"title":"整形外科专用人工智能大语言模型的初步概念验证研究:plasticsurgical gpt。","authors":"Berk B Ozmen, Ibrahim Berber, Graham S Schwarz","doi":"10.1093/asj/sjaf049","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The advent of general-purpose large language models (LLMs) like ChatGPT (OpenAI, San Francisco, CA) has revolutionized natural language processing, but their applicability in specialized medical fields like plastic surgery remains limited due to a lack of domain-specific knowledge.</p><p><strong>Objectives: </strong>This study aims to develop and evaluate PlasticSurgeryGPT, a dedicated LLM fine-tuned on plastic surgery literature, to enhance performance in clinical decision support, surgical education, and research within the field.</p><p><strong>Methods: </strong>A comprehensive dataset of 25,389 plastic surgery research abstracts published between January 1, 2010, and January 1, 2024, was retrieved from PubMed. The abstracts underwent rigorous preprocessing, including text cleaning and tokenization. We fine-tuned the pre-trained GPT-2 model on this dataset using the PyTorch and HuggingFace frameworks. The performance of PlasticSurgeryGPT was evaluated against the default GPT-2 model using BLEU, METEOR, and ROUGE-1 metrics.</p><p><strong>Results: </strong>The fine-tuned model, named PlasticSurgeryGPT, demonstrated substantial improvements over the generic GPT-2 model in capturing the semantic nuances of plastic surgery text. PlasticSurgeryGPT outperformed GPT-2 across BLEU, METEOR, and ROUGE-1 metrics, with scores of 0.135519, 0.583554, and 0.216813, respectively, compared to GPT-2's scores of 0.130179, 0.550498, and 0.215494.</p><p><strong>Conclusions: </strong>PlasticSurgeryGPT represents the first plastic surgery-specific LLM, demonstrating enhanced performance in generating relevant and accurate content compared to a general-purpose model. This work underscores the potential of domain-specific LLMs in improving clinical practice, surgical education, and research in plastic surgery. Future studies should focus on incorporating full-text articles, multimodal data, and larger models to further enhance performance and applicability.</p>","PeriodicalId":7728,"journal":{"name":"Aesthetic Surgery Journal","volume":" ","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Initial Proof-of-Concept Study for a Plastic Surgery Specific Artificial Intelligence Large Language Model: PlasticSurgeryGPT.\",\"authors\":\"Berk B Ozmen, Ibrahim Berber, Graham S Schwarz\",\"doi\":\"10.1093/asj/sjaf049\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>The advent of general-purpose large language models (LLMs) like ChatGPT (OpenAI, San Francisco, CA) has revolutionized natural language processing, but their applicability in specialized medical fields like plastic surgery remains limited due to a lack of domain-specific knowledge.</p><p><strong>Objectives: </strong>This study aims to develop and evaluate PlasticSurgeryGPT, a dedicated LLM fine-tuned on plastic surgery literature, to enhance performance in clinical decision support, surgical education, and research within the field.</p><p><strong>Methods: </strong>A comprehensive dataset of 25,389 plastic surgery research abstracts published between January 1, 2010, and January 1, 2024, was retrieved from PubMed. The abstracts underwent rigorous preprocessing, including text cleaning and tokenization. We fine-tuned the pre-trained GPT-2 model on this dataset using the PyTorch and HuggingFace frameworks. The performance of PlasticSurgeryGPT was evaluated against the default GPT-2 model using BLEU, METEOR, and ROUGE-1 metrics.</p><p><strong>Results: </strong>The fine-tuned model, named PlasticSurgeryGPT, demonstrated substantial improvements over the generic GPT-2 model in capturing the semantic nuances of plastic surgery text. PlasticSurgeryGPT outperformed GPT-2 across BLEU, METEOR, and ROUGE-1 metrics, with scores of 0.135519, 0.583554, and 0.216813, respectively, compared to GPT-2's scores of 0.130179, 0.550498, and 0.215494.</p><p><strong>Conclusions: </strong>PlasticSurgeryGPT represents the first plastic surgery-specific LLM, demonstrating enhanced performance in generating relevant and accurate content compared to a general-purpose model. This work underscores the potential of domain-specific LLMs in improving clinical practice, surgical education, and research in plastic surgery. Future studies should focus on incorporating full-text articles, multimodal data, and larger models to further enhance performance and applicability.</p>\",\"PeriodicalId\":7728,\"journal\":{\"name\":\"Aesthetic Surgery Journal\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Aesthetic Surgery Journal\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1093/asj/sjaf049\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"SURGERY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Aesthetic Surgery Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/asj/sjaf049","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0
摘要
背景:像ChatGPT (OpenAI, San Francisco, CA)这样的通用大型语言模型(llm)的出现彻底改变了自然语言处理,但由于缺乏特定领域的知识,它们在整形外科等专业医疗领域的适用性仍然有限。目的:本研究旨在开发和评估plasticsurgical gpt,这是一个专门针对整形外科文献进行微调的法学硕士,以提高临床决策支持,外科教育和研究领域的表现。方法:从PubMed检索2010年1月1日至2024年1月1日期间发表的25389篇整形外科研究摘要的综合数据集。摘要经过严格的预处理,包括文本清理和标记化。我们使用PyTorch和HuggingFace框架对该数据集上的预训练GPT-2模型进行了微调。使用BLEU、METEOR和ROUGE-1指标,根据默认的GPT-2模型对plasticsurgical gpt的性能进行评估。结果:与通用的GPT-2模型相比,被命名为plasticsurgygpt的微调模型在捕捉整形手术文本的语义细微差别方面有了实质性的改进。与GPT-2的得分0.130179、0.550498和0.215494相比,plasticsurgical gpt在BLEU、METEOR和ROUGE-1指标上的得分分别为0.135519、0.583554和0.216813,优于GPT-2。结论:plasticsurgical gpt代表了第一个针对整形外科的LLM,与通用模型相比,在生成相关和准确的内容方面表现出更高的性能。这项工作强调了特定领域法学硕士在改善临床实践、外科教育和整形外科研究方面的潜力。未来的研究应关注全文文章、多模态数据和更大的模型,以进一步提高性能和适用性。
Initial Proof-of-Concept Study for a Plastic Surgery Specific Artificial Intelligence Large Language Model: PlasticSurgeryGPT.
Background: The advent of general-purpose large language models (LLMs) like ChatGPT (OpenAI, San Francisco, CA) has revolutionized natural language processing, but their applicability in specialized medical fields like plastic surgery remains limited due to a lack of domain-specific knowledge.
Objectives: This study aims to develop and evaluate PlasticSurgeryGPT, a dedicated LLM fine-tuned on plastic surgery literature, to enhance performance in clinical decision support, surgical education, and research within the field.
Methods: A comprehensive dataset of 25,389 plastic surgery research abstracts published between January 1, 2010, and January 1, 2024, was retrieved from PubMed. The abstracts underwent rigorous preprocessing, including text cleaning and tokenization. We fine-tuned the pre-trained GPT-2 model on this dataset using the PyTorch and HuggingFace frameworks. The performance of PlasticSurgeryGPT was evaluated against the default GPT-2 model using BLEU, METEOR, and ROUGE-1 metrics.
Results: The fine-tuned model, named PlasticSurgeryGPT, demonstrated substantial improvements over the generic GPT-2 model in capturing the semantic nuances of plastic surgery text. PlasticSurgeryGPT outperformed GPT-2 across BLEU, METEOR, and ROUGE-1 metrics, with scores of 0.135519, 0.583554, and 0.216813, respectively, compared to GPT-2's scores of 0.130179, 0.550498, and 0.215494.
Conclusions: PlasticSurgeryGPT represents the first plastic surgery-specific LLM, demonstrating enhanced performance in generating relevant and accurate content compared to a general-purpose model. This work underscores the potential of domain-specific LLMs in improving clinical practice, surgical education, and research in plastic surgery. Future studies should focus on incorporating full-text articles, multimodal data, and larger models to further enhance performance and applicability.
期刊介绍:
Aesthetic Surgery Journal is a peer-reviewed international journal focusing on scientific developments and clinical techniques in aesthetic surgery. The official publication of The Aesthetic Society, ASJ is also the official English-language journal of many major international societies of plastic, aesthetic and reconstructive surgery representing South America, Central America, Europe, Asia, and the Middle East. It is also the official journal of the British Association of Aesthetic Plastic Surgeons, the Canadian Society for Aesthetic Plastic Surgery and The Rhinoplasty Society.