{"title":"基于人工智能的乳腺癌研究数据生成技术比较分析。","authors":"Tia M Pope, Ahmad Patooghy","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>This study investigates the use of ChatGPT to support clinical teams with limited expertise in generating synthetic data for breast cancer research. It assesses ChatGPT's application, focusing on effective prompting and best practices for creating high-fidelity synthetic data. The research compares the generated synthetic data to the Wisconsin Breast Cancer Dataset through statistical analysis, structural similarity metrics, and machine learning performance. Results indicate that the quality of prompts and generation techniques significantly affects the data's fidelity. The study highlights the critical role of prompt engineering and data synthesis techniques in producing accurate synthetic data for healthcare research, underscoring the need for precise prompts and generation methods to maintain data integrity in sensitive areas like cancer research.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"910-919"},"PeriodicalIF":0.0000,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099320/pdf/","citationCount":"0","resultStr":"{\"title\":\"Comparative Analysis of Data Generation Techniques for Breast Cancer Research Using Artificial Intelligence.\",\"authors\":\"Tia M Pope, Ahmad Patooghy\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>This study investigates the use of ChatGPT to support clinical teams with limited expertise in generating synthetic data for breast cancer research. It assesses ChatGPT's application, focusing on effective prompting and best practices for creating high-fidelity synthetic data. The research compares the generated synthetic data to the Wisconsin Breast Cancer Dataset through statistical analysis, structural similarity metrics, and machine learning performance. Results indicate that the quality of prompts and generation techniques significantly affects the data's fidelity. The study highlights the critical role of prompt engineering and data synthesis techniques in producing accurate synthetic data for healthcare research, underscoring the need for precise prompts and generation methods to maintain data integrity in sensitive areas like cancer research.</p>\",\"PeriodicalId\":72180,\"journal\":{\"name\":\"AMIA ... Annual Symposium proceedings. AMIA Symposium\",\"volume\":\"2024 \",\"pages\":\"910-919\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-05-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099320/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AMIA ... Annual Symposium proceedings. AMIA Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AMIA ... Annual Symposium proceedings. AMIA Symposium","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
Comparative Analysis of Data Generation Techniques for Breast Cancer Research Using Artificial Intelligence.
This study investigates the use of ChatGPT to support clinical teams with limited expertise in generating synthetic data for breast cancer research. It assesses ChatGPT's application, focusing on effective prompting and best practices for creating high-fidelity synthetic data. The research compares the generated synthetic data to the Wisconsin Breast Cancer Dataset through statistical analysis, structural similarity metrics, and machine learning performance. Results indicate that the quality of prompts and generation techniques significantly affects the data's fidelity. The study highlights the critical role of prompt engineering and data synthesis techniques in producing accurate synthetic data for healthcare research, underscoring the need for precise prompts and generation methods to maintain data integrity in sensitive areas like cancer research.