Bernadette Cornelison, David R Axon, Bryan Abbott, Carter Bishop, Cindy Jebara, Anjali Kumar, Kristen A Root
{"title":"ChatGPT-3.5用于评估孕期非处方药使用的准确性和安全性:一项描述性比较研究。","authors":"Bernadette Cornelison, David R Axon, Bryan Abbott, Carter Bishop, Cindy Jebara, Anjali Kumar, Kristen A Root","doi":"10.3390/pharmacy13040104","DOIUrl":null,"url":null,"abstract":"<p><p>As artificial intelligence (AI) becomes increasingly utilized to perform tasks requiring human intelligence, patients who are pregnant may turn to AI for advice on over-the-counter (OTC) medications. However, medications used in pregnancy may pose profound safety concerns limited by data availability. This study focuses on a chatbot's ability to accurately provide information regarding OTC medications as it relates to patients that are pregnant. A prospective, descriptive design was used to compare the responses generated by the Chat Generative Pre-Trained Transformer 3.5 (ChatGPT-3.5) to the information provided by UpToDate<sup>®</sup>. Eighty-seven of the top pharmacist-recommended OTC drugs in the United States (U.S.) as identified by Pharmacy Times were assessed for safe use in pregnancy using ChatGPT-3.5. A piloted, standard prompt was input into ChatGPT-3.5, and the responses were recorded. Two groups independently rated the responses compared to UpToDate on their correctness, completeness, and safety using a 5-point Likert scale. After independent evaluations, the groups discussed the findings to reach a consensus, with a third independent investigator giving final ratings. For correctness, the median score was 5 (interquartile range [IQR]: 5-5). For completeness, the median score was 4 (IQR: 4-5). For safety, the median score was 5 (IQR: 5-5). Despite high overall scores, the safety errors in 9% of the evaluations (<i>n</i> = 8), including omissions that pose a risk of serious complications, currently renders the chatbot an unsafe standalone resource for this purpose.</p>","PeriodicalId":30544,"journal":{"name":"Pharmacy","volume":"13 4","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12389367/pdf/","citationCount":"0","resultStr":"{\"title\":\"Accuracy and Safety of ChatGPT-3.5 in Assessing Over-the-Counter Medication Use During Pregnancy: A Descriptive Comparative Study.\",\"authors\":\"Bernadette Cornelison, David R Axon, Bryan Abbott, Carter Bishop, Cindy Jebara, Anjali Kumar, Kristen A Root\",\"doi\":\"10.3390/pharmacy13040104\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>As artificial intelligence (AI) becomes increasingly utilized to perform tasks requiring human intelligence, patients who are pregnant may turn to AI for advice on over-the-counter (OTC) medications. However, medications used in pregnancy may pose profound safety concerns limited by data availability. This study focuses on a chatbot's ability to accurately provide information regarding OTC medications as it relates to patients that are pregnant. A prospective, descriptive design was used to compare the responses generated by the Chat Generative Pre-Trained Transformer 3.5 (ChatGPT-3.5) to the information provided by UpToDate<sup>®</sup>. Eighty-seven of the top pharmacist-recommended OTC drugs in the United States (U.S.) as identified by Pharmacy Times were assessed for safe use in pregnancy using ChatGPT-3.5. A piloted, standard prompt was input into ChatGPT-3.5, and the responses were recorded. Two groups independently rated the responses compared to UpToDate on their correctness, completeness, and safety using a 5-point Likert scale. After independent evaluations, the groups discussed the findings to reach a consensus, with a third independent investigator giving final ratings. For correctness, the median score was 5 (interquartile range [IQR]: 5-5). For completeness, the median score was 4 (IQR: 4-5). For safety, the median score was 5 (IQR: 5-5). Despite high overall scores, the safety errors in 9% of the evaluations (<i>n</i> = 8), including omissions that pose a risk of serious complications, currently renders the chatbot an unsafe standalone resource for this purpose.</p>\",\"PeriodicalId\":30544,\"journal\":{\"name\":\"Pharmacy\",\"volume\":\"13 4\",\"pages\":\"\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2025-07-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12389367/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pharmacy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/pharmacy13040104\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"PHARMACOLOGY & PHARMACY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pharmacy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/pharmacy13040104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
Accuracy and Safety of ChatGPT-3.5 in Assessing Over-the-Counter Medication Use During Pregnancy: A Descriptive Comparative Study.
As artificial intelligence (AI) becomes increasingly utilized to perform tasks requiring human intelligence, patients who are pregnant may turn to AI for advice on over-the-counter (OTC) medications. However, medications used in pregnancy may pose profound safety concerns limited by data availability. This study focuses on a chatbot's ability to accurately provide information regarding OTC medications as it relates to patients that are pregnant. A prospective, descriptive design was used to compare the responses generated by the Chat Generative Pre-Trained Transformer 3.5 (ChatGPT-3.5) to the information provided by UpToDate®. Eighty-seven of the top pharmacist-recommended OTC drugs in the United States (U.S.) as identified by Pharmacy Times were assessed for safe use in pregnancy using ChatGPT-3.5. A piloted, standard prompt was input into ChatGPT-3.5, and the responses were recorded. Two groups independently rated the responses compared to UpToDate on their correctness, completeness, and safety using a 5-point Likert scale. After independent evaluations, the groups discussed the findings to reach a consensus, with a third independent investigator giving final ratings. For correctness, the median score was 5 (interquartile range [IQR]: 5-5). For completeness, the median score was 4 (IQR: 4-5). For safety, the median score was 5 (IQR: 5-5). Despite high overall scores, the safety errors in 9% of the evaluations (n = 8), including omissions that pose a risk of serious complications, currently renders the chatbot an unsafe standalone resource for this purpose.