Kathryn S. Konrad , Laura Betz , Sandra McBride , Keith R. Shockley , Georgia Roberts , Helen Cunny , G. Jean Harry
{"title":"在标准化的临床前和毒理学研究中评估对照大鼠行为表现的可重复性和功率评估。","authors":"Kathryn S. Konrad , Laura Betz , Sandra McBride , Keith R. Shockley , Georgia Roberts , Helen Cunny , G. Jean Harry","doi":"10.1016/j.ntt.2025.107562","DOIUrl":null,"url":null,"abstract":"<div><div>Behavioral assays are critical in evaluating impacts on nervous system function in rodents due to genetic or environmental factors and are frequently incorporated into regulatory decision-making studies. Despite numerous sources of guidance for such studies, results across behavioral assays are reputed to be highly variable with questionable replicability. Behavioral data obtained from control rats within four contract laboratory studies were used to evaluate replicability across studies, calculate the level of statistical power, and estimate the number of animals required for a specific effect size. For the three behaviors evaluated here (motor activity, acoustic startle response, and learning and memory), control rats from all studies showed the expected pattern of behavior, e.g., open field acclimation, startle habituation, % prepulse inhibition (PPI) over pre-pulse intensities, and acquisition and goal quadrant preference in the Morris Water Maze (MWM). For selected representative individual endpoints, power analyses were conducted to evaluate sample size requirements. Across all endpoints, a drop in power occurred as differences between two groups became smaller. Power analysis of multiple representative endpoints suggested that a sample size of 20 may detect a 30 % effect with 80 % power. Sample size requirements changed with the effect size, and achieving 80 % power with a 20 % effect size generally required a sample size of 30 rats. While the behavioral performance was replicated over the Study Cohorts, power analyses suggested a need for moderation of expectations regarding detectable differences if decisions relied on single endpoints or small effect sizes. Reporting results from a low powered study can have significant and wide-ranging impacts, including undermining confidence in data interpretation, misleading future research, and failing to adhere to the ethical framework of the 3 R's.</div></div>","PeriodicalId":19144,"journal":{"name":"Neurotoxicology and teratology","volume":"112 ","pages":"Article 107562"},"PeriodicalIF":2.8000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assessing replicability and power estimates of behavioral performance of control rats across standardized pre-clinical and toxicology studies\",\"authors\":\"Kathryn S. Konrad , Laura Betz , Sandra McBride , Keith R. Shockley , Georgia Roberts , Helen Cunny , G. Jean Harry\",\"doi\":\"10.1016/j.ntt.2025.107562\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Behavioral assays are critical in evaluating impacts on nervous system function in rodents due to genetic or environmental factors and are frequently incorporated into regulatory decision-making studies. Despite numerous sources of guidance for such studies, results across behavioral assays are reputed to be highly variable with questionable replicability. Behavioral data obtained from control rats within four contract laboratory studies were used to evaluate replicability across studies, calculate the level of statistical power, and estimate the number of animals required for a specific effect size. For the three behaviors evaluated here (motor activity, acoustic startle response, and learning and memory), control rats from all studies showed the expected pattern of behavior, e.g., open field acclimation, startle habituation, % prepulse inhibition (PPI) over pre-pulse intensities, and acquisition and goal quadrant preference in the Morris Water Maze (MWM). For selected representative individual endpoints, power analyses were conducted to evaluate sample size requirements. Across all endpoints, a drop in power occurred as differences between two groups became smaller. Power analysis of multiple representative endpoints suggested that a sample size of 20 may detect a 30 % effect with 80 % power. Sample size requirements changed with the effect size, and achieving 80 % power with a 20 % effect size generally required a sample size of 30 rats. While the behavioral performance was replicated over the Study Cohorts, power analyses suggested a need for moderation of expectations regarding detectable differences if decisions relied on single endpoints or small effect sizes. Reporting results from a low powered study can have significant and wide-ranging impacts, including undermining confidence in data interpretation, misleading future research, and failing to adhere to the ethical framework of the 3 R's.</div></div>\",\"PeriodicalId\":19144,\"journal\":{\"name\":\"Neurotoxicology and teratology\",\"volume\":\"112 \",\"pages\":\"Article 107562\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurotoxicology and teratology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0892036225001394\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"NEUROSCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurotoxicology and teratology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0892036225001394","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"NEUROSCIENCES","Score":null,"Total":0}
Assessing replicability and power estimates of behavioral performance of control rats across standardized pre-clinical and toxicology studies
Behavioral assays are critical in evaluating impacts on nervous system function in rodents due to genetic or environmental factors and are frequently incorporated into regulatory decision-making studies. Despite numerous sources of guidance for such studies, results across behavioral assays are reputed to be highly variable with questionable replicability. Behavioral data obtained from control rats within four contract laboratory studies were used to evaluate replicability across studies, calculate the level of statistical power, and estimate the number of animals required for a specific effect size. For the three behaviors evaluated here (motor activity, acoustic startle response, and learning and memory), control rats from all studies showed the expected pattern of behavior, e.g., open field acclimation, startle habituation, % prepulse inhibition (PPI) over pre-pulse intensities, and acquisition and goal quadrant preference in the Morris Water Maze (MWM). For selected representative individual endpoints, power analyses were conducted to evaluate sample size requirements. Across all endpoints, a drop in power occurred as differences between two groups became smaller. Power analysis of multiple representative endpoints suggested that a sample size of 20 may detect a 30 % effect with 80 % power. Sample size requirements changed with the effect size, and achieving 80 % power with a 20 % effect size generally required a sample size of 30 rats. While the behavioral performance was replicated over the Study Cohorts, power analyses suggested a need for moderation of expectations regarding detectable differences if decisions relied on single endpoints or small effect sizes. Reporting results from a low powered study can have significant and wide-ranging impacts, including undermining confidence in data interpretation, misleading future research, and failing to adhere to the ethical framework of the 3 R's.
期刊介绍:
Neurotoxicology and Teratology provides a forum for publishing new information regarding the effects of chemical and physical agents on the developing, adult or aging nervous system. In this context, the fields of neurotoxicology and teratology include studies of agent-induced alterations of nervous system function, with a focus on behavioral outcomes and their underlying physiological and neurochemical mechanisms. The Journal publishes original, peer-reviewed Research Reports of experimental, clinical, and epidemiological studies that address the neurotoxicity and/or functional teratology of pesticides, solvents, heavy metals, nanomaterials, organometals, industrial compounds, mixtures, drugs of abuse, pharmaceuticals, animal and plant toxins, atmospheric reaction products, and physical agents such as radiation and noise. These reports include traditional mammalian neurotoxicology experiments, human studies, studies using non-mammalian animal models, and mechanistic studies in vivo or in vitro. Special Issues, Reviews, Commentaries, Meeting Reports, and Symposium Papers provide timely updates on areas that have reached a critical point of synthesis, on aspects of a scientific field undergoing rapid change, or on areas that present special methodological or interpretive problems. Theoretical Articles address concepts and potential mechanisms underlying actions of agents of interest in the nervous system. The Journal also publishes Brief Communications that concisely describe a new method, technique, apparatus, or experimental result.