{"title":"脊柱随机对照试验中基线分类变量和p值的准确性和分布。","authors":"Mark J Bolland, Alison Avenell, Andrew Grey","doi":"10.1098/rsos.240170","DOIUrl":null,"url":null,"abstract":"<p><p>Levayer and colleagues assessed integrity issues in randomized controlled trials (RCTs) in four spine journals using baseline <i>p</i>-values from categorical variables, concluding that there was no evidence of 'systemic fraudulent behaviour'. We used their published dataset to assess the accuracy of reported <i>p</i>-values and whether observed and expected distributions of frequency counts and <i>p</i>-values were consistent. In 51 out of 929 (5.5%) baseline variables, the sum of frequencies did not agree with the reported number of participants. For one-third of reported <i>p</i>-values (172 out of 522), we could not calculate a matching <i>p</i>-value using a range of statistical tests. Sparse data were common: for 22% (74 out of 332) of variables in which the reported <i>p</i>-value matched the <i>p</i>-value calculated from a chi-square test, the expected cells were smaller than recommended for the use of chi-square tests. There were 20-25% more two-arm trials with differences in frequency counts of 1 or 2 between-groups than expected. There were small differences between observed and expected distributions of baseline <i>p</i>-values, but these depended on analysis methods. In summary, incorrectly reported <i>p</i>-values and incorrect statistical test usage were common, and there were differences between observed and expected distributions of baseline <i>p</i>-values and frequency counts, raising questions about the integrity of some RCTs in these journals.</p>","PeriodicalId":21525,"journal":{"name":"Royal Society Open Science","volume":"12 1","pages":"240170"},"PeriodicalIF":2.9000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11739909/pdf/","citationCount":"0","resultStr":"{\"title\":\"Accuracy and distribution of baseline categorical variables and <i>p</i>-values in spine randomized controlled trials.\",\"authors\":\"Mark J Bolland, Alison Avenell, Andrew Grey\",\"doi\":\"10.1098/rsos.240170\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Levayer and colleagues assessed integrity issues in randomized controlled trials (RCTs) in four spine journals using baseline <i>p</i>-values from categorical variables, concluding that there was no evidence of 'systemic fraudulent behaviour'. We used their published dataset to assess the accuracy of reported <i>p</i>-values and whether observed and expected distributions of frequency counts and <i>p</i>-values were consistent. In 51 out of 929 (5.5%) baseline variables, the sum of frequencies did not agree with the reported number of participants. For one-third of reported <i>p</i>-values (172 out of 522), we could not calculate a matching <i>p</i>-value using a range of statistical tests. Sparse data were common: for 22% (74 out of 332) of variables in which the reported <i>p</i>-value matched the <i>p</i>-value calculated from a chi-square test, the expected cells were smaller than recommended for the use of chi-square tests. There were 20-25% more two-arm trials with differences in frequency counts of 1 or 2 between-groups than expected. There were small differences between observed and expected distributions of baseline <i>p</i>-values, but these depended on analysis methods. In summary, incorrectly reported <i>p</i>-values and incorrect statistical test usage were common, and there were differences between observed and expected distributions of baseline <i>p</i>-values and frequency counts, raising questions about the integrity of some RCTs in these journals.</p>\",\"PeriodicalId\":21525,\"journal\":{\"name\":\"Royal Society Open Science\",\"volume\":\"12 1\",\"pages\":\"240170\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-01-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11739909/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Royal Society Open Science\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1098/rsos.240170\",\"RegionNum\":3,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Royal Society Open Science","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1098/rsos.240170","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
Accuracy and distribution of baseline categorical variables and p-values in spine randomized controlled trials.
Levayer and colleagues assessed integrity issues in randomized controlled trials (RCTs) in four spine journals using baseline p-values from categorical variables, concluding that there was no evidence of 'systemic fraudulent behaviour'. We used their published dataset to assess the accuracy of reported p-values and whether observed and expected distributions of frequency counts and p-values were consistent. In 51 out of 929 (5.5%) baseline variables, the sum of frequencies did not agree with the reported number of participants. For one-third of reported p-values (172 out of 522), we could not calculate a matching p-value using a range of statistical tests. Sparse data were common: for 22% (74 out of 332) of variables in which the reported p-value matched the p-value calculated from a chi-square test, the expected cells were smaller than recommended for the use of chi-square tests. There were 20-25% more two-arm trials with differences in frequency counts of 1 or 2 between-groups than expected. There were small differences between observed and expected distributions of baseline p-values, but these depended on analysis methods. In summary, incorrectly reported p-values and incorrect statistical test usage were common, and there were differences between observed and expected distributions of baseline p-values and frequency counts, raising questions about the integrity of some RCTs in these journals.
期刊介绍:
Royal Society Open Science is a new open journal publishing high-quality original research across the entire range of science on the basis of objective peer-review.
The journal covers the entire range of science and mathematics and will allow the Society to publish all the high-quality work it receives without the usual restrictions on scope, length or impact.