Ask, and it shall be given you – individual patient data and code availability for randomised controlled trials submitted for publication

IF 7.5 1区 医学 Q1 ANESTHESIOLOGY
Anaesthesia Pub Date : 2024-12-05 DOI:10.1111/anae.16503
Paul Bramley
{"title":"Ask, and it shall be given you – individual patient data and code availability for randomised controlled trials submitted for publication","authors":"Paul Bramley","doi":"10.1111/anae.16503","DOIUrl":null,"url":null,"abstract":"<p>Sharing data from clinical studies is now recognised to be an important part of the research process [<span>1</span>]. Since many research results cannot be replicated [<span>2, 3</span>], there has been growing interest in making study documents available [<span>4, 5</span>] in order to make reproduction of existing results, detection of false results, and replication of findings and synthesis into larger meta-studies easier. Randomised controlled trials (RCTs) are of particular interest, since they are expensive and time-consuming to run. Post-publication availability of study documents has been investigated previously, but their availability to journals at the point of manuscript submission, where it could be used as part of the review, has not been evaluated.</p><p>To address this, for a 9-month period (1 June 2023 to 29 February 2024), when an RCT was submitted to <i>Anaesthesia</i> and sent for peer review (i.e. not desk rejected), a member of the editorial team requested, via email, anonymised individual patient data (IPD) and statistical code from the corresponding author. We sent one further request if there was no initial response. I examined the submitted manuscript and any provided IPD and code for each RCT to determine: whether the IPD and code were stated to be available in the submitted manuscript; whether the IPD and code were provided on request to the authors by the journal; the IPD format (if it was provided in multiple formats, the least proprietary format was recorded); whether there was a data dictionary; whether IPD were presented in English; whether (using the manuscript and/or Google Translate) it was clear what the variable names in the IPD represented; whether the results of the manuscript could theoretically be reproduced with the provided documents (I did not actually compare the values); and whether authors changed their submitted manuscript based on the request for IPD. I judged reproducibility was ‘possible’ if code and IPD were available, unless a fully reproducible document was available (e.g. R Markdown). For the proprietary files that could contain code but I was unable to open, I labelled code availability ‘unclear’. The project was approved by the editorial board of <i>Anaesthesia</i>, and the host institution confirmed that ethical approval was not required given that IPD were anonymised by authors before transfer. I performed all data cleaning and analysis in R (R Foundation, Vienna, Austria) and all analysis was exploratory.</p><p>In the 9-month data collection window 122 RCTs were submitted to <i>Anaesthesia</i>, 44 of which were desk rejected. Of the remaining 78, we missed the opportunity to request IPD for eight before the manuscripts were rejected. Two provided IPD in such a way that we could not access them (one because of concerns about malware, another due to access issues with a website) and so were also excluded. This left a cohort of 68 manuscripts for further analysis. After we requested data (without any other prompting), authors of six (9%) RCTs reported finding errors in their manuscript which they wished to correct. Nine RCTs provided no IPD (see Table 1 and Fig. 1), of which one refused citing ethical concerns and one refused to provide it unless the manuscript was accepted. One authorship group withdrew their submission after the data request (without providing data).</p><p>In contrast to previous studies investigating whether IPD were provided on request [<span>6</span>], I found that despite most manuscripts not making a statement about data availability, 87% of authors would provide IPD on request. However, this could be explained by the incentives for authors to provide IPD to a journal as part of a review, rather than to other researchers post-publication. This is relevant since previous work on trial submissions has shown that many problems with data integrity required IPD to be detected [<span>7</span>]. Fewer authors provided statistical code, which was a surprise given that this has fewer confidentiality implications and is straightforward in most statistical packages. This may be due to lack of technical expertise, which is supported by the fact that several groups provided documents labelled as code which were not statistical code. Despite this, more than half of submissions provided documents which could lead to full reproducibility of results – though I lacked the resources to evaluate whether all results were reproducible. It is also notable that 9% of all submissions found errors in their own work following a request for IPD. This suggests that the expectation of scrutiny causes authors to check their work and makes routine requests for IPD seem a potentially valuable tool for journals. Finally, it was reassuring that most datasets could be opened with freely available tools and variable names could be interpreted in context, despite the limited availability of data dictionaries.</p>","PeriodicalId":7742,"journal":{"name":"Anaesthesia","volume":"80 2","pages":"205-206"},"PeriodicalIF":7.5000,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/anae.16503","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Anaesthesia","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/anae.16503","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ANESTHESIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Sharing data from clinical studies is now recognised to be an important part of the research process [1]. Since many research results cannot be replicated [2, 3], there has been growing interest in making study documents available [4, 5] in order to make reproduction of existing results, detection of false results, and replication of findings and synthesis into larger meta-studies easier. Randomised controlled trials (RCTs) are of particular interest, since they are expensive and time-consuming to run. Post-publication availability of study documents has been investigated previously, but their availability to journals at the point of manuscript submission, where it could be used as part of the review, has not been evaluated.

To address this, for a 9-month period (1 June 2023 to 29 February 2024), when an RCT was submitted to Anaesthesia and sent for peer review (i.e. not desk rejected), a member of the editorial team requested, via email, anonymised individual patient data (IPD) and statistical code from the corresponding author. We sent one further request if there was no initial response. I examined the submitted manuscript and any provided IPD and code for each RCT to determine: whether the IPD and code were stated to be available in the submitted manuscript; whether the IPD and code were provided on request to the authors by the journal; the IPD format (if it was provided in multiple formats, the least proprietary format was recorded); whether there was a data dictionary; whether IPD were presented in English; whether (using the manuscript and/or Google Translate) it was clear what the variable names in the IPD represented; whether the results of the manuscript could theoretically be reproduced with the provided documents (I did not actually compare the values); and whether authors changed their submitted manuscript based on the request for IPD. I judged reproducibility was ‘possible’ if code and IPD were available, unless a fully reproducible document was available (e.g. R Markdown). For the proprietary files that could contain code but I was unable to open, I labelled code availability ‘unclear’. The project was approved by the editorial board of Anaesthesia, and the host institution confirmed that ethical approval was not required given that IPD were anonymised by authors before transfer. I performed all data cleaning and analysis in R (R Foundation, Vienna, Austria) and all analysis was exploratory.

In the 9-month data collection window 122 RCTs were submitted to Anaesthesia, 44 of which were desk rejected. Of the remaining 78, we missed the opportunity to request IPD for eight before the manuscripts were rejected. Two provided IPD in such a way that we could not access them (one because of concerns about malware, another due to access issues with a website) and so were also excluded. This left a cohort of 68 manuscripts for further analysis. After we requested data (without any other prompting), authors of six (9%) RCTs reported finding errors in their manuscript which they wished to correct. Nine RCTs provided no IPD (see Table 1 and Fig. 1), of which one refused citing ethical concerns and one refused to provide it unless the manuscript was accepted. One authorship group withdrew their submission after the data request (without providing data).

In contrast to previous studies investigating whether IPD were provided on request [6], I found that despite most manuscripts not making a statement about data availability, 87% of authors would provide IPD on request. However, this could be explained by the incentives for authors to provide IPD to a journal as part of a review, rather than to other researchers post-publication. This is relevant since previous work on trial submissions has shown that many problems with data integrity required IPD to be detected [7]. Fewer authors provided statistical code, which was a surprise given that this has fewer confidentiality implications and is straightforward in most statistical packages. This may be due to lack of technical expertise, which is supported by the fact that several groups provided documents labelled as code which were not statistical code. Despite this, more than half of submissions provided documents which could lead to full reproducibility of results – though I lacked the resources to evaluate whether all results were reproducible. It is also notable that 9% of all submissions found errors in their own work following a request for IPD. This suggests that the expectation of scrutiny causes authors to check their work and makes routine requests for IPD seem a potentially valuable tool for journals. Finally, it was reassuring that most datasets could be opened with freely available tools and variable names could be interpreted in context, despite the limited availability of data dictionaries.

Abstract Image

询问,它会给你——提交发表的随机对照试验的个体患者数据和代码可用性
共享临床研究数据现在被认为是研究过程的重要组成部分。由于许多研究结果无法被复制[2,3],人们对提供研究文件越来越感兴趣[4,5],以便更容易地复制现有结果、检测错误结果、复制发现并将其合成为更大的元研究。随机对照试验(RCTs)特别令人感兴趣,因为它们运行昂贵且耗时。研究文件发表后的可用性之前已经进行过调查,但它们在投稿时对期刊的可用性尚未进行评估,在投稿时可以将其作为评审的一部分。为了解决这一问题,在9个月的时间内(2023年6月1日至2024年2月29日),当一项随机对照试验提交给《麻醉学》(Anaesthesia)并发送给同行评审时(即不被拒绝),编辑团队的一名成员通过电子邮件要求通讯作者提供匿名个人患者数据(IPD)和统计代码。如果没有最初的回应,我们发出了进一步的请求。我检查了提交的手稿和每个RCT提供的IPD和代码,以确定:提交的手稿中是否声明IPD和代码可用;期刊是否应作者要求提供IPD和代码;IPD格式(如果以多种格式提供,则记录最少专有格式);是否有数据字典;IPD是否以英文呈现;是否(使用手稿和/或谷歌Translate)清楚IPD中的变量名表示什么;手稿的结果是否可以在理论上与所提供的文件复制(我没有实际比较数值);以及作者是否根据IPD的要求更改了他们提交的手稿。如果有代码和IPD,我判断再现性是“可能的”,除非有完全可再现的文件(例如R Markdown)。对于可能包含代码但我无法打开的专有文件,我将代码可用性标记为“不清楚”。该项目已获得《麻醉学》编委会的批准,并且主办机构确认,由于IPD在转移前已由作者匿名,因此不需要伦理批准。所有的数据清理和分析都是在R (R Foundation, Vienna, Austria)进行的,所有的分析都是探索性的。在9个月的数据收集窗口中,共有122项rct提交给麻醉科,其中44项被拒绝。在剩下的78篇文章中,有8篇在被拒稿前,我们错过了要求IPD的机会。其中两个提供IPD的方式使我们无法访问它们(一个是因为担心恶意软件,另一个是因为网站的访问问题),因此也被排除在外。剩下的68份手稿有待进一步分析。在我们要求提供数据后(没有任何其他提示),6个(9%)随机对照试验的作者报告在他们的手稿中发现了他们希望纠正的错误。9项rct未提供IPD(见表1和图1),其中1项rct以伦理考虑为由拒绝提供IPD, 1项rct拒绝提供IPD,除非稿件被接受。一个作者组在数据请求后撤回了他们的提交(没有提供数据)。表1。收集的变量的汇总统计信息。n = 68IPD在manuscriptNot recorded53中声明的可用性(78%)On request12 (18%)Public database2(3%)声明不可用1(1%)在manuscriptNot recorded66中声明的代码可用性(97%)Public database1 (1%)On request1 (1%)IPD provided59 (87%)Code providedNo31 (46%)Yes24 (35%)Unclear10 (15%)Partial3 (4%)n = 59IPD formatCSV3 (5%)Excel51 (86%)SPSS3 (5%)Stata2(3%)数据字典no42 (71%)Partial12 (20%)Yes5 (8%)IPD in EnglishYes47 (80%)Partial5 (8%)No7 (12%)IPD变量明确标记54(92%)ReproducibleNo27 (46%)Possible29 (49%)Yes3 (5%) IPD,个体患者数据。图1打开图形查看器powerpointuval图显示手稿的特征。根据提交的文件,蓝色研究是可重复的,橙色研究是潜在可重复的,绿色研究是不可重复的。IPD,个体患者数据。与之前调查IPD是否应要求提供的研究相比,我发现尽管大多数手稿没有说明数据的可用性,但87%的作者会应要求提供IPD。然而,这可以解释为作者将IPD作为综述的一部分提供给期刊的动机,而不是在发表后提供给其他研究人员。这是相关的,因为以前关于试验提交的工作表明,许多数据完整性问题需要检测到IPD。很少有作者提供统计代码,这令人惊讶,因为这对机密性的影响较少,而且在大多数统计软件包中都很简单。 这可能是由于缺乏技术专门知识,有几个小组提供了标记为代码而不是统计代码的文件,这一事实也支持了这一点。尽管如此,超过一半的提交文件提供了可能导致结果完全可重复的文件——尽管我缺乏资源来评估是否所有结果都是可重复的。同样值得注意的是,9%的提交者在收到IPD请求后发现自己的作品中存在错误。这表明,对审查的预期导致作者检查他们的工作,并使IPD的常规请求似乎成为期刊潜在的有价值的工具。最后,令人放心的是,尽管数据字典的可用性有限,但大多数数据集都可以使用免费的工具打开,变量名可以在上下文中解释。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Anaesthesia
Anaesthesia 医学-麻醉学
CiteScore
21.20
自引率
9.30%
发文量
300
审稿时长
6 months
期刊介绍: The official journal of the Association of Anaesthetists is Anaesthesia. It is a comprehensive international publication that covers a wide range of topics. The journal focuses on general and regional anaesthesia, as well as intensive care and pain therapy. It includes original articles that have undergone peer review, covering all aspects of these fields, including research on equipment.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信