Increasing Rigour in Online Health Surveys Through the Reduction of Fraudulent Data.

IF 5.8 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES
Wen Zhi Ng, Sundarimaa Erdembileg, Jean Cj Liu, Joseph D Tucker, Rayner Kay Jin Tan
{"title":"Increasing Rigour in Online Health Surveys Through the Reduction of Fraudulent Data.","authors":"Wen Zhi Ng, Sundarimaa Erdembileg, Jean Cj Liu, Joseph D Tucker, Rayner Kay Jin Tan","doi":"10.2196/68092","DOIUrl":null,"url":null,"abstract":"<p><strong>Unstructured: </strong>Online surveys have become a key tool of modern health research, offering a fast, cost-effective, and convenient means of data collection. It enables researchers to access diverse populations, such as those underrepresented in traditional studies, and facilitates the collection of stigmatized or sensitive behaviours through greater anonymity. However, the ease of participation also introduces significant challenges, particularly around data integrity and rigour. As fraudulent responses - whether from bots, repeat responders, or individuals misrepresenting themselves - become more sophisticated and pervasive, ensuring the rigour of online surveys has never been more crucial. This article provides a comprehensive synthesis of practical strategies that help to increase the rigour of online surveys through the detection and removal of fraudulent data. Drawing on recent literature and case studies, we outline several options that address the full research cycle from pre-data collection strategies to post-data collection validation. We emphasize the integration of automated screening techniques (e.g. CAPTCHAs, honeypot questions) and attention checks (e.g. trap questions) for purposeful survey design. Robust recruitment procedures (e.g. concealed eligibility criteria, two-stage screening) and a proper incentive or compensation structure can also help to deter fraudulent participation. We examine the merits and limitations of different sampling methodologies, including river sampling, online panels, and crowdsourcing platforms, offering guidance on how to select samples based on specific research objectives. Post-data collection, we discuss meta-data based techniques to detect fraudulent data (e.g. duplicate email or IP addresses, response time analysis), alongside methods to better screen for low quality responses (e.g. inconsistent response patterns, improbable qualitative responses). The escalating sophistication of fraud tactics, particularly with the growth of Artificial Intelligence, demands that researchers continuously adapt and stay vigilant. We propose the use of dynamic protocols, combining multiple strategies into a multi-pronged approach that can better filter for fraudulent data and evolve depending on the type of responses received across the data-collection process. However, there is still significant room for strategies to develop, and it should be a key focus for upcoming research. As online surveys become increasingly integral to health research, investing in robust strategies to screen for fraudulent data and increasing the rigour of studies is key to upholding scientific integrity.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":" ","pages":""},"PeriodicalIF":5.8000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Internet Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/68092","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Unstructured: Online surveys have become a key tool of modern health research, offering a fast, cost-effective, and convenient means of data collection. It enables researchers to access diverse populations, such as those underrepresented in traditional studies, and facilitates the collection of stigmatized or sensitive behaviours through greater anonymity. However, the ease of participation also introduces significant challenges, particularly around data integrity and rigour. As fraudulent responses - whether from bots, repeat responders, or individuals misrepresenting themselves - become more sophisticated and pervasive, ensuring the rigour of online surveys has never been more crucial. This article provides a comprehensive synthesis of practical strategies that help to increase the rigour of online surveys through the detection and removal of fraudulent data. Drawing on recent literature and case studies, we outline several options that address the full research cycle from pre-data collection strategies to post-data collection validation. We emphasize the integration of automated screening techniques (e.g. CAPTCHAs, honeypot questions) and attention checks (e.g. trap questions) for purposeful survey design. Robust recruitment procedures (e.g. concealed eligibility criteria, two-stage screening) and a proper incentive or compensation structure can also help to deter fraudulent participation. We examine the merits and limitations of different sampling methodologies, including river sampling, online panels, and crowdsourcing platforms, offering guidance on how to select samples based on specific research objectives. Post-data collection, we discuss meta-data based techniques to detect fraudulent data (e.g. duplicate email or IP addresses, response time analysis), alongside methods to better screen for low quality responses (e.g. inconsistent response patterns, improbable qualitative responses). The escalating sophistication of fraud tactics, particularly with the growth of Artificial Intelligence, demands that researchers continuously adapt and stay vigilant. We propose the use of dynamic protocols, combining multiple strategies into a multi-pronged approach that can better filter for fraudulent data and evolve depending on the type of responses received across the data-collection process. However, there is still significant room for strategies to develop, and it should be a key focus for upcoming research. As online surveys become increasingly integral to health research, investing in robust strategies to screen for fraudulent data and increasing the rigour of studies is key to upholding scientific integrity.

通过减少虚假数据,提高在线健康调查的严谨性。
非结构化:在线调查已成为现代卫生研究的关键工具,提供了一种快速、经济、方便的数据收集手段。它使研究人员能够接触到不同的人群,例如那些在传统研究中代表性不足的人群,并通过更大的匿名性促进对污名化或敏感行为的收集。然而,参与的便利性也带来了重大挑战,特别是在数据完整性和严谨性方面。随着欺诈性回复——无论是来自机器人、重复回复者,还是个人虚假陈述——变得越来越复杂和普遍,确保在线调查的严谨性从未像现在这样重要。本文提供了实用策略的综合,通过检测和删除欺诈性数据来帮助提高在线调查的严谨性。根据最近的文献和案例研究,我们概述了几个选项,以解决从数据收集前策略到数据收集后验证的整个研究周期。我们强调将自动筛选技术(例如captcha,蜜罐问题)和注意力检查(例如陷阱问题)集成到有目的的调查设计中。健全的招聘程序(例如隐蔽的资格标准、两阶段筛选)和适当的激励或补偿结构也有助于防止欺诈性参与。我们研究了不同采样方法的优点和局限性,包括河流采样、在线面板和众包平台,并根据具体的研究目标提供了如何选择样本的指导。数据收集后,我们讨论了基于元数据的技术来检测欺诈数据(例如,重复的电子邮件或IP地址,响应时间分析),以及更好地筛选低质量响应(例如,不一致的响应模式,不可能的定性响应)的方法。欺诈手段的日益复杂,特别是随着人工智能的发展,要求研究人员不断适应并保持警惕。我们建议使用动态协议,将多种策略组合成一种多管齐下的方法,可以更好地过滤欺诈性数据,并根据数据收集过程中收到的响应类型进行演变。然而,仍有很大的战略发展空间,这应该是未来研究的重点。随着在线调查日益成为卫生研究不可或缺的一部分,投资于筛选欺诈数据的有力战略和提高研究的严谨性是维护科学诚信的关键。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
14.40
自引率
5.40%
发文量
654
审稿时长
1 months
期刊介绍: The Journal of Medical Internet Research (JMIR) is a highly respected publication in the field of health informatics and health services. With a founding date in 1999, JMIR has been a pioneer in the field for over two decades. As a leader in the industry, the journal focuses on digital health, data science, health informatics, and emerging technologies for health, medicine, and biomedical research. It is recognized as a top publication in these disciplines, ranking in the first quartile (Q1) by Impact Factor. Notably, JMIR holds the prestigious position of being ranked #1 on Google Scholar within the "Medical Informatics" discipline.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信