Population Recruitment Strategies in the Age of Bots: Insights from the What Is on Your Plate Study

IF 3.8 Q2 NUTRITION & DIETETICS
Emily G Elenio , Alison Tovar , John San Soucie , Maya K Vadiveloo
{"title":"Population Recruitment Strategies in the Age of Bots: Insights from the What Is on Your Plate Study","authors":"Emily G Elenio ,&nbsp;Alison Tovar ,&nbsp;John San Soucie ,&nbsp;Maya K Vadiveloo","doi":"10.1016/j.cdnut.2025.107442","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>To evaluate state-wide nutrition policies, valid tools are required to gather sufficient sample sizes. Remote data collection, including web-based dietary assessments, offers convenience for participants and researchers and enables faster and more diverse recruitment. However, it presents challenges, including risk of bots compromising data integrity.</div></div><div><h3>Objectives</h3><div>This study describes the technical survey design of an ongoing longitudinal study, which is evaluating a state-wide Supplemental Nutrition Assistance Program (SNAP) incentive program, discusses strategies to prevent and identify bots, duplicates, fraudulent entries, and implausible data, and provides recommendations to improve future public health nutrition research.</div></div><div><h3>Methods</h3><div>From May to September 2023, SNAP participants from Rhode Island and Connecticut were recruited to complete an online food frequency questionnaire (FFQ) and a demographic survey. Given the large sample and online format, our interdisciplinary team designed the technical backend to optimize participants’ convenience while ensuring data quality through an automated system that assessed FFQ responses. To prevent bots and duplicates, we created duplicate application programming interfaces (API), randomly called participants, and evaluated Completely Automated Public Turing Test to Tell Computers and Humans Apart (reCAPTCHA), geotags, and Internet Protocol (IP) addresses.</div></div><div><h3>Results</h3><div>Using a combination of text blasts and in-person recruitment, we enrolled 1367 participants, with text blasts proving the most effective strategy (∼60% of participants). Midway through recruitment, we identified 544 potential bots that completed the screener, with duplicate IP addresses and geotags from outside the recruitment area serving as strong indicators of bot activity. At baseline, 112 participants failed FFQ data quality checks, prompting follow-up by research assistants. Our automated duplicate and FFQ APIs saved countless hours of staff time.</div></div><div><h3>Conclusions</h3><div>Remote data collection tools were critical for meeting recruitment goals and ensuring our data authenticity. A combination of strategies is necessary to effectively mitigate against bots and ensure plausible responses. Widely available, built-in tools (e.g., reCAPTCHA) are helpful but are insufficient alone. Customized solutions like our automated systems may be critical for future researchers to maintain data integrity.</div></div>","PeriodicalId":10756,"journal":{"name":"Current Developments in Nutrition","volume":"9 5","pages":"Article 107442"},"PeriodicalIF":3.8000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Developments in Nutrition","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2475299125029026","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"NUTRITION & DIETETICS","Score":null,"Total":0}
引用次数: 0

Abstract

Background

To evaluate state-wide nutrition policies, valid tools are required to gather sufficient sample sizes. Remote data collection, including web-based dietary assessments, offers convenience for participants and researchers and enables faster and more diverse recruitment. However, it presents challenges, including risk of bots compromising data integrity.

Objectives

This study describes the technical survey design of an ongoing longitudinal study, which is evaluating a state-wide Supplemental Nutrition Assistance Program (SNAP) incentive program, discusses strategies to prevent and identify bots, duplicates, fraudulent entries, and implausible data, and provides recommendations to improve future public health nutrition research.

Methods

From May to September 2023, SNAP participants from Rhode Island and Connecticut were recruited to complete an online food frequency questionnaire (FFQ) and a demographic survey. Given the large sample and online format, our interdisciplinary team designed the technical backend to optimize participants’ convenience while ensuring data quality through an automated system that assessed FFQ responses. To prevent bots and duplicates, we created duplicate application programming interfaces (API), randomly called participants, and evaluated Completely Automated Public Turing Test to Tell Computers and Humans Apart (reCAPTCHA), geotags, and Internet Protocol (IP) addresses.

Results

Using a combination of text blasts and in-person recruitment, we enrolled 1367 participants, with text blasts proving the most effective strategy (∼60% of participants). Midway through recruitment, we identified 544 potential bots that completed the screener, with duplicate IP addresses and geotags from outside the recruitment area serving as strong indicators of bot activity. At baseline, 112 participants failed FFQ data quality checks, prompting follow-up by research assistants. Our automated duplicate and FFQ APIs saved countless hours of staff time.

Conclusions

Remote data collection tools were critical for meeting recruitment goals and ensuring our data authenticity. A combination of strategies is necessary to effectively mitigate against bots and ensure plausible responses. Widely available, built-in tools (e.g., reCAPTCHA) are helpful but are insufficient alone. Customized solutions like our automated systems may be critical for future researchers to maintain data integrity.
机器人时代的人口招募策略:来自你盘子里有什么研究的见解
背景:为了评估州范围内的营养政策,需要有效的工具来收集足够的样本量。远程数据收集,包括基于网络的饮食评估,为参与者和研究人员提供了便利,并使招聘工作更快、更多样化。然而,它也带来了挑战,包括机器人损害数据完整性的风险。本研究描述了一项正在进行的纵向研究的技术调查设计,该研究正在评估全州范围内的补充营养援助计划(SNAP)激励计划,讨论了防止和识别虚假、重复、虚假条目和不可信数据的策略,并为改进未来的公共卫生营养研究提供了建议。方法从2023年5月至9月,招募来自罗德岛州和康涅狄格州的SNAP参与者完成在线食物频率问卷(FFQ)和人口调查。考虑到大样本和在线格式,我们的跨学科团队设计了技术后端,以优化参与者的便利性,同时通过评估FFQ回答的自动化系统确保数据质量。为了防止机器人和重复,我们创建了重复的应用程序编程接口(API),随机调用参与者,并评估了完全自动化公共图灵测试,以区分计算机和人类(reCAPTCHA),地理标签和互联网协议(IP)地址。结果:采用文本爆炸和亲自招募相结合的方法,我们招募了1367名参与者,其中文本爆炸被证明是最有效的策略(约60%的参与者)。在招聘过程中,我们确定了544个潜在的机器人,这些机器人完成了筛选,来自招聘区域以外的重复IP地址和地理标签是机器人活动的有力指标。在基线时,112名参与者未能通过FFQ数据质量检查,促使研究助理进行随访。我们的自动复制和FFQ api为员工节省了无数小时的时间。结论远程数据收集工具对于实现招聘目标和确保数据真实性至关重要。为了有效地减轻对机器人的攻击并确保合理的回应,需要多种策略的组合。广泛可用的内置工具(例如reCAPTCHA)是有帮助的,但单独使用是不够的。像我们的自动化系统这样的定制解决方案可能对未来的研究人员保持数据完整性至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Current Developments in Nutrition
Current Developments in Nutrition NUTRITION & DIETETICS-
CiteScore
5.30
自引率
4.20%
发文量
1327
审稿时长
8 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信