Assessment of Fraud Deterrence and Detection Procedures Used in a Web-Based Survey Study With Adult Black Cisgender Women: Description of Lessons Learned and Recommendations.
{"title":"Assessment of Fraud Deterrence and Detection Procedures Used in a Web-Based Survey Study With Adult Black Cisgender Women: Description of Lessons Learned and Recommendations.","authors":"Amber I Sophus, Jason W Mitchell","doi":"10.2196/59955","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Online research studies enable engagement with more Black cisgender women in health-related research. However, fraudulent data collection responses in online studies raise important concerns about data integrity, particularly when incentives are involved.</p><p><strong>Objective: </strong>The purpose of this study was to assess the strengths and limitations of fraud deterrence and detection procedures implemented in an incentivized, cross-sectional, online study about HIV prevention and sexual health with Black cisgender women living in Texas.</p><p><strong>Methods: </strong>Data for this study came from a cross-sectional web-based survey that examined factors associated with potential pre-exposure prophylaxis use among a convenience sample of adult Black cisgender women from 3 metropolitan areas in Texas. Each eligibility screener and associated survey entry was evaluated using 4 fraud deterrence features and 7 fraud detection benchmarks with corresponding decision rules.</p><p><strong>Results: </strong>A total of 5862 respondents provided consent and initiated the eligibility screener, of whom 2150 (36.68%) were ineligible for not meeting the inclusion criteria, and 131 (2.23%) completed less than 80% of the survey and were removed from further consideration. Other entries were removed for not passing level 1 fraud deterrent safeguards: duplicate entries with the same IP address (388/5862, 6.62%), same telephone number (69/5862, 1.18%), same email address (114/5862, 1.94%), and same telephone number and email address (17/5862, 0.29%). Of the remaining 2993 entries, 1652 entries were removed for not passing the first 2 items of the level 2 fraud detection benchmarks: screeners and surveys with latitude and longitude coordinates outside of the United States (347/2993, 11.59%) and survey completion time of less than 10 minutes (1305/2993, 43.6%). Of the remaining 1341 entries, 130 (9.69%) passed all 5 of the remaining level 2 data validation benchmarks, and 763 (56.89%) entries were removed due to passing less than 3. An additional 33.4% (423/1341) entries were removed after passing 4 of the 5 remaining validation benchmarks, being contacted to verify survey information, and not providing legitimate contact information or being unable to confirm personal information. The final enrolled sample in this online study consisted of 155 respondents who provided consent, were deemed eligible, and passed fraud deterrence features and fraud detection benchmarks. In this paper, we discuss the lessons learned and provide recommendations for leveraging available features in survey software programs to help deter bots and enhance fraud detection procedures beyond relying on survey software options.</p><p><strong>Conclusions: </strong>Effectively identifying fraudulent responses in online surveys is an ongoing challenge. The data validation approach used in this study establishes a robust protocol for identifying genuine participants, thereby contributing to the removal of false data from study findings. By sharing experiences and implementing thorough fraud deterrence and detection protocols, researchers can maintain data validity and contribute to best practices in web-based research.</p>","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"9 ","pages":"e59955"},"PeriodicalIF":2.0000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11947628/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Formative Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/59955","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Online research studies enable engagement with more Black cisgender women in health-related research. However, fraudulent data collection responses in online studies raise important concerns about data integrity, particularly when incentives are involved.
Objective: The purpose of this study was to assess the strengths and limitations of fraud deterrence and detection procedures implemented in an incentivized, cross-sectional, online study about HIV prevention and sexual health with Black cisgender women living in Texas.
Methods: Data for this study came from a cross-sectional web-based survey that examined factors associated with potential pre-exposure prophylaxis use among a convenience sample of adult Black cisgender women from 3 metropolitan areas in Texas. Each eligibility screener and associated survey entry was evaluated using 4 fraud deterrence features and 7 fraud detection benchmarks with corresponding decision rules.
Results: A total of 5862 respondents provided consent and initiated the eligibility screener, of whom 2150 (36.68%) were ineligible for not meeting the inclusion criteria, and 131 (2.23%) completed less than 80% of the survey and were removed from further consideration. Other entries were removed for not passing level 1 fraud deterrent safeguards: duplicate entries with the same IP address (388/5862, 6.62%), same telephone number (69/5862, 1.18%), same email address (114/5862, 1.94%), and same telephone number and email address (17/5862, 0.29%). Of the remaining 2993 entries, 1652 entries were removed for not passing the first 2 items of the level 2 fraud detection benchmarks: screeners and surveys with latitude and longitude coordinates outside of the United States (347/2993, 11.59%) and survey completion time of less than 10 minutes (1305/2993, 43.6%). Of the remaining 1341 entries, 130 (9.69%) passed all 5 of the remaining level 2 data validation benchmarks, and 763 (56.89%) entries were removed due to passing less than 3. An additional 33.4% (423/1341) entries were removed after passing 4 of the 5 remaining validation benchmarks, being contacted to verify survey information, and not providing legitimate contact information or being unable to confirm personal information. The final enrolled sample in this online study consisted of 155 respondents who provided consent, were deemed eligible, and passed fraud deterrence features and fraud detection benchmarks. In this paper, we discuss the lessons learned and provide recommendations for leveraging available features in survey software programs to help deter bots and enhance fraud detection procedures beyond relying on survey software options.
Conclusions: Effectively identifying fraudulent responses in online surveys is an ongoing challenge. The data validation approach used in this study establishes a robust protocol for identifying genuine participants, thereby contributing to the removal of false data from study findings. By sharing experiences and implementing thorough fraud deterrence and detection protocols, researchers can maintain data validity and contribute to best practices in web-based research.