识别和管理在线陈述偏好调查中的欺诈性应答者：健康偏好研究中最佳-最差尺度的一个案例。

IF 3.1 3区医学 Q1 HEALTH CARE SCIENCES & SERVICES

Patient-Patient Centered Outcomes Research Pub Date : 2025-07-01 Epub Date: 2025-05-03 DOI:10.1007/s40271-025-00740-y

Karen V MacDonald, Geoffrey C Nguyen, Maida J Sewitch, Deborah A Marshall

{"title":"识别和管理在线陈述偏好调查中的欺诈性应答者：健康偏好研究中最佳-最差尺度的一个案例。","authors":"Karen V MacDonald, Geoffrey C Nguyen, Maida J Sewitch, Deborah A Marshall","doi":"10.1007/s40271-025-00740-y","DOIUrl":null,"url":null,"abstract":"Background: There is limited evidence and guidance in health preferences research to prevent, identify, and manage fraudulent respondents and data fraud, especially for best-worst scaling (BWS) and discrete choice experiments with nonordered attributes. Using an example from a BWS survey in which we experienced data fraud, we aimed to: (1) develop an approach to identify, verify, and categorize fraudulent respondents; (2) assess the impact of fraudulent respondents on data and results; and (3) identify variables associated with fraudulent respondents.Methods: An online BWS survey on healthcare services for inflammatory bowel disease (IBD) was administered to Canadian IBD patients. We used a three-step approach to identify, verify, and categorize respondents as likely fraudulent (LF), likely real (LR), and unsure. First, responses to 12 \"red flag\" variables (variables identified as indicators of fraud) were coded 0 (pass) or 1 (fail) then summed to generate a \"fraudulent response score\" (FRS; range: 0-12 (most likely fraudulent)) used to categorize respondents. Second, respondents categorized LR or unsure underwent age verification. Third, categorization was updated on the basis of age verification results. BWS data were analyzed using conditional logit and latent class analysis. Subgroup analysis was done by final categorization, FRS, and red flag variables.Results: Overall, n = 4334 respondents underwent initial categorization resulting in 24% (n = 1019) LF and 76% (n = 3315) needing further review. After review, 75% (n = 3258) were categorized as LF and n = 484 underwent age verification. Respondent categorization was updated on the basis of age verification, with final categorization of 76% (n = 3297) LF, 14% (n = 592) unsure, 10% (n = 442) LR, and < 1% (n = 3) duplicates of LR. BWS item rankings differed most by respondent category. Latent class analysis demonstrated final categorization was significantly associated with class membership; class 1 had characteristics consistent with LR respondents and item ranking order for class 1 closely aligned with LR respondent conditional logit results. Suspicious email was the most frequently failed red flag variable and was associated with fraudulent respondents.Conclusions: Additional steps to review data and verify age resulted in better categorization than only FRS or single red flag variables. Email authentication, single use/unique survey links, and built-in identification verification may be most effective for fraud prevention. Guidance is needed on good research practices for most effective and efficient approaches for preventing, identifying, and managing fraudulent data in health preferences research, specifically in studies with nonordered attributes.","PeriodicalId":51271,"journal":{"name":"Patient-Patient Centered Outcomes Research","volume":" ","pages":"373-390"},"PeriodicalIF":3.1000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Identifying and Managing Fraudulent Respondents in Online Stated Preferences Surveys: A Case Example from Best-Worst Scaling in Health Preferences Research.\",\"authors\":\"Karen V MacDonald, Geoffrey C Nguyen, Maida J Sewitch, Deborah A Marshall\",\"doi\":\"10.1007/s40271-025-00740-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: There is limited evidence and guidance in health preferences research to prevent, identify, and manage fraudulent respondents and data fraud, especially for best-worst scaling (BWS) and discrete choice experiments with nonordered attributes. Using an example from a BWS survey in which we experienced data fraud, we aimed to: (1) develop an approach to identify, verify, and categorize fraudulent respondents; (2) assess the impact of fraudulent respondents on data and results; and (3) identify variables associated with fraudulent respondents.Methods: An online BWS survey on healthcare services for inflammatory bowel disease (IBD) was administered to Canadian IBD patients. We used a three-step approach to identify, verify, and categorize respondents as likely fraudulent (LF), likely real (LR), and unsure. First, responses to 12 \\\"red flag\\\" variables (variables identified as indicators of fraud) were coded 0 (pass) or 1 (fail) then summed to generate a \\\"fraudulent response score\\\" (FRS; range: 0-12 (most likely fraudulent)) used to categorize respondents. Second, respondents categorized LR or unsure underwent age verification. Third, categorization was updated on the basis of age verification results. BWS data were analyzed using conditional logit and latent class analysis. Subgroup analysis was done by final categorization, FRS, and red flag variables.Results: Overall, n = 4334 respondents underwent initial categorization resulting in 24% (n = 1019) LF and 76% (n = 3315) needing further review. After review, 75% (n = 3258) were categorized as LF and n = 484 underwent age verification. Respondent categorization was updated on the basis of age verification, with final categorization of 76% (n = 3297) LF, 14% (n = 592) unsure, 10% (n = 442) LR, and < 1% (n = 3) duplicates of LR. BWS item rankings differed most by respondent category. Latent class analysis demonstrated final categorization was significantly associated with class membership; class 1 had characteristics consistent with LR respondents and item ranking order for class 1 closely aligned with LR respondent conditional logit results. Suspicious email was the most frequently failed red flag variable and was associated with fraudulent respondents.Conclusions: Additional steps to review data and verify age resulted in better categorization than only FRS or single red flag variables. Email authentication, single use/unique survey links, and built-in identification verification may be most effective for fraud prevention. Guidance is needed on good research practices for most effective and efficient approaches for preventing, identifying, and managing fraudulent data in health preferences research, specifically in studies with nonordered attributes.\",\"PeriodicalId\":51271,\"journal\":{\"name\":\"Patient-Patient Centered Outcomes Research\",\"volume\":\" \",\"pages\":\"373-390\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Patient-Patient Centered Outcomes Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s40271-025-00740-y\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/5/3 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Patient-Patient Centered Outcomes Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s40271-025-00740-y","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/3 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

摘要

背景：在健康偏好研究中，预防、识别和管理欺诈性应答者和数据欺诈的证据和指导有限，特别是对于最佳最差缩放（BWS）和具有无序属性的离散选择实验。以BWS调查中的数据欺诈为例，我们的目标是：(1)开发一种方法来识别、验证和分类欺诈受访者；(2)评估欺诈性应答者对数据和结果的影响；(3)识别与欺诈应答者相关的变量。方法：对加拿大IBD患者进行了一项关于炎症性肠病（IBD）医疗服务的在线BWS调查。我们采用了三步法来识别、验证并将受访者分类为可能欺诈（LF）、可能真实（LR）和不确定。首先，对12个“红旗”变量（被确定为欺诈指标的变量）的回应被编码为0（通过）或1（失败），然后求和生成“欺诈回应得分”(FRS；范围：0-12（最有可能欺诈）)用于对受访者进行分类。其次，被归类为LR或不确定的受访者进行了年龄验证。第三，根据年龄验证结果更新分类。BWS数据采用条件logit和潜在类分析进行分析。通过最终分类、FRS和红旗变量进行亚组分析。结果：总体而言，n = 4334名受访者进行了初步分类，导致24% （n = 1019）的LF和76% （n = 3315）需要进一步审查。经审查，75% （n = 3258）被归类为LF， n = 484进行了年龄验证。在年龄验证的基础上更新被调查者分类，最终分类为76% (n = 3297) LF, 14% （n = 592）不确定，10% (n = 442) LR和< 1% （n = 3）重复LR。BWS项目排名在被调查者类别之间差异最大。潜在类别分析表明，最终分类与类别成员显著相关；类别1的特征与LR被调查者一致，类别1的项目排名顺序与LR被调查者的条件逻辑结果密切相关。可疑邮件是最常失败的危险信号变量，与欺诈性受访者有关。结论：与仅采用FRS或单一危险信号变量相比，进一步审查数据和验证年龄可以更好地进行分类。电子邮件认证、单次使用/唯一调查链接和内置身份验证可能是最有效的欺诈预防方法。需要就良好的研究实践提供指导，以便最有效地预防、识别和管理健康偏好研究中的欺诈性数据，特别是在具有无序属性的研究中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Identifying and Managing Fraudulent Respondents in Online Stated Preferences Surveys: A Case Example from Best-Worst Scaling in Health Preferences Research.

Background: There is limited evidence and guidance in health preferences research to prevent, identify, and manage fraudulent respondents and data fraud, especially for best-worst scaling (BWS) and discrete choice experiments with nonordered attributes. Using an example from a BWS survey in which we experienced data fraud, we aimed to: (1) develop an approach to identify, verify, and categorize fraudulent respondents; (2) assess the impact of fraudulent respondents on data and results; and (3) identify variables associated with fraudulent respondents.

Methods: An online BWS survey on healthcare services for inflammatory bowel disease (IBD) was administered to Canadian IBD patients. We used a three-step approach to identify, verify, and categorize respondents as likely fraudulent (LF), likely real (LR), and unsure. First, responses to 12 "red flag" variables (variables identified as indicators of fraud) were coded 0 (pass) or 1 (fail) then summed to generate a "fraudulent response score" (FRS; range: 0-12 (most likely fraudulent)) used to categorize respondents. Second, respondents categorized LR or unsure underwent age verification. Third, categorization was updated on the basis of age verification results. BWS data were analyzed using conditional logit and latent class analysis. Subgroup analysis was done by final categorization, FRS, and red flag variables.

Results: Overall, n = 4334 respondents underwent initial categorization resulting in 24% (n = 1019) LF and 76% (n = 3315) needing further review. After review, 75% (n = 3258) were categorized as LF and n = 484 underwent age verification. Respondent categorization was updated on the basis of age verification, with final categorization of 76% (n = 3297) LF, 14% (n = 592) unsure, 10% (n = 442) LR, and < 1% (n = 3) duplicates of LR. BWS item rankings differed most by respondent category. Latent class analysis demonstrated final categorization was significantly associated with class membership; class 1 had characteristics consistent with LR respondents and item ranking order for class 1 closely aligned with LR respondent conditional logit results. Suspicious email was the most frequently failed red flag variable and was associated with fraudulent respondents.

Conclusions: Additional steps to review data and verify age resulted in better categorization than only FRS or single red flag variables. Email authentication, single use/unique survey links, and built-in identification verification may be most effective for fraud prevention. Guidance is needed on good research practices for most effective and efficient approaches for preventing, identifying, and managing fraudulent data in health preferences research, specifically in studies with nonordered attributes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Patient-Patient Centered Outcomes Research HEALTH CARE SCIENCES & SERVICES-

CiteScore

6.60

自引率

8.30%

发文量

审稿时长

>12 weeks

期刊介绍： The Patient provides a venue for scientifically rigorous, timely, and relevant research to promote the development, evaluation and implementation of therapies, technologies, and innovations that will enhance the patient experience. It is an international forum for research that advances and/or applies qualitative or quantitative methods to promote the generation, synthesis, or interpretation of evidence. The journal has specific interest in receiving original research, reviews and commentaries related to qualitative and mixed methods research, stated-preference methods, patient reported outcomes, and shared decision making. Advances in regulatory science, patient-focused drug development, patient-centered benefit-risk and health technology assessment will also be considered. Additional digital features (including animated abstracts, video abstracts, slide decks, audio slides, instructional videos, infographics, podcasts and animations) can be published with articles; these are designed to increase the visibility, readership and educational value of the journal’s content. In addition, articles published in The Patient may be accompanied by plain language summaries to assist readers who have some knowledge of, but not in-depth expertise in, the area to understand important medical advances. All manuscripts are subject to peer review by international experts.