Karen V MacDonald, Geoffrey C Nguyen, Maida J Sewitch, Deborah A Marshall
{"title":"识别和管理在线陈述偏好调查中的欺诈性应答者:健康偏好研究中最佳-最差尺度的一个案例。","authors":"Karen V MacDonald, Geoffrey C Nguyen, Maida J Sewitch, Deborah A Marshall","doi":"10.1007/s40271-025-00740-y","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>There is limited evidence and guidance in health preferences research to prevent, identify, and manage fraudulent respondents and data fraud, especially for best-worst scaling (BWS) and discrete choice experiments with nonordered attributes. Using an example from a BWS survey in which we experienced data fraud, we aimed to: (1) develop an approach to identify, verify, and categorize fraudulent respondents; (2) assess the impact of fraudulent respondents on data and results; and (3) identify variables associated with fraudulent respondents.</p><p><strong>Methods: </strong>An online BWS survey on healthcare services for inflammatory bowel disease (IBD) was administered to Canadian IBD patients. We used a three-step approach to identify, verify, and categorize respondents as likely fraudulent (LF), likely real (LR), and unsure. First, responses to 12 \"red flag\" variables (variables identified as indicators of fraud) were coded 0 (pass) or 1 (fail) then summed to generate a \"fraudulent response score\" (FRS; range: 0-12 (most likely fraudulent)) used to categorize respondents. Second, respondents categorized LR or unsure underwent age verification. Third, categorization was updated on the basis of age verification results. BWS data were analyzed using conditional logit and latent class analysis. Subgroup analysis was done by final categorization, FRS, and red flag variables.</p><p><strong>Results: </strong>Overall, n = 4334 respondents underwent initial categorization resulting in 24% (n = 1019) LF and 76% (n = 3315) needing further review. After review, 75% (n = 3258) were categorized as LF and n = 484 underwent age verification. Respondent categorization was updated on the basis of age verification, with final categorization of 76% (n = 3297) LF, 14% (n = 592) unsure, 10% (n = 442) LR, and < 1% (n = 3) duplicates of LR. BWS item rankings differed most by respondent category. Latent class analysis demonstrated final categorization was significantly associated with class membership; class 1 had characteristics consistent with LR respondents and item ranking order for class 1 closely aligned with LR respondent conditional logit results. Suspicious email was the most frequently failed red flag variable and was associated with fraudulent respondents.</p><p><strong>Conclusions: </strong>Additional steps to review data and verify age resulted in better categorization than only FRS or single red flag variables. Email authentication, single use/unique survey links, and built-in identification verification may be most effective for fraud prevention. Guidance is needed on good research practices for most effective and efficient approaches for preventing, identifying, and managing fraudulent data in health preferences research, specifically in studies with nonordered attributes.</p>","PeriodicalId":51271,"journal":{"name":"Patient-Patient Centered Outcomes Research","volume":" ","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Identifying and Managing Fraudulent Respondents in Online Stated Preferences Surveys: A Case Example from Best-Worst Scaling in Health Preferences Research.\",\"authors\":\"Karen V MacDonald, Geoffrey C Nguyen, Maida J Sewitch, Deborah A Marshall\",\"doi\":\"10.1007/s40271-025-00740-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>There is limited evidence and guidance in health preferences research to prevent, identify, and manage fraudulent respondents and data fraud, especially for best-worst scaling (BWS) and discrete choice experiments with nonordered attributes. Using an example from a BWS survey in which we experienced data fraud, we aimed to: (1) develop an approach to identify, verify, and categorize fraudulent respondents; (2) assess the impact of fraudulent respondents on data and results; and (3) identify variables associated with fraudulent respondents.</p><p><strong>Methods: </strong>An online BWS survey on healthcare services for inflammatory bowel disease (IBD) was administered to Canadian IBD patients. We used a three-step approach to identify, verify, and categorize respondents as likely fraudulent (LF), likely real (LR), and unsure. First, responses to 12 \\\"red flag\\\" variables (variables identified as indicators of fraud) were coded 0 (pass) or 1 (fail) then summed to generate a \\\"fraudulent response score\\\" (FRS; range: 0-12 (most likely fraudulent)) used to categorize respondents. Second, respondents categorized LR or unsure underwent age verification. Third, categorization was updated on the basis of age verification results. BWS data were analyzed using conditional logit and latent class analysis. Subgroup analysis was done by final categorization, FRS, and red flag variables.</p><p><strong>Results: </strong>Overall, n = 4334 respondents underwent initial categorization resulting in 24% (n = 1019) LF and 76% (n = 3315) needing further review. After review, 75% (n = 3258) were categorized as LF and n = 484 underwent age verification. Respondent categorization was updated on the basis of age verification, with final categorization of 76% (n = 3297) LF, 14% (n = 592) unsure, 10% (n = 442) LR, and < 1% (n = 3) duplicates of LR. BWS item rankings differed most by respondent category. Latent class analysis demonstrated final categorization was significantly associated with class membership; class 1 had characteristics consistent with LR respondents and item ranking order for class 1 closely aligned with LR respondent conditional logit results. Suspicious email was the most frequently failed red flag variable and was associated with fraudulent respondents.</p><p><strong>Conclusions: </strong>Additional steps to review data and verify age resulted in better categorization than only FRS or single red flag variables. Email authentication, single use/unique survey links, and built-in identification verification may be most effective for fraud prevention. Guidance is needed on good research practices for most effective and efficient approaches for preventing, identifying, and managing fraudulent data in health preferences research, specifically in studies with nonordered attributes.</p>\",\"PeriodicalId\":51271,\"journal\":{\"name\":\"Patient-Patient Centered Outcomes Research\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-05-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Patient-Patient Centered Outcomes Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s40271-025-00740-y\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Patient-Patient Centered Outcomes Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s40271-025-00740-y","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
Identifying and Managing Fraudulent Respondents in Online Stated Preferences Surveys: A Case Example from Best-Worst Scaling in Health Preferences Research.
Background: There is limited evidence and guidance in health preferences research to prevent, identify, and manage fraudulent respondents and data fraud, especially for best-worst scaling (BWS) and discrete choice experiments with nonordered attributes. Using an example from a BWS survey in which we experienced data fraud, we aimed to: (1) develop an approach to identify, verify, and categorize fraudulent respondents; (2) assess the impact of fraudulent respondents on data and results; and (3) identify variables associated with fraudulent respondents.
Methods: An online BWS survey on healthcare services for inflammatory bowel disease (IBD) was administered to Canadian IBD patients. We used a three-step approach to identify, verify, and categorize respondents as likely fraudulent (LF), likely real (LR), and unsure. First, responses to 12 "red flag" variables (variables identified as indicators of fraud) were coded 0 (pass) or 1 (fail) then summed to generate a "fraudulent response score" (FRS; range: 0-12 (most likely fraudulent)) used to categorize respondents. Second, respondents categorized LR or unsure underwent age verification. Third, categorization was updated on the basis of age verification results. BWS data were analyzed using conditional logit and latent class analysis. Subgroup analysis was done by final categorization, FRS, and red flag variables.
Results: Overall, n = 4334 respondents underwent initial categorization resulting in 24% (n = 1019) LF and 76% (n = 3315) needing further review. After review, 75% (n = 3258) were categorized as LF and n = 484 underwent age verification. Respondent categorization was updated on the basis of age verification, with final categorization of 76% (n = 3297) LF, 14% (n = 592) unsure, 10% (n = 442) LR, and < 1% (n = 3) duplicates of LR. BWS item rankings differed most by respondent category. Latent class analysis demonstrated final categorization was significantly associated with class membership; class 1 had characteristics consistent with LR respondents and item ranking order for class 1 closely aligned with LR respondent conditional logit results. Suspicious email was the most frequently failed red flag variable and was associated with fraudulent respondents.
Conclusions: Additional steps to review data and verify age resulted in better categorization than only FRS or single red flag variables. Email authentication, single use/unique survey links, and built-in identification verification may be most effective for fraud prevention. Guidance is needed on good research practices for most effective and efficient approaches for preventing, identifying, and managing fraudulent data in health preferences research, specifically in studies with nonordered attributes.
期刊介绍:
The Patient provides a venue for scientifically rigorous, timely, and relevant research to promote the development, evaluation and implementation of therapies, technologies, and innovations that will enhance the patient experience. It is an international forum for research that advances and/or applies qualitative or quantitative methods to promote the generation, synthesis, or interpretation of evidence.
The journal has specific interest in receiving original research, reviews and commentaries related to qualitative and mixed methods research, stated-preference methods, patient reported outcomes, and shared decision making.
Advances in regulatory science, patient-focused drug development, patient-centered benefit-risk and health technology assessment will also be considered.
Additional digital features (including animated abstracts, video abstracts, slide decks, audio slides, instructional videos, infographics, podcasts and animations) can be published with articles; these are designed to increase the visibility, readership and educational value of the journal’s content. In addition, articles published in The Patient may be accompanied by plain language summaries to assist readers who have some knowledge of, but not in-depth expertise in, the area to understand important medical advances.
All manuscripts are subject to peer review by international experts.