{"title":"孙和唐评论:问题智能手机使用的测量、评估和效度。","authors":"Richard J. E. James, Lucy Hitcham","doi":"10.1111/add.16764","DOIUrl":null,"url":null,"abstract":"<p>The thoughtful choice of estimation procedures for the confirmatory factor analysis (CFA) and invariance testing is worth particular attention. Many assessment studies use maximum likelihood (ML or MLR with robust standard errors) for CFA despite well-known limitations when applied to ordinal data [<span>7</span>]. A popular alternative is to use limited information estimation, for example, weighted least squares (WLSMV) to overcome these. However, doing so comes with major drawbacks, most notably when assessing measurement invariance [<span>8, 9</span>]. Sun and Tang [<span>5</span>] carefully balance the strengths of both MLR and WLSMV to validate the Problematic Smartphone Use Scale among Chinese college students (PSUS-C). These considerations are valuable across the entirety of addiction research, especially in domains or populations where endorsement of indicators might be skewed (e.g. gambling, certain forms of substance use and general population samples). To illustrate why these problems matter, CFA studies have repeatedly shown inconsistent evidence of structural validity in prominent scales such as the Problem Gambling Severity Index [<span>10, 11</span>]. However, closer examination suggests that most of this inconsistency is an artifact of using ML on ordinal questionnaire items in general population samples where the distribution of responses is often skewed. When analyzed using an approach that balances the strengths of both ML and WLSMV, these inconsistencies disappear [<span>11, 12</span>].</p><p>The findings also highlight an important tension between identifying the best-fitting factor structure and deciding how a scale should be used. Both exploratory factor analysis (EFA) and CFA rejected a single-factor model in this study, yet a sum score was used to assess criterion validity. We raise this to promote the benefits of testing models specifying either a second-order or a bifactor structure because these can assess whether a single score is appropriate [<span>13</span>]. This is an issue across the PSU field, where many scales have been validated as multi-dimensional. but are used as a single score. This tension is a source of analytic flexibility and a potential threat to the validity of many findings, especially when methods such as structural equation modelling are used.</p><p>Our final reflection underscores the importance of invariance testing. Despite concluding in favor of strict invariance, there does not appear to be a comparison of latent mean differences that would allow a stronger test of group differences. Our examination of the descriptive data suggests the absence of a substantial sex difference in PSUS-C scores in this large, externally representative sample. We calculated the standardized effect size (d) using the mean (M) and SD statistics reported in table 1 (men: M = 58.05, SD = 18.09; women: M = 57.52, SD = 16.12). The difference observed in this study does not appear to practically differ from zero (d = 0.03). This finding contrasts with a large, disparate literature that has inconsistently found sex differences in the severity and prevalence of problematic smartphone behaviors (e.g. Cohen's <i>d</i> for women > men = 0.16 [<span>14</span>], 0.39 [<span>15</span>], 0.22 [<span>16</span>], 0.10 [<span>17</span>] and 0.21 [<span>17</span>]). This is further complicated by a fixation on creating novel instruments or adapting scales from other behavioral addictions instead of improving and refining existing measures [<span>18</span>]. Ultimately, the absence of appropriate psychometric validation found in many PSU and behavioral addiction measures makes it impossible to determine whether the group differences observed elsewhere reflect genuine differences or bias caused by sampling, specific measurement scales or specific questionnaire items. Sun and Tang's study [<span>5</span>] offers insights on how to move forward with the assessment and validation of behavioral addiction measurement scales. The use of rigorous testing is essential to establish whether addiction constructs are equivalent across diverse groups of people to make valid group comparisons and inferences [<span>19</span>].</p><p><b>Richard J. E. James:</b> Writing—original draft (equal). <b>Lucy Hitcham:</b> Writing—original draft (equal).</p><p>R.J. has received funding for gambling research projects in the last 3 years from GREO Evidence Insights and the Academic Forum for the Study of Gambling. These funds are sourced from regulatory settlements levied by the Gambling Commission in lieu of penalties. L.H. is funded by the Engineering and Physical Sciences Research Council (EPSRC) on a PhD Studentship scholarship (EP/S023305/1).</p>","PeriodicalId":109,"journal":{"name":"Addiction","volume":"120 4","pages":"642-644"},"PeriodicalIF":5.2000,"publicationDate":"2025-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/add.16764","citationCount":"0","resultStr":"{\"title\":\"Commentary on Sun and Tang: Measurement assessment and validity in problematic smartphone use\",\"authors\":\"Richard J. E. James, Lucy Hitcham\",\"doi\":\"10.1111/add.16764\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The thoughtful choice of estimation procedures for the confirmatory factor analysis (CFA) and invariance testing is worth particular attention. Many assessment studies use maximum likelihood (ML or MLR with robust standard errors) for CFA despite well-known limitations when applied to ordinal data [<span>7</span>]. A popular alternative is to use limited information estimation, for example, weighted least squares (WLSMV) to overcome these. However, doing so comes with major drawbacks, most notably when assessing measurement invariance [<span>8, 9</span>]. Sun and Tang [<span>5</span>] carefully balance the strengths of both MLR and WLSMV to validate the Problematic Smartphone Use Scale among Chinese college students (PSUS-C). These considerations are valuable across the entirety of addiction research, especially in domains or populations where endorsement of indicators might be skewed (e.g. gambling, certain forms of substance use and general population samples). To illustrate why these problems matter, CFA studies have repeatedly shown inconsistent evidence of structural validity in prominent scales such as the Problem Gambling Severity Index [<span>10, 11</span>]. However, closer examination suggests that most of this inconsistency is an artifact of using ML on ordinal questionnaire items in general population samples where the distribution of responses is often skewed. When analyzed using an approach that balances the strengths of both ML and WLSMV, these inconsistencies disappear [<span>11, 12</span>].</p><p>The findings also highlight an important tension between identifying the best-fitting factor structure and deciding how a scale should be used. Both exploratory factor analysis (EFA) and CFA rejected a single-factor model in this study, yet a sum score was used to assess criterion validity. We raise this to promote the benefits of testing models specifying either a second-order or a bifactor structure because these can assess whether a single score is appropriate [<span>13</span>]. This is an issue across the PSU field, where many scales have been validated as multi-dimensional. but are used as a single score. This tension is a source of analytic flexibility and a potential threat to the validity of many findings, especially when methods such as structural equation modelling are used.</p><p>Our final reflection underscores the importance of invariance testing. Despite concluding in favor of strict invariance, there does not appear to be a comparison of latent mean differences that would allow a stronger test of group differences. Our examination of the descriptive data suggests the absence of a substantial sex difference in PSUS-C scores in this large, externally representative sample. We calculated the standardized effect size (d) using the mean (M) and SD statistics reported in table 1 (men: M = 58.05, SD = 18.09; women: M = 57.52, SD = 16.12). The difference observed in this study does not appear to practically differ from zero (d = 0.03). This finding contrasts with a large, disparate literature that has inconsistently found sex differences in the severity and prevalence of problematic smartphone behaviors (e.g. Cohen's <i>d</i> for women > men = 0.16 [<span>14</span>], 0.39 [<span>15</span>], 0.22 [<span>16</span>], 0.10 [<span>17</span>] and 0.21 [<span>17</span>]). This is further complicated by a fixation on creating novel instruments or adapting scales from other behavioral addictions instead of improving and refining existing measures [<span>18</span>]. Ultimately, the absence of appropriate psychometric validation found in many PSU and behavioral addiction measures makes it impossible to determine whether the group differences observed elsewhere reflect genuine differences or bias caused by sampling, specific measurement scales or specific questionnaire items. Sun and Tang's study [<span>5</span>] offers insights on how to move forward with the assessment and validation of behavioral addiction measurement scales. The use of rigorous testing is essential to establish whether addiction constructs are equivalent across diverse groups of people to make valid group comparisons and inferences [<span>19</span>].</p><p><b>Richard J. E. James:</b> Writing—original draft (equal). <b>Lucy Hitcham:</b> Writing—original draft (equal).</p><p>R.J. has received funding for gambling research projects in the last 3 years from GREO Evidence Insights and the Academic Forum for the Study of Gambling. These funds are sourced from regulatory settlements levied by the Gambling Commission in lieu of penalties. L.H. is funded by the Engineering and Physical Sciences Research Council (EPSRC) on a PhD Studentship scholarship (EP/S023305/1).</p>\",\"PeriodicalId\":109,\"journal\":{\"name\":\"Addiction\",\"volume\":\"120 4\",\"pages\":\"642-644\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2025-01-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/add.16764\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Addiction\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/add.16764\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHIATRY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Addiction","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/add.16764","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}
Commentary on Sun and Tang: Measurement assessment and validity in problematic smartphone use
The thoughtful choice of estimation procedures for the confirmatory factor analysis (CFA) and invariance testing is worth particular attention. Many assessment studies use maximum likelihood (ML or MLR with robust standard errors) for CFA despite well-known limitations when applied to ordinal data [7]. A popular alternative is to use limited information estimation, for example, weighted least squares (WLSMV) to overcome these. However, doing so comes with major drawbacks, most notably when assessing measurement invariance [8, 9]. Sun and Tang [5] carefully balance the strengths of both MLR and WLSMV to validate the Problematic Smartphone Use Scale among Chinese college students (PSUS-C). These considerations are valuable across the entirety of addiction research, especially in domains or populations where endorsement of indicators might be skewed (e.g. gambling, certain forms of substance use and general population samples). To illustrate why these problems matter, CFA studies have repeatedly shown inconsistent evidence of structural validity in prominent scales such as the Problem Gambling Severity Index [10, 11]. However, closer examination suggests that most of this inconsistency is an artifact of using ML on ordinal questionnaire items in general population samples where the distribution of responses is often skewed. When analyzed using an approach that balances the strengths of both ML and WLSMV, these inconsistencies disappear [11, 12].
The findings also highlight an important tension between identifying the best-fitting factor structure and deciding how a scale should be used. Both exploratory factor analysis (EFA) and CFA rejected a single-factor model in this study, yet a sum score was used to assess criterion validity. We raise this to promote the benefits of testing models specifying either a second-order or a bifactor structure because these can assess whether a single score is appropriate [13]. This is an issue across the PSU field, where many scales have been validated as multi-dimensional. but are used as a single score. This tension is a source of analytic flexibility and a potential threat to the validity of many findings, especially when methods such as structural equation modelling are used.
Our final reflection underscores the importance of invariance testing. Despite concluding in favor of strict invariance, there does not appear to be a comparison of latent mean differences that would allow a stronger test of group differences. Our examination of the descriptive data suggests the absence of a substantial sex difference in PSUS-C scores in this large, externally representative sample. We calculated the standardized effect size (d) using the mean (M) and SD statistics reported in table 1 (men: M = 58.05, SD = 18.09; women: M = 57.52, SD = 16.12). The difference observed in this study does not appear to practically differ from zero (d = 0.03). This finding contrasts with a large, disparate literature that has inconsistently found sex differences in the severity and prevalence of problematic smartphone behaviors (e.g. Cohen's d for women > men = 0.16 [14], 0.39 [15], 0.22 [16], 0.10 [17] and 0.21 [17]). This is further complicated by a fixation on creating novel instruments or adapting scales from other behavioral addictions instead of improving and refining existing measures [18]. Ultimately, the absence of appropriate psychometric validation found in many PSU and behavioral addiction measures makes it impossible to determine whether the group differences observed elsewhere reflect genuine differences or bias caused by sampling, specific measurement scales or specific questionnaire items. Sun and Tang's study [5] offers insights on how to move forward with the assessment and validation of behavioral addiction measurement scales. The use of rigorous testing is essential to establish whether addiction constructs are equivalent across diverse groups of people to make valid group comparisons and inferences [19].
Richard J. E. James: Writing—original draft (equal). Lucy Hitcham: Writing—original draft (equal).
R.J. has received funding for gambling research projects in the last 3 years from GREO Evidence Insights and the Academic Forum for the Study of Gambling. These funds are sourced from regulatory settlements levied by the Gambling Commission in lieu of penalties. L.H. is funded by the Engineering and Physical Sciences Research Council (EPSRC) on a PhD Studentship scholarship (EP/S023305/1).
期刊介绍:
Addiction publishes peer-reviewed research reports on pharmacological and behavioural addictions, bringing together research conducted within many different disciplines.
Its goal is to serve international and interdisciplinary scientific and clinical communication, to strengthen links between science and policy, and to stimulate and enhance the quality of debate. We seek submissions that are not only technically competent but are also original and contain information or ideas of fresh interest to our international readership. We seek to serve low- and middle-income (LAMI) countries as well as more economically developed countries.
Addiction’s scope spans human experimental, epidemiological, social science, historical, clinical and policy research relating to addiction, primarily but not exclusively in the areas of psychoactive substance use and/or gambling. In addition to original research, the journal features editorials, commentaries, reviews, letters, and book reviews.