Is Less Sometimes More? An Experimental Comparison of Four Measures of Perceived Usability.

IF 3.3 3区心理学 Q1 BEHAVIORAL SCIENCES

Human Factors Pub Date : 2025-01-01 Epub Date: 2024-03-14 DOI:10.1177/00187208241237862

Elisa Gräve, Axel Buchner

{"title":"Is Less Sometimes More? An Experimental Comparison of Four Measures of Perceived Usability.","authors":"Elisa Gräve, Axel Buchner","doi":"10.1177/00187208241237862","DOIUrl":null,"url":null,"abstract":"Objective: In usability studies, the subjective component of usability, perceived usability, is often of interest besides the objective usability components, efficiency and effectiveness. Perceived usability is typically investigated using questionnaires. Our goal was to assess experimentally which of four perceived-usability questionnaires differing in length best reflects the difference in perceived usability between systems.Background: Conventional measurement wisdom strongly favors multi-item questionnaires, as measures based on more items supposedly yield better results. However, this assumption is controversial. Single-item questionnaires also have distinct advantages and it has been shown repeatedly that single-item measures can be viable alternatives to multi-item measures.Method: N = 1089 (Experiment 1) and N = 1095 (Experiment 2) participants rated the perceived usability of a good or a poor web-based mobile phone contract system using the 35-item ISONORM 9241/10 (Experiment 1 only), the 10-item System Usability Scale (SUS), the 4-item Usability Metric for User Experience (UMUX), and the single-item Adjective Rating Scale.Results: The Adjective Rating Scale represented the perceived-usability difference between both systems at least as good as, or significantly better than, the multi-item questionnaires (significantly better than the UMUX and the ISONORM 9241/10 in Experiment 1, significantly better than the SUS in Experiment 2).Conclusion: The single-item Adjective Rating Scale is a viable alternative to multi-item perceived-usability questionnaires.Application: Extremely short instruments can be recommended to measure perceived usability, at least for simple user interfaces that can be considered concrete-singular in the sense that raters understand which entity is being rated and what is being rated is reasonably homogenous.","PeriodicalId":56333,"journal":{"name":"Human Factors","volume":" ","pages":"32-48"},"PeriodicalIF":3.3000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11555902/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Factors","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/00187208241237862","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/3/14 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BEHAVIORAL SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: In usability studies, the subjective component of usability, perceived usability, is often of interest besides the objective usability components, efficiency and effectiveness. Perceived usability is typically investigated using questionnaires. Our goal was to assess experimentally which of four perceived-usability questionnaires differing in length best reflects the difference in perceived usability between systems.

Background: Conventional measurement wisdom strongly favors multi-item questionnaires, as measures based on more items supposedly yield better results. However, this assumption is controversial. Single-item questionnaires also have distinct advantages and it has been shown repeatedly that single-item measures can be viable alternatives to multi-item measures.

Method: N = 1089 (Experiment 1) and N = 1095 (Experiment 2) participants rated the perceived usability of a good or a poor web-based mobile phone contract system using the 35-item ISONORM 9241/10 (Experiment 1 only), the 10-item System Usability Scale (SUS), the 4-item Usability Metric for User Experience (UMUX), and the single-item Adjective Rating Scale.

Results: The Adjective Rating Scale represented the perceived-usability difference between both systems at least as good as, or significantly better than, the multi-item questionnaires (significantly better than the UMUX and the ISONORM 9241/10 in Experiment 1, significantly better than the SUS in Experiment 2).

Conclusion: The single-item Adjective Rating Scale is a viable alternative to multi-item perceived-usability questionnaires.

Application: Extremely short instruments can be recommended to measure perceived usability, at least for simple user interfaces that can be considered concrete-singular in the sense that raters understand which entity is being rated and what is being rated is reasonably homogenous.

Abstract Image

查看原文本刊更多论文

有时越少越好？四种感知可用性测量方法的实验比较

客观在可用性研究中，除了可用性的客观要素--效率和效果之外，可用性的主观要素--感知可用性--往往也会引起人们的兴趣。感知可用性通常采用问卷调查的方式进行研究。我们的目标是通过实验来评估四种不同长度的感知可用性问卷中，哪种问卷最能反映不同系统在感知可用性方面的差异：背景：传统的测量智慧更倾向于多项目问卷，因为基于更多项目的测量应该会产生更好的结果。然而，这一假设是有争议的。单项问卷也有明显的优势，而且事实一再证明，单项问卷可以替代多项目问卷：N=1089人（实验1）和N=1095人（实验2）的参与者使用35个项目的ISONORM 9241/10（仅实验1）、10个项目的系统可用性量表（SUS）、4个项目的用户体验可用性度量（UMUX）和单项目的形容词评定量表对一个好的或差的基于网络的移动电话合同系统的可用性进行评定：结果：形容词评定量表对两个系统之间可用性差异的描述至少与多项目问卷相当，甚至明显优于多项目问卷（在实验 1 中明显优于 UMUX 和 ISONORM 9241/10，在实验 2 中明显优于 SUS）：结论：单项目形容词评定量表是多项目感知可用性问卷的可行替代方案：应用：可以推荐使用非常简短的工具来测量感知可用性，至少对于简单的用户界面来说是如此，因为这些界面可以被认为是具体的--单一的，即评分者了解被评分的是哪个实体，而且被评分的内容也是合理的--同质的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Human Factors 管理科学-行为科学

CiteScore

10.60

自引率

6.10%

发文量

审稿时长

6-12 weeks

期刊介绍： Human Factors: The Journal of the Human Factors and Ergonomics Society publishes peer-reviewed scientific studies in human factors/ergonomics that present theoretical and practical advances concerning the relationship between people and technologies, tools, environments, and systems. Papers published in Human Factors leverage fundamental knowledge of human capabilities and limitations – and the basic understanding of cognitive, physical, behavioral, physiological, social, developmental, affective, and motivational aspects of human performance – to yield design principles; enhance training, selection, and communication; and ultimately improve human-system interfaces and sociotechnical systems that lead to safer and more effective outcomes.