模拟人工代理团队任务中的信任度量

Human Factors and Simulation Pub Date : 1900-01-01 DOI:10.54941/ahfe1003560

C. Ficke, Arianna Addis, Daniela Nguyen, Kendall Carmody, Amanda L. Thayer, Jessica L. Wildman, M. Carroll

{"title":"模拟人工代理团队任务中的信任度量","authors":"C. Ficke, Arianna Addis, Daniela Nguyen, Kendall Carmody, Amanda L. Thayer, Jessica L. Wildman, M. Carroll","doi":"10.54941/ahfe1003560","DOIUrl":null,"url":null,"abstract":"Due to improvements in agent capabilities through technological advancements, the prevalence of human-agent teams (HATs) are expanding into more dynamic and complex environments. Prior research suggests that human trust in agents plays a pivotal role in the team’s success and mission effectiveness (Yu et al., 2019; Kohn et al., 2020). Therefore, understanding and being able to accurately measure trust in HATs is critical. The literature presents numerous approaches to capture and measure trust in HATs, including behavioral indicators, self-report survey items, and physiological measures to capture and quantify trust. However, deciding when and which measures to use can be an overwhelming and tedious process. To combat this issue, we previously developed a theoretical framework to guide researchers in what measures to use and when to use them in a HAT context (Ficke et al., 2022). More specifically, we evaluated common measures of trust in HATs according to eight criteria and demonstrated the utility of different types of measures in various scenarios according to how dynamic trust was expected to be and how often teammates interacted with one another. In the current work, we operationalize this framework in a simulation-based research setting. In particular, we developed a simulated search and rescue task paradigm in which a human teammate interacts with two subteams of autonomous agents to identify and respond to targets such as enemies, IEDs and trapped civilians. Using the Ficke et al. (2022) framework as a guide, we identified self-report, behavioral, and physiological measures to capture human trust in their autonomous agent counterparts, at the individual, subteam, and full team levels. Measures included single-item and multi-item self report surveys, chosen due to their accessibility and prevalence across research domains, as well as their simplistic ability to assess multifaceted constructs. These self-report measures will also be used to assess convergent validity of newly developed unobtrusive (i.e., behavioral, physiological) measures of trust. Further, using the six-step Rational Approach to Developing Systems-based Measures (RADSM) process, we cross-referenced theory on trust with available data from the paradigm to develop context-appropriate behavioral measures of trust. The RADSM process differs from traditional data-led approaches in that it is simultaneously a top-down (data-driven) and bottom-up (theory-driven) approach (Orvis, et al., 2013). Through this process, we identified a range of measures such as usage behaviors (to use or misuse an entity), monitoring behaviors, response time, and other context-specific actions to capture trust. We also incorporated tools to capture physiological responses, including electrocardiogram readings and galvanic skin responses. These measures will be utilized in a series of simulation-based experiments examining the effect of trust violation and repair strategies on trust as well as to evaluate the validity and reliability of the measurement framework. This paper will describe the methods used to identify, develop and/or implement these measures, the resulting measure implementation and how the resulting measurement toolbox maps onto the evaluation criteria (e.g., temporal resolution, diagnosticity), and guidance for implementation in other domains.","PeriodicalId":102446,"journal":{"name":"Human Factors and Simulation","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Measuring Trust in a Simulated Human Agent Team Task\",\"authors\":\"C. Ficke, Arianna Addis, Daniela Nguyen, Kendall Carmody, Amanda L. Thayer, Jessica L. Wildman, M. Carroll\",\"doi\":\"10.54941/ahfe1003560\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to improvements in agent capabilities through technological advancements, the prevalence of human-agent teams (HATs) are expanding into more dynamic and complex environments. Prior research suggests that human trust in agents plays a pivotal role in the team’s success and mission effectiveness (Yu et al., 2019; Kohn et al., 2020). Therefore, understanding and being able to accurately measure trust in HATs is critical. The literature presents numerous approaches to capture and measure trust in HATs, including behavioral indicators, self-report survey items, and physiological measures to capture and quantify trust. However, deciding when and which measures to use can be an overwhelming and tedious process. To combat this issue, we previously developed a theoretical framework to guide researchers in what measures to use and when to use them in a HAT context (Ficke et al., 2022). More specifically, we evaluated common measures of trust in HATs according to eight criteria and demonstrated the utility of different types of measures in various scenarios according to how dynamic trust was expected to be and how often teammates interacted with one another. In the current work, we operationalize this framework in a simulation-based research setting. In particular, we developed a simulated search and rescue task paradigm in which a human teammate interacts with two subteams of autonomous agents to identify and respond to targets such as enemies, IEDs and trapped civilians. Using the Ficke et al. (2022) framework as a guide, we identified self-report, behavioral, and physiological measures to capture human trust in their autonomous agent counterparts, at the individual, subteam, and full team levels. Measures included single-item and multi-item self report surveys, chosen due to their accessibility and prevalence across research domains, as well as their simplistic ability to assess multifaceted constructs. These self-report measures will also be used to assess convergent validity of newly developed unobtrusive (i.e., behavioral, physiological) measures of trust. Further, using the six-step Rational Approach to Developing Systems-based Measures (RADSM) process, we cross-referenced theory on trust with available data from the paradigm to develop context-appropriate behavioral measures of trust. The RADSM process differs from traditional data-led approaches in that it is simultaneously a top-down (data-driven) and bottom-up (theory-driven) approach (Orvis, et al., 2013). Through this process, we identified a range of measures such as usage behaviors (to use or misuse an entity), monitoring behaviors, response time, and other context-specific actions to capture trust. We also incorporated tools to capture physiological responses, including electrocardiogram readings and galvanic skin responses. These measures will be utilized in a series of simulation-based experiments examining the effect of trust violation and repair strategies on trust as well as to evaluate the validity and reliability of the measurement framework. This paper will describe the methods used to identify, develop and/or implement these measures, the resulting measure implementation and how the resulting measurement toolbox maps onto the evaluation criteria (e.g., temporal resolution, diagnosticity), and guidance for implementation in other domains.\",\"PeriodicalId\":102446,\"journal\":{\"name\":\"Human Factors and Simulation\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Human Factors and Simulation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.54941/ahfe1003560\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Factors and Simulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54941/ahfe1003560","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

由于技术进步对代理能力的改进，人类代理团队(hat)的流行正在扩展到更动态和复杂的环境中。先前的研究表明，人类对代理人的信任在团队的成功和任务有效性中起着关键作用(Yu et al.， 2019;Kohn et al.， 2020)。因此，理解并能够准确地度量HATs中的信任是至关重要的。文献提出了许多方法来捕获和测量hat中的信任，包括行为指标，自我报告调查项目和生理测量来捕获和量化信任。然而，决定何时以及使用哪些度量标准可能是一个令人不知所措且乏味的过程。为了解决这个问题，我们之前开发了一个理论框架来指导研究人员在HAT环境中使用哪些措施以及何时使用这些措施(Ficke et al.， 2022)。更具体地说，我们根据八个标准评估了hat中常见的信任度量，并根据动态信任的预期程度以及团队成员相互互动的频率，演示了不同类型度量在各种场景中的效用。在目前的工作中，我们在一个基于模拟的研究环境中实现了这个框架。特别是，我们开发了一个模拟搜索和救援任务范例，其中一个人类队友与两个自主代理子团队交互，以识别和响应目标，如敌人，简易爆炸装置和被困平民。以Ficke等人(2022)的框架为指导，我们确定了自我报告、行为和生理措施，以在个人、子团队和整个团队层面捕捉人类对自主代理对手的信任。测量方法包括单项目和多项目自我报告调查，选择它们是因为它们在研究领域的可访问性和普遍性，以及它们评估多方面结构的简单能力。这些自我报告的措施也将被用来评估收敛效度的新开发的不引人注目的(即行为，生理)措施的信任。此外，利用六步理性方法开发基于系统的措施(RADSM)过程，我们将信任理论与范式中的可用数据交叉引用，以开发适合上下文的信任行为措施。RADSM过程不同于传统的以数据为主导的方法，因为它同时是自上而下(数据驱动)和自下而上(理论驱动)的方法(Orvis等人，2013)。通过这个过程，我们确定了一系列度量，如使用行为(使用或滥用实体)、监控行为、响应时间和其他特定于上下文的操作来获取信任。我们还采用了捕获生理反应的工具，包括心电图读数和皮肤电反应。这些测量将被用于一系列基于模拟的实验，以检验信任违背和修复策略对信任的影响，并评估测量框架的效度和信度。本文将描述用于识别、开发和/或实现这些度量的方法，结果度量的实现以及结果度量工具箱如何映射到评估标准(例如，时间分辨率，诊断性)，以及在其他领域实现的指导。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Measuring Trust in a Simulated Human Agent Team Task

Due to improvements in agent capabilities through technological advancements, the prevalence of human-agent teams (HATs) are expanding into more dynamic and complex environments. Prior research suggests that human trust in agents plays a pivotal role in the team’s success and mission effectiveness (Yu et al., 2019; Kohn et al., 2020). Therefore, understanding and being able to accurately measure trust in HATs is critical. The literature presents numerous approaches to capture and measure trust in HATs, including behavioral indicators, self-report survey items, and physiological measures to capture and quantify trust. However, deciding when and which measures to use can be an overwhelming and tedious process. To combat this issue, we previously developed a theoretical framework to guide researchers in what measures to use and when to use them in a HAT context (Ficke et al., 2022). More specifically, we evaluated common measures of trust in HATs according to eight criteria and demonstrated the utility of different types of measures in various scenarios according to how dynamic trust was expected to be and how often teammates interacted with one another. In the current work, we operationalize this framework in a simulation-based research setting. In particular, we developed a simulated search and rescue task paradigm in which a human teammate interacts with two subteams of autonomous agents to identify and respond to targets such as enemies, IEDs and trapped civilians. Using the Ficke et al. (2022) framework as a guide, we identified self-report, behavioral, and physiological measures to capture human trust in their autonomous agent counterparts, at the individual, subteam, and full team levels. Measures included single-item and multi-item self report surveys, chosen due to their accessibility and prevalence across research domains, as well as their simplistic ability to assess multifaceted constructs. These self-report measures will also be used to assess convergent validity of newly developed unobtrusive (i.e., behavioral, physiological) measures of trust. Further, using the six-step Rational Approach to Developing Systems-based Measures (RADSM) process, we cross-referenced theory on trust with available data from the paradigm to develop context-appropriate behavioral measures of trust. The RADSM process differs from traditional data-led approaches in that it is simultaneously a top-down (data-driven) and bottom-up (theory-driven) approach (Orvis, et al., 2013). Through this process, we identified a range of measures such as usage behaviors (to use or misuse an entity), monitoring behaviors, response time, and other context-specific actions to capture trust. We also incorporated tools to capture physiological responses, including electrocardiogram readings and galvanic skin responses. These measures will be utilized in a series of simulation-based experiments examining the effect of trust violation and repair strategies on trust as well as to evaluate the validity and reliability of the measurement framework. This paper will describe the methods used to identify, develop and/or implement these measures, the resulting measure implementation and how the resulting measurement toolbox maps onto the evaluation criteria (e.g., temporal resolution, diagnosticity), and guidance for implementation in other domains.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Human Factors and Simulation

自引率

0.00%

发文量