Rinta Hasegawa, Yosuke Fukuchi, Kohei Okuoka, M. Imai
{"title":"Advantage Mapping: Learning Operation Mapping for User-Preferred Manipulation by Extracting Scenes with Advantage Function","authors":"Rinta Hasegawa, Yosuke Fukuchi, Kohei Okuoka, M. Imai","doi":"10.1145/3527188.3561917","DOIUrl":null,"url":null,"abstract":"When a user manipulates a system, a user input through an interface, or an operation, is converted to the user’s intended action according to the mapping that links operations and actions, which we call “operation mapping”. Although many operation mappings are created by designers assuming how a typical user would operate the system, the optimal operation mapping may vary from user to user. The designer cannot prepare in advance all possible operation mappings. One approach to solve this problem involves autonomous learning of an operation mapping during the operation. However, existing methods require manual preparation of scenes for learning mappings. We propose advantage mapping, which enables the efficient learning of operation mappings. Working from the idea that scenes in which the user’s desired action is predictable are useful for learning operation mappings, advantage mapping extracts scenes according to the magnitude of entropy in the output of the action value function acquired from reinforcement learning. In our experiment, the user’s ideal operation mapping was more accurately obtained from the scenes selected by advantage mapping than from learning through actual play.","PeriodicalId":179256,"journal":{"name":"Proceedings of the 10th International Conference on Human-Agent Interaction","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 10th International Conference on Human-Agent Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3527188.3561917","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
When a user manipulates a system, a user input through an interface, or an operation, is converted to the user’s intended action according to the mapping that links operations and actions, which we call “operation mapping”. Although many operation mappings are created by designers assuming how a typical user would operate the system, the optimal operation mapping may vary from user to user. The designer cannot prepare in advance all possible operation mappings. One approach to solve this problem involves autonomous learning of an operation mapping during the operation. However, existing methods require manual preparation of scenes for learning mappings. We propose advantage mapping, which enables the efficient learning of operation mappings. Working from the idea that scenes in which the user’s desired action is predictable are useful for learning operation mappings, advantage mapping extracts scenes according to the magnitude of entropy in the output of the action value function acquired from reinforcement learning. In our experiment, the user’s ideal operation mapping was more accurately obtained from the scenes selected by advantage mapping than from learning through actual play.