Rinta Hasegawa, Yosuke Fukuchi, Kohei Okuoka, M. Imai
{"title":"优势映射:利用优势函数提取场景,学习用户偏好操作的操作映射","authors":"Rinta Hasegawa, Yosuke Fukuchi, Kohei Okuoka, M. Imai","doi":"10.1145/3527188.3561917","DOIUrl":null,"url":null,"abstract":"When a user manipulates a system, a user input through an interface, or an operation, is converted to the user’s intended action according to the mapping that links operations and actions, which we call “operation mapping”. Although many operation mappings are created by designers assuming how a typical user would operate the system, the optimal operation mapping may vary from user to user. The designer cannot prepare in advance all possible operation mappings. One approach to solve this problem involves autonomous learning of an operation mapping during the operation. However, existing methods require manual preparation of scenes for learning mappings. We propose advantage mapping, which enables the efficient learning of operation mappings. Working from the idea that scenes in which the user’s desired action is predictable are useful for learning operation mappings, advantage mapping extracts scenes according to the magnitude of entropy in the output of the action value function acquired from reinforcement learning. In our experiment, the user’s ideal operation mapping was more accurately obtained from the scenes selected by advantage mapping than from learning through actual play.","PeriodicalId":179256,"journal":{"name":"Proceedings of the 10th International Conference on Human-Agent Interaction","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Advantage Mapping: Learning Operation Mapping for User-Preferred Manipulation by Extracting Scenes with Advantage Function\",\"authors\":\"Rinta Hasegawa, Yosuke Fukuchi, Kohei Okuoka, M. Imai\",\"doi\":\"10.1145/3527188.3561917\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When a user manipulates a system, a user input through an interface, or an operation, is converted to the user’s intended action according to the mapping that links operations and actions, which we call “operation mapping”. Although many operation mappings are created by designers assuming how a typical user would operate the system, the optimal operation mapping may vary from user to user. The designer cannot prepare in advance all possible operation mappings. One approach to solve this problem involves autonomous learning of an operation mapping during the operation. However, existing methods require manual preparation of scenes for learning mappings. We propose advantage mapping, which enables the efficient learning of operation mappings. Working from the idea that scenes in which the user’s desired action is predictable are useful for learning operation mappings, advantage mapping extracts scenes according to the magnitude of entropy in the output of the action value function acquired from reinforcement learning. In our experiment, the user’s ideal operation mapping was more accurately obtained from the scenes selected by advantage mapping than from learning through actual play.\",\"PeriodicalId\":179256,\"journal\":{\"name\":\"Proceedings of the 10th International Conference on Human-Agent Interaction\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 10th International Conference on Human-Agent Interaction\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3527188.3561917\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 10th International Conference on Human-Agent Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3527188.3561917","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Advantage Mapping: Learning Operation Mapping for User-Preferred Manipulation by Extracting Scenes with Advantage Function
When a user manipulates a system, a user input through an interface, or an operation, is converted to the user’s intended action according to the mapping that links operations and actions, which we call “operation mapping”. Although many operation mappings are created by designers assuming how a typical user would operate the system, the optimal operation mapping may vary from user to user. The designer cannot prepare in advance all possible operation mappings. One approach to solve this problem involves autonomous learning of an operation mapping during the operation. However, existing methods require manual preparation of scenes for learning mappings. We propose advantage mapping, which enables the efficient learning of operation mappings. Working from the idea that scenes in which the user’s desired action is predictable are useful for learning operation mappings, advantage mapping extracts scenes according to the magnitude of entropy in the output of the action value function acquired from reinforcement learning. In our experiment, the user’s ideal operation mapping was more accurately obtained from the scenes selected by advantage mapping than from learning through actual play.