量子领域的自动化小工具发现

IF 6.3 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology Pub Date : 2023-08-15 DOI:10.1088/2632-2153/acf098

Lea M. Trenkwalder, Andrea López-Incera, Hendrik Poulsen Nautrup, Fulvio Flamini, H. Briegel

{"title":"量子领域的自动化小工具发现","authors":"Lea M. Trenkwalder, Andrea López-Incera, Hendrik Poulsen Nautrup, Fulvio Flamini, H. Briegel","doi":"10.1088/2632-2153/acf098","DOIUrl":null,"url":null,"abstract":"In recent years, reinforcement learning (RL) has become increasingly successful in its application to the quantum domain and the process of scientific discovery in general. However, while RL algorithms learn to solve increasingly complex problems, interpreting the solutions they provide becomes ever more challenging. In this work, we gain insights into an RL agent’s learned behavior through a post-hoc analysis based on sequence mining and clustering. Specifically, frequent and compact subroutines, used by the agent to solve a given task, are distilled as gadgets and then grouped by various metrics. This process of gadget discovery develops in three stages: First, we use an RL agent to generate data, then, we employ a mining algorithm to extract gadgets and finally, the obtained gadgets are grouped by a density-based clustering algorithm. We demonstrate our method by applying it to two quantum-inspired RL environments. First, we consider simulated quantum optics experiments for the design of high-dimensional multipartite entangled states where the algorithm finds gadgets that correspond to modern interferometer setups. Second, we consider a circuit-based quantum computing environment where the algorithm discovers various gadgets for quantum information processing, such as quantum teleportation. This approach for analyzing the policy of a learned agent is agent and environment agnostic and can yield interesting insights into any agent’s policy.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.3000,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated gadget discovery in the quantum domain\",\"authors\":\"Lea M. Trenkwalder, Andrea López-Incera, Hendrik Poulsen Nautrup, Fulvio Flamini, H. Briegel\",\"doi\":\"10.1088/2632-2153/acf098\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, reinforcement learning (RL) has become increasingly successful in its application to the quantum domain and the process of scientific discovery in general. However, while RL algorithms learn to solve increasingly complex problems, interpreting the solutions they provide becomes ever more challenging. In this work, we gain insights into an RL agent’s learned behavior through a post-hoc analysis based on sequence mining and clustering. Specifically, frequent and compact subroutines, used by the agent to solve a given task, are distilled as gadgets and then grouped by various metrics. This process of gadget discovery develops in three stages: First, we use an RL agent to generate data, then, we employ a mining algorithm to extract gadgets and finally, the obtained gadgets are grouped by a density-based clustering algorithm. We demonstrate our method by applying it to two quantum-inspired RL environments. First, we consider simulated quantum optics experiments for the design of high-dimensional multipartite entangled states where the algorithm finds gadgets that correspond to modern interferometer setups. Second, we consider a circuit-based quantum computing environment where the algorithm discovers various gadgets for quantum information processing, such as quantum teleportation. This approach for analyzing the policy of a learned agent is agent and environment agnostic and can yield interesting insights into any agent’s policy.\",\"PeriodicalId\":33757,\"journal\":{\"name\":\"Machine Learning Science and Technology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2023-08-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine Learning Science and Technology\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.1088/2632-2153/acf098\",\"RegionNum\":2,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning Science and Technology","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/2632-2153/acf098","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

近年来，强化学习(RL)在量子领域和科学发现过程中的应用越来越成功。然而，当强化学习算法学会解决越来越复杂的问题时，解释它们提供的解决方案变得越来越具有挑战性。在这项工作中，我们通过基于序列挖掘和聚类的事后分析深入了解了RL代理的学习行为。具体来说，代理用于解决给定任务的频繁和紧凑的子例程被提取为小工具，然后按各种指标分组。小工具发现的过程分为三个阶段:首先使用RL代理生成数据，然后使用挖掘算法提取小工具，最后使用基于密度的聚类算法对获得的小工具进行分组。我们通过将其应用于两个量子激发RL环境来演示我们的方法。首先，我们考虑设计高维多部纠缠态的模拟量子光学实验，其中算法找到与现代干涉仪设置相对应的小部件。其次，我们考虑了一个基于电路的量子计算环境，其中算法发现了量子信息处理的各种小工具，例如量子隐形传态。这种分析学习代理策略的方法是与代理和环境无关的，可以对任何代理的策略产生有趣的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automated gadget discovery in the quantum domain

In recent years, reinforcement learning (RL) has become increasingly successful in its application to the quantum domain and the process of scientific discovery in general. However, while RL algorithms learn to solve increasingly complex problems, interpreting the solutions they provide becomes ever more challenging. In this work, we gain insights into an RL agent’s learned behavior through a post-hoc analysis based on sequence mining and clustering. Specifically, frequent and compact subroutines, used by the agent to solve a given task, are distilled as gadgets and then grouped by various metrics. This process of gadget discovery develops in three stages: First, we use an RL agent to generate data, then, we employ a mining algorithm to extract gadgets and finally, the obtained gadgets are grouped by a density-based clustering algorithm. We demonstrate our method by applying it to two quantum-inspired RL environments. First, we consider simulated quantum optics experiments for the design of high-dimensional multipartite entangled states where the algorithm finds gadgets that correspond to modern interferometer setups. Second, we consider a circuit-based quantum computing environment where the algorithm discovers various gadgets for quantum information processing, such as quantum teleportation. This approach for analyzing the policy of a learned agent is agent and environment agnostic and can yield interesting insights into any agent’s policy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Machine Learning Science and Technology Computer Science-Artificial Intelligence

CiteScore

9.10

自引率

4.40%

发文量

审稿时长

5 weeks

期刊介绍： Machine Learning Science and Technology is a multidisciplinary open access journal that bridges the application of machine learning across the sciences with advances in machine learning methods and theory as motivated by physical insights. Specifically, articles must fall into one of the following categories: advance the state of machine learning-driven applications in the sciences or make conceptual, methodological or theoretical advances in machine learning with applications to, inspiration from, or motivated by scientific problems.