{"title":"人工智能工程可证明目标可长期保持","authors":"Adeniyi Fasoro","doi":"10.1002/aaai.12167","DOIUrl":null,"url":null,"abstract":"<p>I argue that ensuring artificial intelligence (AI) retains alignment with human values over time is critical yet understudied. Most research focuses on static alignment, neglecting crucial retention dynamics enabling stability during learning and autonomy. This paper elucidates limitations constraining provable retention, arguing key gaps include formalizing dynamics, transparency of advanced systems, participatory scaling, and risks of uncontrolled recursive self-improvement. I synthesize technical and ethical perspectives into a conceptual framework grounded in control theory and philosophy to analyze dynamics. I argue priorities should shift towards capability modulation, participatory design, and advanced modeling to verify enduring alignment. Overall, I argue that realizing AI safely aligned throughout its lifetime necessitates translating principles into formal methods, demonstrations, and systems integrating technical and humanistic rigor.</p>","PeriodicalId":7854,"journal":{"name":"Ai Magazine","volume":"45 2","pages":"256-266"},"PeriodicalIF":2.5000,"publicationDate":"2024-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aaai.12167","citationCount":"0","resultStr":"{\"title\":\"Engineering AI for provable retention of objectives over time\",\"authors\":\"Adeniyi Fasoro\",\"doi\":\"10.1002/aaai.12167\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>I argue that ensuring artificial intelligence (AI) retains alignment with human values over time is critical yet understudied. Most research focuses on static alignment, neglecting crucial retention dynamics enabling stability during learning and autonomy. This paper elucidates limitations constraining provable retention, arguing key gaps include formalizing dynamics, transparency of advanced systems, participatory scaling, and risks of uncontrolled recursive self-improvement. I synthesize technical and ethical perspectives into a conceptual framework grounded in control theory and philosophy to analyze dynamics. I argue priorities should shift towards capability modulation, participatory design, and advanced modeling to verify enduring alignment. Overall, I argue that realizing AI safely aligned throughout its lifetime necessitates translating principles into formal methods, demonstrations, and systems integrating technical and humanistic rigor.</p>\",\"PeriodicalId\":7854,\"journal\":{\"name\":\"Ai Magazine\",\"volume\":\"45 2\",\"pages\":\"256-266\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2024-03-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aaai.12167\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ai Magazine\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/aaai.12167\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ai Magazine","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/aaai.12167","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Engineering AI for provable retention of objectives over time
I argue that ensuring artificial intelligence (AI) retains alignment with human values over time is critical yet understudied. Most research focuses on static alignment, neglecting crucial retention dynamics enabling stability during learning and autonomy. This paper elucidates limitations constraining provable retention, arguing key gaps include formalizing dynamics, transparency of advanced systems, participatory scaling, and risks of uncontrolled recursive self-improvement. I synthesize technical and ethical perspectives into a conceptual framework grounded in control theory and philosophy to analyze dynamics. I argue priorities should shift towards capability modulation, participatory design, and advanced modeling to verify enduring alignment. Overall, I argue that realizing AI safely aligned throughout its lifetime necessitates translating principles into formal methods, demonstrations, and systems integrating technical and humanistic rigor.
期刊介绍:
AI Magazine publishes original articles that are reasonably self-contained and aimed at a broad spectrum of the AI community. Technical content should be kept to a minimum. In general, the magazine does not publish articles that have been published elsewhere in whole or in part. The magazine welcomes the contribution of articles on the theory and practice of AI as well as general survey articles, tutorial articles on timely topics, conference or symposia or workshop reports, and timely columns on topics of interest to AI scientists.