arXiv - CS - Emerging Technologies最新文献

筛选
英文 中文
Decentralized Intelligence Health Network (DIHN) 分散式情报保健网络(DIHN)
arXiv - CS - Emerging Technologies Pub Date : 2024-08-12 DOI: arxiv-2408.06240
Abraham Nash
{"title":"Decentralized Intelligence Health Network (DIHN)","authors":"Abraham Nash","doi":"arxiv-2408.06240","DOIUrl":"https://doi.org/arxiv-2408.06240","url":null,"abstract":"Decentralized Intelligence Health Network (DIHN) is a theoretical framework\u0000addressing significant challenges of health data sovereignty and AI utilization\u0000in healthcare caused by data fragmentation across providers and institutions.\u0000It establishes a sovereign architecture for healthcare provision as a\u0000prerequisite to a sovereign health network, then facilitates effective AI\u0000utilization by overcoming barriers to accessing diverse medical data sources.\u0000This comprehensive framework leverages: 1) self-sovereign identity architecture\u0000coupled with a personal health record (PHR) as a prerequisite for health data\u0000sovereignty; 2) a scalable federated learning (FL) protocol implemented on a\u0000public blockchain for decentralized AI training in healthcare, where health\u0000data remains with participants and only model parameter updates are shared; and\u00003) a scalable, trustless rewards mechanism to incentivize participation and\u0000ensure fair reward distribution. This framework ensures that no entity can\u0000prevent or control access to training on health data offered by participants or\u0000determine financial benefits, as these processes operate on a public blockchain\u0000with an immutable record and without a third party. It supports effective AI\u0000training in healthcare, allowing patients to maintain control over their health\u0000data, benefit financially, and contribute to a decentralized, scalable\u0000ecosystem that leverages collective AI to develop beneficial healthcare\u0000algorithms. Patients receive rewards into their digital wallets as an incentive\u0000to opt-in to the FL protocol, with a long-term roadmap to funding decentralized\u0000insurance solutions. This approach introduces a novel, self-financed healthcare\u0000model that adapts to individual needs, complements existing systems, and\u0000redefines universal coverage. It highlights the potential to transform\u0000healthcare data management and AI utilization while empowering patients.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative Evaluation of Memory Technologies for Synaptic Crossbar Arrays- Part 2: Design Knobs and DNN Accuracy Trends 突触交叉条阵列记忆技术比较评估--第 2 部分:设计旋钮和 DNN 精度趋势
arXiv - CS - Emerging Technologies Pub Date : 2024-08-11 DOI: arxiv-2408.05857
Jeffry Victor, Chunguang Wang, Sumeet K. Gupta
{"title":"Comparative Evaluation of Memory Technologies for Synaptic Crossbar Arrays- Part 2: Design Knobs and DNN Accuracy Trends","authors":"Jeffry Victor, Chunguang Wang, Sumeet K. Gupta","doi":"arxiv-2408.05857","DOIUrl":"https://doi.org/arxiv-2408.05857","url":null,"abstract":"Crossbar memory arrays have been touted as the workhorse of in-memory\u0000computing (IMC)-based acceleration of Deep Neural Networks (DNNs), but the\u0000associated hardware non-idealities limit their efficacy. To address this,\u0000cross-layer design solutions that reduce the impact of hardware non-idealities\u0000on DNN accuracy are needed. In Part 1 of this paper, we established the\u0000co-optimization strategies for various memory technologies and their crossbar\u0000arrays, and conducted a comparative technology evaluation in the context of IMC\u0000robustness. In this part, we analyze various design knobs such as array size\u0000and bit-slice (number of bits per device) and their impact on the performance\u0000of 8T SRAM, ferroelectric transistor (FeFET), Resistive RAM (ReRAM) and\u0000spin-orbit-torque magnetic RAM (SOT-MRAM) in the context of inference accuracy\u0000at 7nm technology node. Further, we study the effect of circuit design\u0000solutions such as Partial Wordline Activation (PWA) and custom ADC reference\u0000levels that reduce the hardware non-idealities and comparatively analyze the\u0000response of each technology to such accuracy enhancing techniques. Our results\u0000on ResNet-20 (with CIFAR-10) show that PWA increases accuracy by up to 32.56%\u0000while custom ADC reference levels yield up to 31.62% accuracy enhancement. We\u0000observe that compared to the other technologies, FeFET, by virtue of its small\u0000layout height and high distinguishability of its memory states, is best suited\u0000for large arrays. For higher bit-slices and a more complex dataset (ResNet-50\u0000with Cifar-100) we found that ReRAM matches the performance of FeFET.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Collaborative PIM Computing Optimization Framework for Multi-Tenant DNN 多租户 DNN 的 PIM 协同计算优化框架
arXiv - CS - Emerging Technologies Pub Date : 2024-08-09 DOI: arxiv-2408.04812
Bojing Li, Duo Zhong, Xiang Chen, Chenchen Liu
{"title":"A Collaborative PIM Computing Optimization Framework for Multi-Tenant DNN","authors":"Bojing Li, Duo Zhong, Xiang Chen, Chenchen Liu","doi":"arxiv-2408.04812","DOIUrl":"https://doi.org/arxiv-2408.04812","url":null,"abstract":"Modern Artificial Intelligence (AI) applications are increasingly utilizing\u0000multi-tenant deep neural networks (DNNs), which lead to a significant rise in\u0000computing complexity and the need for computing parallelism. ReRAM-based\u0000processing-in-memory (PIM) computing, with its high density and low power\u0000consumption characteristics, holds promising potential for supporting the\u0000deployment of multi-tenant DNNs. However, direct deployment of complex\u0000multi-tenant DNNs on exsiting ReRAM-based PIM designs poses challenges.\u0000Resource contention among different tenants can result in sever\u0000under-utilization of on-chip computing resources. Moreover, area-intensive\u0000operators and computation-intensive operators require excessively large on-chip\u0000areas and long processing times, leading to high overall latency during\u0000parallel computing. To address these challenges, we propose a novel ReRAM-based\u0000in-memory computing framework that enables efficient deployment of multi-tenant\u0000DNNs on ReRAM-based PIM designs. Our approach tackles the resource contention\u0000problems by iteratively partitioning the PIM hardware at tenant level. In\u0000addition, we construct a fine-grained reconstructed processing pipeline at the\u0000operator level to handle area-intensive operators. Compared to the direct\u0000deployments on traditional ReRAM-based PIM designs, our proposed PIM computing\u0000framework achieves significant improvements in speed (ranges from 1.75x to\u000060.43x) and energy(up to 1.89x).","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"2011 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conceptual Design and Implementation of FIDO2 compatible Smart Card for Decentralized Financial Transaction System 用于分散式金融交易系统的 FIDO2 兼容智能卡的概念设计与实施
arXiv - CS - Emerging Technologies Pub Date : 2024-08-09 DOI: arxiv-2408.04977
Anisha Ghosh, Aditya Mitra, Sibi Chakkaravarthy Sethuraman, Aswani Kumar Cherukuri
{"title":"Conceptual Design and Implementation of FIDO2 compatible Smart Card for Decentralized Financial Transaction System","authors":"Anisha Ghosh, Aditya Mitra, Sibi Chakkaravarthy Sethuraman, Aswani Kumar Cherukuri","doi":"arxiv-2408.04977","DOIUrl":"https://doi.org/arxiv-2408.04977","url":null,"abstract":"With challenges and limitations associated with security in the fintech\u0000industry, the rise to the need for data protection increases. However, the\u0000current existing passwordless and password-based peer to peer transactions in\u0000online banking systems are vulnerable to advanced forms of digital attacks. The\u0000influx of modern data protection methods keeps better records of the\u0000transactions, but it still does not address the issue of authentication and\u0000account takeovers during transactions. To the address the mentioned issue, this\u0000paper proposes a novel and robust peer to peer transaction system which employs\u0000best cloud security practices, proper use of cryptography and trusted computing\u0000to mitigate common vulnerabilities. We will be implementing FIDO2 compatible\u0000Smart Card to securely authenticate the user using physical smart cards and\u0000store the records in the cloud which enables access control by allowing access\u0000only when an access is requested. The standard incorporates multiple layers of\u0000security on cloud computing models to ensure secrecy of the said data. Services\u0000of the standard adhere to regulations provides by the government and assures\u0000privacy to the information of the payee or the end-user. The whole system has\u0000been implemented in the Internet of Things scenario.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantum Key Storage for Efficient Key Management 高效密钥管理的量子密钥存储
arXiv - CS - Emerging Technologies Pub Date : 2024-08-08 DOI: arxiv-2408.04598
Emir Dervisevic, Amina Tankovic, Enio Kaljic, Miroslav Voznak, Miralem Mehic
{"title":"Quantum Key Storage for Efficient Key Management","authors":"Emir Dervisevic, Amina Tankovic, Enio Kaljic, Miroslav Voznak, Miralem Mehic","doi":"arxiv-2408.04598","DOIUrl":"https://doi.org/arxiv-2408.04598","url":null,"abstract":"In the ongoing discourse surrounding integrating QKD networks as a service\u0000for critical infrastructures, key storage design often receives insufficient\u0000attention. Nonetheless, it bears crucial significance as it profoundly impacts\u0000the efficiency of QKD network services, thereby shaping its suitability for\u0000diverse applications. In this article, we analyze the effectiveness of key\u0000storage designs developed through practical testbeds and propose a novel key\u0000storage design to increase the effectiveness of key creation and supply. All\u0000key storage designs underwent analysis using network simulation tools, and the\u0000findings demonstrate that the novel key storage design surpasses existing\u0000approaches in terms of performance.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Systematic Literature Map on Big Data 大数据系统文献图
arXiv - CS - Emerging Technologies Pub Date : 2024-08-08 DOI: arxiv-2408.05253
Rogerio Rossi, Kechi Hirama, Eduardo Ferreira Franco
{"title":"A Systematic Literature Map on Big Data","authors":"Rogerio Rossi, Kechi Hirama, Eduardo Ferreira Franco","doi":"arxiv-2408.05253","DOIUrl":"https://doi.org/arxiv-2408.05253","url":null,"abstract":"The paradigm of Big Data has been established as a solid field of studies in\u0000many areas such as healthcare, science, transport, education, government\u0000services, among others. Despite widely discussed, there is no agreed definition\u0000about the paradigm although there are many concepts proposed by the academy and\u0000industry. This work aims to provide an analytical view of the studies conducted\u0000and published regarding the Big Data paradigm. The approach used is the\u0000systematic map of the literature, combining bibliometric analysis and content\u0000analysis to depict the panorama of research works, identifying patterns,\u0000trends, and gaps. The results indicate that there is still a long way to go,\u0000both in research and in concepts, such as building and defining adequate\u0000infrastructures and standards, to meet future challenges and for the paradigm\u0000to become effective and bring the expected benefits.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
C-Nash: A Novel Ferroelectric Computing-in-Memory Architecture for Solving Mixed Strategy Nash Equilibrium C-Nash:用于求解混合策略纳什均衡的新型铁电计算内存架构
arXiv - CS - Emerging Technologies Pub Date : 2024-08-08 DOI: arxiv-2408.04169
Yu Qian, Kai Ni, Thomas Kämpfe, Cheng Zhuo, Xunzhao Yin
{"title":"C-Nash: A Novel Ferroelectric Computing-in-Memory Architecture for Solving Mixed Strategy Nash Equilibrium","authors":"Yu Qian, Kai Ni, Thomas Kämpfe, Cheng Zhuo, Xunzhao Yin","doi":"arxiv-2408.04169","DOIUrl":"https://doi.org/arxiv-2408.04169","url":null,"abstract":"The concept of Nash equilibrium (NE), pivotal within game theory, has\u0000garnered widespread attention across numerous industries. Recent advancements\u0000introduced several quantum Nash solvers aimed at identifying pure strategy NE\u0000solutions (i.e., binary solutions) by integrating slack terms into the\u0000objective function, commonly referred to as slack-quadratic unconstrained\u0000binary optimization (S-QUBO). However, incorporation of slack terms into the\u0000quadratic optimization results in changes of the objective function, which may\u0000cause incorrect solutions. Furthermore, these quantum solvers only identify a\u0000limited subset of pure strategy NE solutions, and fail to address mixed\u0000strategy NE (i.e., decimal solutions), leaving many solutions undiscovered. In\u0000this work, we propose C-Nash, a novel ferroelectric computing-in-memory (CiM)\u0000architecture that can efficiently handle both pure and mixed strategy NE\u0000solutions. The proposed architecture consists of (i) a transformation method\u0000that converts quadratic optimization into a MAX-QUBO form without introducing\u0000additional slack variables, thereby avoiding objective function changes; (ii) a\u0000ferroelectric FET (FeFET) based bi-crossbar structure for storing payoff\u0000matrices and accelerating the core vector-matrix-vector (VMV) multiplications\u0000of QUBO form; (iii) A winner-takes-all (WTA) tree implementing the MAX form and\u0000a two-phase based simulated annealing (SA) logic for searching NE solutions.\u0000Evaluations show that C-Nash has up to 68.6% increase in the success rate for\u0000identifying NE solutions, finding all pure and mixed NE solutions rather than\u0000only a portion of pure NE solutions, compared to D-Wave based quantum\u0000approaches. Moreover, C-Nash boasts a reduction up to 157.9X/79.0X in\u0000time-to-solutions compared to D-Wave 2000 Q6 and D-Wave Advantage 4.1,\u0000respectively.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"77 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Employing Vector Field Techniques on the Analysis of Memristor Cellular Nonlinear Networks Cell Dynamics 利用矢量场技术分析晶状体细胞非线性网络的细胞动力学
arXiv - CS - Emerging Technologies Pub Date : 2024-08-06 DOI: arxiv-2408.03260
Chandan Singh, Vasileios Ntinas, Dimitrios Prousalis, Yongmin Wang, Ahmet Samil Demirkol, Ioannis Messaris, Vikas Rana, Stephan Menzel, Alon Ascoli, Ronald Tetzlaff
{"title":"Employing Vector Field Techniques on the Analysis of Memristor Cellular Nonlinear Networks Cell Dynamics","authors":"Chandan Singh, Vasileios Ntinas, Dimitrios Prousalis, Yongmin Wang, Ahmet Samil Demirkol, Ioannis Messaris, Vikas Rana, Stephan Menzel, Alon Ascoli, Ronald Tetzlaff","doi":"arxiv-2408.03260","DOIUrl":"https://doi.org/arxiv-2408.03260","url":null,"abstract":"This paper introduces an innovative graphical analysis tool for investigating\u0000the dynamics of Memristor Cellular Nonlinear Networks (M-CNNs) featuring\u00002nd-order processing elements, known as M-CNN cells. In the era of specialized\u0000hardware catering to the demands of intelligent autonomous systems, the\u0000integration of memristors within Cellular Nonlinear Networks (CNNs) has emerged\u0000as a promising paradigm due to their exceptional characteristics. However, the\u0000standard Dynamic Route Map (DRM) analysis, applicable to 1st-order systems,\u0000fails to address the intricacies of 2nd-order M-CNN cell dynamics, as well the\u00002nd-order DRM (DRM2) exhibits limitations on the graphical illustration of\u0000local dynamical properties of the M-CNN cells, e.g. state derivative's\u0000magnitude. To address this limitation, we propose a novel integration of M-CNN\u0000cell vector field into the cell's phase portrait, enhancing the analysis\u0000efficacy and enabling efficient M-CNN cell design. A comprehensive exploration\u0000of M-CNN cell dynamics is presented, showcasing the utility of the proposed\u0000graphical tool for various scenarios, including bistable and monostable\u0000behavior, and demonstrating its superior ability to reveal subtle variations in\u0000cell behavior. Through this work, we offer a refined perspective on the\u0000analysis and design of M-CNNs, paving the way for advanced applications in edge\u0000computing and specialized hardware.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Narrowband-IoT (NB-IoT) and IoT Use Cases in Universities, Campuses, and Educational Institutions: A Research Analysis 大学、校园和教育机构中的窄带物联网(NB-IoT)和物联网使用案例:研究分析
arXiv - CS - Emerging Technologies Pub Date : 2024-08-06 DOI: arxiv-2408.03157
Lyberius Ennio F. Taruc, Arvin R. De La Cruz
{"title":"Narrowband-IoT (NB-IoT) and IoT Use Cases in Universities, Campuses, and Educational Institutions: A Research Analysis","authors":"Lyberius Ennio F. Taruc, Arvin R. De La Cruz","doi":"arxiv-2408.03157","DOIUrl":"https://doi.org/arxiv-2408.03157","url":null,"abstract":"The main objective of this research paper is to analyze the available use\u0000cases of Narrowband-IoT and IoT in universities, campuses, and educational\u0000institutions. A literature review was conducted using multiple databases such\u0000as IEEE Xplore, ACM Digital Library, and Scopus. The study explores the\u0000benefits of IoT adoption in higher education. Various use cases of NB-IoT in\u0000educational institutions were analyzed, including smart campus management,\u0000asset tracking, monitoring, and safety and security systems. Of the six use\u0000cases assessed, three focused on the deployment of IoT Things, while three\u0000focused on NB-IoT Connectivity. The research paper concludes that NB-IoT\u0000technology has significant potential to enhance various aspects of educational\u0000institutions, from smart campus management to improving safety and security\u0000systems. The study recommends further exploration and implementation of NB-IoT\u0000technology in educational settings to improve efficiency, security, and overall\u0000campus management. The research highlights the potential applications of NB-IoT\u0000in universities and educational institutions, paving the way for future studies\u0000in this area. The social implications of this research could involve enhancing\u0000the overall learning experience for students, improving campus safety, and\u0000promoting technological advancements in educational settings. Keywords: narrowband-IoT, Internet-of-Things, smart campus, smart\u0000institutions","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"77 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Models 大模型战略思维,小模型效率:在大型语言模型中传递思维理论
arXiv - CS - Emerging Technologies Pub Date : 2024-08-05 DOI: arxiv-2408.05241
Nunzio LoreSepehr, AlirezaSepehr, Ilami, Babak Heydari
{"title":"Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Models","authors":"Nunzio LoreSepehr, AlirezaSepehr, Ilami, Babak Heydari","doi":"arxiv-2408.05241","DOIUrl":"https://doi.org/arxiv-2408.05241","url":null,"abstract":"As the performance of larger, newer Large Language Models continues to\u0000improve for strategic Theory of Mind (ToM) tasks, the demand for these state of\u0000the art models increases commensurately. However, their deployment is costly\u0000both in terms of processing power and time. In this paper, we investigate the\u0000feasibility of creating smaller, simulation-ready agents by way of fine-tuning.\u0000To do this, we present a large pre-trained model with 20 unique scenarios that\u0000combine a social context with a social dilemma, recording its answers, and\u0000using them for Q&A fine-tuning on a smaller model of the same family. Our\u0000focus is on in-context game-theoretic decision-making, the same domain within\u0000which human interaction occurs and that requires both a theory of mind (or a\u0000semblance thereof) and an understanding of social dynamics. We find that the\u0000fine-tuned smaller language model exhibited significant performance closer to\u0000that of its larger relative, and that their improvements extended in areas and\u0000contexts beyond the ones provided in the training examples. On average for all\u0000games, through fine-tuning, the smaller model showed a %46 improvement in\u0000aligning with the behavior of the larger model, with %100 representing\u0000complete alignment. This suggests that our pipeline represents an efficient\u0000method to transmit some form of theory of mind to smaller models, creating\u0000improved and cheaply deployable algorithms in the process. Despite their\u0000simplicity and their associated shortcomings and limitations, our findings\u0000represent a stepping stone in the pursuit and training of specialized models\u0000for strategic and social decision making.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"61 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142226990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信