ArXiv最新文献_第3页

ArXiv Pub Date : 2024-03-11 DOI: 10.1609/aaai.v38i5.28314

Shuai Tan, Bin Ji, Yu Ding, Ye Pan

{"title":"Say Anything with Any Style","authors":"Shuai Tan, Bin Ji, Yu Ding, Ye Pan","doi":"10.1609/aaai.v38i5.28314","DOIUrl":"https://doi.org/10.1609/aaai.v38i5.28314","url":null,"abstract":"Generating stylized talking head with diverse head motions is crucial for achieving natural-looking videos but still remains challenging. Previous works either adopt a regressive method to capture the speaking style, resulting in a coarse style that is averaged across all training data, or employ a universal network to synthesize videos with different styles which causes suboptimal performance. To address these, we propose a novel dynamic-weight method, namely Say Anything with Any Style (SAAS), which queries the discrete style representation via a generative model with a learned style codebook. Specifically, we develop a multi-task VQ-VAE that incorporates three closely related tasks to learn a style codebook as a prior for style extraction. This discrete prior, along with the generative model, enhances the precision and robustness when extracting the speaking styles of the given style clips. By utilizing the extracted style, a residual architecture comprising a canonical branch and style-specific branch is employed to predict the mouth shapes conditioned on any driving audio while transferring the speaking style from the source to any desired one. To adapt to different speaking styles, we steer clear of employing a universal network by exploring an elaborate HyperStyle to produce the style-specific weights offset for the style branch. Furthermore, we construct a pose generator and a pose codebook to store the quantized pose representation, allowing us to sample diverse head motions aligned with the audio and the extracted style. Experiments demonstrate that our approach surpasses state-of-the-art methods in terms of both lip-synchronization and stylized expression. Besides, we extend our SAAS to video-driven style editing field and achieve satisfactory performance as well.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"31 16","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140396170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Born to Run, Programmed to Play: Mapping the Extended Reality Exergames Landscape 天生奔跑，编程游戏：绘制扩展现实体外游戏地图

ArXiv Pub Date : 2024-03-11 DOI: 10.1145/3613904.3642124

Sukran Karaosmanoglu, S. Cmentowski, Lennart E. Nacke, Frank Steinicke

引用次数: 0

What Makes Quantization for Large Language Model Hard? An Empirical Study from the Lens of Perturbation 是什么让大型语言模型的量化变得困难？从扰动的角度进行实证研究

ArXiv Pub Date : 2024-03-11 DOI: 10.1609/aaai.v38i16.29765

Zhuocheng Gong, Jiahao Liu, Jingang Wang, Xunliang Cai, Dongyan Zhao, Rui Yan

{"title":"What Makes Quantization for Large Language Model Hard? An Empirical Study from the Lens of Perturbation","authors":"Zhuocheng Gong, Jiahao Liu, Jingang Wang, Xunliang Cai, Dongyan Zhao, Rui Yan","doi":"10.1609/aaai.v38i16.29765","DOIUrl":"https://doi.org/10.1609/aaai.v38i16.29765","url":null,"abstract":"Quantization has emerged as a promising technique for improving the memory and computational efficiency of large language models (LLMs). Though the trade-off between performance and efficiency is well-known, there is still much to be learned about the relationship between quantization and LLM performance. To shed light on this relationship, we propose a new perspective on quantization, viewing it as perturbations added to the weights and activations of LLMs. We call this approach ``the lens of perturbation\". Using this lens, we conduct experiments with various artificial perturbations to explore their impact on LLM performance. Our findings reveal several connections between the properties of perturbations and LLM performance, providing insights into the failure cases of uniform quantization and suggesting potential solutions to improve the robustness of LLM quantization.\u0000To demonstrate the significance of our findings, we implement a simple non-uniform quantization approach based on our insights. Our experiments show that this approach achieves minimal performance degradation on both 4-bit weight quantization and 8-bit quantization for weights and activations. These results validate the correctness of our approach and highlight its potential to improve the efficiency of LLMs without sacrificing performance.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"26 48","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140396252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Thought Graph: Generating Thought Process for Biological Reasoning 思维图谱生成生物推理的思维过程

ArXiv Pub Date : 2024-03-11 DOI: 10.1145/3589335.3651572

Chi-Yang Hsu, Kyle Cox, Jiawei Xu, Zhen Tan, Tianhua Zhai, Mengzhou Hu, Dexter Pratt, Tianlong Chen, Ziniu Hu, Ying Ding

引用次数: 0

Deriving Dependently-Typed OOP from First Principles - Extended Version with Additional Appendices 从第一原理衍生出依赖类型的 OOP--扩展版及附加附录

ArXiv Pub Date : 2024-03-11 DOI: 10.1145/3649846

David Binder, Ingo Skupin, Tim Süberkrüb, Klaus Ostermann

{"title":"Deriving Dependently-Typed OOP from First Principles - Extended Version with Additional Appendices","authors":"David Binder, Ingo Skupin, Tim Süberkrüb, Klaus Ostermann","doi":"10.1145/3649846","DOIUrl":"https://doi.org/10.1145/3649846","url":null,"abstract":"The expression problem describes how most types can easily be extended with new ways to produce the type or new ways to consume the type, but not both. When abstract syntax trees are defined as an algebraic data type, for example, they can easily be extended with new consumers, such as print or eval, but adding a new constructor requires the modification of all existing pattern matches. The expression problem is one way to elucidate the difference between functional or data-oriented programs (easily extendable by new consumers) and object-oriented programs (easily extendable by new producers). This difference between programs which are extensible by new producers or new consumers also exists for dependently typed programming, but with one core difference: Dependently-typed programming almost exclusively follows the functional programming model and not the object-oriented model, which leaves an interesting space in the programming language landscape unexplored. In this paper, we explore the field of dependently-typed object-oriented programming by deriving it from first principles using the principle of duality. That is, we do not extend an existing object-oriented formalism with dependent types in an ad-hoc fashion, but instead start from a familiar data-oriented language and derive its dual fragment by the systematic use of defunctionalization and refunctionalization. Our central contribution is a dependently typed calculus which contains two dual language fragments. We provide type- and semantics-preserving transformations between these two language fragments: defunctionalization and refunctionalization. We have implemented this language and these transformations and use this implementation to explain the various ways in which constructions in dependently typed programming can be explained as special instances of the phenomenon of duality.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"29 17","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140395881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Data Cubes in Hand: A Design Space of Tangible Cubes for Visualizing 3D Spatio-Temporal Data in Mixed Reality 手握数据立方体：在混合现实中可视化三维时空数据的有形立方体设计空间

ArXiv Pub Date : 2024-03-11 DOI: 10.1145/3613904.3642740

Shuqi He, Haonan Yao, Luyan Jiang, Kaiwen Li, Nan Xiang, Yue Li, Hai-Ning Liang, Lingyun Yu

{"title":"Data Cubes in Hand: A Design Space of Tangible Cubes for Visualizing 3D Spatio-Temporal Data in Mixed Reality","authors":"Shuqi He, Haonan Yao, Luyan Jiang, Kaiwen Li, Nan Xiang, Yue Li, Hai-Ning Liang, Lingyun Yu","doi":"10.1145/3613904.3642740","DOIUrl":"https://doi.org/10.1145/3613904.3642740","url":null,"abstract":"Tangible interfaces in mixed reality (MR) environments allow for intuitive data interactions. Tangible cubes, with their rich interaction affordances, high maneuverability, and stable structure, are particularly well-suited for exploring multi-dimensional data types. However, the design potential of these cubes is underexplored. This study introduces a design space for tangible cubes in MR, focusing on interaction space, visualization space, sizes, and multiplicity. Using spatio-temporal data, we explored the interaction affordances of these cubes in a workshop (N=24). We identified unique interactions like rotating, tapping, and stacking, which are linked to augmented reality (AR) visualization commands. Integrating user-identified interactions, we created a design space for tangible-cube interactions and visualization. A prototype visualizing global health spending with small cubes was developed and evaluated, supporting both individual and combined cube manipulation. This research enhances our grasp of tangible interaction in MR, offering insights for future design and application in diverse data contexts.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"29 37","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140396041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Transparent AI Disclosure Obligations: Who, What, When, Where, Why, How 透明的人工智能披露义务：谁、做什么、何时、何地、为什么、如何做

ArXiv Pub Date : 2024-03-11 DOI: 10.1145/3613905.3650750

Abdallah El Ali, Karthikeya Puttur Venkatraj, Sophie Morosoli, Laurens Naudts, Natali Helberger, Pablo César

引用次数: 0

MAP-Elites with Transverse Assessment for Multimodal Problems in Creative Domains 针对创意领域多模态问题的横向评估 MAP-Elites

ArXiv Pub Date : 2024-03-11 DOI: 10.1007/978-3-031-56992-0_26

Marvin Zammit, Antonios Liapis, Georgios N. Yannakakis

引用次数: 0

Chart4Blind: An Intelligent Interface for Chart Accessibility Conversion Chart4Blind：图表无障碍转换的智能界面

ArXiv Pub Date : 2024-03-11 DOI: 10.1145/3640543.3645175

Omar Moured, Morris Baumgarten-Egemole, Alina Roitberg, Karin Müller, Thorsten Schwarz, Rainer Stiefelhagen

{"title":"Chart4Blind: An Intelligent Interface for Chart Accessibility Conversion","authors":"Omar Moured, Morris Baumgarten-Egemole, Alina Roitberg, Karin Müller, Thorsten Schwarz, Rainer Stiefelhagen","doi":"10.1145/3640543.3645175","DOIUrl":"https://doi.org/10.1145/3640543.3645175","url":null,"abstract":"In a world driven by data visualization, ensuring the inclusive accessibility of charts for Blind and Visually Impaired (BVI) individuals remains a significant challenge. Charts are usually presented as raster graphics without textual and visual metadata needed for an equivalent exploration experience for BVI people. Additionally, converting these charts into accessible formats requires considerable effort from sighted individuals. Digitizing charts with metadata extraction is just one aspect of the issue; transforming it into accessible modalities, such as tactile graphics, presents another difficulty. To address these disparities, we propose Chart4Blind, an intelligent user interface that converts bitmap image representations of line charts into universally accessible formats. Chart4Blind achieves this transformation by generating Scalable Vector Graphics (SVG), Comma-Separated Values (CSV), and alternative text exports, all comply with established accessibility standards. Through interviews and a formal user study, we demonstrate that even inexperienced sighted users can make charts accessible in an average of 4 minutes using Chart4Blind, achieving a System Usability Scale rating of 90%. In comparison to existing approaches, Chart4Blind provides a comprehensive solution, generating end-to-end accessible SVGs suitable for assistive technologies such as embossed prints (papers and laser cut), 2D tactile displays, and screen readers. For additional information, including open-source codes and demos, please visit our project page https://moured.github.io/chart4blind/.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"10 14","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140395834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SoniWeight Shoes: Investigating Effects and Personalization of a Wearable Sound Device for Altering Body Perception and Behavior SoniWeight 鞋：调查用于改变身体感知和行为的可穿戴声音设备的效果和个性化设计

ArXiv Pub Date : 2024-03-11 DOI: 10.1145/3613904.3642651

A. D'Adamo, M. Roel-Lesur, L. Turmo-Vidal, Mohammad Mahdi Dehshibi, D. D. L. Prida, J. R. Díaz-Durán, L. A. Azpicueta-Ruiz, A. Valjamae, A. T. I. Lab, Dei Interactive Systems Group, Department of Computer Science, Engineering, U. C. I. Madrid, Madrid, Spain, Department of Quantitative Theory, Communications, Johan Skytte Institute of Political Studies, University of Tartu, Tartu, Estonia., Ucl Interaction Centre, University College London, London, United Kingdom.

{"title":"SoniWeight Shoes: Investigating Effects and Personalization of a Wearable Sound Device for Altering Body Perception and Behavior","authors":"A. D'Adamo, M. Roel-Lesur, L. Turmo-Vidal, Mohammad Mahdi Dehshibi, D. D. L. Prida, J. R. Díaz-Durán, L. A. Azpicueta-Ruiz, A. Valjamae, A. T. I. Lab, Dei Interactive Systems Group, Department of Computer Science, Engineering, U. C. I. Madrid, Madrid, Spain, Department of Quantitative Theory, Communications, Johan Skytte Institute of Political Studies, University of Tartu, Tartu, Estonia., Ucl Interaction Centre, University College London, London, United Kingdom.","doi":"10.1145/3613904.3642651","DOIUrl":"https://doi.org/10.1145/3613904.3642651","url":null,"abstract":"Changes in body perception influence behavior and emotion and can be induced through multisensory feedback. Auditory feedback to one's actions can trigger such alterations; however, it is unclear which individual factors modulate these effects. We employ and evaluate SoniWeight Shoes, a wearable device based on literature for altering one's weight perception through manipulated footstep sounds. In a healthy population sample across a spectrum of individuals (n=84) with varying degrees of eating disorder symptomatology, physical activity levels, body concerns, and mental imagery capacities, we explore the effects of three sound conditions (low-frequency, high-frequency and control) on extensive body perception measures (demographic, behavioral, physiological, psychological, and subjective). Analyses revealed an impact of individual differences in each of these dimensions. Besides replicating previous findings, we reveal and highlight the role of individual differences in body perception, offering avenues for personalized sonification strategies. Datasets, technical refinements, and novel body map quantification tools are provided.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"26 18","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140396397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0