ArXivPub Date : 2024-03-10DOI: 10.1609/aaai.v38i15.29675
Pedro Zuidberg Dos Martires
{"title":"Probabilistic Neural Circuits","authors":"Pedro Zuidberg Dos Martires","doi":"10.1609/aaai.v38i15.29675","DOIUrl":"https://doi.org/10.1609/aaai.v38i15.29675","url":null,"abstract":"Probabilistic circuits (PCs) have gained prominence in recent years as a versatile framework for discussing probabilistic models that support tractable queries and are yet expressive enough to model complex probability distributions. Nevertheless, tractability comes at a cost: PCs are less expressive than neural networks. In this paper we introduce probabilistic neural circuits (PNCs), which strike a balance between PCs and neural nets in terms of tractability and expressive power. Theoretically, we show that PNCs can be interpreted as deep mixtures of Bayesian networks. Experimentally, we demonstrate that PNCs constitute powerful function approximators.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140396453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-09DOI: 10.1109/icassp48485.2024.10448469
Qiaochu Huang, Xu He, Boshi Tang, Hao-Wen Zhuang, Liyang Chen, Shuochen Gao, Zhiyong Wu, Haozhi Huang, Helen M. Meng
{"title":"Enhancing Expressiveness in Dance Generation via Integrating Frequency and Music Style Information","authors":"Qiaochu Huang, Xu He, Boshi Tang, Hao-Wen Zhuang, Liyang Chen, Shuochen Gao, Zhiyong Wu, Haozhi Huang, Helen M. Meng","doi":"10.1109/icassp48485.2024.10448469","DOIUrl":"https://doi.org/10.1109/icassp48485.2024.10448469","url":null,"abstract":"Dance generation, as a branch of human motion generation, has attracted increasing attention. Recently, a few works attempt to enhance dance expressiveness, which includes genre matching, beat alignment, and dance dynamics, from certain aspects. However, the enhancement is quite limited as they lack comprehensive consideration of the aforementioned three factors. In this paper, we propose ExpressiveBailando, a novel dance generation method designed to generate expressive dances, concurrently taking all three factors into account. Specifically, we mitigate the issue of speed homogenization by incorporating frequency information into VQ-VAE, thus improving dance dynamics. Additionally, we integrate music style information by extracting genre- and beat-related features with a pre-trained music model, hence achieving improvements in the other two factors. Extensive experimental results demonstrate that our proposed method can generate dances with high expressiveness and outperforms existing methods both qualitatively and quantitatively.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140396462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jie Liu, Zhongyuan Zhao, Zijian Ding, Benjamin Brock, Hongbo Rong, Zhiru Zhang
{"title":"UniSparse: An Intermediate Language for General Sparse Format Customization","authors":"Jie Liu, Zhongyuan Zhao, Zijian Ding, Benjamin Brock, Hongbo Rong, Zhiru Zhang","doi":"10.1145/3649816","DOIUrl":"https://doi.org/10.1145/3649816","url":null,"abstract":"The ongoing trend of hardware specialization has led to a growing use of custom data formats when processing sparse workloads, which are typically memory-bound. These formats facilitate optimized software/hardware implementations by utilizing sparsity pattern- or target-aware data structures and layouts to enhance memory access latency and bandwidth utilization. However, existing sparse tensor programming models and compilers offer little or no support for productively customizing the sparse formats. Additionally, because these frameworks represent formats using a limited set of per-dimension attributes, they lack the flexibility to accommodate numerous new variations of custom sparse data structures and layouts. To overcome this deficiency, we propose UniSparse, an intermediate language that provides a unified abstraction for representing and customizing sparse formats. Unlike the existing attribute-based frameworks, UniSparse decouples the logical representation of the sparse tensor (i.e., the data structure) from its low-level memory layout, enabling the customization of both. As a result, a rich set of format customizations can be succinctly expressed in a small set of well-defined query, mutation, and layout primitives. We also develop a compiler leveraging the MLIR infrastructure, which supports adaptive customization of formats, and automatic code generation of format conversion and compute operations for heterogeneous architectures. We demonstrate the efficacy of our approach through experiments running commonly-used sparse linear algebra operations with specialized formats on multiple different hardware targets, including an Intel CPU, an NVIDIA GPU, an AMD Xilinx FPGA, and a simulated processing-in-memory (PIM) device.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140396579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-09DOI: 10.1145/3613905.3650882
Jie Cai, Aashka Patel, Azadeh Naderi, D. Y. Wohn
{"title":"Content Moderation Justice and Fairness on Social Media: Comparisons Across Different Contexts and Platforms","authors":"Jie Cai, Aashka Patel, Azadeh Naderi, D. Y. Wohn","doi":"10.1145/3613905.3650882","DOIUrl":"https://doi.org/10.1145/3613905.3650882","url":null,"abstract":"Social media users may perceive moderation decisions by the platform differently, which can lead to frustration and dropout. This study investigates users' perceived justice and fairness of online moderation decisions when they are exposed to various illegal versus legal scenarios, retributive versus restorative moderation strategies, and user-moderated versus commercially moderated platforms. We conduct an online experiment on 200 American social media users of Reddit and Twitter. Results show that retributive moderation delivers higher justice and fairness for commercially moderated than for user-moderated platforms in illegal violations; restorative moderation delivers higher fairness for legal violations than illegal ones. We discuss the opportunities for platform policymaking to improve moderation system design.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140396474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-09DOI: 10.1109/icassp48485.2024.10446945
Qu Yang, Qianhui Liu, Nan Li, Meng Ge, Zeyang Song, Haizhou Li
{"title":"sVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural Networks","authors":"Qu Yang, Qianhui Liu, Nan Li, Meng Ge, Zeyang Song, Haizhou Li","doi":"10.1109/icassp48485.2024.10446945","DOIUrl":"https://doi.org/10.1109/icassp48485.2024.10446945","url":null,"abstract":"Speech applications are expected to be low-power and robust under noisy conditions. An effective Voice Activity Detection (VAD) front-end lowers the computational need. Spiking Neural Networks (SNNs) are known to be biologically plausible and power-efficient. However, SNN-based VADs have yet to achieve noise robustness and often require large models for high performance. This paper introduces a novel SNN-based VAD model, referred to as sVAD, which features an auditory encoder with an SNN-based attention mechanism. Particularly, it provides effective auditory feature representation through SincNet and 1D convolution, and improves noise robustness with attention mechanisms. The classifier utilizes Spiking Recurrent Neural Networks (sRNN) to exploit temporal speech information. Experimental results demonstrate that our sVAD achieves remarkable noise robustness and meanwhile maintains low power consumption and a small footprint, making it a promising solution for real-world VAD applications.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140396488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-09DOI: 10.1145/3613905.3651057
Yao Lyu, He Zhang, Shuo Niu, Jie Cai
{"title":"A Preliminary Exploration of YouTubers' Use of Generative-AI in Content Creation","authors":"Yao Lyu, He Zhang, Shuo Niu, Jie Cai","doi":"10.1145/3613905.3651057","DOIUrl":"https://doi.org/10.1145/3613905.3651057","url":null,"abstract":"Content creators increasingly utilize generative artificial intelligence (Gen-AI) on platforms such as YouTube, TikTok, Instagram, and various blogging sites to produce imaginative images, AI-generated videos, and articles using Large Language Models (LLMs). Despite its growing popularity, there remains an underexplored area concerning the specific domains where AI-generated content is being applied, and the methodologies content creators employ with Gen-AI tools during the creation process. This study initially explores this emerging area through a qualitative analysis of 68 YouTube videos demonstrating Gen-AI usage. Our research focuses on identifying the content domains, the variety of tools used, the activities performed, and the nature of the final products generated by Gen-AI in the context of user-generated content.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140396324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-09DOI: 10.1145/3613904.3642631
Christopher Katins, Pawel W. Wo'zniak, Aodi Chen, Ihsan Tumay, Luu Viet Trinh Le, John Uschold, Thomas Kosch
{"title":"Assessing User Apprehensions About Mixed Reality Artifacts and Applications: The Mixed Reality Concerns (MRC) Questionnaire","authors":"Christopher Katins, Pawel W. Wo'zniak, Aodi Chen, Ihsan Tumay, Luu Viet Trinh Le, John Uschold, Thomas Kosch","doi":"10.1145/3613904.3642631","DOIUrl":"https://doi.org/10.1145/3613904.3642631","url":null,"abstract":"Current research in Mixed Reality (MR) presents a wide range of novel use cases for blending virtual elements with the real world. This yet-to-be-ubiquitous technology challenges how users currently work and interact with digital content. While offering many potential advantages, MR technologies introduce new security, safety, and privacy challenges. Thus, it is relevant to understand users' apprehensions towards MR technologies, ranging from security concerns to social acceptance. To address this challenge, we present the Mixed Reality Concerns (MRC) Questionnaire, designed to assess users' concerns towards MR artifacts and applications systematically. The development followed a structured process considering previous work, expert interviews, iterative refinements, and confirmatory tests to analytically validate the questionnaire. The MRC Questionnaire offers a new method of assessing users' critical opinions to compare and assess novel MR artifacts and applications regarding security, privacy, social implications, and trust.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140396331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-09DOI: 10.1145/3614419.3644026
In-Soon Kang, Maruf Ahmed Mridul, Abraham Sanders, Yao Ma, Thilanka Munasinghe, Aparna Gupta, O. Seneviratne
{"title":"Deciphering Crypto Twitter","authors":"In-Soon Kang, Maruf Ahmed Mridul, Abraham Sanders, Yao Ma, Thilanka Munasinghe, Aparna Gupta, O. Seneviratne","doi":"10.1145/3614419.3644026","DOIUrl":"https://doi.org/10.1145/3614419.3644026","url":null,"abstract":"Cryptocurrency is a fast-moving space, with a continuous influx of new projects every year. However, an increasing number of incidents in the space, such as hacks and security breaches, threaten the growth of the community and the development of technology. This dynamic and often tumultuous landscape is vividly mirrored and shaped by discussions within Crypto Twitter, a key digital arena where investors, enthusiasts, and skeptics converge, revealing real-time sentiments and trends through social media interactions. We present our analysis on a Twitter dataset collected during a formative period of the cryptocurrency landscape. We collected 40 million tweets using cryptocurrency-related keywords and performed a nuanced analysis that involved grouping the tweets by semantic similarity and constructing a tweet and user network. We used sentence-level embeddings and autoencoders to create K-means clusters of tweets and identified six groups of tweets and their topics to examine different cryptocurrency-related interests and the change in sentiment over time. Moreover, we discovered sentiment indicators that point to real-life incidents in the crypto world, such as the FTX incident of November 2022. We also constructed and analyzed different networks of tweets and users in our dataset by considering the reply and quote relationships and analyzed the largest components of each network. Our networks reveal a structure of bot activity in Crypto Twitter and suggest that they can be detected and handled using a network-based approach. Our work sheds light on the potential of social media signals to detect and understand crypto events, benefiting investors, regulators, and curious observers alike, as well as the potential for bot detection in Crypto Twitter using a network-based approach.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140396380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-09DOI: 10.1109/icassp48485.2024.10446756
Yanyi Zhang, Qi Jia, Xin Fan, Yu Liu, Ran He
{"title":"CSCNET: Class-Specified Cascaded Network for Compositional Zero-Shot Learning","authors":"Yanyi Zhang, Qi Jia, Xin Fan, Yu Liu, Ran He","doi":"10.1109/icassp48485.2024.10446756","DOIUrl":"https://doi.org/10.1109/icassp48485.2024.10446756","url":null,"abstract":"Attribute and object (A-O) disentanglement is a fundamental and critical problem for Compositional Zero-shot Learning (CZSL), whose aim is to recognize novel A-O compositions based on foregone knowledge. Existing methods based on disentangled representation learning lose sight of the contextual dependency between the A-O primitive pairs. Inspired by this, we propose a novel A-O disentangled framework for CZSL, namely Class-specified Cascaded Network (CSCNet). The key insight is to firstly classify one primitive and then specifies the predicted class as a priori for guiding another primitive recognition in a cascaded fashion. To this end, CSCNet constructs Attribute-to-Object and Object-to-Attribute cascaded branches, in addition to a composition branch modeling the two primitives as a whole. Notably, we devise a parametric classifier (ParamCls) to improve the matching between visual and semantic embeddings. By improving the A-O disentanglement, our framework achieves superior results than previous competitive methods.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140396359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-08DOI: 10.1609/aaai.v38i4.28144
Chengxu Liu, Xuan Wang, Yuanting Fan, Shuai Li, Xueming Qian
{"title":"Decoupling Degradations with Recurrent Network for Video Restoration in Under-Display Camera","authors":"Chengxu Liu, Xuan Wang, Yuanting Fan, Shuai Li, Xueming Qian","doi":"10.1609/aaai.v38i4.28144","DOIUrl":"https://doi.org/10.1609/aaai.v38i4.28144","url":null,"abstract":"Under-display camera (UDC) systems are the foundation of full-screen display devices in which the lens mounts under the display. The pixel array of light-emitting diodes used for display diffracts and attenuates incident light, causing various degradations as the light intensity changes. Unlike general video restoration which recovers video by treating different degradation factors equally, video restoration for UDC systems is more challenging that concerns removing diverse degradation over time while preserving temporal consistency. In this paper, we introduce a novel video restoration network, called D2RNet, specifically designed for UDC systems. It employs a set of Decoupling Attention Modules (DAM) that effectively separate the various video degradation factors. More specifically, a soft mask generation function is proposed to formulate each frame into flare and haze based on the diffraction arising from incident light of different intensities, followed by the proposed flare and haze removal components that leverage long- and short-term feature learning to handle the respective degradations. Such a design offers an targeted and effective solution to eliminating various types of degradation in UDC systems. We further extend our design into multi-scale to overcome the scale-changing of degradation that often occur in long-range videos. To demonstrate the superiority of D2RNet, we propose a large-scale UDC video benchmark by gathering HDR videos and generating realistically degraded videos using the point spread function measured by a commercial UDC system. Extensive quantitative and qualitative evaluations demonstrate the superiority of D2RNet compared to other state-of-the-art video restoration and UDC image restoration methods.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140396710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}