{"title":"Revisiting Unsupervised Local Descriptor Learning","authors":"Wu‐ru Wang, Lei Zhang, Hua Huang","doi":"10.1609/aaai.v37i3.25367","DOIUrl":"https://doi.org/10.1609/aaai.v37i3.25367","url":null,"abstract":"Constructing accurate training tuples is crucial for unsupervised local descriptor learning, yet challenging due to the absence of patch labels. The state-of-the-art approach constructs tuples with heuristic rules, which struggle to precisely depict real-world patch transformations, in spite of enabling fast model convergence. A possible solution to alleviate the problem is the clustering-based approach, which can capture realistic patch variations and learn more accurate class decision boundaries, but suffers from slow model convergence. This paper presents HybridDesc, an unsupervised approach that learns powerful local descriptor models with fast convergence speed by combining the rule-based and clustering-based approaches to construct training tuples. In addition, HybridDesc also contributes two concrete enhancing mechanisms: (1) a Differentiable Hyperparameter Search (DHS) strategy to find the optimal hyperparameter setting of the rule-based approach so as to provide accurate prior for the clustering-based approach, (2) an On-Demand Clustering (ODC) method to reduce the clustering overhead of the clustering-based approach without eroding its advantage. Extensive experimental results show that HybridDesc can efficiently learn local descriptors that surpass existing unsupervised local descriptors and even rival competitive supervised ones.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"26 1","pages":"2680-2688"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75665780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Di Jia, Qianqian Wang, Jun Cao, Peng Cai, Zhiyang Jin
{"title":"FC-TrackNet: Fast Convergence Net for 6D Pose Tracking in Synthetic Domains","authors":"Di Jia, Qianqian Wang, Jun Cao, Peng Cai, Zhiyang Jin","doi":"10.1609/aaai.v37i13.27077","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.27077","url":null,"abstract":"In this work, we propose a fast convergence track net, or FC-TrackNet, based on a synthetic data-driven approach to maintaining long-term 6D pose tracking. Comparison experiments are performed on two different datasets, The results demonstrate that our approach can achieve a consistent tracking frequency of 90.9 Hz as well as higher accuracy than the state-of-the art approaches.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"32 1","pages":"16455-16457"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74441212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Weizhen Bian, Yijin Song, Nianzhen Gu, Tin Yan Chan, Tsz To Lo, Tsun Sun Li, King Chak Wong, Wei Xue, R. Trillo
{"title":"MoMusic: A Motion-Driven Human-AI Collaborative Music Composition and Performing System","authors":"Weizhen Bian, Yijin Song, Nianzhen Gu, Tin Yan Chan, Tsz To Lo, Tsun Sun Li, King Chak Wong, Wei Xue, R. Trillo","doi":"10.1609/aaai.v37i13.26907","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.26907","url":null,"abstract":"The significant development of artificial neural network architectures has facilitated the increasing adoption of automated music composition models over the past few years. However, most existing systems feature algorithmic generative structures based on hard code and predefined rules, generally excluding interactive or improvised behaviors. We propose a motion based music system, MoMusic, as a AI real time music generation system. MoMusic features a partially randomized harmonic sequencing model based on a probabilistic analysis of tonal chord progressions, mathematically abstracted through musical set theory. This model is presented against a dual dimension grid that produces resulting sounds through a posture recognition mechanism. A camera captures the users' fingers' movement and trajectories, creating coherent, partially improvised harmonic progressions. MoMusic integrates several timbrical registers, from traditional classical instruments such as the piano to a new ''human voice instrument'' created using a voice conversion technique. Our research demonstrates MoMusic's interactiveness, ability to inspire musicians, and ability to generate coherent musical material with various timbrical registers. MoMusic's capabilities could be easily expanded to incorporate different forms of posture controlled timbrical transformation, rhythmic transformation, dynamic transformation, or even digital sound processing techniques.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"28 1","pages":"16057-16062"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74526568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Music-to-Facial Expressions: Emotion-Based Music Visualization for the Hearing Impaired","authors":"Yubo Wang, Fengzhou Pan, Danni Liu, Jiaxiong Hu","doi":"10.1609/aaai.v37i13.26912","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.26912","url":null,"abstract":"While music is made to convey messages and emotions, auditory music is not equally accessible to everyone. Music visualization is a common approach to augment the listening experiences of the hearing users and to provide music experiences for the hearing-impaired. In this paper, we present a music visualization system that can turn the input of a piece of music into a series of facial expressions representative of the continuously changing sentiments in the music. The resulting facial expressions, recorded as action units, can later animate a static virtual avatar to be emotive synchronously with the music.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"43 2","pages":"16096-16102"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72482483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pingsheng Liu, Zhengjie Huang, Xiechi Zhang, Linlin Wang, Gerard de Melo, Xin Lin, Liang Pang, Liang He
{"title":"A Disentangled-Attention Based Framework with Persona-Aware Prompt Learning for Dialogue Generation","authors":"Pingsheng Liu, Zhengjie Huang, Xiechi Zhang, Linlin Wang, Gerard de Melo, Xin Lin, Liang Pang, Liang He","doi":"10.1609/aaai.v37i11.26556","DOIUrl":"https://doi.org/10.1609/aaai.v37i11.26556","url":null,"abstract":"Endowing dialogue agents with personas is the key to delivering more human-like conversations. However, existing persona-grounded dialogue systems still lack informative details of human conversations and tend to reply with inconsistent and generic responses. One of the main underlying causes is that pre-defined persona sentences are generally short and merely superficial descriptions of personal attributes, making appropriate persona selection and understanding non-trivial. Another challenge is that it is crucial to consider the context and the conversation flow to dynamically determine when to invoke different types of persona signals. To address these problems, we propose a disentangled-attention based pre-training architecture, which incorporates persona-aware prompt learning to bridge the connection between the selected persona and response generation. Our model first exploits the conversation flow to select context-relevant personas, and subsequently enriches the superficial persona descriptions with extra personality traits through persona-aware prompting. Finally, the decoder leverages a disentangled-attention mechanism to flexibly control the reliance on personas and dialogue contexts, and incorporates A*-like keyword-based heuristic estimates for controllable generation. Extensive experiments show that our approach can outperform strong baselines and deliver more consistent and engaging responses on the PERSONA-CHAT dataset.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"10 1","pages":"13255-13263"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72581913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Songtuan Lin, G. Behnke, Simona Ondrčková, R. Barták, P. Bercher
{"title":"On Total-Order HTN Plan Verification with Method Preconditions - An Extension of the CYK Parsing Algorithm","authors":"Songtuan Lin, G. Behnke, Simona Ondrčková, R. Barták, P. Bercher","doi":"10.1609/aaai.v37i10.26420","DOIUrl":"https://doi.org/10.1609/aaai.v37i10.26420","url":null,"abstract":"In this paper, we consider the plan verification problem for totally ordered (TO) HTN planning. The problem is proved to be solvable in polynomial time by recognizing its connection to the membership decision problem for context-free grammars. Currently, most HTN plan verification approaches do not have special treatments for the TO configuration, and the only one features such an optimization still relies on an exhaustive search. Hence, we will develop a new TOHTN plan verification approach in this paper by extending the standard CYK parsing algorithm which acts as the best decision procedure in general.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"54 1","pages":"12041-12048"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74561317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yan Peng, Yueyi Zhang, Peilin Xiao, Xiaoyan Sun, Feng Wu
{"title":"Better and Faster: Adaptive Event Conversion for Event-Based Object Detection","authors":"Yan Peng, Yueyi Zhang, Peilin Xiao, Xiaoyan Sun, Feng Wu","doi":"10.1609/aaai.v37i2.25298","DOIUrl":"https://doi.org/10.1609/aaai.v37i2.25298","url":null,"abstract":"Event cameras are a kind of bio-inspired imaging sensor, which asynchronously collect sparse event streams with many advantages. In this paper, we focus on building better and faster event-based object detectors. To this end, we first propose a computationally efficient event representation Hyper Histogram, which adequately preserves both the polarity and temporal information of events. Then we devise an Adaptive Event Conversion module, which converts events into Hyper Histograms according to event density via an adaptive queue. Moreover, we introduce a novel event-based augmentation method Shadow Mosaic, which significantly improves the event sample diversity and enhances the generalization ability of detection models. We equip our proposed modules on three representative object detection models: YOLOv5, Deformable-DETR, and RetinaNet. Experimental results on three event-based detection datasets (1Mpx, Gen1, and MVSEC-NIGHTL21) demonstrate that our proposed approach outperforms other state-of-the-art methods by a large margin, while achieving a much faster running speed (< 14 ms and < 4 ms for 50 ms event data on the 1Mpx and Gen1 datasets).","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"60 1","pages":"2056-2064"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78731664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yichen Li, Wen-Jie Shen, Boyu Zhang, Feng Mao, Zongzhang Zhang, Yang Yu
{"title":"Learning Generalizable Batch Active Learning Strategies via Deep Q-networks (Student Abstract)","authors":"Yichen Li, Wen-Jie Shen, Boyu Zhang, Feng Mao, Zongzhang Zhang, Yang Yu","doi":"10.1609/aaai.v37i13.26989","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.26989","url":null,"abstract":"To handle a large amount of unlabeled data, batch active learning (BAL) queries humans for the labels of a batch of the most valuable data points at every round. Most current BAL strategies are based on human-designed heuristics, such as uncertainty sampling or mutual information maximization. However, there exists a disagreement between these heuristics and the ultimate goal of BAL, i.e., optimizing the model's final performance within the query budgets. This disagreement leads to a limited generality of these heuristics. To this end, we formulate BAL as an MDP and propose a data-driven approach based on deep reinforcement learning. Our method learns the BAL strategy by maximizing the model's final performance. Experiments on the UCI benchmark show that our method can achieve competitive performance compared to existing heuristics-based approaches.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"32 1","pages":"16258-16259"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78768216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inconsistent Cores for ASP: The Perks and Perils of Non-monotonicity","authors":"J. Fichte, Markus Hecher, Stefan Szeider","doi":"10.1609/aaai.v37i5.25783","DOIUrl":"https://doi.org/10.1609/aaai.v37i5.25783","url":null,"abstract":"Answer Set Programming (ASP) is a prominent modeling and solving framework. An inconsistent core (IC) of an ASP program is an inconsistent subset of rules. In the case of inconsistent programs, a smallest or subset-minimal IC contains crucial rules for the inconsistency. In this work, we study fnding minimal ICs of ASP programs and key fragments from a complexity-theoretic perspective. Interestingly, due to ASP’s non-monotonic behavior, also consistent programs admit ICs. It turns out that there is an entire landscape of problems involving ICs with a diverse range of complexities up to the fourth level of the Polynomial Hierarchy. Deciding the existence of an IC is, already for tight programs, on the second level of the Polynomial Hierarchy. Furthermore, we give encodings for IC-related problems on the fragment of tight programs and illustrate feasibility on small instance sets.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"18 1","pages":"6363-6371"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74971850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiguo Liu, Chao Liu, Nan Li, Shihao Gao, Mingqi Liu, Dali Zhu
{"title":"LADA-Trans-NER: Adaptive Efficient Transformer for Chinese Named Entity Recognition Using Lexicon-Attention and Data-Augmentation","authors":"Jiguo Liu, Chao Liu, Nan Li, Shihao Gao, Mingqi Liu, Dali Zhu","doi":"10.1609/aaai.v37i11.26554","DOIUrl":"https://doi.org/10.1609/aaai.v37i11.26554","url":null,"abstract":"Recently, word enhancement has become very popular for Chinese Named Entity Recognition (NER), reducing segmentation errors and increasing the semantic and boundary information of Chinese words. However, these methods tend to ignore the semantic relationship before and after the sentence after integrating lexical information. Therefore, the regularity of word length information has not been fully explored in various word-character fusion methods. In this work, we propose a Lexicon-Attention and Data-Augmentation (LADA) method for Chinese NER. We discuss the challenges of using existing methods in incorporating word information for NER and show how our proposed methods could be leveraged to overcome those challenges. LADA is based on a Transformer Encoder that utilizes lexicon to construct a directed graph and fuses word information through updating the optimal edge of the graph. Specially, we introduce the advanced data augmentation method to obtain the optimal representation for the NER task. Experimental results show that the augmentation done using LADA can considerably boost the performance of our NER system and achieve significantly better results than previous state-of-the-art methods and variant models in the literature on four publicly available NER datasets, namely Resume, MSRA, Weibo, and OntoNotes v4. We also observe better generalization and application to a real-world setting from LADA on multi-source complex entities.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"21 1","pages":"13236-13245"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75024279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}