{"title":"DisGUIDE: Disagreement-Guided Data-Free Model Extraction","authors":"Jonathan Rosenthal, Eric Enouen, H. Pham, Lin Tan","doi":"10.1609/aaai.v37i8.26150","DOIUrl":"https://doi.org/10.1609/aaai.v37i8.26150","url":null,"abstract":"Recent model-extraction attacks on Machine Learning as a Service (MLaaS) systems have moved towards data-free approaches, showing the feasibility of stealing models trained with difficult-to-access data. However, these attacks are ineffective or limited due to the low accuracy of extracted models and the high number of queries to the models under attack. The high query cost makes such techniques infeasible for online MLaaS systems that charge per query.\u0000We create a novel approach to get higher accuracy and query efficiency than prior data-free model extraction techniques. Specifically, we introduce a novel generator training scheme that maximizes the disagreement loss between two clone models that attempt to copy the model under attack. This loss, combined with diversity loss and experience replay, enables the generator to produce better instances to train the clone models. Our evaluation on popular datasets CIFAR-10 and CIFAR-100 shows that our approach improves the final model accuracy by up to 3.42% and 18.48% respectively. The average number of queries required to achieve the accuracy of the prior state of the art is reduced by up to 64.95%. We hope this will promote future work on feasible data-free model extraction and defenses against such attacks.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"517 1","pages":"9614-9622"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77148921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Leveraging Sub-class Discimination for Compositional Zero-Shot Learning","authors":"Xiaoming Hu, Zilei Wang","doi":"10.1609/aaai.v37i1.25168","DOIUrl":"https://doi.org/10.1609/aaai.v37i1.25168","url":null,"abstract":"Compositional Zero-Shot Learning (CZSL) aims at identifying unseen compositions composed of previously seen attributes and objects during the test phase. In real images, the visual appearances of attributes and objects (primitive concepts) generally interact with each other. Namely, the visual appearances of an attribute may change when composed with different objects, and vice versa. But previous works overlook this important property. In this paper, we introduce a simple yet effective approach with leveraging sub-class discrimination. Specifically, we define the primitive concepts in different compositions as sub-classes, and then maintain the sub-class discrimination to address the above challenge. More specifically, inspired by the observation that the composed recognition models could account for the differences across sub-classes, we first propose to impose the embedding alignment between the composed and disentangled recognition to incorporate sub-class discrimination at the feature level. Then we develop the prototype modulator networks to adjust the class prototypes w.r.t. the composition information, which can enhance sub-class discrimination at the classifier level. We conduct extensive experiments on the challenging benchmark datasets, and the considerable performance improvement over state-of-the-art approaches is achieved, which indicates the effectiveness of our method. Our code is available at https://github.com/hxm97/SCD-CZSL.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"38 1","pages":"890-898"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80822271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhenyu Hou, Yukuo Cen, Ziding Liu, Dongxue Wu, Baoyan Wang, Xuanhe Li, Lei Hong, Jie Tang
{"title":"MTDiag: An Effective Multi-Task Framework for Automatic Diagnosis","authors":"Zhenyu Hou, Yukuo Cen, Ziding Liu, Dongxue Wu, Baoyan Wang, Xuanhe Li, Lei Hong, Jie Tang","doi":"10.1609/aaai.v37i12.26666","DOIUrl":"https://doi.org/10.1609/aaai.v37i12.26666","url":null,"abstract":"Automatic diagnosis systems aim to probe for symptoms (i.e., symptom checking) and diagnose disease through multi-turn conversations with patients. Most previous works formulate it as a sequential decision process and use reinforcement learning (RL) to decide whether to inquire about symptoms or make a diagnosis. However, these RL-based methods heavily rely on the elaborate reward function and usually suffer from an unstable training process and low data efficiency. In this work, we propose an effective multi-task framework for automatic diagnosis called MTDiag. We first reformulate symptom checking as a multi-label classification task by direct supervision. Each medical dialogue is equivalently converted into multiple samples for classification, which can also help alleviate the data scarcity problem. Furthermore, we design a multi-task learning strategy to guide the symptom checking procedure with disease information and further utilize contrastive learning to better distinguish symptoms between diseases. Extensive experimental results show that our method achieves state-of-the-art performance on four public datasets with 1.7%~3.1% improvement in disease diagnosis, demonstrating the superiority of the proposed method. Additionally, our model is now deployed in an online medical consultant system as an assistant tool for real-life doctors.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"44 1","pages":"14241-14248"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81106184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"See Your Emotion from Gait Using Unlabeled Skeleton Data","authors":"Haifeng Lu, Xiping Hu, B. Hu","doi":"10.1609/aaai.v37i2.25272","DOIUrl":"https://doi.org/10.1609/aaai.v37i2.25272","url":null,"abstract":"This paper focuses on contrastive learning for gait-based emotion recognition. The existing contrastive learning approaches are rarely suitable for learning skeleton-based gait representations, which suffer from limited gait diversity and inconsistent semantics. In this paper, we propose a Cross-coordinate contrastive learning framework utilizing Ambiguity samples for self-supervised Gait-based Emotion representation (CAGE). First, we propose ambiguity transform to push positive samples into ambiguous semantic space. By learning similarities between ambiguity samples and positive samples, our model can learn higher-level semantics of the gait sequences and maintain semantic diversity. Second, to encourage learning the semantic invariance, we uniquely propose cross-coordinate contrastive learning between the Cartesian coordinate and the Spherical coordinate, which brings rich supervisory signals to learn the intrinsic semantic consistency information. Exhaustive experiments show that CAGE improves existing self-supervised methods by 5%–10% accuracy, and it achieves comparable or even superior performance to supervised methods.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"33 1","pages":"1826-1834"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81187420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MEID: Mixture-of-Experts with Internal Distillation for Long-Tailed Video Recognition","authors":"Xinjie Li, Huijuan Xu","doi":"10.1609/aaai.v37i2.25230","DOIUrl":"https://doi.org/10.1609/aaai.v37i2.25230","url":null,"abstract":"The long-tailed video recognition problem is especially challenging, as videos tend to be long and untrimmed, and each video may contain multiple classes, causing frame-level class imbalance. The previous method tackles the long-tailed video recognition only through frame-level sampling for class re-balance without distinguishing the frame-level feature representation between head and tail classes. To improve the frame-level feature representation of tail classes, we modulate the frame-level features with an auxiliary distillation loss to reduce the distribution distance between head and tail classes. Moreover, we design a mixture-of-experts framework with two different expert designs, i.e., the first expert with an attention-based classification network handling the original long-tailed distribution, and the second expert dealing with the re-balanced distribution from class-balanced sampling. Notably, in the second expert, we specifically focus on the frames unsolved by the first expert through designing a complementary frame selection module, which inherits the attention weights from the first expert and selects frames with low attention weights, and we also enhance the motion feature representation for these selected frames. To highlight the multi-label challenge in long-tailed video recognition, we create two additional benchmarks based on Charades and CharadesEgo videos with the multi-label property, called CharadesLT and CharadesEgoLT. Extensive experiments are conducted on the existing long-tailed video benchmark VideoLT and the two new benchmarks to verify the effectiveness of our proposed method with state-of-the-art performance. The code and proposed benchmarks are released at https://github.com/VisionLanguageLab/MEID.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"26 1","pages":"1451-1459"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81211794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiang Li, Yiwen Wang, Yifan Sun, Xihong Wu, J. Chen
{"title":"PGSS: Pitch-Guided Speech Separation","authors":"Xiang Li, Yiwen Wang, Yifan Sun, Xihong Wu, J. Chen","doi":"10.1609/aaai.v37i11.26542","DOIUrl":"https://doi.org/10.1609/aaai.v37i11.26542","url":null,"abstract":"Monaural speech separation aims to separate concurrent speakers from a single-microphone mixture recording. Inspired by the effect of pitch priming in auditory scene analysis (ASA) mechanisms, a novel pitch-guided speech separation framework is proposed in this work. The prominent advantage of this framework is that both the permutation problem and the unknown speaker number problem existing in general models can be avoided by using pitch contours as the primary means to guide the target speaker. In addition, adversarial training is applied, instead of a traditional time-frequency mask, to improve the perceptual quality of separated speech. Specifically, the proposed framework can be divided into two phases: pitch extraction and speech separation. The former aims to extract pitch contour candidates for each speaker from the mixture, modeling the bottom-up process in ASA mechanisms. Any pitch contour can be selected as the condition in the second phase to separate the corresponding speaker, where a conditional generative adversarial network (CGAN) is applied. The second phase models the effect of pitch priming in ASA. Experiments on the WSJ0-2mix corpus reveal that the proposed approaches can achieve higher pitch extraction accuracy and better separation performance, compared to the baseline models, and have the potential to be applied to SOTA architectures.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"29 1","pages":"13130-13138"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82040326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BETA-CD: A Bayesian Meta-Learned Cognitive Diagnosis Framework for Personalized Learning","authors":"Haoyang Bi, Enhong Chen, Weidong He, Han Wu, Weihao Zhao, Shijin Wang, Jinze Wu","doi":"10.1609/aaai.v37i4.25629","DOIUrl":"https://doi.org/10.1609/aaai.v37i4.25629","url":null,"abstract":"Personalized learning is a promising educational approach that aims to provide high-quality personalized services for each student with minimum demands for practice data. The key to achieving that lies in the cognitive diagnosis task, which estimates the cognitive state of the student through his/her logged data of doing practice quizzes. Nevertheless, in the personalized learning scenario, existing cognitive diagnosis models suffer from the inability to (1) quickly adapt to new students using a small amount of data, and (2) measure the reliability of the diagnosis result to avoid improper services that mismatch the student's actual state. In this paper, we propose a general Bayesian mETA-learned Cognitive Diagnosis framework (BETA-CD), which addresses the two challenges by prior knowledge exploitation and model uncertainty quantification, respectively. Specifically, we firstly introduce Bayesian hierarchical modeling to associate each student's cognitive state with a shared prior distribution encoding prior knowledge and a personal posterior distribution indicating model uncertainty. Furthermore, we formulate a meta-learning objective to automatically exploit prior knowledge from historical students, and efficiently solve it with a gradient-based variational inference method. The code will be publicly available at https://github.com/AyiStar/pyat.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"72 1","pages":"5018-5026"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85703765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scene-Level Sketch-Based Image Retrieval with Minimal Pairwise Supervision","authors":"Ce Ge, Jingyu Wang, Q. Qi, Haifeng Sun, Tong Xu, Jianxin Liao","doi":"10.1609/aaai.v37i1.25141","DOIUrl":"https://doi.org/10.1609/aaai.v37i1.25141","url":null,"abstract":"The sketch-based image retrieval (SBIR) task has long been researched at the instance level, where both query sketches and candidate images are assumed to contain only one dominant object. This strong assumption constrains its application, especially with the increasingly popular intelligent terminals and human-computer interaction technology. In this work, a more general scene-level SBIR task is explored, where sketches and images can both contain multiple object instances. The new general task is extremely challenging due to several factors: (i) scene-level SBIR inherently shares sketch-specific difficulties with instance-level SBIR (e.g., sparsity, abstractness, and diversity), (ii) the cross-modal similarity is measured between two partially aligned domains (i.e., not all objects in images are drawn in scene sketches), and (iii) besides instance-level visual similarity, a more complex multi-dimensional scene-level feature matching problem is imposed (including appearance, semantics, layout, etc.). Addressing these challenges, a novel Conditional Graph Autoencoder model is proposed to deal with scene-level sketch-images retrieval. More importantly, the model can be trained with only pairwise supervision, which distinguishes our study from others in that elaborate instance-level annotations (for example, bounding boxes) are no longer required. Extensive experiments confirm the ability of our model to robustly retrieve multiple related objects at the scene level and exhibit superior performance beyond strong competitors.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"11 1","pages":"650-657"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84148600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guihong Wan, Meng Jiao, Xinglong Ju, Yu Zhang, H. Schweitzer, Feng Liu
{"title":"Electrophysiological Brain Source Imaging via Combinatorial Search with Provable Optimality","authors":"Guihong Wan, Meng Jiao, Xinglong Ju, Yu Zhang, H. Schweitzer, Feng Liu","doi":"10.1609/aaai.v37i10.26471","DOIUrl":"https://doi.org/10.1609/aaai.v37i10.26471","url":null,"abstract":"Electrophysiological Source Imaging (ESI) refers to reconstructing the underlying brain source activation from non-invasive Electroencephalography (EEG) and Magnetoencephalography (MEG) measurements on the scalp. Estimating the source locations and their extents is a fundamental tool in clinical and neuroscience applications. However, the estimation is challenging because of the ill-posedness and high coherence in the leadfield matrix as well as the noise in the EEG/MEG data. In this work, we proposed a combinatorial search framework to address the ESI problem with a provable optimality guarantee. Specifically, by exploiting the graph neighborhood information in the brain source space, we converted the ESI problem into a graph search problem and designed a combinatorial search algorithm under the framework of A* to solve it. The proposed algorithm is guaranteed to give an optimal solution to the ESI problem. Experimental results on both synthetic data and real epilepsy EEG data demonstrated that the proposed algorithm could faithfully reconstruct the source activation in the brain.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"62 1","pages":"12491-12499"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78339379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bangyan He, J. Liu, Yiming Li, Siyuan Liang, Jingzhi Li, Xiaojun Jia, Xiaochun Cao
{"title":"Generating Transferable 3D Adversarial Point Cloud via Random Perturbation Factorization","authors":"Bangyan He, J. Liu, Yiming Li, Siyuan Liang, Jingzhi Li, Xiaojun Jia, Xiaochun Cao","doi":"10.1609/aaai.v37i1.25154","DOIUrl":"https://doi.org/10.1609/aaai.v37i1.25154","url":null,"abstract":"Recent studies have demonstrated that existing deep neural networks (DNNs) on 3D point clouds are vulnerable to adversarial examples, especially under the white-box settings where the adversaries have access to model parameters. However, adversarial 3D point clouds generated by existing white-box methods have limited transferability across different DNN architectures. They have only minor threats in real-world scenarios under the black-box settings where the adversaries can only query the deployed victim model. In this paper, we revisit the transferability of adversarial 3D point clouds. We observe that an adversarial perturbation can be randomly factorized into two sub-perturbations, which are also likely to be adversarial perturbations. It motivates us to consider the effects of the perturbation and its sub-perturbations simultaneously to increase the transferability for sub-perturbations also contain helpful information. In this paper, we propose a simple yet effective attack method to generate more transferable adversarial 3D point clouds. Specifically, rather than simply optimizing the loss of perturbation alone, we combine it with its random factorization. We conduct experiments on benchmark dataset, verifying our method's effectiveness in increasing transferability while preserving high efficiency.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"2 1","pages":"764-772"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78477112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}