Jiguo Liu, Chao Liu, Nan Li, Shihao Gao, Mingqi Liu, Dali Zhu
{"title":"LADA-Trans-NER: Adaptive Efficient Transformer for Chinese Named Entity Recognition Using Lexicon-Attention and Data-Augmentation","authors":"Jiguo Liu, Chao Liu, Nan Li, Shihao Gao, Mingqi Liu, Dali Zhu","doi":"10.1609/aaai.v37i11.26554","DOIUrl":"https://doi.org/10.1609/aaai.v37i11.26554","url":null,"abstract":"Recently, word enhancement has become very popular for Chinese Named Entity Recognition (NER), reducing segmentation errors and increasing the semantic and boundary information of Chinese words. However, these methods tend to ignore the semantic relationship before and after the sentence after integrating lexical information. Therefore, the regularity of word length information has not been fully explored in various word-character fusion methods. In this work, we propose a Lexicon-Attention and Data-Augmentation (LADA) method for Chinese NER. We discuss the challenges of using existing methods in incorporating word information for NER and show how our proposed methods could be leveraged to overcome those challenges. LADA is based on a Transformer Encoder that utilizes lexicon to construct a directed graph and fuses word information through updating the optimal edge of the graph. Specially, we introduce the advanced data augmentation method to obtain the optimal representation for the NER task. Experimental results show that the augmentation done using LADA can considerably boost the performance of our NER system and achieve significantly better results than previous state-of-the-art methods and variant models in the literature on four publicly available NER datasets, namely Resume, MSRA, Weibo, and OntoNotes v4. We also observe better generalization and application to a real-world setting from LADA on multi-source complex entities.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"21 1","pages":"13236-13245"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75024279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Brian Hu, Paul Tunison, Brandon Richard Webster, Anthony J. Hoogs
{"title":"Xaitk-Saliency: An Open Source Explainable AI Toolkit for Saliency","authors":"Brian Hu, Paul Tunison, Brandon Richard Webster, Anthony J. Hoogs","doi":"10.1609/aaai.v37i13.26871","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.26871","url":null,"abstract":"Advances in artificial intelligence (AI) using techniques such as deep learning have fueled the recent progress in fields such as computer vision. However, these algorithms are still often viewed as \"black boxes\", which cannot easily explain how they arrived at their final output decisions. Saliency maps are one commonly used form of explainable AI (XAI), which indicate the input features an algorithm paid attention to during its decision process. Here, we introduce the open source xaitk-saliency package, an XAI framework and toolkit for saliency. We demonstrate its modular and flexible nature by highlighting two example use cases for saliency maps: (1) object detection model comparison and (2) doppelganger saliency for person re-identification. We also show how the xaitk-saliency package can be paired with visualization tools to support the interactive exploration of saliency maps. Our results suggest that saliency maps may play a critical role in the verification and validation of AI models, ensuring their trusted use and deployment. The code is publicly available at: https://github.com/xaitk/xaitk-saliency.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"20 1","pages":"15760-15766"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72614360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tackling Safe and Efficient Multi-Agent Reinforcement Learning via Dynamic Shielding (Student Abstract)","authors":"Wenli Xiao, Yiwei Lyu, J. Dolan","doi":"10.1609/aaai.v37i13.27041","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.27041","url":null,"abstract":"Multi-agent Reinforcement Learning (MARL) has been increasingly used in safety-critical applications but has no safety guarantees, especially during training. In this paper, we propose dynamic shielding, a novel decentralized MARL framework to ensure safety in both training and deployment phases. Our framework leverages Shield, a reactive system running in parallel with the reinforcement learning algorithm to monitor and correct agents' behavior. In our algorithm, shields dynamically split and merge according to the environment state in order to maintain decentralization and avoid conservative behaviors while enjoying formal safety guarantees. We demonstrate the effectiveness of MARL with dynamic shielding in the mobile navigation scenario.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"30 7","pages":"16362-16363"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72635396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"InParformer: Evolutionary Decomposition Transformers with Interactive Parallel Attention for Long-Term Time Series Forecasting","authors":"Haizhou Cao, Zhenhao Huang, Tiechui Yao, Jue Wang, Hui He, Yangang Wang","doi":"10.1609/aaai.v37i6.25845","DOIUrl":"https://doi.org/10.1609/aaai.v37i6.25845","url":null,"abstract":"Long-term time series forecasting (LTSF) provides substantial benefits for numerous real-world applications, whereas places essential demands on the model capacity to capture long-range dependencies. Recent Transformer-based models have significantly improved LTSF performance. It is worth noting that Transformer with the self-attention mechanism was originally proposed to model language sequences whose tokens (i.e., words) are discrete and highly semantic. However, unlike language sequences, most time series are sequential and continuous numeric points. Time steps with temporal redundancy are weakly semantic, and only leveraging time-domain tokens is hard to depict the overall properties of time series (e.g., the overall trend and periodic variations). To address these problems, we propose a novel Transformer-based forecasting model named InParformer with an Interactive Parallel Attention (InPar Attention) mechanism. The InPar Attention is proposed to learn long-range dependencies comprehensively in both frequency and time domains. To improve its learning capacity and efficiency, we further design several mechanisms, including query selection, key-value pair compression, and recombination. Moreover, InParformer is constructed with evolutionary seasonal-trend decomposition modules to enhance intricate temporal pattern extraction. Extensive experiments on six real-world benchmarks show that InParformer outperforms the state-of-the-art forecasting Transformers.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"177 1","pages":"6906-6915"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74963333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Photogrammetry and VR for Comparing 2D and Immersive Linguistic Data Collection (Student Abstract)","authors":"Jacob Rubinstein, Cynthia Matuszek, Don Engel","doi":"10.1609/aaai.v37i13.27016","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.27016","url":null,"abstract":"The overarching goal of this work is to enable the collection of language describing a wide variety of objects viewed in virtual reality. We aim to create full 3D models from a small number of ‘keyframe’ images of objects found in the publicly available Grounded Language Dataset (GoLD) using photogrammetry. We will then collect linguistic descriptions by placing our models in virtual reality and having volunteers describe them. To evaluate the impact of virtual reality immersion on linguistic descriptions of the objects, we intend to apply contrastive learning to perform grounded language learning, then compare the descriptions collected from images (in GoLD) versus our models.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"11 1","pages":"16312-16313"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75127079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Yang, Yurui Huang, Weili Guo, Baohua Xu, Dingyin Xia
{"title":"Towards Global Video Scene Segmentation with Context-Aware Transformer","authors":"Yang Yang, Yurui Huang, Weili Guo, Baohua Xu, Dingyin Xia","doi":"10.1609/aaai.v37i3.25426","DOIUrl":"https://doi.org/10.1609/aaai.v37i3.25426","url":null,"abstract":"Videos such as movies or TV episodes usually need to divide the long storyline into cohesive units, i.e., scenes, to facilitate the understanding of video semantics. The key challenge lies in finding the boundaries of scenes by comprehensively considering the complex temporal structure and semantic information. To this end, we introduce a novel Context-Aware Transformer (CAT) with a self-supervised learning framework to learn high-quality shot representations, for generating well-bounded scenes. More specifically, we design the CAT with local-global self-attentions, which can effectively consider both the long-term and short-term context to improve the shot encoding. For training the CAT, we adopt the self-supervised learning schema. Firstly, we leverage shot-to-scene level pretext tasks to facilitate the pre-training with pseudo boundary, which guides CAT to learn the discriminative shot representations that maximize intra-scene similarity and inter-scene discrimination in an unsupervised manner. Then, we transfer contextual representations for fine-tuning the CAT with supervised data, which encourages CAT to accurately detect the boundary for scene segmentation. As a result, CAT is able to learn the context-aware shot representations and provides global guidance for scene segmentation. Our empirical analyses show that CAT can achieve state-of-the-art performance when conducting the scene segmentation task on the MovieNet dataset, e.g., offering 2.15 improvements on AP.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"23 3 1","pages":"3206-3213"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77373482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Adiga, Gursharn Kaur, Lijing Wang, Benjamin Hurt, P. Porebski, S. Venkatramanan, B. Lewis, M. Marathe
{"title":"Phase-Informed Bayesian Ensemble Models Improve Performance of COVID-19 Forecasts","authors":"A. Adiga, Gursharn Kaur, Lijing Wang, Benjamin Hurt, P. Porebski, S. Venkatramanan, B. Lewis, M. Marathe","doi":"10.1609/aaai.v37i13.26855","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.26855","url":null,"abstract":"Despite hundreds of methods published in the literature, forecasting epidemic dynamics remains challenging yet important. The challenges stem from multiple sources, including: the need for timely data, co-evolution of epidemic dynamics with behavioral and immunological adaptations, and the evolution of new pathogen strains. The ongoing COVID-19 pandemic highlighted these challenges; in an important article, Reich et al. did a comprehensive analysis highlighting many of these challenges.\u0000\u0000In this paper, we take another step in critically evaluating existing epidemic forecasting methods. Our methods are based on a simple yet crucial observation - epidemic dynamics go through a number of phases (waves). Armed with this understanding, we propose a modification to our deployed Bayesian ensembling case time series forecasting framework. We show that ensembling methods employing the phase information and using different weighting schemes for each phase can produce improved forecasts. We evaluate our proposed method with both the currently deployed model and the COVID-19 forecasthub models. The overall performance of the proposed model is consistent across the pandemic but more importantly, it is ranked third and first during two critical rapid growth phases in cases, regimes where the performance of most models from the CDC forecasting hub dropped significantly.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"74 1","pages":"15647-15653"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77380139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ping Xue, Yang Lu, Jingfei Chang, Xing Wei, Zhenchun Wei
{"title":"Fast and Accurate Binary Neural Networks Based on Depth-Width Reshaping","authors":"Ping Xue, Yang Lu, Jingfei Chang, Xing Wei, Zhenchun Wei","doi":"10.1609/aaai.v37i9.26268","DOIUrl":"https://doi.org/10.1609/aaai.v37i9.26268","url":null,"abstract":"Network binarization (i.e., binary neural networks, BNNs) can efficiently compress deep neural networks and accelerate model inference but cause severe accuracy degradation. Existing BNNs are mainly implemented based on the commonly used full-precision network backbones, and then the accuracy is improved with various techniques. However, there is a question of whether the full-precision network backbone is well adapted to BNNs. We start from the factors of the performance degradation of BNNs and analyze the problems of directly using full-precision network backbones for BNNs: for a given computational budget, the backbone of a BNN may need to be shallower and wider compared to the backbone of a full-precision network. With this in mind, Depth-Width Reshaping (DWR) is proposed to reshape the depth and width of existing full-precision network backbones and further optimize them by incorporating pruning techniques to better fit the BNNs. Extensive experiments demonstrate the analytical result and the effectiveness of the proposed method. Compared with the original backbones, the DWR backbones constructed by the proposed method result in close to O(√s) decrease in activations, while achieving an absolute accuracy increase by up to 1.7% with comparable computational cost. Besides, by using the DWR backbones, existing methods can achieve new state-of-the-art (SOTA) accuracy (e.g., 67.2% on ImageNet with ResNet-18 as the original backbone). We hope this work provides a novel insight into the backbone design of BNNs. The code is available at https://github.com/pingxue-hfut/DWR.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"150 1","pages":"10684-10692"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77477351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yixin Ren, Hao Zhang, Yewei Xia, J. Guan, Shuigeng Zhou
{"title":"Multi-Level Wavelet Mapping Correlation for Statistical Dependence Measurement: Methodology and Performance","authors":"Yixin Ren, Hao Zhang, Yewei Xia, J. Guan, Shuigeng Zhou","doi":"10.1609/aaai.v37i5.25799","DOIUrl":"https://doi.org/10.1609/aaai.v37i5.25799","url":null,"abstract":"We propose a new criterion for measuring dependence between two real variables, namely, Multi-level Wavelet Mapping Correlation (MWMC). MWMC can capture the nonlinear dependencies between variables by measuring their correlation under different levels of wavelet mappings. We show that the empirical estimate of MWMC converges exponentially to its population quantity. To support independence test better with MWMC, we further design a permutation test based on MWMC and prove that our test can not only control the type I error rate (the rate of false positives) well but also ensure that the type II error rate (the rate of false negatives) is upper bounded by O(1/n) (n is the sample size) with finite permutations. By extensive experiments on (conditional) independence tests and causal discovery, we show that our method outperforms existing independence test methods.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"53 1","pages":"6499-6506"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77489627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Debiasing Intrinsic Bias and Application Bias Jointly via Invariant Risk Minimization (Student Abstract)","authors":"Yuzhou Mao, Liu Yu, Yi Yang, Fan Zhou, Ting Zhong","doi":"10.1609/aaai.v37i13.27000","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.27000","url":null,"abstract":"Demographic biases and social stereotypes are common in pretrained language models (PLMs), while the fine-tuning in downstream applications can also produce new biases or amplify the impact of the original biases. Existing works separate the debiasing from the fine-tuning procedure, which results in a gap between intrinsic bias and application bias. In this work, we propose a debiasing framework CauDebias to eliminate both biases, which directly combines debiasing with fine-tuning and can be applied for any PLMs in downstream tasks. We distinguish the bias-relevant (non-causal factors) and label-relevant (causal factors) parts in sentences from a causal invariant perspective. Specifically, we perform intervention on non-causal factors in different demographic groups, and then devise an invariant risk minimization loss to trade-off performance between bias mitigation and task accuracy. Experimental results on three downstream tasks show that our CauDebias can remarkably reduce biases in PLMs while minimizing the impact on downstream tasks.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"197 1","pages":"16280-16281"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79768635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}