Lutong Qin;Lei Zhang;Chengrun Li;Chaoda Song;Dongzhou Cheng;Shuoyuan Wang;Hao Wu;Aiguo Song
{"title":"Towards Better Accuracy-Efficiency Trade-Offs: Dynamic Activity Inference via Collaborative Learning From Various Width-Resolution Configurations","authors":"Lutong Qin;Lei Zhang;Chengrun Li;Chaoda Song;Dongzhou Cheng;Shuoyuan Wang;Hao Wu;Aiguo Song","doi":"10.1109/TAI.2024.3489532","DOIUrl":"https://doi.org/10.1109/TAI.2024.3489532","url":null,"abstract":"Recently, deep neural networks have triumphed over a large variety of human activity recognition (HAR) applications on resource-constrained mobile devices. However, most existing works are static and ignore the fact that the computational budget usually changes drastically across various devices, which prevent real-world HAR deployment. It still remains a major challenge: how to adaptively and instantly tradeoff accuracy and latency at runtime for on-device activity inference using time series sensor data? To address this issue, this article introduces a new collaborative learning scheme by training a set of subnetworks executed at varying network widths when fueled with different sensor input resolutions as data augmentation, which can instantly switch on-the-fly at different width-resolution configurations for flexible and dynamic activity inference under varying resource budgets. Particularly, it offers a promising performance-boosting solution by utilizing self-distillation to transfer the unique knowledge among multiple width-resolution configuration, which can capture stronger feature representations for activity recognition. Extensive experiments and ablation studies on three public HAR benchmark datasets validate the effectiveness and efficiency of our approach. A real implementation is evaluated on a mobile device. This discovery opens up the possibility to directly access accuracy-latency spectrum of deep learning models in versatile real-world HAR deployments. Code is available at \u0000<uri>https://github.com/Lutong-Qin/Collaborative_HAR</uri>\u0000.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6723-6738"},"PeriodicalIF":0.0,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142825900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning Empirical Inherited Intelligent MPC for Switched Systems With Network Security Communication","authors":"Yiwen Qi;Yiwen Tang;Wenke Yu","doi":"10.1109/TAI.2024.3486276","DOIUrl":"https://doi.org/10.1109/TAI.2024.3486276","url":null,"abstract":"This article studies learning empirical inherited intelligent model predictive control (LEII-MPC) for switched systems. For complex environments and systems, an intelligent control method design with learning ability is necessary and meaningful. First, a switching law that coordinates the iterative learning control action is devised according to the average dwell time approach. Second, an intelligent MPC mechanism with the iteration learning experience is designed to optimize the control action. With the designed LEII-MPC, sufficient conditions for the switched systems stability equipped with the event-triggering schemes (ETSs) in both the time domain and the iterative domain are presented. The ETS in the iterative domain is to solve unnecessary iterative updates. The ETS in the time domain is to deal with potential denial of service (DoS) attacks, which includes two parts: 1) for detection, an attack-dependent event-triggering method is presented to determine attack sequence and reduce lost packets; and 2) for compensation, a buffer is used to ensure system performance during the attack period. Last, a numerical example shows the effectiveness of the proposed method.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6342-6355"},"PeriodicalIF":0.0,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142810188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep Learning-Based Dual Watermarking for Image Copyright Protection and Authentication","authors":"Sudev Kumar Padhi;Archana Tiwari;Sk. Subidh Ali","doi":"10.1109/TAI.2024.3485519","DOIUrl":"https://doi.org/10.1109/TAI.2024.3485519","url":null,"abstract":"Advancements in digital technologies make it easy to modify the content of digital images. Hence, ensuring digital images’ integrity and authenticity is necessary to protect them against various attacks that manipulate them. We present a deep learning (DL) based dual invisible watermarking technique for performing source authentication, content authentication, and protecting digital content copyright of images sent over the internet. Beyond securing images, the proposed technique demonstrates robustness to content-preserving image manipulation attacks. It is also impossible to imitate or overwrite watermarks because the cryptographic hash of the image and the dominant features of the image in the form of perceptual hash are used as watermarks. We highlighted the need for source authentication to safeguard image integrity and authenticity, along with identifying similar content for copyright protection. After exhaustive testing, our technique obtained a high peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM), which implies there is a minute change in the original image after embedding our watermarks. Our trained model achieves high watermark extraction accuracy and satisfies two different objectives of verification and authentication on the same watermarked image.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6134-6145"},"PeriodicalIF":0.0,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142810193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MTECC: A Multitask Learning Framework for Esophageal Cancer Analysis","authors":"Jianpeng An;Wenqi Li;Yunhao Bai;Huazhen Chen;Gang Zhao;Qing Cai;Zhongke Gao","doi":"10.1109/TAI.2024.3485524","DOIUrl":"https://doi.org/10.1109/TAI.2024.3485524","url":null,"abstract":"In the field of esophageal cancer diagnostics, the accurate identification and classification of tumors and adjacent tissues within whole slide images (WSIs) are critical. However, this task is complicated by the difficulty in annotating normal tissue on tumor-bearing slides, as the infiltration results in a blend of different tissue types, making annotation difficult for pathologists. To overcome this challenge, we introduce the multitask esophageal cancer classification (MTECC) framework, featuring an innovative dual-branch architecture that operates at both global and local levels. The framework initially employs a masked autoencoder (MAE) for self-supervised learning. A distinctive feature of MTECC is the integration of RandoMix, an innovative image augmentation technique that randomly exchanges patches between different images. This method significantly enhances the model's generalization ability, especially for recognizing tissues within cancerous slides. MTECC ingeniously integrates two tasks: tumor detection using global tokens, and fine-grained tissue classification at the patch level using local tokens. The empirical evaluation of the MTECC on our extensive esophageal cancer dataset substantiates its efficacy. The performance metrics indicate robust results, with an accuracy of 0.811, an F1 score of 0.735, and an AUC of 0.957. The MTECC method represents a significant advancement in applying deep learning to complex pathological image analysis, offering valuable tools for pathologists in diagnosing and treating esophageal cancer.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6739-6751"},"PeriodicalIF":0.0,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142825947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiang-Xing Cheng;Huibin Lin;Chun-Yang Zhang;C. L. Philip Chen
{"title":"Unsupervised Domain Adaptation on Point Clouds via High-Order Geometric Structure Modeling","authors":"Jiang-Xing Cheng;Huibin Lin;Chun-Yang Zhang;C. L. Philip Chen","doi":"10.1109/TAI.2024.3483199","DOIUrl":"https://doi.org/10.1109/TAI.2024.3483199","url":null,"abstract":"Point clouds can capture the precise geometric information of objects and scenes, which are an important source of 3-D data and one of the most popular 3-D geometric data structures for cognitions in many real-world applications like automatic driving and remote sensing. However, due to the influence of sensors and varieties of objects, the point clouds obtained by different devices may suffer obvious geometric changes, resulting in domain gaps that are prone to the neural networks trained in one domain failing to preserve the performance in other domains. To alleviate the above problem, this article proposes an unsupervised domain adaptation framework, named HO-GSM, as the first attempt to model high-order geometric structures of point clouds. First, we construct multiple self-supervised tasks to learn the invariant semantic and geometric features of the source and target domains, especially to capture the feature invariance of high-order geometric structures of point clouds. Second, the discriminative feature space of target domain is acquired by using contrastive learning to refine domain alignment to specific class level. Experiments on the PointDA-10 and GraspNetPC-10 collection of datasets show that the proposed HO-GSM can significantly outperform the state-of-the-art counterparts.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6121-6133"},"PeriodicalIF":0.0,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142810191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaotian Song;Xiangning Xie;Zeqiong Lv;Gary G. Yen;Weiping Ding;Jiancheng Lv;Yanan Sun
{"title":"Efficient Evaluation Methods for Neural Architecture Search: A Survey","authors":"Xiaotian Song;Xiangning Xie;Zeqiong Lv;Gary G. Yen;Weiping Ding;Jiancheng Lv;Yanan Sun","doi":"10.1109/TAI.2024.3477457","DOIUrl":"https://doi.org/10.1109/TAI.2024.3477457","url":null,"abstract":"Neural architecture search (NAS) has received increasing attention because of its exceptional merits in automating the design of deep neural network (DNN) architectures. However, the performance evaluation process, as a key part of NAS, often requires training a large number of DNNs. This inevitably makes NAS computationally expensive. In past years, many efficient evaluation methods (EEMs) have been proposed to address this critical issue. In this article, we comprehensively survey these EEMs published up to date, and provide a detailed analysis to motivate the further development of this research direction. Specifically, we divide the existing EEMs into four categories based on the number of DNNs trained for constructing these EEMs. The categorization can reflect the degree of efficiency in principle, which can in turn help quickly grasp the methodological features. In surveying each category, we further discuss the design principles and analyze the strengths and weaknesses to clarify the landscape of existing EEMs, thus making easily understanding the research trends of EEMs. Furthermore, we also discuss the current challenges and issues to identify future research directions in this emerging topic. In summary, this survey provides a convenient overview of EEM for interested users, and they can easily select the proper EEM method for the tasks at hand. In addition, the researchers in the NAS field could continue exploring the future directions suggested in the article.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"5990-6011"},"PeriodicalIF":0.0,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142810406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}