{"title":"Learning group interaction for sports video understanding from a perspective of athlete","authors":"Rui He, Zehua Fu, Qingjie Liu, Yunhong Wang, Xunxun Chen","doi":"10.1007/s11704-023-2525-y","DOIUrl":"https://doi.org/10.1007/s11704-023-2525-y","url":null,"abstract":"<p>Learning activities interactions between small groups is a key step in understanding team sports videos. Recent research focusing on team sports videos can be strictly regarded from the perspective of the audience rather than the athlete. For team sports videos such as volleyball and basketball videos, there are plenty of intra-team and inter-team relations. In this paper, a new task named Group Scene Graph Generation is introduced to better understand intra-team relations and inter-team relations in sports videos. To tackle this problem, a novel Hierarchical Relation Network is proposed. After all players in a video are finely divided into two teams, the feature of the two teams’ activities and interactions will be enhanced by Graph Convolutional Networks, which are finally recognized to generate Group Scene Graph. For evaluation, built on <i>Volleyball</i> dataset with additional 9660 team activity labels, a <i>Volleyball+</i> dataset is proposed. A baseline is set for better comparison and our experimental results demonstrate the effectiveness of our method. Moreover, the idea of our method can be directly utilized in another video-based task, Group Activity Recognition. Experiments show the priority of our method and display the link between the two tasks. Finally, from the athlete’s view, we elaborately present an interpretation that shows how to utilize Group Scene Graph to analyze teams’ activities and provide professional gaming suggestions.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"33 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138743512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"$$cal{Y}$$ -Tuning: an efficient tuning paradigm for large-scale pre-trained models via label representation learning","authors":"Yitao Liu, Chenxin An, Xipeng Qiu","doi":"10.1007/s11704-023-3131-8","DOIUrl":"https://doi.org/10.1007/s11704-023-3131-8","url":null,"abstract":"<p>With current success of large-scale pre-trained models (PTMs), how efficiently adapting PTMs to downstream tasks has attracted tremendous attention, especially for PTMs with billions of parameters. Previous work focuses on designing parameter-efficient tuning paradigms but needs to save and compute the gradient of the whole computational graph. In this paper, we propose <span>(cal{Y})</span>-Tuning, an efficient yet effective paradigm to adapt frozen large-scale PTMs to specific downstream tasks. <span>(cal{Y})</span>-Tuning learns dense representations for labels <span>(cal{Y})</span> defined in a given task and aligns them to fixed feature representation. Without computing the gradients of text encoder at training phrase, <span>(cal{Y})</span>-Tuning is not only parameter-efficient but also training-efficient. Experimental results show that for DeBERTa<sub>XXL</sub> with 1.6 billion parameters, <span>(cal{Y})</span>-Tuning achieves performance more than 96% of full fine-tuning on GLUE Benchmark with only 2% tunable parameters and much fewer training costs.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"38 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138743951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuting Yang, Pei Huang, Juan Cao, Jintao Li, Yun Lin, Feifei Ma
{"title":"A prompt-based approach to adversarial example generation and robustness enhancement","authors":"Yuting Yang, Pei Huang, Juan Cao, Jintao Li, Yun Lin, Feifei Ma","doi":"10.1007/s11704-023-2639-2","DOIUrl":"https://doi.org/10.1007/s11704-023-2639-2","url":null,"abstract":"<p>Recent years have seen the wide application of natural language processing (NLP) models in crucial areas such as finance, medical treatment, and news media, raising concerns about the model robustness and vulnerabilities. We find that prompt paradigm can probe special robust defects of pre-trained language models. Malicious prompt texts are first constructed for inputs and a pre-trained language model can generate adversarial examples for victim models via maskfilling. Experimental results show that prompt paradigm can efficiently generate more diverse adversarial examples besides synonym substitution. Then, we propose a novel robust training approach based on prompt paradigm which incorporates prompt texts as the alternatives to adversarial examples and enhances robustness under a lightweight minimax-style optimization framework. Experiments on three real-world tasks and two deep neural models show that our approach can significantly improve the robustness of models to resist adversarial attacks.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"1 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138715451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantum speedup and limitations on matroid property problems","authors":"Xiaowei Huang, Jingquan Luo, Lvzhou Li","doi":"10.1007/s11704-023-3130-9","DOIUrl":"https://doi.org/10.1007/s11704-023-3130-9","url":null,"abstract":"<p>Matroid theory has been developed to be a mature branch of mathematics and has extensive applications in combinatorial optimization, algorithm design and so on. On the other hand, quantum computing has attracted much attention and has been shown to surpass classical computing on solving some computational problems. Surprisingly, crossover studies of the two fields seem to be missing in the literature. This paper initiates the study of quantum algorithms for matroid property problems. It is shown that quadratic quantum speedup is possible for the calculation problem of finding the girth or the number of circuits (bases, flats, hyperplanes) of a matroid, and for the decision problem of deciding whether a matroid is uniform or Eulerian, by giving a uniform lower bound <span>(Omega left( {sqrt {left( {matrix{n cr {leftlfloor {n/2} rightrfloor } cr } } right)} } right))</span> on the query complexity of all these problems. On the other hand, for the uniform matroid decision problem, an asymptotically optimal quantum algorithm is proposed which achieves the lower bound, and for the girth problem, an almost optimal quantum algorithm is given with query complexity <span>(Oleft( {log nsqrt {left( {matrix{n cr {leftlfloor {n/2} rightrfloor } cr } } right)} } right))</span>. In addition, for the paving matroid decision problem, a lower bound <span>(Omega left( {sqrt {left( {matrix{n cr {leftlfloor {n/2} rightrfloor } cr } } right)/n} } right))</span> on the query complexity is obtained, and an <span>(Oleft( {sqrt {left( {matrix{n cr {leftlfloor {n/2} rightrfloor } cr } } right)} } right))</span> quantum algorithm is presented.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"9 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138743393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LMR-CBT: learning modality-fused representations with CB-Transformer for multimodal emotion recognition from unaligned multimodal sequences","authors":"Ziwang Fu, Feng Liu, Qing Xu, Xiangling Fu, Jiayin Qi","doi":"10.1007/s11704-023-2444-y","DOIUrl":"https://doi.org/10.1007/s11704-023-2444-y","url":null,"abstract":"<p>Learning modality-fused representations and processing unaligned multimodal sequences are meaningful and challenging in multimodal emotion recognition. Existing approaches use directional pairwise attention or a message hub to fuse language, visual, and audio modalities. However, these fusion methods are often quadratic in complexity with respect to the modal sequence length, bring redundant information and are not efficient. In this paper, we propose an efficient neural network to learn modality-fused representations with CB-Transformer (LMR-CBT) for multimodal emotion recognition from unaligned multi-modal sequences. Specifically, we first perform feature extraction for the three modalities respectively to obtain the local structure of the sequences. Then, we design an innovative asymmetric transformer with cross-modal blocks (CB-Transformer) that enables complementary learning of different modalities, mainly divided into local temporal learning, cross-modal feature fusion and global self-attention representations. In addition, we splice the fused features with the original features to classify the emotions of the sequences. Finally, we conduct word-aligned and unaligned experiments on three challenging datasets, IEMOCAP, CMU-MOSI, and CMU-MOSEI. The experimental results show the superiority and efficiency of our proposed method in both settings. Compared with the mainstream methods, our approach reaches the state-of-the-art with a minimum number of parameters.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"19 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138681628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Communication-robust multi-agent learning by adaptable auxiliary multi-agent adversary generation","authors":"Lei Yuan, Feng Chen, Zongzhang Zhang, Yang Yu","doi":"10.1007/s11704-023-2733-5","DOIUrl":"https://doi.org/10.1007/s11704-023-2733-5","url":null,"abstract":"<p>Communication can promote coordination in cooperative Multi-Agent Reinforcement Learning (MARL). Nowadays, existing works mainly focus on improving the communication efficiency of agents, neglecting that real-world communication is much more challenging as there may exist noise or potential attackers. Thus the robustness of the communication-based policies becomes an emergent and severe issue that needs more exploration. In this paper, we posit that the ego system<sup>1)</sup> trained with auxiliary adversaries may handle this limitation and propose an adaptable method of <b>M</b>ulti<b>-A</b>gent <b>A</b>uxiliary <b>A</b>dversaries Generation for robust <b>C</b>ommunication, dubbed MA3C, to obtain a robust communication-based policy. In specific, we introduce a novel message-attacking approach that models the learning of the auxiliary attacker as a cooperative problem under a shared goal to minimize the coordination ability of the ego system, with which every information channel may suffer from distinct message attacks. Furthermore, as naive adversarial training may impede the generalization ability of the ego system, we design an attacker population generation approach based on evolutionary learning. Finally, the ego system is paired with an attacker population and then alternatively trained against the continuously evolving attackers to improve its robustness, meaning that both the ego system and the attackers are adaptable. Extensive experiments on multiple benchmarks indicate that our proposed MA3C provides comparable or better robustness and generalization ability than other baselines.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"18 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138681760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongru Gao, Xiaofei Liao, Zhiyuan Shao, Kexin Li, Jiajie Chen, Hai Jin
{"title":"A survey on dynamic graph processing on GPUs: concepts, terminologies and systems","authors":"Hongru Gao, Xiaofei Liao, Zhiyuan Shao, Kexin Li, Jiajie Chen, Hai Jin","doi":"10.1007/s11704-023-2656-1","DOIUrl":"https://doi.org/10.1007/s11704-023-2656-1","url":null,"abstract":"<p>Graphs that are used to model real-world entities with vertices and relationships among entities with edges, have proven to be a powerful tool for describing real-world problems in applications. In most real-world scenarios, entities and their relationships are subject to constant changes. Graphs that record such changes are called dynamic graphs. In recent years, the widespread application scenarios of dynamic graphs have stimulated extensive research on dynamic graph processing systems that continuously ingest graph updates and produce up-to-date graph analytics results. As the scale of dynamic graphs becomes larger, higher performance requirements are demanded to dynamic graph processing systems. With the massive parallel processing power and high memory bandwidth, GPUs become mainstream vehicles to accelerate dynamic graph processing tasks. GPU-based dynamic graph processing systems mainly address two challenges: maintaining the graph data when updates occur (i.e., graph updating) and producing analytics results in time (i.e., graph computing). In this paper, we survey GPU-based dynamic graph processing systems and review their methods on addressing both graph updating and graph computing. To comprehensively discuss existing dynamic graph processing systems on GPUs, we first introduce the terminologies of dynamic graph processing and then develop a taxonomy to describe the methods employed for graph updating and graph computing. In addition, we discuss the challenges and future research directions of dynamic graph processing on GPUs.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"2 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138682019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A MLP-Mixer and mixture of expert model for remaining useful life prediction of lithium-ion batteries","authors":"","doi":"10.1007/s11704-023-3277-4","DOIUrl":"https://doi.org/10.1007/s11704-023-3277-4","url":null,"abstract":"<h3>Abstract</h3> <p>Accurately predicting the Remaining Useful Life (RUL) of lithium-ion batteries is crucial for battery management systems. Deep learning-based methods have been shown to be effective in predicting RUL by leveraging battery capacity time series data. However, the representation learning of features such as long-distance sequence dependencies and mutations in capacity time series still needs to be improved. To address this challenge, this paper proposes a novel deep learning model, the MLP-Mixer and Mixture of Expert (MMMe) model, for RUL prediction. The MMMe model leverages the Gated Recurrent Unit and Multi-Head Attention mechanism to encode the sequential data of battery capacity to capture the temporal features and a re-zero MLP-Mixer model to capture the high-level features. Additionally, we devise an ensemble predictor based on a Mixture-of-Experts (MoE) architecture to generate reliable RUL predictions. The experimental results on public datasets demonstrate that our proposed model significantly outperforms other existing methods, providing more reliable and precise RUL predictions while also accurately tracking the capacity degradation process. Our code and dataset are available at the website of github.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"6 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138681697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Zhu, Yishuai Geng, Yun Li, Jipeng Qiang, Xindong Wu
{"title":"Representation learning: serial-autoencoder for personalized recommendation","authors":"Yi Zhu, Yishuai Geng, Yun Li, Jipeng Qiang, Xindong Wu","doi":"10.1007/s11704-023-2441-1","DOIUrl":"https://doi.org/10.1007/s11704-023-2441-1","url":null,"abstract":"<p>Nowadays, the personalized recommendation has become a research hotspot for addressing information overload. Despite this, generating effective recommendations from sparse data remains a challenge. Recently, auxiliary information has been widely used to address data sparsity, but most models using auxiliary information are linear and have limited expressiveness. Due to the advantages of feature extraction and no-label requirements, autoencoder-based methods have become quite popular. However, most existing autoencoder-based methods discard the reconstruction of auxiliary information, which poses huge challenges for better representation learning and model scalability. To address these problems, we propose Serial-Autoencoder for Personalized Recommendation (SAPR), which aims to reduce the loss of critical information and enhance the learning of feature representations. Specifically, we first combine the original rating matrix and item attribute features and feed them into the first autoencoder for generating a higher-level representation of the input. Second, we use a second autoencoder to enhance the reconstruction of the data representation of the prediciton rating matrix. The output rating information is used for recommendation prediction. Extensive experiments on the MovieTweetings and MovieLens datasets have verified the effectiveness of SAPR compared to state-of-the-art models.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"187 4 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138681699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust AUC maximization for classification with pairwise confidence comparisons","authors":"Haochen Shi, Mingkun Xie, Shengjun Huang","doi":"10.1007/s11704-023-2709-5","DOIUrl":"https://doi.org/10.1007/s11704-023-2709-5","url":null,"abstract":"<p>Supervised learning often requires a large number of labeled examples, which has become a critical bottleneck in the case that manual annotating the class labels is costly. To mitigate this issue, a new framework called pairwise comparison (Pcomp) classification is proposed to allow training examples only weakly annotated with pairwise comparison, i.e., which one of two examples is more likely to be positive. The previous study solves Pcomp problems by minimizing the classification error, which may lead to less robust model due to its sensitivity to class distribution. In this paper, we propose a robust learning framework for Pcomp data along with a pairwise surrogate loss called Pcomp-AUC. It provides an unbiased estimator to equivalently maximize AUC without accessing the precise class labels. Theoretically, we prove the consistency with respect to AUC and further provide the estimation error bound for the proposed method. Empirical studies on multiple datasets validate the effectiveness of the proposed method.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"6 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138681625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}