Wei Guo, Fuzhen Zhuang, Xiao Zhang, Yiqi Tong, Jin Dong
{"title":"A comprehensive survey of federated transfer learning: challenges, methods and applications","authors":"Wei Guo, Fuzhen Zhuang, Xiao Zhang, Yiqi Tong, Jin Dong","doi":"10.1007/s11704-024-40065-x","DOIUrl":"https://doi.org/10.1007/s11704-024-40065-x","url":null,"abstract":"<p>Federated learning (FL) is a novel distributed machine learning paradigm that enables participants to collaboratively train a centralized model with privacy preservation by eliminating the requirement of data sharing. In practice, FL often involves multiple participants and requires the third party to aggregate global information to guide the update of the target participant. Therefore, many FL methods do not work well due to the training and test data of each participant may not be sampled from the same feature space and the same underlying distribution. Meanwhile, the differences in their local devices (system heterogeneity), the continuous influx of online data (incremental data), and labeled data scarcity may further influence the performance of these methods. To solve this problem, federated transfer learning (FTL), which integrates transfer learning (TL) into FL, has attracted the attention of numerous researchers. However, since FL enables a continuous share of knowledge among participants with each communication round while not allowing local data to be accessed by other participants, FTL faces many unique challenges that are not present in TL. In this survey, we focus on categorizing and reviewing the current progress on federated transfer learning, and outlining corresponding solutions and applications. Furthermore, the common setting of FTL scenarios, available datasets, and significant related research are summarized in this survey.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"93 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141770853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DMFVAE: miRNA-disease associations prediction based on deep matrix factorization method with variational autoencoder","authors":"Pijing Wei, Qianqian Wang, Zhen Gao, Ruifen Cao, Chunhou Zheng","doi":"10.1007/s11704-023-3610-y","DOIUrl":"https://doi.org/10.1007/s11704-023-3610-y","url":null,"abstract":"<p>MicroRNAs (miRNAs) are closely related to numerous complex human diseases, therefore, exploring miRNA-disease associations (MDAs) can help people gain a better understanding of complex disease mechanism. An increasing number of computational methods have been developed to predict MDAs. However, the sparsity of the MDAs may hinder the performance of many methods. In addition, many methods fail to capture the nonlinear relationships of miRNA-disease network and inadequately leverage the features of network and neighbor nodes. In this study, we propose a deep matrix factorization model with variational autoencoder (DMFVAE) to predict potential MDAs. DMFVAE first decomposes the original association matrix and the enhanced association matrix, in which the enhanced association matrix is enhanced by self-adjusting the nearest neighbor method, to obtain sparse vectors and dense vectors, respectively. Then, the variational encoder is employed to obtain the nonlinear latent vectors of miRNA and disease for the sparse vectors, and meanwhile, node2vec is used to obtain the network structure embedding vectors of miRNA and disease for the dense vectors. Finally, sample features are acquired by combining the latent vectors and network structure embedding vectors, and the final prediction is implemented by convolutional neural network with channel attention. To evaluate the performance of DMFVAE, we conduct five-fold cross validation on the HMDD v2.0 and HMDD v3.2 datasets and the results show that DMFVAE performs well. Furthermore, case studies on lung neoplasms, colon neoplasms, and esophageal neoplasms confirm the ability of DMFVAE in identifying potential miRNAs for human diseases.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"218 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141610926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graph foundation model","authors":"Chuan Shi, Junze Chen, Jiawei Liu, Cheng Yang","doi":"10.1007/s11704-024-40046-0","DOIUrl":"https://doi.org/10.1007/s11704-024-40046-0","url":null,"abstract":"<p>Graph Foundation Models represent an evolving direction in graph machine learning. Drawing inspiration from the success of Large Language Models in NLP, GFMs are designed to be trained on extensive graph data and adapted for a diverse array of downstream tasks. In this article, we have explained and introduced the concept of GFMs, comparing them with Language Foundation Models to highlight their similarities and differences. We identified the key technologies in building GFMs as the pre-train and adaptation techniques from the fields of GNNs and LLMs. Additionally, we discussed the potential for GFMs to have significant applications in various domains, ranging from social network analysis to bioinformatics and beyond.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"16 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141549372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SEOE: an option graph based semantically embedding method for prenatal depression detection","authors":"Xiaosong Han, Mengchen Cao, Dong Xu, Xiaoyue Feng, Yanchun Liang, Xiaoduo Lang, Renchu Guan","doi":"10.1007/s11704-024-3612-4","DOIUrl":"https://doi.org/10.1007/s11704-024-3612-4","url":null,"abstract":"<p>Prenatal depression, which can affect pregnant women’s physical and psychological health and cause postpartum depression, is increasing dramatically. Therefore, it is essential to detect prenatal depression early and conduct an attribution analysis. Many studies have used questionnaires to screen for prenatal depression, but the existing methods lack attributability. To diagnose the early signs of prenatal depression and identify the key factors that may lead to prenatal depression from questionnaires, we present the semantically enhanced option embedding (SEOE) model to represent questionnaire options. It can quantitatively determine the relationship and patterns between options and depression. SEOE first quantifies options and resorts them, gathering options with little difference, since Word2Vec is highly dependent on context. The resort task is transformed into an optimization problem involving the traveling salesman problem. Moreover, all questionnaire samples are used to train the options’ vector using Word2Vec. Finally, an LSTM and GRU fused model incorporating the cycle learning rate is constructed to detect whether a pregnant woman is suffering from depression. To verify the model, we compare it with other deep learning and traditional machine learning methods. The experiment results show that our proposed model can accurately identify pregnant women with depression and reach an F1 score of 0.8. The most relevant factors of depression found by SEOE are also verified in the literature. In addition, our model is of low computational complexity and strong generalization, which can be widely applied to other questionnaire analyses of psychiatric disorders.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"3 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141501822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"WPIA: accelerating DNN warm-up in Web browsers by precompiling WebGL programs","authors":"Deyu Tian, Yun Ma, Yudong Han, Qi Yang, Haochen Yang, Gang Huang","doi":"10.1007/s11704-024-40066-w","DOIUrl":"https://doi.org/10.1007/s11704-024-40066-w","url":null,"abstract":"<p>In this paper, we study the long warm-up time of GPU acceleration of DNN inference in Web browsers. We analyzed the reason behind the long warm-up time through a measurement study and revealed that compiling WebGL programs takes most of the warm-up time. Inspired by this finding, we proposed WPIA, an approach that suggests precompiling WebGL programs on the server side to avoid compiling them in Web browsers. WPIA tackles the challenges of precompiling by merging WebGL programs and using a record-and-replay technique. Evaluation experiment results show that WPIA can accelerate the DNN warm-up time to an order of magnitude.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"17 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141501824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FedTop: a constraint-loosed federated learning aggregation method against poisoning attack","authors":"Che Wang, Zhenhao Wu, Jianbo Gao, Jiashuo Zhang, Junjie Xia, Feng Gao, Zhi Guan, Zhong Chen","doi":"10.1007/s11704-024-3767-z","DOIUrl":"https://doi.org/10.1007/s11704-024-3767-z","url":null,"abstract":"<p>In this paper, we developed FedTop which significantly facilitates collaboration effectiveness between normal participants without suffering significant negative impacts from malicious participants. FedTop can both be regarded as a normal aggregation method for federated learning with normal data and stand more severe poisoning attacks including targeted and untargeted attacks with more loosen preconditions. In addition, we experimentally demonstrate that this method can significantly improve the learning performance in a malicious environment. However, our work still faces much limitations on data set choosing, base model choosing and the number of malicious models. Thus, our future work will be focused on experimentation with more scenarios, such as increasing the number of participants or designing more complex poisoning attacks on more complex data sets.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"38 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141501821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Audio-guided self-supervised learning for disentangled visual speech representations","authors":"Dalu Feng, Shuang Yang, Shiguang Shan, Xilin Chen","doi":"10.1007/s11704-024-3787-8","DOIUrl":"https://doi.org/10.1007/s11704-024-3787-8","url":null,"abstract":"<p>In this paper, we propose a novel two-branch framework to learn the disentangled visual speech representations based on two particular observations. Its main idea is to introduce the audio signal to guide the learning of speech-relevant cues and introduce a bottleneck to restrict the speech-irrelevant branch from learning high-frequency and fine-grained speech cues. Experiments on both the word-level and sentence-level audio-visual speech datasets LRW and LRS2-BBC show the effectiveness. Our future work is to explore more explicit auxiliary tasks and constraints beyond the reconstruction task of the speech-relevant and irrelevant branch to improve further its ability of capturing speech cues in the video. Meanwhile, it’s also a nice try to combine multiple types of knowledge representations [10] to further boost the obtained speech epresentations, which is also left for the future work.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"75 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141501825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuchen Yuan, Xiaoyue Feng, Bo Zhang, Pengyi Zhang, Jie Song
{"title":"JAPO: learning join and pushdown order for cloud-native join optimization","authors":"Yuchen Yuan, Xiaoyue Feng, Bo Zhang, Pengyi Zhang, Jie Song","doi":"10.1007/s11704-024-3937-z","DOIUrl":"https://doi.org/10.1007/s11704-024-3937-z","url":null,"abstract":"<p>In this paper, we introduce JAPO which learn the join and pushdown order through DRL. The main idea is that the DRL agent learns better decisions based on the experiences by monitoring the rewards and latencies via trying different actions. The results show that our method can generate good plans both on join order and pushdown order. We also show that our method can select the well-performed distributed index placement via experiments. In the future, we plan to deploy JAPO to real systems execution and consider more factors in JAPO, such as different join types.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"26 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141513577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Da-Wei Zhou, Zhi-Hong Qi, Han-Jia Ye, De-Chuan Zhan
{"title":"TV100: a TV series dataset that pre-trained CLIP has not seen","authors":"Da-Wei Zhou, Zhi-Hong Qi, Han-Jia Ye, De-Chuan Zhan","doi":"10.1007/s11704-024-40217-z","DOIUrl":"https://doi.org/10.1007/s11704-024-40217-z","url":null,"abstract":"<p>The era of pre-trained models has ushered in a wealth of new insights for the machine learning community. Among the myriad of questions that arise, one of paramount importance is: ‘Do pre-trained models possess comprehensive knowledge?’ This paper seeks to address this crucial inquiry. In line with our objective, we have made publicly available a novel dataset comprised of images from TV series released post-2021. This dataset holds significant potential for use in various research areas, including the evaluation of novel class iscovery and long-tailed learning, among others.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"128 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141552704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HeterMM: applying in-DRAM index to heterogeneous memory-based key-value stores","authors":"Yunhong Ji, Wentao Huang, Xuan Zhou","doi":"10.1007/s11704-024-3713-0","DOIUrl":"https://doi.org/10.1007/s11704-024-3713-0","url":null,"abstract":"<p>We propose HeterMM, a versatile framework that leverages in-DRAM indexes in KV stores on heterogeneous memory. HeterMM incorporates a plug-in programming model, allowing for the integration of various types of indexes. By prioritizing the maintenance of both index and hot data in DRAM, HeterMM maximizes the utilization of the superior performance of DRAM. Our evaluation demonstrates that HeterMM outperforms existing state-of-the-art frameworks that convert in-DRAM indexes to persistent ones. Furthermore, HeterMM can surpass NVM-specific KV stores by carefully selecting the appropriate index for specific scenarios.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"36 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140586741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}