Tsinghua Science and Technology最新文献

筛选
英文 中文
Classification Hardness Based Adaptive Sampling Ensemble for Imbalanced Data Classification 基于分类硬度的不平衡数据分类自适应抽样集成
IF 6.6 1区 计算机科学
Tsinghua Science and Technology Pub Date : 2025-07-04 DOI: 10.26599/TST.2024.9010149
Zenghao Cui;Ziyi Gao;Shuaibing Yue;Rui Wang;Haiyan Zhu
{"title":"Classification Hardness Based Adaptive Sampling Ensemble for Imbalanced Data Classification","authors":"Zenghao Cui;Ziyi Gao;Shuaibing Yue;Rui Wang;Haiyan Zhu","doi":"10.26599/TST.2024.9010149","DOIUrl":"https://doi.org/10.26599/TST.2024.9010149","url":null,"abstract":"Class imbalance can substantially affect classification tasks using traditional classifiers, especially when identifying instances of minority categories. In addition to class imbalance, other challenges can also hinder accurate classification. Researchers have explored various approaches to mitigate the effects of class imbalance. However, most studies focus only on processing correlations within a single category of samples. This paper introduces an ensemble framework called Inter- and Intra-Class Overlapping Ensemble (IICOE), which incorporates two sampling methods. The first method, which is based on classification hardness undersampling, targets majority category samples by using simple samples as the foundation for classification and improving performance by focusing on samples near classification boundaries. The second method addresses the issue of overfitting minority category samples in undersampling and ensemble learning. To mitigate this, an adaptive augment hybrid sampling method is proposed, which enhances the classification boundary of samples and reduces overfitting. This paper conducts multiple experiments on 15 public datasets and concludes that the IICOE ensemble framework outperforms other ensemble learning algorithms in classifying imbalanced data.","PeriodicalId":48690,"journal":{"name":"Tsinghua Science and Technology","volume":"30 6","pages":"2419-2433"},"PeriodicalIF":6.6,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11072117","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144557916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FedCE: A Contrast Enhancement Federated Learning Method for Heterogeneous Medical Named Entity Recognition FedCE:一种用于异构医学命名实体识别的对比增强联邦学习方法
IF 6.6 1区 计算机科学
Tsinghua Science and Technology Pub Date : 2025-07-04 DOI: 10.26599/TST.2024.9010186
Kai Chang;Hailong Sun;Jindou Wan;Naiqian Zhang;Yiming Liu;Kuo Yang;Zixin Shu;Jianan Xia;Xuezhong Zhou
{"title":"FedCE: A Contrast Enhancement Federated Learning Method for Heterogeneous Medical Named Entity Recognition","authors":"Kai Chang;Hailong Sun;Jindou Wan;Naiqian Zhang;Yiming Liu;Kuo Yang;Zixin Shu;Jianan Xia;Xuezhong Zhou","doi":"10.26599/TST.2024.9010186","DOIUrl":"https://doi.org/10.26599/TST.2024.9010186","url":null,"abstract":"Medical Named Entity Recognition (NER) plays a crucial role in attaining precise patient portraits as well as providing support for intelligent diagnosis and treatment decisions. Federated Learning (FL) enables collaborative modeling and training across multiple endpoints without exposing the original data. However, the statistical heterogeneity exhibited by clinical medical text records poses a challenge for FL methods to support the training of NER models in such scenarios. We propose a Federated Contrast Enhancement (FedCE) method for NER to address the challenges faced by non-large-scale pre-trained models in FL for label-heterogeneous. The method leverages a multi-view encoder structure to capture both global and local semantic information, and employs contrastive learning to enhance the interoperability of global knowledge and local context. We evaluate the performance of the FedCE method on three real-world clinical record datasets. We investigate the impact of factors, such as pooling methods, maximum input text length, and training rounds on FedCE. Additionally, we assess how well FedCE adapts to the base NER models and evaluate its generalization performance. The experimental results show that the FedCE method has obvious advantages and can be effectively applied to various basic models, which is of great theoretical and practical significance for advancing FL in healthcare settings.","PeriodicalId":48690,"journal":{"name":"Tsinghua Science and Technology","volume":"30 6","pages":"2384-2398"},"PeriodicalIF":6.6,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11072110","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144557926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Total Contents 全部内容
IF 6.6 1区 计算机科学
Tsinghua Science and Technology Pub Date : 2025-07-04
{"title":"Total Contents","authors":"","doi":"","DOIUrl":"https://doi.org/","url":null,"abstract":"","PeriodicalId":48690,"journal":{"name":"Tsinghua Science and Technology","volume":"30 6","pages":"I-XIII"},"PeriodicalIF":6.6,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11072058","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144557874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Output Type Guided Random Test Case Generation for String Validation Routines 字符串验证例程的输出类型引导随机测试用例生成
IF 6.6 1区 计算机科学
Tsinghua Science and Technology Pub Date : 2025-07-04 DOI: 10.26599/TST.2024.9010023
Chenhui Cui;Rubing Huang;Jinfu Chen;Yunan Zhou
{"title":"Output Type Guided Random Test Case Generation for String Validation Routines","authors":"Chenhui Cui;Rubing Huang;Jinfu Chen;Yunan Zhou","doi":"10.26599/TST.2024.9010023","DOIUrl":"https://doi.org/10.26599/TST.2024.9010023","url":null,"abstract":"String validation routines have been widely used in many real-world applications, such as email validation and postcode validation. String test cases are adopted to test these validation routines, to identify potential defects and security risks. Random Testing (RT) is a well-known testing approach to randomly generate string test cases from the input domain (i.e., the set of all possible test inputs), which is simple to implement at a low cost. However, its testing effectiveness may be unsatisfactory for string validation routines. The main reason for this is that RT may have a high probability to generate invalid rather than valid string test cases, due to its randomness property. This research proposes a new RT approach based on the output types (i.e., valid and invalid strings) for string validation routines, namely Output-type-guided Random Testing (RT-O), which attempts to randomly generate both valid and invalid string test cases with a certain probability. This research performed an empirical study involving several real-world string validation routines collected from ten Java open-source projects, to investigate and compare testing performances of RT-O against the previous two widely-used RT methods. The results show that the generated string test cases by RT-O outperform test cases generated by other RT methods.","PeriodicalId":48690,"journal":{"name":"Tsinghua Science and Technology","volume":"30 6","pages":"2467-2486"},"PeriodicalIF":6.6,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11072066","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144557876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerating Distributed Training of Large Concurrent-Branch Models Through Bidirectional Pipeline Coordination 通过双向管道协调加速大型并发分支模型的分布式训练
IF 6.6 1区 计算机科学
Tsinghua Science and Technology Pub Date : 2025-07-04 DOI: 10.26599/TST.2024.9010233
Zan Zong;Yuyang Chen;Qi Zhang;Daming Zhao;Jianjiang Li;Yijun Jing;Jidong Zhai
{"title":"Accelerating Distributed Training of Large Concurrent-Branch Models Through Bidirectional Pipeline Coordination","authors":"Zan Zong;Yuyang Chen;Qi Zhang;Daming Zhao;Jianjiang Li;Yijun Jing;Jidong Zhai","doi":"10.26599/TST.2024.9010233","DOIUrl":"https://doi.org/10.26599/TST.2024.9010233","url":null,"abstract":"Large models have been widely used in the field of neural language processing, information retrieving, etc. With the development of the large models, not only is the parameter scale increased, but the model architecture has also become more complex. For example, the multi-modal transformer-based model mainly has concurrent branches, which we denoted as the concurrent branch model (CBM). Many CBMs have enlarged to tens of billions of parameters, and require distributed resources to train this kind of model. Existing distributed training systems cannot fully handle this type of model architecture because there are interactions between branches. Inspired by the unbalanced resource usage of pipeline parallelism, we prefer to organize different branches with a fine-grained bidirectional pipeline schedule of communication and computation. However, improper coordination between branches leads to idle time for computation and low training efficiency. In this paper, we present Flexpipe, a pipeline engine for c3oncurrent-branch models. We first introduce a branch-aware pipeline parallelism (BAPP) to make full use of the concurrent characteristic of the model architecture. Then, based on a multi-branch pipeline simulator, we propose an adaptive interaction coordinator, which facilitates the low-overhead branch interactions during the distributed model training. We evaluate our approach on popular concurrent branch models combined with modern training systems. Compared with the Chimera, the experiential results show that our method improves the end-to-end training throughput by 20% on average.","PeriodicalId":48690,"journal":{"name":"Tsinghua Science and Technology","volume":"30 6","pages":"2638-2652"},"PeriodicalIF":6.6,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11072115","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144557924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Novel Classification Scheme for Early Alzheimer's Disease (AD) Severity Diagnosis Using Deep Features of the Hybrid Cascade Attention Architecture: Early Detection of AD on MRI Scans 利用混合级联注意结构的深度特征诊断早期阿尔茨海默病(AD)严重程度的新分类方案:在MRI扫描上早期发现AD
IF 6.6 1区 计算机科学
Tsinghua Science and Technology Pub Date : 2025-07-04 DOI: 10.26599/TST.2024.9010080
Mohamadreza Khosravi;Hossein Parsaei;Khosro Rezaee
{"title":"Novel Classification Scheme for Early Alzheimer's Disease (AD) Severity Diagnosis Using Deep Features of the Hybrid Cascade Attention Architecture: Early Detection of AD on MRI Scans","authors":"Mohamadreza Khosravi;Hossein Parsaei;Khosro Rezaee","doi":"10.26599/TST.2024.9010080","DOIUrl":"https://doi.org/10.26599/TST.2024.9010080","url":null,"abstract":"In neuropathological diseases such as Alzheimer's Disease (AD), neuroimaging and Magnetic Resonance Imaging (MRI) play crucial roles in the realm of Artificial Intelligence of Medical Things (AIoMT) by leveraging edge intelligence resources. However, accurately classifying MRI scans based on neurodegenerative diseases faces challenges due to significant variability across classes and limited intra-class differences. To address this challenge, we propose a novel approach aimed at improving the early detection of AD through MRI imaging. This method integrates a Convolutional Neural Network (CNN) with a Cascade Attention Model (CAM-CNN). The CAM-CNN model outperforms traditional CNNs in AD classification accuracy and processing complexity. In this architecture, the attention mechanism is effectively implemented by utilizing two constraint cost functions and a cross-network with diverse pre-trained parameters for a two-stream architecture. Additionally, two new cost functions, Satisfied Rank Loss (SRL) and Cross-Network Similarity Loss (CNSL), are introduced to enhance collaboration and overall network performance. Finally, a unique entropy addition method is employed in the attention module for network integration, converting intermediate outcomes into the final prediction. These components are designed to work collaboratively and can be sequentially trained for optimal performance, thereby enhancing the effectiveness of AD stage classification and robustness to interference from MR images. Validation using the Kaggle dataset demonstrates the model's accuracy of 99.07% in multiclass classification, ensuring precise classification and early detection of all AD subtypes. Further validation across three feature categories with varying numbers confirms the robustness of the proposed approach, with deviations from the standard criteria of less than 1%. Applied in Alzheimer's patient care, this capability holds promise for enhancing value-based therapy and clinical decision-making. It aids in differentiating Alzheimer's patients from healthy individuals, thereby improving patient care and enabling more targeted therapies.","PeriodicalId":48690,"journal":{"name":"Tsinghua Science and Technology","volume":"30 6","pages":"2572-2591"},"PeriodicalIF":6.6,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11072114","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144557913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved Aerial Video Compression for UAV System Based on Historical Background Redundancy 基于历史背景冗余的改进无人机航拍视频压缩
IF 6.6 1区 计算机科学
Tsinghua Science and Technology Pub Date : 2025-07-04 DOI: 10.26599/TST.2024.9010110
Chuanao Jiang;Jin Xu;Liuguo Yin
{"title":"Improved Aerial Video Compression for UAV System Based on Historical Background Redundancy","authors":"Chuanao Jiang;Jin Xu;Liuguo Yin","doi":"10.26599/TST.2024.9010110","DOIUrl":"https://doi.org/10.26599/TST.2024.9010110","url":null,"abstract":"In an increasing number of area inspection applications, such as powerline inspection and sewage disposal monitoring, Unmanned Aerial Vehicles (UAVs) are used for capturing and transmitting on-site videos. Existing UAV video compressions employ Advanced Video Coding (AVC) or High Efficiency Video Coding (HEVC) encoders to eliminate intra-frame and short-term inter-frame redundancy, while these methods still face challenges in achieving high compression efficiency due to the high captured video bitrate and limited transmission capacity. In this paper, we further consider that UAVs revisit the same area and capture videos from different viewpoints, hence the Long-term Historical Background Redundancy (LHBR) exists among revisited video clips. Thus, we leverage the LHBR caused by UAV revisits, and propose a high-efficiency aerial video compression for UAVs. Our method comprises three steps: Firstly, we propose a lightweight method based on a spatial correlation model to select the most correlated reference frames from historical video database. Then, we design a Historical Reference Background Frame (HBRF) generation algorithm by alternately using the keypoint-based and telemetry-assisted alignments to align the selected frames with current frame. Finally, we use the generated HBRF as a reference frame to eliminate the LHBR within I-frames. Our proposed method has been experimentally proven to reduce BjØntegaard-Delta bitrate (BD-bitrate) by 42.83% or enhance BjØntegaard-Delta Peak Signal-to-Noise Ratio (BD-PSNR) by 2.98 dB over original HEVC, and take 29.3% of the encoding time needed for existing LHBR based compressions.","PeriodicalId":48690,"journal":{"name":"Tsinghua Science and Technology","volume":"30 6","pages":"2366-2383"},"PeriodicalIF":6.6,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11072118","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144557917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Downlink Outage Probability and Channel Capacity for Cell-Free Massive MIMO Systems 无小区大规模MIMO系统的下行中断概率和信道容量
IF 6.6 1区 计算机科学
Tsinghua Science and Technology Pub Date : 2025-07-04 DOI: 10.26599/TST.2024.9010148
Danilo B. T. Almeida;Marcelo S. Alencar;Rafael M. Duarte;Francisco Madeiro;Waslon T. A. Lopes;Hugerles S. Silva;Ugo S. Dias;Wamberto J. L. Queiroz
{"title":"Downlink Outage Probability and Channel Capacity for Cell-Free Massive MIMO Systems","authors":"Danilo B. T. Almeida;Marcelo S. Alencar;Rafael M. Duarte;Francisco Madeiro;Waslon T. A. Lopes;Hugerles S. Silva;Ugo S. Dias;Wamberto J. L. Queiroz","doi":"10.26599/TST.2024.9010148","DOIUrl":"https://doi.org/10.26599/TST.2024.9010148","url":null,"abstract":"In Cell-Free (CF) systems, the users are served simultaneously by a large number of low-cost and low-power distributed antennas, taking advantage of spatial diversity. The scarcity of equations that accurately describe the system performance limits optimization techniques to applications of users Quality of Service (QoS) uniformization. Thus, to accurately characterize the performance of such systems, a simplified model for the downlink received signal is proposed and new expressions are derived for the users Outage Probability (OP) and average channel capacity taking into account the channel gain variations characteristics. Different cell-free scenarios are analyzed and several curves are presented for different parameters that characterize the channels. The new theoretical results are corroborated by Monte-Carlo simulations and compared to literature results, which confirm classical cell-free behavior as well as the saturation on channel capacity and OP curves, and reveal that the proposed expressions describe the systems more accurately.","PeriodicalId":48690,"journal":{"name":"Tsinghua Science and Technology","volume":"30 6","pages":"2557-2571"},"PeriodicalIF":6.6,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11072069","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144557925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LLM4DEU: Fine Tuning Large Language Model for Medical Diagnosis in Outpatient and Emergency Department Visits of Neurosurgery LLM4DEU:神经外科门急诊就诊医学诊断的微调大语言模型
IF 6.6 1区 计算机科学
Tsinghua Science and Technology Pub Date : 2025-07-04 DOI: 10.26599/TST.2024.9010125
Boran Wang;Yiming Liu;Haoyu Tian;Rui Hua;Kai Chang;Jianan Xia;Xinyu Dai;Zhuliang Gao;Sitong Liu;Rui Wang;Xuezhong Zhou;Wei Wei
{"title":"LLM4DEU: Fine Tuning Large Language Model for Medical Diagnosis in Outpatient and Emergency Department Visits of Neurosurgery","authors":"Boran Wang;Yiming Liu;Haoyu Tian;Rui Hua;Kai Chang;Jianan Xia;Xinyu Dai;Zhuliang Gao;Sitong Liu;Rui Wang;Xuezhong Zhou;Wei Wei","doi":"10.26599/TST.2024.9010125","DOIUrl":"https://doi.org/10.26599/TST.2024.9010125","url":null,"abstract":"Clinical diagnosis for complex disease conditions is a complicated decision process involving systematic inference and differentiation. Artificial Intelligence (AI) models have been a widely established approach to help improve the efficiency of various kinds of clinical decision tasks (e.g., diagnosis, treatment, and prognosis). However, due to the critical requirement of time efficiency, lack of sufficient information, and high probability of comorbid diseases in Outpatient and Emergency Settings (OES), it is still challenging to build clinically feasible AI models using the free text clinical records in OES for complex disease conditions, such as neurosurgery. Here we propose an AI diagnosis model, named LLM4DEU, for neurosurgery disease differentiations by fine-tuning a large language model (i.e., ChatGLM) using the Department of Neurosurgery, the Beijing Tiantan Hospital OES electronic health records. LLM4DEU obtained state-of-the-art performance on clinical diagnosis with a F1 score of 78.53%, which is superior to five well-known baselines (including deep learning models). In addition, we evaluated the actual performance of the model by case studies on the diagnosis of specific neurosurgical diseases (e.g., subdural hematoma, cerebral hemorrhage, and cerebral infarction). The experimental results show that the LLM4DEU model has significant advantages in diagnosing low-incidence disease conditions, and comparative analyses with clinical experts confirm the predictive power of the model in neurosurgical diagnosis.","PeriodicalId":48690,"journal":{"name":"Tsinghua Science and Technology","volume":"30 6","pages":"2487-2504"},"PeriodicalIF":6.6,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11072112","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144557923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing Multimodal Data Queries in Data Lakes 优化数据湖中的多模态数据查询
IF 6.6 1区 计算机科学
Tsinghua Science and Technology Pub Date : 2025-07-04 DOI: 10.26599/TST.2025.9010022
Runqun Xiong;Shiyuan Zhao;Ciyuan Chen;Zhuqing Xu
{"title":"Optimizing Multimodal Data Queries in Data Lakes","authors":"Runqun Xiong;Shiyuan Zhao;Ciyuan Chen;Zhuqing Xu","doi":"10.26599/TST.2025.9010022","DOIUrl":"https://doi.org/10.26599/TST.2025.9010022","url":null,"abstract":"This paper addresses the challenge of efficiently querying multimodal related data in data lakes, a large-scale storage and management system that supports heterogeneous data formats, including structured, semi-structured, and unstructured data. Multimodal data queries are crucial because they enable seamless retrieval of related data across modalities, such as tables, images, and text, which has applications in fields like e-commerce, healthcare, and education. However, existing methods primarily focus on single-modality queries, such as joinable or unionable table discovery, and struggle to handle the heterogeneity and lack of metadata in data lakes while balancing accuracy and efficiency. To tackle these challenges, we propose a Multimodal data Query mechanism for Data Lakes (MQDL), which employs a modality-adaptive indexing mechanism raleted and contrastive learning based embeddings to unify representations across modalities. Additionally, we introduce product quantization to optimize candidate verification during queries, reducing computational overhead while maintaining precision. We evaluate MQDL using a table-image dataset across multiple business scenarios, measuring metrics such as precision, recall, and F1-score. Results show that MQDL achieves an accuracy rate of approximately 90%, while demonstrating strong scalability and reduced query response time compared to traditional methods. These findings highlight MQDL's potential to enhance multimodal data retrieval in complex data lake environments.","PeriodicalId":48690,"journal":{"name":"Tsinghua Science and Technology","volume":"30 6","pages":"2625-2637"},"PeriodicalIF":6.6,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11072065","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144557647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信