Lingling Zhang , Muye Huang , Qianying Wang , Yaxian Wang , Wenjun Wu , Ziqi He , Jun Liu
{"title":"GoT-CQA: Graph-of-Thought guided compositional reasoning for chart question answering","authors":"Lingling Zhang , Muye Huang , Qianying Wang , Yaxian Wang , Wenjun Wu , Ziqi He , Jun Liu","doi":"10.1016/j.cviu.2026.104745","DOIUrl":"10.1016/j.cviu.2026.104745","url":null,"abstract":"<div><div>Chart Question Answering (CQA) aims at answering questions based on the visual chart content, which plays an important role in chart summarization, business data analysis, and data report generation. CQA is a challenging multi-modal task because of the strong context dependence and complex reasoning requirement. The former refers to answering this question strictly based on the analysis of the visual content or internal data of the given chart, while the latter emphasizes the various logical and numerical reasoning involved in answer prediction process. In this paper, we pay more attention on the complex reasoning in CQA task, and propose a novel Graph-of-Thought (GoT) guided compositional reasoning model called GoT-CQA to overcome this problem. At first, we transform the chart-oriented question into a directed acyclic GoT composed of multiple operator nodes, including localization, numerical and logical operator. It reflects the human brain’s solution process to this question intuitively. After that, we design an efficient auto-compositional reasoning framework guided by the GoT, to execute the multi-step reasoning operations in various types of questions. Comprehensive experiments on ChartQA and PlotQA-D datasets show that GoT-CQA achieves outstanding performance, especially in complex human-written and reasoning questions, comparing with the latest popular baselines.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"268 ","pages":"Article 104745"},"PeriodicalIF":3.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147803114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SAFIR: Spatially-aware Activation Function for Implicit Neural Representations","authors":"Ehsan Zeraatkar, Jelena Tešić","doi":"10.1016/j.cviu.2026.104754","DOIUrl":"10.1016/j.cviu.2026.104754","url":null,"abstract":"<div><div>Neural Implicit Representations (INRs) have reshaped a range of vision tasks by modeling signals as continuous functions, yet existing architectures, such as SIREN, suffer from spectral bias due to their reliance on spatially uniform frequency parameters. This work introduces SAFIR, a novel spatially aware adaptive frequency modulation framework that learns coordinate-dependent frequency parameters via a convolutional omega-prediction network. SAFIR enables adaptive allocation of high- and low-frequency components across spatial regions, thereby addressing the persistent problem of over-smoothing or overfitting in conventional INRs. Extensive evaluations on five benchmark datasets demonstrate that SAFIR not only offers substantial improvements in PSNR and SSIM for 2D image reconstruction, super-resolution, and generative vision tasks, but also achieves superior parameter efficiency and faster convergence than state-of-the-art alternatives. By effectively bridging local signal complexity with adaptive neural modulation, SAFIR represents a practical step toward high-fidelity, efficient signal representation in modern computer vision applications.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"268 ","pages":"Article 104754"},"PeriodicalIF":3.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147803106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"EARS4SEE: A multimodal audio description system dedicated to blind and visually impaired users","authors":"Ruxandra Tapu , Bogdan Mocanu","doi":"10.1016/j.cviu.2026.104772","DOIUrl":"10.1016/j.cviu.2026.104772","url":null,"abstract":"<div><div>In recent years, automatic audio description (AD) generation has become an important research domain within accessibility and assistive technology, driven by its potential to enhance content understanding, social integration, and cognitive engagement for individuals with visual impairments (VI). In this paper, we introduce EARS4SEE, a novel multimodal framework for AD generation that integrates semantic video analysis, character tracking, and adaptive temporal segmentation to enhance contextual coherence and narrative fluency. The proposed system integrates multi-stream fusion strategy, leveraging visual, textual, and audio modalities for character-centric, semantically enriched AD. Textual descriptions are synthesized into natural-sounding speech using state-of-the-art text-to-speech (TTS) techniques for an immersive experience. A core contribution of the proposed methodology involves the tracking-based character recognition module, which ensures temporally consistent character identification using an adaptive temporal attention mechanism. The approach mitigates inconsistencies from motion blur, occlusions, and scale variations, improving referential continuity. Additionally, EARS4SEE introduces an automated multimodal video segmentation pipeline, capturing long-range temporal dependencies to improve scene boundary detection and contextual alignment. The experimental evaluation carried out on the MAD-Eval-Named and TV-AD datasets validates the effectiveness of the proposed methodology, which leads to average CIDEr and LLM-AD-eval scores of 24.1 and 3.02, respectively. In addition, when compared to state-of-the-art techniques, the proposed architecture shows superior performances in terms of the CIDEr, with gains in accuracy ranging in the [1.72%, 10.2%] interval and an 8% increase in LLM-AD-eval scores.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"268 ","pages":"Article 104772"},"PeriodicalIF":3.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147803109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Self-supervised learning for object detection in challenging settings: A survey","authors":"Alina Ciocarlan , Sidonie Lefebvre , Sylvie Le Hégarat-Mascle , Arnaud Woiselle","doi":"10.1016/j.cviu.2026.104783","DOIUrl":"10.1016/j.cviu.2026.104783","url":null,"abstract":"<div><div>Self-supervised learning (SSL) has shown great promise in computer vision, enabling networks to learn meaningful representations from large unlabeled datasets. SSL methods fall into two main categories: instance discrimination and image modeling. While instance discrimination is fundamental to SSL, it was originally designed for classification and may be less effective for downstream tasks that require fine-grained or spatially localized representations. In this focused survey, we study SSL for object detection under challenging practical conditions, with particular emphasis on small object detection, domain shift and few-shot learning.</div><div>Building upon previous surveys, we not only provide a detailed comparison of SSL strategies, but also assess their effectiveness for object detection using both CNN and ViT-based architectures. Our benchmark is performed fairly by fine-tuning a Faster R-CNN initialized with several exemplary SSL methods ourselves, including object-level Instance Discrimination and Masked Image Modeling methods, on the widely used COCO dataset, as well as on a domain-specific dataset focused on vehicle detection in infrared remote sensing imagery. We also evaluate the impact of pre-training on custom domain-specific datasets, highlighting how some SSL strategies are better suited for handling uncurated data. Furthermore, we assess the methods in few-shot settings and inference on noisy input, revealing important behavioral differences depending on the type of encoder used. Our findings highlight that combining approaches with complementary local and global biases improves performance across the evaluated object detection settings. Overall, this survey provides a practical guide for selecting optimal SSL strategies in different scenarios.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"268 ","pages":"Article 104783"},"PeriodicalIF":3.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147803111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiarui Duan , Jie Song , Haofei Zhang , Mengqi Xue , Huiqiong Wang , Mingli Song
{"title":"RAIN: An embarrassingly simple approach to debiasing attribution evaluation","authors":"Jiarui Duan , Jie Song , Haofei Zhang , Mengqi Xue , Huiqiong Wang , Mingli Song","doi":"10.1016/j.cviu.2026.104773","DOIUrl":"10.1016/j.cviu.2026.104773","url":null,"abstract":"<div><div>Model attribution, which assigns each input feature a value to indicate its contribution to the final predictions, has become the most widely studied paradigm towards explainable artificial intelligence. Evaluation of the existing attribution methods, however, still remains a challenge due to the unavailable ground truth. Prior human-agnostic evaluation approaches typically adopt feature ablation to test how the performance varies by dropping the features of different attribution values. However, these methods suffer from either the missingness bias incurred by the inconsistent data distribution, or the expensive computational cost caused by retraining a large number of models. In this work, we propose an embarrassingly simple yet surprisingly effective evaluation scheme, termed as <strong>Random-Ablation training for attrIbution-ablatioN test</strong> (RAIN), to balance the missingness bias and computational cost. The core idea underlying RAIN is approximating the data distribution after attribution-based feature ablation with a randomly-ablated data distribution. With the proposed RAIN, we systematically evaluate the existing attribution methods with extensive experiments. Results demonstrate that the proposed method not only significantly alleviates missingness bias issues, but also yields more consistent and reasonable attribution rankings, yet with much cheaper computational cost. Code and model will be made publicly available soon.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"268 ","pages":"Article 104773"},"PeriodicalIF":3.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147803112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yufei Gao , Bin Fu , Lei Shi , Haibo Pang , Yucheng Shi , Yameng Zhang
{"title":"FDPAdapter : Adapting segment anything in challenging vision tasks via frequency-domain priors","authors":"Yufei Gao , Bin Fu , Lei Shi , Haibo Pang , Yucheng Shi , Yameng Zhang","doi":"10.1016/j.cviu.2026.104784","DOIUrl":"10.1016/j.cviu.2026.104784","url":null,"abstract":"<div><div>The Segment Anything Model (SAM) excels as a large-scale visual foundation model for natural image segmentation. However, due to significant differences between its training data domain and specific scenarios like medical imaging and camouflaged objects, its generalization capability is limited in challenging tasks involving fine-grained boundaries, low-contrast targets, and highly textured environments. Existing Parameter-Efficient Fine-Tuning approaches based on adapters can improve downstream performance to some extent. However, most solutions remain at the level of treating frequency-domain information as auxiliary input or shallow overlay. They fail to address ViT’s inherent weakness in perceiving local high-frequency textures at the architectural level, resulting in insufficient fusion depth or inadequate task-specific feature guidance. To address this, we propose FDPAdapter, a novel lightweight adaptation framework. Through the carefully designed Frequency-Domain Prior Extractor (FDPE) and Dual-CrossInvolution Fusion Module (DCFM), it achieves a transition from static frequency-domain injection to dynamic frequency spatial modulation. FDPE actively decouples and reconstructs task-relevant local high-frequency textures and edge priors from raw images, effectively compensating for ViT’s inherent low-pass filtering characteristics. DCFM establishes a bidirectional closed-loop interaction mechanism, transforming high-frequency priors from passive overlay to active guidance and iterative refinement of the global ViT feature stream. This approach significantly enhances the model’s segmentation capabilities in challenging scenarios with lightweight parameter overhead. The experimental results show that FDPAdapter is superior to the existing methods in two challenging fields: medical image segmentation and camouflaged object detection. Specifically, compared with SAM-Adapter on the Kvasir medical image dataset, mIoU increased by 2.1% and Dice increased by 1.2%. Compared with SAM-Adapter on the three camouflaged object detection datasets, <span><math><msub><mrow><mi>S</mi></mrow><mrow><mi>α</mi></mrow></msub></math></span> increased by 4% - 7.2%, and <span><math><msub><mrow><mi>E</mi></mrow><mrow><mi>ϕ</mi></mrow></msub></math></span> increased by 4.4%–7.1%.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"268 ","pages":"Article 104784"},"PeriodicalIF":3.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147803110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
João Sousa , Roya Darabi , Armando Sousa , Frank Brueckner , Luís Paulo Reis , Ana Reis
{"title":"JEMA: Joint Embedding of Multimodal and multi-view Alignment in human-centric embedding space for manufacturing","authors":"João Sousa , Roya Darabi , Armando Sousa , Frank Brueckner , Luís Paulo Reis , Ana Reis","doi":"10.1016/j.cviu.2026.104771","DOIUrl":"10.1016/j.cviu.2026.104771","url":null,"abstract":"<div><div>This work introduces JEMA (Joint Embedding with Multimodal and multi-view Alignment), a novel co-learning framework and loss function to combine multiple sensors and process parameters in Directed Energy Deposition (DED), a critical process in metal additive manufacturing. As Industry 5.0 advances in industrial applications, effective process monitoring becomes increasingly essential. However, the limited availability of data and the black-box nature of AI solutions present significant implementation challenges in industrial settings. JEMA addresses these limitations by leveraging multimodal data, including multi-view images and process parameters, to learn transferable semantic representations. By implementing a supervised regression contrastive loss function, JEMA shapes the embedding space to enable interpretable inference. Furthermore, the framework allows for simplified hardware requirements and reduced computational overhead during deployment by utilizing only the primary on-axis sensor. We evaluate the effectiveness of JEMA loss in DED process monitoring, with particular focus on its generalization capabilities for downstream tasks such as melt pool geometry prediction without extensive fine-tuning. Our empirical results demonstrate the effectiveness of JEMA, showing improvements of 29% and 20% in multimodal and unimodal settings, respectively, compared to models without any regularization loss. Additionally, JEMA outperforms supervised contrastive learning methods by 8% and 2% in the same settings. These improvements are also accompanied by a more structured and meaningful representation in the embedding space. Importantly, the learned embedding representation provides direct interpretability of the feature space, which can be utilized by both human operators and automated systems for process optimization, control, and anomaly detection based on defined thresholds. This human-centered approach ensures that operators can actively engage with the system, making informed decisions and enhancing their trust in the process. Our framework establishes a foundation for integrating multisensor data with metadata, enabling diverse downstream applications both within manufacturing processes and beyond, while keeping human expertise central to the loop.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"268 ","pages":"Article 104771"},"PeriodicalIF":3.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147803108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing feature representation in Siamese networks for object tracking with ranking-based loss","authors":"Sachin Sakthi K.S. , Young Hoon Joo , Jae Hoon Jeong","doi":"10.1016/j.cviu.2026.104785","DOIUrl":"10.1016/j.cviu.2026.104785","url":null,"abstract":"<div><div>Siamese networks have gained significant traction in single-object tracking due to their efficient feature comparison and matching capabilities. However, existing approaches often struggle to capture robust and discriminative representations under complex conditions such as occlusion, scale variation, and background clutter, leading to a trade-off between classification accuracy and localization precision. To overcome these limitations, this paper introduces SiamMFFRL, a novel Siamese-based tracking framework that integrates multi-feature fusion and ranking-based loss learning to achieve balanced and robust target estimation. The proposed multi-feature fusion module (MFF) aggregates hierarchical features from multiple levels of the backbone network, enhancing the discriminative capability of the model and improving its robustness against appearance variations and background distractions. We design a ranking-based loss that jointly optimizes classification and regression branches by enforcing a ranking margin between foreground and background responses, thereby harmonizing classification confidence with localization precision. Furthermore, a target-aware branch (TAB) is incorporated to refine bounding box predictions and mitigate inconsistencies between classification and localization outputs. By integrating these complementary components into a unified Siamese framework, SiamMFFRL achieves a robust balance between accuracy and stability in complex tracking environments. Extensive experiments conducted on major tracking benchmarks, including LaSOT, GOT-10K, VOT2018, VOT2019, OTB100, and UAV123, demonstrate that SiamMFFRL consistently outperforms state-of-the-art trackers in both precision and success rates, confirming its effectiveness and generalization capability.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"268 ","pages":"Article 104785"},"PeriodicalIF":3.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147803113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Human-in-the-loop RGB semantic segmentation for intuitive teaching of novel grasping tasks via few-shot adaptation of lightweight neural networks","authors":"Furkan Kaynar, Sudarshan Rajagopalan, Mahmoud Krichene, Eckehard Steinbach","doi":"10.1016/j.cviu.2025.104620","DOIUrl":"10.1016/j.cviu.2025.104620","url":null,"abstract":"<div><div>Assistive robots are starting to be a part of everyday human life, yet the cognitive abilities of the robots still need to be improved to successfully operate in human environments. Vision-based grasping systems require determining the correct semantic region in the image that is suitable for task-oriented grasping of target objects. The complexity of unstructured environments like households and healthcare facilities necessitates the ability to learn new segmentation tasks from human guidance and to dynamically extend the perceptual skills on a daily basis to improve autonomy. We propose to use a browser-based deep interactive RGB segmentation interface for receiving the human guidance intuitively, allowing non-experts to guide the robots from remote. We further leverage meta learning to quickly adapt lightweight neural networks for custom semantic segmentation tasks with a few human demonstrations, which facilitates a quick, cost-efficient onboard learning on the assistive robots. For training and evaluation of the few-shot learning module with grasp area affordances, we extended the GraspNet-1Billion benchmark with affordance segmentation labels and created a new dataset with grasp affordance masks corresponding to 82,944 cluttered scene images of GraspNet-1Billion. Comparative experiments on the datasets indicate the effectiveness of our few-shot learning modules by reaching high segmentation accuracies at a low computational cost. In addition, the robotic experiments with a 7-DOF manipulator show that the proposed method outperforms the baseline method by a large margin (22 to 25%) in task-oriented grasp precision and success rates. Lastly, the conducted user studies demonstrate the ease of use of the proposed demonstration interface. Consequently, our end-to-end pipeline facilitates real-world deployment of assistive robots in human environments by intuitively teaching new semantic segmentation tasks on a daily basis. The generated grasp affordance dataset can be found at <span><span>https://doi.org/10.14459/2026mp1839309</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"267 ","pages":"Article 104620"},"PeriodicalIF":3.5,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147613116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carlos Vélez-García , Miguel Cazorla , Jorge Pomares
{"title":"Escaping the big data paradigm in self-supervised representation learning","authors":"Carlos Vélez-García , Miguel Cazorla , Jorge Pomares","doi":"10.1016/j.cviu.2026.104698","DOIUrl":"10.1016/j.cviu.2026.104698","url":null,"abstract":"<div><div>The reliance on large-scale datasets and extensive computational resources has become a significant barrier to advancing representation learning from images, particularly in domains where data is scarce or expensive to obtain. In this paper, we address the critical question: <em>Can we escape the big data paradigm in self-supervised representation learning from images?</em> We introduce <strong>SCOTT</strong> (<strong>S</strong>parse <strong>Co</strong>nvolutional <strong>T</strong>okenizer for <strong>T</strong>ransformers), a shallow tokenization architecture that is compatible with Masked Image Modeling (MIM) tasks. SCOTT injects convolutional inductive biases into Vision Transformers (ViTs), enhancing their efficacy in small-scale data regimens. Alongside, we propose to train on a Joint-Embedding Predictive Architecture within a MIM framework (<strong>MIM-JEPA</strong>), operating in latent representation space to capture more semantic features. Our approach enables ViTs to be trained from scratch on datasets orders of magnitude smaller than traditionally required — without relying on massive external datasets for pretraining. We validate our method on three small-size, standard-resolution, fine-grained datasets: Oxford Flowers-102, Oxford IIIT Pets-37, and ImageNet-100. Despite the challenges of limited data and high intra-class similarity of these datasets, our frozen SCOTT models pretrained with MIM-JEPA significantly outperform fully supervised methods and achieve competitive results with state-of-the-art approaches that rely on large-scale pretraining, complex image augmentations and bigger model sizes. By demonstrating that robust off-the-shelf representations can be learned with limited data, compute, and model sizes, our work paves the way for computer applications in resource constrained environments such as medical imaging or robotics.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"266 ","pages":"Article 104698"},"PeriodicalIF":3.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147426831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}