Fengyang Xiao, Pan Zhang, Chunming He, Runze Hu, Yutao Liu
{"title":"Concealed Object Segmentation with Hierarchical Coherence Modeling","authors":"Fengyang Xiao, Pan Zhang, Chunming He, Runze Hu, Yutao Liu","doi":"10.48550/arXiv.2401.11767","DOIUrl":"https://doi.org/10.48550/arXiv.2401.11767","url":null,"abstract":"Concealed object segmentation (COS) is a challenging task that involves localizing and segmenting those concealed objects that are visually blended with their surrounding environments. Despite achieving remarkable success, existing COS segmenters still struggle to achieve complete segmentation results in extremely concealed scenarios. In this paper, we propose a Hierarchical Coherence Modeling (HCM) segmenter for COS, aiming to address this incomplete segmentation limitation. In specific, HCM promotes feature coherence by leveraging the intra-stage coherence and cross-stage coherence modules, exploring feature correlations at both the single-stage and contextual levels. Additionally, we introduce the reversible re-calibration decoder to detect previously undetected parts in low-confidence regions, resulting in further enhancing segmentation performance. Extensive experiments conducted on three COS tasks, including camouflaged object detection, polyp image segmentation, and transparent object detection, demonstrate the promising results achieved by the proposed HCM segmenter.","PeriodicalId":155654,"journal":{"name":"CAAI International Conference on Artificial Intelligence","volume":"344 2","pages":"16-27"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140500908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chang Sun, Zili Wang, Shuyou Zhang, Le Wang, Jianrong Tan
{"title":"Physical Logic Enhanced Network for Small-Sample Bi-Layer Metallic Tubes Bending Springback Prediction","authors":"Chang Sun, Zili Wang, Shuyou Zhang, Le Wang, Jianrong Tan","doi":"10.48550/arXiv.2209.09870","DOIUrl":"https://doi.org/10.48550/arXiv.2209.09870","url":null,"abstract":"Bi-layer metallic tube (BMT) plays an extremely crucial role in engineering applications, with rotary draw bending (RDB) the high-precision bending processing can be achieved, however, the product will further springback. Due to the complex structure of BMT and the high cost of dataset acquisition, the existing methods based on mechanism research and machine learning cannot meet the engineering requirements of springback prediction. Based on the preliminary mechanism analysis, a physical logic enhanced network (PE-NET) is proposed. The architecture includes ES-NET which equivalent the BMT to the single-layer tube, and SP-NET for the final prediction of springback with sufficient singlelayer tube samples. Specifically, in the first stage, with the theory-driven preexploration and the data-driven pretraining, the ES-NET and SP-NET are constructed, respectively. In the second stage, under the physical logic, the PE-NET is assembled by ES-NET and SP-NET and then fine-tuned with the small sample BMT dataset and composite loss function. The validity and stability of the proposed method are verified by the FE simulation dataset, the small-sample dataset BMT springback angle prediction is achieved, and the method potential in interpretability and engineering applications are demonstrated.","PeriodicalId":155654,"journal":{"name":"CAAI International Conference on Artificial Intelligence","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132698702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lei Chen, Haibo Qin, Shi-Xue Zhang, Chun Yang, Xucheng Yin
{"title":"Scene Text Recognition with Single-Point Decoding Network","authors":"Lei Chen, Haibo Qin, Shi-Xue Zhang, Chun Yang, Xucheng Yin","doi":"10.48550/arXiv.2209.01914","DOIUrl":"https://doi.org/10.48550/arXiv.2209.01914","url":null,"abstract":", Abstract. In recent years, attention-based scene text recognition methods have been very popular and attracted the interest of many researchers. Attention-based methods can adaptively focus attention on a small area or even single point during decoding, in which the attention matrix is nearly one-hot distribution. Furthermore, the whole feature maps will be weighted and summed by all attention matrices during inference, caus-ing huge redundant computations. In this paper, we propose an efficient attention-free Single-Point Decoding Network (dubbed SPDN) for scene text recognition, which can replace the traditional attention-based decoding network. Specifically, we propose Single-Point Sampling Module (SPSM) to efficiently sample one key point on the feature map for decoding one character. In this way, our method can not only precisely locate the key point of each character but also remove redundant computations. Based on SPSM, we design an efficient and novel single-point decoding network to replace the attention-based decoding network. Extensive experiments on publicly available benchmarks verify that our SPDN can greatly improve decoding efficiency without sacrificing performance.","PeriodicalId":155654,"journal":{"name":"CAAI International Conference on Artificial Intelligence","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132240387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cross-Camera Deep Colorization","authors":"Yaping Zhao, Haitian Zheng, Mengqi Ji, Ruqi Huang","doi":"10.48550/arXiv.2209.01211","DOIUrl":"https://doi.org/10.48550/arXiv.2209.01211","url":null,"abstract":". In this paper, we consider the color-plus-mono dual-camera system and propose an end-to-end convolutional neural network to align and fuse images from it in an efficient and cost-effective way. Our method takes cross-domain and cross-scale images as input, and consequently synthesizes HR colorization results to facilitate the trade-off between spatial-temporal resolution and color depth in the single-camera imaging system. In contrast to the previous colorization methods, ours can adapt to color and monochrome cameras with distinctive spatial-temporal resolutions, rendering the flexibility and robustness in practical applications. The key ingredient of our method is a cross-camera alignment module that generates multi-scale correspondences for cross-domain image alignment. Through extensive experiments on various datasets and multiple settings, we val-idate the flexibility and effectiveness of our approach. Remarkably, our method consistently achieves substantial improvements, i.e. , around 10dB PSNR gain, upon the state-of-the-art methods. Code is at: github.com/IndigoPurple/CCDC.","PeriodicalId":155654,"journal":{"name":"CAAI International Conference on Artificial Intelligence","volume":"136 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131551044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yangxi Zhou, Junping Du, Zhe Xue, Ang Li, Zeli Guan
{"title":"Chinese Word Sense Embedding with SememeWSD and Synonym Set","authors":"Yangxi Zhou, Junping Du, Zhe Xue, Ang Li, Zeli Guan","doi":"10.48550/arXiv.2206.14388","DOIUrl":"https://doi.org/10.48550/arXiv.2206.14388","url":null,"abstract":". Word embedding is a fundamental natural language processing task which can learn feature of words. However, most word embedding methods assign only one vector to a word, even if polysemous words have multi-senses. To address this limitation, we propose SememeWSD Synonym (SWSDS) model to assign a different vector to every sense of polysemous words with the help of word sense disambiguation (WSD) and synonym set in OpenHowNet. We use the SememeWSD model, an unsupervised word sense disambiguation model based on OpenHowNet, to do word sense disambiguation and annotate the polysemous word with sense id. Then, we obtain top 10 synonyms of the word sense from OpenHowNet and calculate the average vector of synonyms as the vector of the word sense. In experiments, We evaluate the SWSDS model on semantic similarity calculation with Gensim’s wmdistance method. It achieves improvement of accuracy. We also examine the SememeWSD model on different BERT models to find the more effective model.","PeriodicalId":155654,"journal":{"name":"CAAI International Conference on Artificial Intelligence","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122399748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ri-Qi Su, Alphonse Houssou Hounye, Cong Cao, Muzhou Hou
{"title":"PHN: Parallel heterogeneous network with soft gating for CTR prediction","authors":"Ri-Qi Su, Alphonse Houssou Hounye, Cong Cao, Muzhou Hou","doi":"10.48550/arXiv.2206.09184","DOIUrl":"https://doi.org/10.48550/arXiv.2206.09184","url":null,"abstract":"The Click-though Rate (CTR) prediction task is a basic task in recommendation system. Most of the previous researches of CTR models built based on Wide &deep structure and gradually evolved into parallel structures with different modules. However, the simple accumulation of parallel structures can lead to higher structural complexity and longer training time. Based on the Sigmoid activation function of output layer, the linear addition activation value of parallel structures in the training process is easy to make the samples fall into the weak gradient interval, resulting in the phenomenon of weak gradient, and reducing the effectiveness of training. To this end, this paper proposes a Parallel Heterogeneous Network (PHN) model, which constructs a network with parallel structure through three different interaction analysis methods, and uses Soft Selection Gating (SSG) to feature heterogeneous data with different structure. Finally, residual link with trainable parameters are used in the network to mitigate the influence of weak gradient phenomenon. Furthermore, we demonstrate the effectiveness of PHN in a large number of comparative experiments, and visualize the performance of the model in training process and structure.","PeriodicalId":155654,"journal":{"name":"CAAI International Conference on Artificial Intelligence","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114865372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Transformer-based Network for Deformable Medical Image Registration","authors":"Yibo Wang, W. Qian, Xuming Zhang","doi":"10.1007/978-3-031-20497-5_41","DOIUrl":"https://doi.org/10.1007/978-3-031-20497-5_41","url":null,"abstract":"","PeriodicalId":155654,"journal":{"name":"CAAI International Conference on Artificial Intelligence","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126024431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}