2021 International Conference on Multimedia Analysis and Pattern Recognition (MAPR)最新文献

筛选
英文 中文
A robust framework for mathematical formula detection 数学公式检测的鲁棒框架
2021 International Conference on Multimedia Analysis and Pattern Recognition (MAPR) Pub Date : 2021-10-01 DOI: 10.1109/MAPR53640.2021.9585197
M. Tran, Tri Pham, Tien Nguyen, Tien Do, T. Ngo
{"title":"A robust framework for mathematical formula detection","authors":"M. Tran, Tri Pham, Tien Nguyen, Tien Do, T. Ngo","doi":"10.1109/MAPR53640.2021.9585197","DOIUrl":"https://doi.org/10.1109/MAPR53640.2021.9585197","url":null,"abstract":"Mathematical formulas identification is a crucial step in the pipeline of many tasks such as mathematical information retrieval, storing digital science documents, etc. For basic mathematical formulas recognition, all these tasks need to detect the bounding boxes of mathematical expression as a prerequisite step. Currently, deep learning-based object detection methods work well for mathematical formula detection (MFD). These methods are divided into two categories: anchor self-study and anchor not self-study. The anchor self-study method is efficient with large quantity labels but not so well with small quantities, whereas the second type of method works better with small quantities. Therefore, we proposed an algorithm that keeps the good prediction of each type and then merges both into final results. To demonstrate the hypothesis, we select two typical object detection methods: YOLOv5 and Faster RCNN as the representation of two kind approaches to building an MFD framework. Our experiment results on ICDAR2021-MFD1 achieved the F1 score of the whole system is 89.3 while the single detector just reached 74.2, 88.9 (Faster RCNN and YOLOv5 respectively) that proving the effectiveness of the proposal.","PeriodicalId":233540,"journal":{"name":"2021 International Conference on Multimedia Analysis and Pattern Recognition (MAPR)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126534853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Boundary delineation of reflux esophagitis lesions from endoscopic images using color and texture 内镜图像中反流性食管炎病变边界的颜色和纹理划分
2021 International Conference on Multimedia Analysis and Pattern Recognition (MAPR) Pub Date : 2021-10-01 DOI: 10.1109/MAPR53640.2021.9585290
Danh H. Vu, Long-Thuy Nguyen, Van-Tuan Nguyen, Thanh-Hai Tran, V. Dao, Hai Vu
{"title":"Boundary delineation of reflux esophagitis lesions from endoscopic images using color and texture","authors":"Danh H. Vu, Long-Thuy Nguyen, Van-Tuan Nguyen, Thanh-Hai Tran, V. Dao, Hai Vu","doi":"10.1109/MAPR53640.2021.9585290","DOIUrl":"https://doi.org/10.1109/MAPR53640.2021.9585290","url":null,"abstract":"Automatic assessment of medical images and endoscopic images in particular is an attractive research topic recent years. To achieve this goal, many tasks must be conducted for example lesions detection, segmentation and classification. In order to design suitable models for such tasks, it would be preferable to know at first: i) which characteristics that differentiate a lesion from a normal region; ii) how large is the boundary of these two regions that still allows to distinguish them. This paper presents an in-depth study of the role of color and texture features for delineation of boundary between a lesion region and a background region. To this end, from the groundtruth contour of a manually segmented lesion, we first expand two margins in two directions. We name inner margin in the lesion region and outer margin in the background region. We then extract color dependent features in different color spaces (HSV, RGB, Lab) and texture features (LBP, HOG, GLCM) on these two margins. Finally we deploy the Support Vector Machine (SVM) technique to classify two classes (lesion and non-lesion). Extensive experiments conducted on a dataset of endoscopic images answer to our aforementioned questions and give some suggestions for designing suitable models of lesion detection in the future.","PeriodicalId":233540,"journal":{"name":"2021 International Conference on Multimedia Analysis and Pattern Recognition (MAPR)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124995559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Unweighted Bipartite Matching For Robust Vehicle Counting 鲁棒车辆计数的非加权二部匹配
2021 International Conference on Multimedia Analysis and Pattern Recognition (MAPR) Pub Date : 2021-10-01 DOI: 10.1109/MAPR53640.2021.9585273
Khanh Ho, H. Le, K. Nguyen, Thua Nguyen, Tien Do, T. Ngo, Thanh-Son Nguyen
{"title":"Unweighted Bipartite Matching For Robust Vehicle Counting","authors":"Khanh Ho, H. Le, K. Nguyen, Thua Nguyen, Tien Do, T. Ngo, Thanh-Son Nguyen","doi":"10.1109/MAPR53640.2021.9585273","DOIUrl":"https://doi.org/10.1109/MAPR53640.2021.9585273","url":null,"abstract":"Intelligent Transportation System (ITS) plays an essential role in smart cities. Through ITS, local authorities could handle enormous traffic flows with minimal effort and solve traffic-related problems such as traffic congestion or traffic regulation violating behaviours. In this work, we designed a system that has the ability to count vehicles moving in specific directions on the road. Such automated systems also have to deal with the diverse weather and instabilities in captured media, making current tracking algorithms become prone to errors. This problem is even more challenging in Vietnam and other developing countries, where traffic on the road is much more complex with the presence of small vehicles such as bicycles and motorbikes, thus tracking algorithms would be more likely to fail. Our proposed method for Track Joining was built on top of deepSORT, incorporating Taylor Expansion and Unweighted Bipartite Maximum Matching to predict missing movements or identify duplicated vehicle tracks, then attempt to merge them. In HCMC AI City Challenge 20201, our whole system outperforms other approaches by achieving the lowest overall RMSE score: an average of 1.39 fails per video segment on a benchmark dataset.","PeriodicalId":233540,"journal":{"name":"2021 International Conference on Multimedia Analysis and Pattern Recognition (MAPR)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131827252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring Zero-shot Cross-lingual Aspect-based Sentiment Analysis using Pre-trained Multilingual Language Models 使用预训练的多语言模型探索零概率跨语言基于方面的情感分析
2021 International Conference on Multimedia Analysis and Pattern Recognition (MAPR) Pub Date : 2021-10-01 DOI: 10.1109/MAPR53640.2021.9585242
Khoa Thi-Kim Phan, D. Hao, D. Thin, N. Nguyen
{"title":"Exploring Zero-shot Cross-lingual Aspect-based Sentiment Analysis using Pre-trained Multilingual Language Models","authors":"Khoa Thi-Kim Phan, D. Hao, D. Thin, N. Nguyen","doi":"10.1109/MAPR53640.2021.9585242","DOIUrl":"https://doi.org/10.1109/MAPR53640.2021.9585242","url":null,"abstract":"Aspect-based sentiment analysis (ABSA) has received much attention in the Natural Language Processing research community. Most of the proposed methods are conducted exclusively in English and high-resources languages. Leveraging resources available from English and transferring to low-resources languages seems to be an immediate solution. In this paper, we investigate the performance of zero-shot cross-lingual transfer learning based on pre-trained multilingual models (mBERT and XLM-R) for two main sub-tasks in the ABSA problem: Aspect Category Detection and Opinion Target Expression. We experiment on the benchmark data sets of six languages as English, Russian, Dutch, Spanish, Turkish, and French. The experimental results demonstrated that using the XLM-R model can yield relatively acceptable results for the zero-shot cross-lingual scenario.","PeriodicalId":233540,"journal":{"name":"2021 International Conference on Multimedia Analysis and Pattern Recognition (MAPR)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115722820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Siamese Attention and Point Adaptive Network for Visual Tracking 视觉跟踪的连体注意和点自适应网络
2021 International Conference on Multimedia Analysis and Pattern Recognition (MAPR) Pub Date : 2021-10-01 DOI: 10.1109/MAPR53640.2021.9585250
T. Dinh, Long Tran Quoc, Kien Thai Trung
{"title":"Siamese Attention and Point Adaptive Network for Visual Tracking","authors":"T. Dinh, Long Tran Quoc, Kien Thai Trung","doi":"10.1109/MAPR53640.2021.9585250","DOIUrl":"https://doi.org/10.1109/MAPR53640.2021.9585250","url":null,"abstract":"Siamese-based trackers have achieved excellent performance on visual object tracking. Most of the existing trackers usually compute the features of the target template and search image independently and rely on either a multi-scale searching scheme or pre-defined anchor boxes to accurately estimate the scale and aspect ratio of a target. This paper proposes Siamese attention and point adaptive head network referred to as SiamAPN for Visual Tracking. Siamese attention includes self-attention and cross-attention for feature enhancement and aggregating rich contextual inter-dependencies between the target template and the search image. And Point head network for bounding box prediction is both proposal and anchor-free. The proposed framework is simple and effective. Extensive experiments on visual tracking benchmarks, including OTB100, UAV123, and VOT2018, demonstrate that our tracker achieves state-of-the-art performance and runs at 45 FPS.","PeriodicalId":233540,"journal":{"name":"2021 International Conference on Multimedia Analysis and Pattern Recognition (MAPR)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127992178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visual-guided audio source separation: an empirical study 视觉引导音频源分离的实证研究
2021 International Conference on Multimedia Analysis and Pattern Recognition (MAPR) Pub Date : 2021-10-01 DOI: 10.1109/MAPR53640.2021.9585244
Thanh Thi Hien Duong, Huu Manh Nguyen, Hai Nghiem Thi, Thi-Lan Le, Phi-Le Nguyen, Q. Nguyen
{"title":"Visual-guided audio source separation: an empirical study","authors":"Thanh Thi Hien Duong, Huu Manh Nguyen, Hai Nghiem Thi, Thi-Lan Le, Phi-Le Nguyen, Q. Nguyen","doi":"10.1109/MAPR53640.2021.9585244","DOIUrl":"https://doi.org/10.1109/MAPR53640.2021.9585244","url":null,"abstract":"Real-world video scenes are usually very complicated as they are mixtures of many different audio-visual objects. Humans with normal hearing ability can easily locate, identify and differentiate sound sources which are heard simultaneously. However, this is an extremely difficult task for machines as the creation of machine listening algorithms that can automatically separate sound sources in difficult mixing conditions has remained very challenging. In this paper, we consider the use of a visual-guided audio source separation approach for separating sounds of different instruments in the video, where detected visual objects are used to assist the sound separation process. We particularly investigate the use of different object detectors for the task. In addition, as an empirical study, we analyze the effect of training datasets on separation performance. Finally, experiment results obtained from a benchmark dataset MUSIC confirm the advantages of the new object detector investigated in the paper.","PeriodicalId":233540,"journal":{"name":"2021 International Conference on Multimedia Analysis and Pattern Recognition (MAPR)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123503008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信