{"title":"Helicobacter Pylori Classification based on Deep Neural Network","authors":"Yu-Wen Lin, Guo-Shiang Lin, S. Chai","doi":"10.1109/AVSS.2019.8909848","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909848","url":null,"abstract":"In this paper, a helicobacter pylori classification method based on deep neural network (Inception v3) was proposed. The purpose of the proposed model is to provide physicians with reference to the diagnosis of Helicobacter pylori infection for increasing the diagnostic efficiency. Data augmentation and transfer learning are exploited for model construction to generate a classification system with high prediction accuracy. To evaluate the performance of the proposed method, many endoscope images are collected for testing. Experimental results show that the proposed method can well determine whether the input image contains Helicobacter pylori or not.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123333731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Camille Maurice, Francisco Madrigal, A. Monin, F. Lerasle
{"title":"A New Bayesian Modeling for 3D Human-Object Action Recognition","authors":"Camille Maurice, Francisco Madrigal, A. Monin, F. Lerasle","doi":"10.1109/AVSS.2019.8909873","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909873","url":null,"abstract":"Intelligent surveillance systems in human-centered environments require people behavioral monitoring. In this paper, we propose a new Bayesian framework to recognize actions on RGB-D videos by two different observations: the human pose and objects in its vicinity. We design a model for each action that integrates these observations and a probabilistic sequencing of actions performed during activities. We validate our approach on two public video datasets: CAD-120 and Watch-n-Patch. We show a performance gain of 4% in action detection on the fly on CAD-120 videos. Our approach is competitive to 2D image features and skeleton-based methods, as we present an improvement of 16% on Watch-n-Patch. Action recognition performance is clearly improved by our Bayesian and joint human-object perception.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125240352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinchuan Xiao, Yinhang Tang, Jianzhu Guo, Yang Yang, Xiangyu Zhu, Zhen Lei, Stan Z. Li
{"title":"3DMA: A Multi-modality 3D Mask Face Anti-spoofing Database","authors":"Jinchuan Xiao, Yinhang Tang, Jianzhu Guo, Yang Yang, Xiangyu Zhu, Zhen Lei, Stan Z. Li","doi":"10.1109/AVSS.2019.8909845","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909845","url":null,"abstract":"Benefiting from publicly available databases, face anti-spoofing has recently gained extensive attention in the academic community. However, most of the existing databases focus on the 2D object attacks, including photo and video attacks. The only two public 3D mask face anti-spoofing database are very small. In this paper, we release a multi-modality 3D mask face anti-spoofing database named 3DMA, which contains 920 videos of 67 genuine subjects wearing 48 kinds of 3D masks, captured in visual (VIS) and near-infrared (NIR) modalities. To simulate the real world scenarios, two illumination and four capturing distance settings are deployed during the collection process. To the best of our knowledge, the proposed database is currently the most extensive public database for 3D mask face anti-spoofing. Furthermore, we build three protocols for performance evaluation under different illumination conditions and distances. Experimental results with Convolutional Neural Network (CNN) and LBP-based methods reveal that our proposed 3DMA is indeed a challenge for face anti-spoofing. This database is available at http://www.cbsr.ia.ac.cn/english/3DMA.html. We hope our public 3DMA database can help to pave the way for further research on 3D mask face anti-spoofing.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127876203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-Threshold based Ground Detection for Point Cloud Scene","authors":"Chien-Chou Lin, Chih-Wei Lee, L. Yao","doi":"10.1109/AVSS.2019.8909897","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909897","url":null,"abstract":"Point cloud is widely used in self-driving technology recently. Usually the first step of point cloud processing is segmentation of ground points and non-ground points. In this paper, a multi-threshold detector is proposed for point cloud scene captured by LiDAR mounted on an autonomous vehicle. The proposed algorithm uses variant thresholds which depend on the distance between two consecutive points. Furthermore, the algorithm also proposes additional rules for finding the start ground point of each scanning line and eliminating the backward slope. Simulation result shows the proposed algorithm works well in different testing environments, in terms of miss rate, accuracy and execution time. For one scene with more than 180,000 points, the segmentation can be done in 8 ms and with 99.5% accuracy rate by the proposed algorithm.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114152272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vehicle Tracking Using Deep SORT with Low Confidence Track Filtering","authors":"Xinyu Hou, Yi Wang, Lap-Pui Chau","doi":"10.1109/AVSS.2019.8909903","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909903","url":null,"abstract":"Multi-object tracking (MOT) becomes an attractive topic due to its wide range of usability in video surveillance and traffic monitoring. Recent improvements on MOT has focused on tracking-by-detection manner. However, as a relatively complicated and integrated computer vision mission, state-of-the-art tracking-by-detection techniques are still suffering from issues such as a large number of false-positive tracks. To reduce the effect of unreliable detections on vehicle tracking, in this paper, we propose to incorporate a low confidence track filtering into the Simple Online and Realtime Tracking with a Deep association metric (Deep SORT) algorithm. We present a self-generated UA-DETRAC vehicle re-identification dataset which can be used to train the convolutional neural network of Deep SORT for data association. We evaluate our proposed tracker on UA-DETRAC test dataset. Experimental results show that the proposed method can improve the original Deep SORT algorithm with a significant margin. Our tracker outperforms the state-of-the-art online trackers and is comparable with batch-mode trackers.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124897893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frank M. Hafner, Amran Bhuiyan, Julian F. P. Kooij, Eric Granger
{"title":"RGB-Depth Cross-Modal Person Re-identification","authors":"Frank M. Hafner, Amran Bhuiyan, Julian F. P. Kooij, Eric Granger","doi":"10.1109/AVSS.2019.8909838","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909838","url":null,"abstract":"Person re-identification is a key challenge for surveillance across multiple sensors. Prompted by the advent of powerful deep learning models for visual recognition, and inexpensive RGBD cameras and sensor-rich mobile robotic platforms, e.g. self-driving vehicles, we investigate the relatively unexplored problem of cross-modal re-identification of persons between RGB (color) and depth images. The considerable divergence in data distributions across different sensor modalities introduces additional challenges to the typical difficulties like distinct viewpoints, occlusions, and pose and illumination variation. While some work has investigated re-identification across RGB and infrared, we take inspiration from successes in transfer learning from RGB to depth in object detection tasks. Our main contribution is a novel cross-modal distillation network for robust person re-identification, which learns a shared feature representation space of person's appearance in both RGB and depth images. The proposed network was compared to conventional and deep learning approaches proposed for other cross-domain re-identification tasks. Results obtained on the public BIWI and RobotPKU datasets indicate that the proposed method can significantly outperform the state-of-the-art approaches by up to 10.5% mAp, demonstrating the benefit of the proposed distillation paradigm.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115511250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint Coarse-and-Fine Semantic Segmentation","authors":"Yi-Cheng Chiu, Chih-Yang Lin, T. Shih","doi":"10.1109/AVSS.2019.8909829","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909829","url":null,"abstract":"The issue of image semantic segmentation is renowned within computer vision and artificial intelligence. The ground truth in image segmentation is hard to produce and is time- and resource-intensive. Recent research on realtime image semantic segmentation based on deep learning has reduced image resolution through pooling operations, resulting in detail loss in the scene. In order to generate high-quality annotated data, in this paper, we propose a joint coarse-and-fine (JCF) architecture that can repair fragment defects based on a coarse module, and also produce fine details based on a fine module. The experiments show promising results compared to state-of-the-art methods.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116986713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Da Li, Zhang Zhang, Caifeng Shan, Liang Wang, T. Tan
{"title":"A Comprehensive Study on Large-Scale Person Retrieval in Real Surveillance Scenarios","authors":"Da Li, Zhang Zhang, Caifeng Shan, Liang Wang, T. Tan","doi":"10.1109/AVSS.2019.8909851","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909851","url":null,"abstract":"Person retrieval is a hot research topic due to its important application potential for public security. Though existing algorithms have achieved impressive progresses on current public datasets, it is still a challenging task in the real surveillance scenarios due to the various viewpoints, pose variations and occlusions. Moreover, few of the existing works study the problem of person retrieval on large-scale gallery set, where lots of distractions may deteriorate the retrieval results heavily. To have a deep understanding on the above challenges, we perform a comprehensive study on current state-of-the-art person retrieval algorithms with a large-scale benchmark in real surveillance scenarios. In the study, two kinds of techniques, i.e., attribute recognition and person re-identification, including eight algorithms, are evaluated at both algorithm level and system level. Here, the system-level evaluations investigate the effects of the combinations of the above algorithms with the module of person detection, where lots of distractions in person detection results pose a big challenge for person retrieval in real scenes. Extensive evaluations with large gallery sizes (up to 243k) and comprehensive analyses are presented in the study, which will guide researchers to develop more advanced algorithms in future.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134404392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spatial Attention for Pedestrian Detection","authors":"Ujjwal, Aziz Dziri, Bertrand Leroy, F. Brémond","doi":"10.1109/AVSS.2019.8909907","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909907","url":null,"abstract":"Achieving high detection accuracy and high inference speed is important for a pedestrian detection system in self-driving applications. There exists a trade-off between detection accuracy and inference speed in modern convolutional object detectors. In this paper, we propose a novel pedestrian detection system, which leverages spatial attention and a two-level cascade of classification and bounding box regression to balance the trade-off. Our proposed spatial attention module reduces the search space for pedestrians by selecting a small set of anchor boxes for further processing. Furthermore, we present a two-level cascade of bounding box classification and regression and demonstrate its effectiveness for improved accuracy. We demonstrate the performance of our system on 2 public datasets-caltech-reasonable and citypersons; with state-of-art performance. Our ablation studies confirm the usefulness of our spatial attention and cascade modules.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115799526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Suggesting Gaze-based Selection for Surveillance Applications","authors":"Jutta Hild, E. Peinsipp-Byma, M. Voit, J. Beyerer","doi":"10.1109/AVSS.2019.8909833","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909833","url":null,"abstract":"The selection operation is a basic input operation when interacting with a computer. Traditional manual selection methods like mouse input are challenging when interacting with a dynamic scene containing moving objects as it occurs in surveillance applications. In this contribution, we give an overview on gaze-based selection as a fast and intuitive alternative method considering typical selection tasks in surveillance applications. In this context, we report the results of an evaluation on initialization of an object-tracking algorithm performed by eighteen expert video analysts. Besides its benefits, gaze-based selection is difficult if selection objects are small and close together. We provide first results of a pilot study evaluating a distance measure, which might have the potential of making gaze-based selection more robust. Finally, to leverage gaze-based selection becoming a common technique, low-cost eye-tracking devices have to be available which achieve the same accuracy as the high-end devices. As such cheap eye-trackers only recently became available, it is so far not evident, which accuracy we can expect. Hence, we report the results of an accuracy evaluation of the Tobii 4C conducted with twelve students performing a calibration-style task.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125556307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}