{"title":"End-to-end crowd counting via joint learning local and global count","authors":"C. Shang, H. Ai, Bo Bai","doi":"10.1109/ICIP.2016.7532551","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532551","url":null,"abstract":"Crowd counting is a very challenging task in crowded scenes due to heavy occlusions, appearance variations and perspective distortions. Current crowd counting methods typically operate on an image patch level with overlaps, then sum over the patches to get the final count. In this paper, we propose an end-to-end convolutional neural network (CNN) architecture that takes a whole image as its input and directly outputs the counting result. While making use of sharing computations over overlapping regions, our method takes advantages of contextual information when predicting both local and global count. In particular, we first feed the image to a pre-trained CNN to get a set of high level features. Then the features are mapped to local counting numbers using recurrent network layers with memory cells. We perform the experiments on several challenging crowd counting datasets, which achieve the state-of-the-art results and demonstrate the effectiveness of our method.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"14 1","pages":"1215-1219"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79313403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I. Rodomagoulakis, N. Kardaris, Vassilis Pitsikalis, A. Arvanitakis, P. Maragos
{"title":"A multimedia gesture dataset for human robot communication: Acquisition, tools and recognition results","authors":"I. Rodomagoulakis, N. Kardaris, Vassilis Pitsikalis, A. Arvanitakis, P. Maragos","doi":"10.1109/ICIP.2016.7532923","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532923","url":null,"abstract":"Motivated by the recent advances in human-robot interaction we present a new dataset, a suite of tools to handle it and state-of-the-art work on visual gestures and audio commands recognition. The dataset has been collected with an integrated annotation and acquisition web-interface that facilitates on-the-way temporal ground-truths for fast acquisition. The dataset includes gesture instances in which the subjects are not in strict setup positions, and contains multiple scenarios, not restricted to a single static configuration. We accompany it by a valuable suite of tools as the practical interface to acquire audio-visual data in the robotic operating system, a state-of-the-art learning pipeline to train visual gesture and audio command models, and an online gesture recognition system. Finally, we include a rich evaluation of the dataset providing rich and insightfull experimental recognition results.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"32 1","pages":"3066-3070"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81270829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scale-invariant anomaly detection with multiscale group-sparse models","authors":"Diego Carrera, G. Boracchi, A. Foi, B. Wohlberg","doi":"10.1109/ICIP.2016.7533089","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7533089","url":null,"abstract":"The automatic detection of anomalies, defined as patterns that are not encountered in representative set of normal images, is an important problem in industrial control and biomedical applications. We have shown that this problem can be successfully addressed by the sparse representation of individual image patches using a dictionary learned from a large set of patches extracted from normal images. Anomalous patches are detected as those for which the sparse representation on this dictionary exceeds sparsity or error tolerances. Unfortunately, this solution is not suitable for many real-world visual inspection-systems since it is not scale invariant: since the dictionary is learned at a single scale, patches in normal images acquired at a different magnification level might be detected as anomalous. We present an anomaly-detection algorithm that learns a dictionary that is invariant to a range of scale changes, and overcomes this limitation by use of an appropriate sparse coding stage. The algorithm was successfully tested in an industrial application by analyzing a dataset of Scanning Electron Microscope (SEM) images, which typically exhibit different magnification levels.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"4 1","pages":"3892-3896"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81893954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Patch similarity based edge-preserving background estimation for single frame infrared small target detection","authors":"Kun Bai, Yuehuang Wang, Qiong Song","doi":"10.1109/ICIP.2016.7532343","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532343","url":null,"abstract":"Edges in infrared image usually cause serious false alarms in single frame infrared small target detection. So a novel edge-preserving background estimation method is proposed for small target detection to attenuate this problem. First we will introduce the patch similarity feature of infrared image. Then, patch similarity of infrared image is utilized to formulate edge-preserving infrared background estimation. At last, estimated background will be eliminated from original infrared image to suppress edges. The effective edge-preserving ability of our approach will be shown through experiments and comparisons with state-of-the-art background estimation methods.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"29 1","pages":"181-185"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85731160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic vehicle counting method based on principal component pursuit background modeling","authors":"Jorge Quesada, P. Rodríguez","doi":"10.1109/ICIP.2016.7533075","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7533075","url":null,"abstract":"Estimating the number of vehicles present in traffic video sequences is a common task in applications such as active traffic management and automated route planning. There exist several vehicle counting methods such as Particle Filtering or Headlight Detection, among others. Although Principal Component Pursuit (PCP) is considered to be the state-of-the-art for video background modeling, it has not been previously exploited for this task. This is mainly because most of the existing PCP algorithms are batch methods and have a high computational cost that makes them unsuitable for real-time vehicle counting. In this paper, we propose to use a novel incremental PCP-based algorithm to estimate the number of vehicles present in top-view traffic video sequences in real-time. We test our method against several challenging datasets, achieving results that compare favorably with state-of-the-art methods in performance and speed: an average accuracy of 98% when counting vehicles passing through a virtual door, 91% when estimating the total number of vehicles present in the scene, and up to 26 fps in processing time.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"12 1","pages":"3822-3826"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77016754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated blood vessel extraction in two-dimensional breast thermography","authors":"S. Kakileti, K. Venkataramani","doi":"10.1109/ICIP.2016.7532383","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532383","url":null,"abstract":"In this paper, we present an automated algorithm for detection of blood vessels in 2D-thermographic images for breast cancer screening. Vessel extraction from breast thermal images help in the classification of malignancy as cancer causes increased blood flow at warmer temperatures, additional vessel formation and tortuosity of vessels feeding the cancerous growth. The proposed algorithm uses three enhanced images to detect possible vessel regions based on their intensity and shape. The final vessel detection combines these three outputs. The algorithm does not depend on the variation of pixel intensity in the images but only depends on the relative variation unlike many standard algorithms. On a dataset of over 40 subjects with high-resolution thermographic images, we are able to extract the vessels accurately with elimination of diffused heat regions. Future studies would involve extracting features from the detected vessels and using these features for classification of malignancy.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"53 1 1","pages":"380-384"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78592141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust calibration of broadcast cameras based on ellipse and line contours","authors":"S. Croci, N. Stefanoski, A. Smolic","doi":"10.1109/ICIP.2016.7532377","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532377","url":null,"abstract":"Professional TV studio footage often poses specific challenges to camera calibration due to lack of features and complex camera operation. As available algorithms often fail, we propose a novel approach based on robust tracking of ellipse and line features of a predefined logo. We further devise a predictive and iterative estimation algorithm, which incorporates confidence measures and filtering. Our results validate accuracy and reliability of our approach, demonstrated with challenging professional footage.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"24 1","pages":"350-354"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78557208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Projective non-negative matrix factorization for unsupervised graph clustering","authors":"C. Bampis, P. Maragos, A. Bovik","doi":"10.1109/ICIP.2016.7532559","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532559","url":null,"abstract":"We develop an unsupervised graph clustering and image segmentation algorithm based on non-negative matrix factorization. We consider arbitrarily represented visual signals (in 2D or 3D) and use a graph embedding approach for image or point cloud segmentation. We extend a Projective Non-negative Matrix Factorization variant to include local spatial relationships over the image graph. By using properly defined region features, one can apply our method of unsupervised graph clustering for object and image segmentation. To demonstrate this, we apply our ideas on many graph based segmentation tasks such as 2D pixel and super-pixel segmentation and 3D point cloud segmentation. Finally, we show results comparable to those achieved by the only existing work in pixel based texture segmentation using Nonnegative Matrix Factorization, deploying a simple yet effective extension that is parameter free. We provide a detailed convergence proof of our spatially regularized method and various demonstrations as supplementary material. This novel work brings together graph clustering with image segmentation.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"36 1","pages":"1255-1258"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79969161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dongqing Zou, Ping Guo, Qiang Wang, Xiaotao Wang, Guangqi Shao, Feng Shi, Jia Li, P. Park
{"title":"Context-aware event-driven stereo matching","authors":"Dongqing Zou, Ping Guo, Qiang Wang, Xiaotao Wang, Guangqi Shao, Feng Shi, Jia Li, P. Park","doi":"10.1109/ICIP.2016.7532523","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532523","url":null,"abstract":"Similarity measuring plays as an import role in stereo matching, whether for visual data from standard cameras or for those from novel sensors such as Dynamic Vision Sensors (DVS). Generally speaking, robust feature descriptors contribute to designing a powerful similarity measurement, as demonstrated by classic stereo matching methods. However, the kind and representative ability of feature descriptors for DVS data are so limited that achieving accurate stereo matching on DVS data becomes very challenging. In this paper, a novel feature descriptor is proposed to improve the accuracy for DVS stereo matching. Our feature descriptor can describe the local context or distribution of the DVS data, contributing to constructing an effective similarity measurement for DVS data matching, yielding an accurate stereo matching result. Our method is evaluated by testing our method on groundtruth data and comparing with various standard stereo methods. Experiments demonstrate the efficiency and effectiveness of our method.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"23 1","pages":"1076-1080"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81467347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel classification system for dysplastic nevus and malignant melanoma","authors":"Mutlu Mete, N. Sirakov, John Griffin, A. Menter","doi":"10.1109/ICIP.2016.7532993","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532993","url":null,"abstract":"Melanoma is a potentially deadly form of skin cancer, however, if detected early, it is curable. A dysplastic nevus (atypical mole) is not cancerous but may represent a precursor to malignancy as nearly 40% of melanomas arise from a preexisting mole. In this study, we propose a system to classify a skin lesion image as melanoma (M), dysplastic nevus (D), and benign (B). For this purpose we develop a new two layered-system. The first layer consists of three binary Support Vector Machine (SVM) classifiers, one for each pair of classes, M vs B, M vs D, and B vs D. The second layer is a novel decision maker function, which uses probability memberships derived from the first layer. Each lesion is characterized with five features, which mostly overlaps with the ABCD rule of dermatology. The dataset we used have 112 lesions with 54 M, 38 D, and 20 B cases. In the experiments of melanoma detection, we obtained 98% specificity, 76% sensitivity, and 85% F-measure accuracy.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"98 1","pages":"3414-3418"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76186152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}