{"title":"Efficient HD video and image salient object detection with hierarchical boolean map approach","authors":"Bo Xiao, Bin Wang","doi":"10.1109/ICIVC.2017.7984448","DOIUrl":"https://doi.org/10.1109/ICIVC.2017.7984448","url":null,"abstract":"We present an efficient technique for high-definition image and video salient object detection using a hierarchical Boolean map approach. We begin by extracting multiple boolean map layers. Within each layer, we then apply flood fill algorithm to each seed pixel in parallel to generate attention maps. The saliency map is calculated by summing up all the attention maps. We further improve video consistency by using a border median filter to avoid flicker. This hierarchical approach accelerates the salient object detection process and consistently achieves convincing results.","PeriodicalId":181522,"journal":{"name":"2017 2nd International Conference on Image, Vision and Computing (ICIVC)","volume":"13 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132733142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Natural scene text detection based on SWT, MSER and candidate classification","authors":"L. Guan, Jizheng Chu","doi":"10.1109/ICIVC.2017.7984452","DOIUrl":"https://doi.org/10.1109/ICIVC.2017.7984452","url":null,"abstract":"This paper presents a novel scene text detection algorithm based on Stroke Width Transform (SWT), Maximally Extremal Regions (MSER) and candidate classification. Firstly, utilize the SWT and MSER to extract the candidate characters at the same time. Secondly, preliminary filtering the candidate connected components based on heuristic rules. Thirdly, using mutual verification and integration to class all candidate into two categories: strong candidates, weak candidates. If the weak candidate has similar properties with strong candidate, then the weak candidate is changed into strong candidate. Finally, the text area is aggregated into text lines by text line aggregation algorithm. The experiment results on public datasets show that the proposed method can detect text lines effectively.","PeriodicalId":181522,"journal":{"name":"2017 2nd International Conference on Image, Vision and Computing (ICIVC)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123746001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tracking skeletal fusion feature for one shot learning gesture recognition","authors":"Li Xuejiao, S. Yongqing","doi":"10.1109/ICIVC.2017.7984545","DOIUrl":"https://doi.org/10.1109/ICIVC.2017.7984545","url":null,"abstract":"Accessibility of RGB-D sensors have facilitated the research in gesture recognition. During sundry approaches, it is found that skeleton information is significant especially for one shot learning by virtue of the minimum requirement of data. We made a review on state-of-the-art approaches for gesture recognition in one shot learning. Based on bag of visual model (BOVW), this paper presents a study on skeletal tracking from RGB-D and puts forward a novel skeletal fusion feature extracted from these data, namely skeletal filtered features around key points (SFFK). The proposed SFFK feature is efficient, precise and robust. Efforts were made to optimize the gesture segmentation algorithm based on dynamic time warping (DTW). We propose different ways to gain the motion matrix, during which we find one performs best. That is taking OR operation on two difference images obtained from three adjacent frames. Finally, we evaluated our approach on the ChaLearn gesture dataset (CGD). The results show that our approach is remarkably superior to those existed approaches on CGD.","PeriodicalId":181522,"journal":{"name":"2017 2nd International Conference on Image, Vision and Computing (ICIVC)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125993420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Grid clustering analysis the big data of spectrum by deep learning","authors":"Chen Shuxin, Sun Weimin","doi":"10.1109/ICIVC.2017.7984705","DOIUrl":"https://doi.org/10.1109/ICIVC.2017.7984705","url":null,"abstract":"With the Internet plus big data science information era increasing rapidly. To seek special and unknown objects is the human exploration of the mystery of the universe to pursue the goal in the universe. The spectrum by the big data mining are the fairly complex data, the dimension is high, and the correlation between the dimensions is not strong, but it is easy to introduce noise or the missing data. So it is much more difficult to deal with metering data. This article investigates the LAMOST data release star spectrum based on the high resolution spectral parameters. The RFITSIO software package of R language is used to graphically analyze the big data of the spectrum. The deep learning analysis extracts the information from the large data with finding the new knowledge and the unknown outlier data. Now the FITS format spectral large data information rise to 107 levels of data. Since the big data is imported with a large amount of redundant information, the full spectrum signal of the star spectrum making the full use of Multivariable Statistical Analysis to cluster clustering data characterized by line index. Using the Lick line index as the spectral feature, the spectral data are clustered by the K-means mean algorithm of deep learning. Experiments show that the data with strong physical correlation are valid and fast, the clustering outlier analysis of the big data feature in the spectral survey are completed with the characteristics of the data.","PeriodicalId":181522,"journal":{"name":"2017 2nd International Conference on Image, Vision and Computing (ICIVC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129908218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Algorithm of fingerprint extraction and implementation based on OpenCV","authors":"Yue Yaru, Zhu Jialin","doi":"10.1109/ICIVC.2017.7984539","DOIUrl":"https://doi.org/10.1109/ICIVC.2017.7984539","url":null,"abstract":"At present, OpenCV function library is applied more and more widely, and it is used in digital image processing to solve some problems of image processing, and it can improve the effectiveness of image processing. Commonly it is easy to cause the loss of image detail for the Otsu method, in order to solve these problems the Otsu method is improved. In the case of uneven illumination and blurred image, it can segment the target, and the result is accurate, simple and shorter running time. What's more, it can reduce the amount of computation and storage space. In general, it is a fast and effective and good real-time image threshold segmentation algorithm. This paper uses the OpenCV functions to achieve a fingerprint extraction algorithm. The algorithm uses the Otsu algorithm improved to get the best threshold that it can segment image. Simulation experiment is carried out by using object-oriented Vc++6.0 programming tools, and it proves that the fingerprint extraction algorithm based on OpenCV function library is effective. It can improve the accuracy of fingerprint extraction, and image likes real image. Give some code.","PeriodicalId":181522,"journal":{"name":"2017 2nd International Conference on Image, Vision and Computing (ICIVC)","volume":"2010 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127351997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongyan Zhang, J. Luo, Zihao Wang, Long Ma, Yi-Fan Niu
{"title":"An accelerated matching algorithm for SIFT-like features","authors":"Hongyan Zhang, J. Luo, Zihao Wang, Long Ma, Yi-Fan Niu","doi":"10.1109/ICIVC.2017.7984527","DOIUrl":"https://doi.org/10.1109/ICIVC.2017.7984527","url":null,"abstract":"This paper defines the SIFT-like features by analogy and proposes a novel method to accelerate its matching process. The acceleration strategy is to compute a characteristic value for each key point descriptor and divide a key point set into different subsets making use of this value. The approximate nearest neighbor (ANN) search method is applied to improve the efficiency of matching. The performance and accuracy of the proposed algorithm have been tested on various data and compared with the normal ANN search. The experimental results show the new method is, on average, twice faster than ANN search when it is applied to SIFT features' matching.","PeriodicalId":181522,"journal":{"name":"2017 2nd International Conference on Image, Vision and Computing (ICIVC)","volume":"133 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123514296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrating saliency and ResNet for airport detection in large-size remote sensing images","authors":"Tinghe Zhu, Yuhui Li, Qiankun Ye, H. Huo, T. Fang","doi":"10.1109/ICIVC.2017.7984451","DOIUrl":"https://doi.org/10.1109/ICIVC.2017.7984451","url":null,"abstract":"Automatic airport detection has received great attention due to the importance of airports in both military and civilian uses. This paper focuses on automatic airport detection in large-size remote sensing images under a two-step object detection framework. In the first step, both geometrical saliency and local entropy saliency are improved to find more accurate ROIs for detecting airports in large-size remote sensing images. The geometrical saliency is based on line features of airports, and line segment detector (LSD) is used to detect line segments. Then, line group weighted saliency map is generated after line connection, and local entropy saliency map is created by further considering the entropy difference between neighbor pixels. Finally, ROIs can be obtained by combining these two saliency maps. The improved saliencies could make airports prominent as a whole instead of many separated parts, and it could find more accurate ROIs than traditional saliency methods. In the second step, deep residual learning network (ResNet) is used to determine whether a ROI should be labeled as an airport. The unobvious features of airports could be further extracted by ResNet, which greatly promotes the robustness of the proposed two-step object detection method. The experiments on large-size remote sensing images have shown that the proposed method could reduce false alarms greatly compared with both the traditional two-way saliency (TWS) airport detection method and the state-of-the-art object detection method of single shot multi-box detector (SSD).","PeriodicalId":181522,"journal":{"name":"2017 2nd International Conference on Image, Vision and Computing (ICIVC)","volume":"46 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121975560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unconstrained face detection based on cascaded Convolutional Neural Networks in surveillance video","authors":"Junjie Li, Saleem Karmoshi, Ming Zhu","doi":"10.1109/ICIVC.2017.7984516","DOIUrl":"https://doi.org/10.1109/ICIVC.2017.7984516","url":null,"abstract":"With the popularity of surveillance video, face detection in surveillance video has become a popular and important topic. Face detection in surveillance video plays an important role in many popular applications such as: personal identification, crowd analysis, database establishment, and abnormal event detection. This paper proposes an unconstrained face detection method for surveillance video, which is not influenced by factors such as face location, expression, posture, scale, and lighting conditions. First, the detection area is initially extracted from the video frame using the improved foreground extraction and skin color detection. Next, we then use the multi-scale sliding window and the cascaded Convolutional Neural Network (CNN) designed in this paper to detect faces. This cascaded network consists of two CNN networks: the first network filters out most of the background area while ensuring the running speed of the whole system and the recall rate of the face, while the second network guarantees the accuracy of the overall system. Finally, we set up a database for the experiment which contained samples from the actual surveillance video. The results of our experiment suggest that the proposed method can obtain good results on unconstrained face detection in surveillance video and can also achieve satisfactory detection speed.","PeriodicalId":181522,"journal":{"name":"2017 2nd International Conference on Image, Vision and Computing (ICIVC)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122181173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application of data mining methods in diabetes prediction","authors":"Messan Komi, Jun Li, Yongxin Zhai, Xianguo Zhang","doi":"10.1109/ICIVC.2017.7984706","DOIUrl":"https://doi.org/10.1109/ICIVC.2017.7984706","url":null,"abstract":"Data science methods have the potential to benefit other scientific fields by shedding new light on common questions. One such task is help to make predictions on medical data. Diabetes mellitus or simply diabetes is a disease caused due to the increase level of blood glucose. Various traditional methods, based on physical and chemical tests, are available for diagnosing diabetes. The methods strongly based on the data mining techniques can be effectively applied for high blood pressure risk prediction. In this paper, we explore the early prediction of diabetes via five different data mining methods including: GMM, SVM, Logistic regression, ELM, ANN. The experiment result proves that ANN (Artificial Neural Network) provides the highest accuracy than other techniques.","PeriodicalId":181522,"journal":{"name":"2017 2nd International Conference on Image, Vision and Computing (ICIVC)","volume":"232 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122463003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhance deep learning performance in face recognition","authors":"Ze Lu, Xudong Jiang, A. Kot","doi":"10.1109/ICIVC.2017.7984554","DOIUrl":"https://doi.org/10.1109/ICIVC.2017.7984554","url":null,"abstract":"Deep convolutional neural networks (CNNs) based face recognition approaches have been dominating the field. The success of CNNs is attributed to their ability to learn rich image representations. But training CNNs relies on estimating millions of parameters and requires a very large number of annotated training images. A widely-used alternative is to fine-tune the CNN that has been pre-trained using a large set of labeled images. However, we show that fine-tuning pre-trained CNNs cannot provide satisfactory face recognition performance when training and testing datasets have large differences. To address this problem, we propose to improve the face recognition performance of CNNs by using non-CNN features. Extensive experiments are conducted on LFW and FRGC databases using the pre-trained CNN model, VGG-Face. Results show that the complementary information contained in non-CNN features greatly improves the face verification rate/accuracy of CNNs on LFW and FRGC databases. Furthermore, we show that non-CNN features are more effective in enhancing the performance of pre-trained CNNs than fine-tuning.","PeriodicalId":181522,"journal":{"name":"2017 2nd International Conference on Image, Vision and Computing (ICIVC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117218341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}