{"title":"Faster R-CNN for Marine Organism Detection and Recognition Using Data Augmentation","authors":"Hao Zhou, Hai Huang, Xu Yang, Lu Zhang, Lu Qi","doi":"10.1145/3177404.3177433","DOIUrl":"https://doi.org/10.1145/3177404.3177433","url":null,"abstract":"Recently, Faster Region-based CNN(Faster R-CNN) has achieved marvelous accomplishment in object detection and recognition. In this paper, Faster R-CNN is applied to marine organism detection and recognition. However, the training of Faster R-CNN requires a mass of labeled samples which are difficult to obtain for marine organism. Therefore, three data augmentation methods are proposed dedicated to underwater-imaging. Specifically, the inverse process of underwater image restoration is used to simulate different marine turbulence environments. Perspective transformation is proposed to simulate different view of camera shooting. Illumination synthesis is used to simulate different marine illuminating environments. The performance of each data augmentation method, together with Faster R-CNN is evaluated by experiments on real world underwater dataset, which validate the effectiveness of the proposed method for marine organism detection and recognition.","PeriodicalId":133378,"journal":{"name":"Proceedings of the International Conference on Video and Image Processing","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115368163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application Research of the Laser and Digital Image Processing in Bridge Monitoring","authors":"Zhang-li Lan, Wei Chen, Fang Liu, Yang Yang","doi":"10.1145/3177404.3177443","DOIUrl":"https://doi.org/10.1145/3177404.3177443","url":null,"abstract":"Bridge is a key link in the traffic system, and many new methods have been studied to detect and monitorthe health of bridge, among which the method of bridge detection using the laser and laser image is a research hotpot. In this paper, the displacement measurement method of the deflection, cable support tower and anchor structure using the laser and image processing is studied, and the principle of the measurement iselaborated, among which the key algorithms are described. Successfully application in the practical bridge monitoring has testified that the methodcan meet the performanceneeds of the bridge structure above.","PeriodicalId":133378,"journal":{"name":"Proceedings of the International Conference on Video and Image Processing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115370008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mucahit Buyukyilmaz, Ali Osman Çibikdiken, M. A. Abdalla, H. Seker
{"title":"Identification of Chicken Eimeria Species from Microscopic Images by Using MLP Deep Learning Algorithm","authors":"Mucahit Buyukyilmaz, Ali Osman Çibikdiken, M. A. Abdalla, H. Seker","doi":"10.1145/3177404.3177445","DOIUrl":"https://doi.org/10.1145/3177404.3177445","url":null,"abstract":"Eimeria has more than one species of every single genus of animals that causes diseases that may spread at fast speed and therefore adversely affects animal productivities and results in animal death. It is therefore essential to detect the disease and prevent its spread at the earliest stage. There have been some attempts to address this problem through the analysis of microscopic images. However, due to the complexity, diversity, and similarity of the types of the species, there need more sophisticated methods to be adapted for the intelligent and automated analysis of their microscopic images by using machine- learning methods. To tackle this problem, a deep-learning-based architecture has been proposed and successfully adapted in this study where Chicken fecal microscopy images have been analyzed to identify nine types of these species. The methodology developed includes two main parts, namely (i) pre-processing steps include the techniques that convert image into gray level, extract cell walls, remove background, rotate cell to vertically aligned position and move to their center and (ii) MLP-based deep learning technique to learn features and classify the images, for which Keras model was utilized. Based on the outcome of a 5-fold cross validation that was repeated for 100 times, the approach taken has yielded an average accuracy of 83.75%±0.60, which is comparable to the existing methods.","PeriodicalId":133378,"journal":{"name":"Proceedings of the International Conference on Video and Image Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122408372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spatio-temporal Analysis for Infrared Facial Expression Recognition from Videos","authors":"Zhilei Liu, Cuicui Zhang","doi":"10.1145/3177404.3177408","DOIUrl":"https://doi.org/10.1145/3177404.3177408","url":null,"abstract":"Facial expression recognition (FER) for emotion inference has become one of the most important research fields in human-computer interaction. Existing study on FER mainly focuses on visible images, whereas varying lighting conditions may influence their performances. Recent studies have demonstrated the advantages of infrared thermal images reflecting the temperature distributions, which are robust to lighting changes. In this paper, a novel infrared image sequence based FER method is proposed using spatiotemporal feature analysis and deep Boltzmann machines (DBM). Firstly, a dense motion field among infrared image sequences is generated using optical flow algorithm. Then, PCA is applied for dimension reduction and a three-layer DBM structure is designed for final expression classification. Finally, the effectiveness of the proposed method is well demonstrated based on several experiments conducted on NVIE database.","PeriodicalId":133378,"journal":{"name":"Proceedings of the International Conference on Video and Image Processing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128318653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Single Image Super-resolution Based on Residual Learning","authors":"Chao Xie, Xiaobo Lu","doi":"10.1145/3177404.3177419","DOIUrl":"https://doi.org/10.1145/3177404.3177419","url":null,"abstract":"Patch-based learning methods, such as sparse coding, are by far one of the most dominant ways to handle the single image super-resolution (SISR) issue. However, due to the great success of deep learning, several advanced models based on deep neural networks have been proposed for SISR more recently, gradually revealing its superiority over other counterparts. Therefore, in this paper, we carry on this promising line of work and propose a well-designed network mainly on the basis of residual learning. The key idea of our model is to extract the mean part from the input first in order to lower the impact of background and obtaining two individual components from it. Then, residual learning is applied to mapping the remainder of the input to target high-resolution space, while the mean part is quickly connected to the final output via identity shortcuts. Consequently, our final model conceptually integrates all the above procedures into a completely end-to-end trainable deep network. Thorough experimental results indicate that the proposed method can perform effectively, and is superior to many recently published baselines in terms of both visual fidelity and objective evaluation.","PeriodicalId":133378,"journal":{"name":"Proceedings of the International Conference on Video and Image Processing","volume":"307 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128337666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design of Illegal Occupation Inspection Facilities for Expressway Emergency","authors":"Jie Yuan, Q. Luo, Shubo Wu","doi":"10.1145/3177404.3177416","DOIUrl":"https://doi.org/10.1145/3177404.3177416","url":null,"abstract":"Expressway emergency lane is specially used for vehicles dealing with engineering rescue, medical rescue and other emergency services, any social vehicles are prohibited to enter or stay in the lane. However, the Expressway emergency lanes in China are occupied frequently, which makes it easy to cause delaying in emergency situations and results in serious consequences. Therefore, based on RFID technology, a testing facility for illegal occupation of emergency lane was designed. At the entrance of the toll station, the vehicle information is bound to the electronic tag ID and then distributed to the owner. In the detection area, a communication mechanism is established between the electronic tag, the microcomputer station and the signal lamp induction system by using the RFID technology. By receiving and launching signals, it's easy to detect illegal occupations of vehicles in the emergency lane and to remind the vehicle to leave quickly. At last, the feasibility analysis of the testing facilities shows that the facilities can quickly and accurately detect the vehicles that illegally occupy the emergency lane, and its detection accuracy and power supply capacity meet with the expected requirements. The research of this paper is of great significance to the management of China's expressway emergency lane.","PeriodicalId":133378,"journal":{"name":"Proceedings of the International Conference on Video and Image Processing","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116602702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A New Single Camera-based Ball Motion Analysis System for Virtual Sports","authors":"Jongsung Kim, Myung-Gyu Kim","doi":"10.1145/3177404.3177413","DOIUrl":"https://doi.org/10.1145/3177404.3177413","url":null,"abstract":"In this paper, a new single camera-based ball motion analysis system is proposed for virtual sports. In the proposed system, a ball motion imaging process using a single camera combined with a multi-exposure trigger is used to capture a ball motion image without high-cost equipment. Then, a 2D ball motion analysis process using pixel labeling and circle fitting algorithms is used to obtain the 2D ball positions and size from that ball motion image. Finally, a new 3D ball motion analysis process is used to simultaneously estimate the 3D ball positions and velocity by solving new ball motion equations based on some 2D-3D perspective constraints on the ball positions and size. The performance of the proposed system was experimentally verified against a multiple camera-based system and a radar-based system within a virtual soccer platform. Experimental results show that the proposed system is highly effective and efficient for virtual sports.","PeriodicalId":133378,"journal":{"name":"Proceedings of the International Conference on Video and Image Processing","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123619593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of Artistic Language in the Virtual Reality Design","authors":"Chen Peng, Xiaotong Liang","doi":"10.1145/3177404.3177446","DOIUrl":"https://doi.org/10.1145/3177404.3177446","url":null,"abstract":"Virtual Reality (VR) is an emerging and independent art category. After analyzing artistic expression and language in VR, the paper indicates that VR image has changed the artist's creation processes and methods, helping artists produce more excellent and adaptive works. Meanwhile, VR art also poses certain challenges.","PeriodicalId":133378,"journal":{"name":"Proceedings of the International Conference on Video and Image Processing","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127546162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Blind Image Quality Assessment Using Center-Surround Mechanism","authors":"Jie Li, Jia Yan, Songfeng Deng, Meiling He","doi":"10.1145/3177404.3177425","DOIUrl":"https://doi.org/10.1145/3177404.3177425","url":null,"abstract":"Blind image quality assessment (BIQA) metrics play an important role in multimedia applications. Neuroscience research indicates that the human visual system (HVS) exhibits clear center-surround mechanisms for visual content extraction. Inspired by this, a center-surround mechanism based feature extraction technique is proposed to solve BIQA problem. The difference-of-Gaussian (DoG) filter, computed in scale-space, has been shown to be able to mimic the center-surround mechanism. In this paper, only DoG maps are employed to characterize the local structure changes in distorted images. The DoG maps are then modeled by generalized Gaussian distribution (GGD) to obtain statistical features. A regression model is learnt to map the features to the subjective quality score. Despite its simplicity, extensive experimental results have demonstrated competitive quality prediction performance and generalization ability of our method.","PeriodicalId":133378,"journal":{"name":"Proceedings of the International Conference on Video and Image Processing","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125023593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the International Conference on Video and Image Processing","authors":"","doi":"10.1145/3177404","DOIUrl":"https://doi.org/10.1145/3177404","url":null,"abstract":"","PeriodicalId":133378,"journal":{"name":"Proceedings of the International Conference on Video and Image Processing","volume":"125 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134347208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}