Christopher Cooley, S. Coleman, B. Gardiner, B. Scotney
{"title":"Multi-scale saliency using local gradient and global colour features","authors":"Christopher Cooley, S. Coleman, B. Gardiner, B. Scotney","doi":"10.1145/3357254.3357285","DOIUrl":"https://doi.org/10.1145/3357254.3357285","url":null,"abstract":"In this paper, the issue of scale is addressed in the context of salient object detection. To date, many single scale models have been proposed for detecting salient objects within a scene. Scale is a fundamental problem within image processing, and therefore, multiple scale techniques are investigated and evaluated, as well the presentation of a novel multi-scale saliency model. The proposed model is compared with two state-of-the-art multi-scale saliency algorithms and qualitatively evaluated with respect to algorithmic accuracy and efficiency on the publicly available MSRA10K salient object dataset.","PeriodicalId":361892,"journal":{"name":"International Conference on Artificial Intelligence and Pattern Recognition","volume":"276 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114945758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Face expression recognition based on improved convolutional neural network","authors":"Quanming Liu, Jing Zhang, Y. Xin","doi":"10.1145/3357254.3357275","DOIUrl":"https://doi.org/10.1145/3357254.3357275","url":null,"abstract":"Aiming at the problems of huge parameters and network degradation caused by simple linear stacked convolution layers or continuous full connection layers in traditional expression recognition methods, two convolution neural network models are designed through depth separation convolution and residual module respectively to widen and deepen the network. Firstly, model A adopts depth separation convolution instead of regular convolution layer, and the global average pooling layer replaces the final full connection layer, utilizes the methods of dropout, batch normalization, activation function of PReLU and image augmentation to avoid over-fitting effectively. Model B adopts pre-trained ResNet50 model to extract facial features, magnifies the images twice by the SRGAN method. Using ensemble method to fuse model A and B, the accuracy is further improved. To verify the feasibility of the method, the model was tested on the FER2013 facial expression dataset, and the performance was compared with the other facial expression recognition algorithms. The final results showed the improved convolutional neural network (CNN) reached the advanced precision of 73.244% in FER2013 dataset, and the experiment data and the number of model parameters all proved the effectiveness of this method.","PeriodicalId":361892,"journal":{"name":"International Conference on Artificial Intelligence and Pattern Recognition","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133834591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dense attentional network for pancreas segmentation in abdominal CT scans","authors":"Weihao Yu, Huai Chen, Lisheng Wang","doi":"10.1145/3357254.3357259","DOIUrl":"https://doi.org/10.1145/3357254.3357259","url":null,"abstract":"Deep neural networks have been widely used in medical image segmentation and they can achieve good results in segmentation of some big organs. However, for some small organs, such as the pancreas in 3D CT images, the segmentation results are usually not satisfactory due to the low proportion. In this paper, we present a novel network --- Dense Attentional Network (DA-Net), to improve the pancreas segmentation in abdominal CT scans. In DA-Net, dense connection is used to combine low-level features of encoder with corresponding features of decoder, which can help to improve the utilization of feature maps (FMs). In addition, a new module for recombination and recalibration of feature maps (RRFM) and a new attentional mechanism --- deep attentional features (DAF), are used, which can excite the most discriminating features. Pancreas is segmented from 3D CT images by a coarse-to-fine mode, in which pancreas is firstly located from 3D CT images by a coarse segmentation network, and then pancreas is further finely segmented by the DA-Net. We evaluate the proposed method with 129 CT images from NIH pancreas dataset and BTCV segmentation challenge, and compare it with several mainstream segmentation networks. Comparing with these networks, our DA-Net has the higher mean DSC of 81.39%. This shows the effectiveness and advantage of the proposed method.","PeriodicalId":361892,"journal":{"name":"International Conference on Artificial Intelligence and Pattern Recognition","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134064663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recognizing sound signals through spiking neurons and spike-timing-dependent plasticity","authors":"Yan Liu, Jiawei Chen, Liujun Chen","doi":"10.1145/3357254.3357264","DOIUrl":"https://doi.org/10.1145/3357254.3357264","url":null,"abstract":"Spiking Neural Networks (SNNs) are regarded as brain-inspired neural networks. Most SNNs described spiking neurons with the leaky integrate-and-fire model, which does not incorporate biological properties of real neurons. In this paper, a model motivated by the human auditory pathway is proposed to explore the possible sound signals recognition mechanism based on the biological dynamic properties of Hodgkin-Huxley (HH) neurons and the spike-timing-dependent-plasticity (STDP) rule of synapses. The first mechanism is that HH neurons have the property of frequency selective response. They only respond to their characteristic frequencies in burst spike trains, which makes the recognition of sound intensity based on the dynamic neurons become possible. The second mechanism is that according to the STDP rule, a synaptic connection structure is formed, and the frequency and the intensity information of input signals are stored in the synaptic delay times. Finally, the neural networks recognize sound signals with spatiotemporal firing patterns.","PeriodicalId":361892,"journal":{"name":"International Conference on Artificial Intelligence and Pattern Recognition","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133298782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Attention-based bidirectional gated recurrent unit neural networks for sentiment analysis","authors":"Qing Yu, Hui Zhao, Zuohua Wang","doi":"10.1145/3357254.3357262","DOIUrl":"https://doi.org/10.1145/3357254.3357262","url":null,"abstract":"Sentiment analysis is an important research direction of natural language processing. In-depth exploration of online textual emotional information has great social significance social and commercial value for market research, online public opinion discovery and early warning. In this paper, the gated recurrent unit neural network and the attention mechanism are combined to propose a text sentiment analysis model---Attention-BGRU. The attention mechanism was added to the gated recurrent unit neural network, and the model was implemented under the Keras deep learning framework. According to the experimental results, the comparison with the existing models shows that the proposed model has obvious advantages over the general deep learning method.","PeriodicalId":361892,"journal":{"name":"International Conference on Artificial Intelligence and Pattern Recognition","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115774142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Research on recommender algorithm optimization based on statistics and preference model","authors":"Jia Wang, Xia Song, Q. Jin, Dan Song","doi":"10.1145/3357254.3357291","DOIUrl":"https://doi.org/10.1145/3357254.3357291","url":null,"abstract":"The personalized recommender system has become a research hotspot in the field of artificial intelligence (AI) because it can effectively deal with information overload. Cold start and data sparsity are two major challenges for smart recommender systems. This paper proposes an optimized recommender algorithm based on statistics and preference model that is able to solve the problems of data sparsity and cold start by means of statistics. Taking the film scoring system as the test object, the Gaussian model is established for the video type preference. The results show that the optimized algorithm can better deal with cold start and data sparsity, and achieve more accurate prediction recommender score.","PeriodicalId":361892,"journal":{"name":"International Conference on Artificial Intelligence and Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116231168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qi Zhang, Huafeng Wang, Tao Du, Sichen Yang, Yuehai Wang, Zhiqiang Xing, Wenle Bai, Yang Yi
{"title":"Super-resolution reconstruction algorithms based on fusion of deep learning mechanism and wavelet","authors":"Qi Zhang, Huafeng Wang, Tao Du, Sichen Yang, Yuehai Wang, Zhiqiang Xing, Wenle Bai, Yang Yi","doi":"10.1145/3357254.3358600","DOIUrl":"https://doi.org/10.1145/3357254.3358600","url":null,"abstract":"In this paper, we consider the problem of super-resolution reconstruction. This is a hot topic because super-resolution reconstruction has a wide range of applications in the medical field, remote sensing monitoring, and criminal investigation. Compared with traditional algorithms, the current super-resolution reconstruction algorithm based on deep learning greatly improves the clarity of reconstructed pictures. Existing work like Super-Resolution Using a Generative Adversarial Network (SRGAN) can effectively restore the texture details of the image. However, experimentally verified that the texture details of the image recovered by the SRGAN are not robust. In order to get super-resolution reconstructed images with richer high-frequency details, we improve the network structure and propose a super-resolution reconstruction algorithm combining wavelet transform and Generative Adversarial Network. The proposed algorithm can efficiently reconstruct high-resolution images with rich global information and local texture details. We have trained our model by PyTorch framework and VOC2012 dataset, and tested it by Set5, Set14, BSD100 and Urban100 test datasets.","PeriodicalId":361892,"journal":{"name":"International Conference on Artificial Intelligence and Pattern Recognition","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124171073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ClickBAIT: Click-based Accelerated Incremental Training of Convolutional Neural Networks","authors":"Ervin Teng, João Diogo Falcão, Bob Iannucci","doi":"10.1109/AIPR.2018.8707375","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707375","url":null,"abstract":"Today's general-purpose deep convolutional neural networks (CNN) for image classification and object detection are trained offline on large static datasets. Some applications, however, will require training in real-time on live video streams with a human-in-the-loop. We refer to this class of problem as Time-ordered Online Training (ToOT) - these problems will require a consideration of not only the quantity of incoming training data, but the human effort required to tag and use it. In this paper, we define training benefit as a metric to measure the effectiveness of a sequence in using each user interaction. We demonstrate and evaluate a system tailored to performing ToOT in the field, capable of training an image classifier on a live video stream through minimal input from a human operator. We show that by exploiting the time-ordered nature of the video stream through optical flow-based object tracking, we can increase the effectiveness of human actions by about 8 times.","PeriodicalId":361892,"journal":{"name":"International Conference on Artificial Intelligence and Pattern Recognition","volume":"247 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114057499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Saxena, J. Nielsen, I. Gonzalez-Gomez, G. Karapetyan, V. Khankaldyyan, M. Nelson, W. Laug
{"title":"A Non-invasive, Multi-modality Approach Based on NIRS and MRI Techniques For Monitoring Intracranial Brain Tumor Angiogenesis","authors":"V. Saxena, J. Nielsen, I. Gonzalez-Gomez, G. Karapetyan, V. Khankaldyyan, M. Nelson, W. Laug","doi":"10.1109/AIPR.2005.10","DOIUrl":"https://doi.org/10.1109/AIPR.2005.10","url":null,"abstract":"The understanding of tumor oxygenation at the microvascular level in an orthotopic model may provide useful insight into tumor physiology, therapeutic response and development of protocols to study tumor behavior. In this paper the vascular status and the patho-physiological changes occurring during angiogenes is are studied in an orthotopic brain tumor model using a noninvasive multimodality approach based on near infrared (NIR) diffuse optical spectroscopy (DOS) along with magnetic resonance imaging (MRI) We report a direct correlation between tumor size and intratumoral microvessel density MVD, and tumor oxygenation. The relative decrease in the oxygen saturation value with tumor growth indicates that though blood vessels infiltrate and proliferate the tumor region, a hypoxic trend is clearly present","PeriodicalId":361892,"journal":{"name":"International Conference on Artificial Intelligence and Pattern Recognition","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124683297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Technique for the Extraction of Depth Information by Gradient Analysis on Grayscale Images","authors":"Li Tao, V. Asari","doi":"10.1109/AIPR.2004.7","DOIUrl":"https://doi.org/10.1109/AIPR.2004.7","url":null,"abstract":"A technique specifically designed for 3D surface reconstruction of human face in a single grayscale image was developed based on the principle of Shape from Shading (SFS). Lambertian reflectance model was used to obtain the surface gradient information contained in the intensity image. The surface depth was computed by direct integration of surface gradients. The X and Y components of surface gradients were determined based on the assumption that the direction of surface gradient is parallel to the image intensity gradient. In order to determine the signs of the X and Y components of surface gradients, the analysis of image intensity and face detection technique were used to provide the position of critical points based on the 3D characteristics of human face. This algorithm has been applied to synthetic face images with light source direction along z axis, and the reconstructed 3D human face was obtained with good accuracy and high speed. The result produced by our algorithm was also compared with those of other SFS algorithms. The performance of the proposed algorithm indicates that the new concept of combining face detection with SFS will be a promising and useful technique for recovering 3D faces from grayscale images.","PeriodicalId":361892,"journal":{"name":"International Conference on Artificial Intelligence and Pattern Recognition","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123946508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}