{"title":"Single image super-resolution using deep hierarchical attention network","authors":"Fei Zhao, Rui Chen, Yuan Li","doi":"10.1145/3381271.3381282","DOIUrl":"https://doi.org/10.1145/3381271.3381282","url":null,"abstract":"In this paper, we present a compact and accurate super-resolution algorithm using the attention-augmented convolutional neural network, which can exploit and weight hierarchical features at multiple scales and levels to improve learning capability. The proposed network employs cascading U-net structure to allow the flow of low-frequency information to focus on learning high- and mid-level features. In addition, deep hierarchical channel attention is developed to help in learning from high-level complex features. Moreover, we propose a hierarchical pyramid attention to learn the inter and intra-level dependencies between the feature maps. Furthermore, the comprehensive quantitative and qualitative experiments on low-resolution and real image benchmark datasets illustrate that our algorithm performs favorably against the state-of-the-art methods.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"193 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131793309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Segmentation-based orbiting satellite tracking","authors":"Yunda Sun, Peizhuo Li, Xue Wan","doi":"10.1145/3381271.3381291","DOIUrl":"https://doi.org/10.1145/3381271.3381291","url":null,"abstract":"To obtained the trajectory of an orbiting satellite is a fundamental and vital step for space rendezvous and manipulation by space robots. Due to the freely and rapidly motion of on-orbiting satellites, the sudden change in appearance, orbiting satellite tracking is difficult for traditional tracker, which usually relies on a single bounding box of the target object. However, more information should be provided by visual tracking such as binary mask. In this paper, we proposed a SOST (Segmentation-based Orbiting Satellite Tracking) algorithm that improves the performance of tracking. Our method, SOST, improves the tracking performance by generating a mask map obtained from segmentation within the initial bounding box. The final bounding box will be refined using the segmentation result. Experiment using real on-orbit rendezvous and docking video from NASA (Nation Aeronautics and Space Administration), simulated satellite animation sequence from ESA (European Space Agency) and image sequences of 3D printed satellite model took in our laboratory demonstrate the robustness, versatility and fast speed of our method compared to state-of-the-art tracking methods. Our dataset will be released for academic use in future.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122269940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust stereo matching using improved ZNCC combined SAD-LBP","authors":"Jie-Wei Chen, Youshen Xia","doi":"10.1145/3381271.3381295","DOIUrl":"https://doi.org/10.1145/3381271.3381295","url":null,"abstract":"Stereo matching aims to obtain depth information of scene from captured images, which becomes an active research topic in the field of computer vision. Most stereo matching cost algorithms are based on a common assumption, which is the intensity or color value of corresponding pixels are same. However, in real-world applications, the colors of the objects observed in the recorded image data are affected by radiometric variations. In this paper, using a novel similarity measure we propose a robust stereo matching method, which has robust performance to noise, illumination condition changes, and exposure changes between left and right images. The proposed stereo matching cost combines improved zero mean normalized cross-correlation (ZNCC) model and the absolute difference of local binary pattern (LBP) of windows to get both the color and texture similarity of windows to be matched. Based on Middleburry data set, we verify the effectiveness of the proposed algorithm. Computed results show that the proposed algorithm is more robust to illumination changes and noise than related stereo matching algorithms.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114700163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Preference of 2D animation style in malaysian colonial shophouses multimedia courseware","authors":"K. T. Chau, N. Nasir, Tan Voon Yee Valerie","doi":"10.1145/3381271.3381304","DOIUrl":"https://doi.org/10.1145/3381271.3381304","url":null,"abstract":"This paper attempts to determine the 2D animation style, either Malaysia, Japan or American style, is preferred by Malaysian youths in multimedia courseware that introduces colonial shophouses in Georgetown, Penang. The 17th century colonial shophouses has exclusive history that exhibits the influence of Malay, Chinese, Indian, and European culture. It is therefore of significance to deliver the untold story of these historical inheritance to the Malaysian youths to inculcate their sense of appreciation towards Malaysian heritages. In multimedia realm, there are concerns about the suitability of 2D animation style capable of delivering the right sense of Malaysian heritages using digital multimedia objects to the Malaysian youths. Therefore, a relevant research was conducted. This research recruited 40 respondents and quantitative findings suggest that majority of Malaysian youths were interested towards using digitally drawn Malaysian 2D animation style which fulfils the aspect of visual and family oriented story to introduce Penang's historical shophouses in multimedia courseware.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130242006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Research of express box defect detection based on machine vision","authors":"Liang Wei, Ningyu Zhang, Muyao Xue, Ju Huo","doi":"10.1145/3381271.3381280","DOIUrl":"https://doi.org/10.1145/3381271.3381280","url":null,"abstract":"The traditional box appearance defect detection adopts manual detection which is low accuracy and time consuming. In order to overcome the drawbacks of manual detection and improve transportation automation, an effective and convenient defect detection algorithm of express boxes is proposed. The central idea of the proposed algorithm is to solve defect identification problems during transportation. This technique only needs to observe defects through the binocular camera system, thus the parameters of shape and size can be obtained subsequently. Finally, experiments are carried out. Results show that the algorithm can accurately identify different shaped defects and the highest reconstruction correct rate is up to 95.83%. The results verify the applicability of the proposed approach.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128387954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of user behavior in a large-scale internet video-on-demand(VoD) system","authors":"Yaohui Yuan, Xingjun Wang, GuangXiang Bin","doi":"10.1145/3381271.3381288","DOIUrl":"https://doi.org/10.1145/3381271.3381288","url":null,"abstract":"With the development and popularity of the Internet, Internet traffic has increased dramatically. Numerous studies have shown that video accounts for a large percentage of Internet traffic, and this percentage is still rising in the future. A good understanding of user behavior in online video systems can help us design, configure and manage video content distribution to alleviate network stress. In this paper, we did a detailed analysis of user behavior data for Internet video. Our research shows that the user's daily access and online pattern of users have a fixed pattern, and the user's access behavior conforms to Zipf's law. Besides, we optimized the fit of Zipf-like distribution of video's popularity. Finally, we built a reliable simulation system that simulates user behavior data. Overall, we believe that the results presented in this paper are very important and valuable to the whole network.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"225 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123876293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-time head pose estimation based on face geometry","authors":"Aditya Hosamani, M. Phirke","doi":"10.1145/3381271.3381296","DOIUrl":"https://doi.org/10.1145/3381271.3381296","url":null,"abstract":"This paper presents a novel head pose estimation technique based on face-geometry. The approach involves performing nose detection and threshold-based segmentation over near-infrared (NIR) input. NIR's inherent invariance to illumination changes makes it an ideal fit within automotive, avionics, mining, and many more such applications where the inconsistent lighting leaves visible light RGB cameras impractical. The pose angles are inferred from the shape of the segmented face and its nose location. Validation of the proposed approach is performed in a two-fold manner since annotated public NIR-based head-pose datasets are scarce. Firstly, the pure yaw and pitch angles are justified using the UPNA RGB head pose dataset. Secondly, an in-house captured and annotated NIR dataset is used for corroborating the pure roll angle. Comparison of the proposed approach with the two more commonly used head pose estimation algorithms viz., DLIB and OpenVINO over the NIR dataset reveal that the proposed approach outperforms the latter two with better accuracy, maximum range coverage, and lesser computation time, thereby, making it a suitable choice in applications such as driver monitoring and surveillance systems, especially during the night/low-light scenarios.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117070414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Phase retrieval with outliers via median truncated amplitude flow","authors":"Quanbing Zhang, Feihang Hu, Dequn Liu, Yufan Yuan","doi":"10.1145/3381271.3381287","DOIUrl":"https://doi.org/10.1145/3381271.3381287","url":null,"abstract":"This paper studies an important situation of phase retrieval, which aims to recover the unknown signal from the given quadratic measurements that are corrupted by outliers. A phase retrieval algorithm based on median Truncated Amplitude Flow (mTAF) is proposed, which adopts the median truncation spectral initialization method to generate an ideal initial point, and then applies the median truncated amplitude flow to perform iteration update to obtain the final recovery result. In the iterative step, the amplitude loss function is used to reduce the number of measurements and accelerate the convergence rate, and introduce the median truncation rule to build samples and eliminate the outliers. Experimental results demonstrate the feasibility and validity of algorithm, and it is shown that the mTAF algorithm can achieve higher success rate and faster convergence speed than others.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126660679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Use of virtual reality games in people with depression and anxiety","authors":"Alice J. Lin, Fuhua Cheng, Charles B. Chen","doi":"10.1145/3381271.3381299","DOIUrl":"https://doi.org/10.1145/3381271.3381299","url":null,"abstract":"Major depressive disorder is a common but serious mood disorder. It can cause severe symptoms that affect how you feel, think, and handle daily activities, such as sleeping, eating, or working. Depression is among the major causes of global disease burden. Video games can help to improve a person's mood, therefore improving their depressive symptoms. Here, we design and develop a prototype VR game to help people with depressive disorders improve their mood. We did preliminary tests showing encouraging results in improving people's moods.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132170567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Novel pose measurement with optimized principal component analysis for unknown spacecraft based on point cloud","authors":"Guiyang Zhang, Ju Huo, Zhanyu Zhang, Mingxuan He, Jinjie Zhang, Muyao Xue","doi":"10.1145/3381271.3381281","DOIUrl":"https://doi.org/10.1145/3381271.3381281","url":null,"abstract":"This paper investigates the issue of vision orientation for unknown spacecraft in orbit-capture, upon which a fast and highly accurate pose measurement method based on improved coordinate system correction by weighted principal component analysis (PCA) is proposed. This algorithm weights point cloud features before dimensionality reduction, and then three principal component vectors in different frames are calculated. Consequently, the effective reduction of the original point cloud and the reduction of information overlap are achieved. The nearest point of the Euclidean distance is employed to corrected the direction of PCA coordinate axis, and thus the initial pose of two sets of point cloud are obtained. Finally, the point cloud in arbitrary pose relationship of unknown space can be aligned accurately by improved iterative closest point (ICP) algorithm with the kd-tree search strategy. The presented method overcomes the disadvantages of high requirement of initial value and avoiding local convergence, which means it achieves a global alignment for unknown target with point cloud of similar shape and integrity. Experiments show that the maximum relative error of attitude is superior to 0.15°, position error is less than ±4mm within the space 2000mmx2000mrnx3000rnm. Results verify that the accuracy and speed performance of the proposed approach can satisfy the requirements of on-orbit spacecraft to capture unknown objects.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121032933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}