Zubin Choudhary, A. Erickson, Nahal Norouzi, Kangsoo Kim, G. Bruder, Gregory F. Welch
{"title":"Virtual Big Heads in Extended Reality: Estimation of Ideal Head Scales and Perceptual Thresholds for Comfort and Facial Cues","authors":"Zubin Choudhary, A. Erickson, Nahal Norouzi, Kangsoo Kim, G. Bruder, Gregory F. Welch","doi":"10.1145/3571074","DOIUrl":"https://doi.org/10.1145/3571074","url":null,"abstract":"Extended reality (XR) technologies, such as virtual reality (VR) and augmented reality (AR), provide users, their avatars, and embodied agents a shared platform to collaborate in a spatial context. Although traditional face-to-face communication is limited by users’ proximity, meaning that another human’s non-verbal embodied cues become more difficult to perceive the farther one is away from that person, researchers and practitioners have started to look into ways to accentuate or amplify such embodied cues and signals to counteract the effects of distance with XR technologies. In this article, we describe and evaluate the Big Head technique, in which a human’s head in VR/AR is scaled up relative to their distance from the observer as a mechanism for enhancing the visibility of non-verbal facial cues, such as facial expressions or eye gaze. To better understand and explore this technique, we present two complimentary human-subject experiments in this article. In our first experiment, we conducted a VR study with a head-mounted display to understand the impact of increased or decreased head scales on participants’ ability to perceive facial expressions as well as their sense of comfort and feeling of “uncannniness” over distances of up to 10 m. We explored two different scaling methods and compared perceptual thresholds and user preferences. Our second experiment was performed in an outdoor AR environment with an optical see-through head-mounted display. Participants were asked to estimate facial expressions and eye gaze, and identify a virtual human over large distances of 30, 60, and 90 m. In both experiments, our results show significant differences in minimum, maximum, and ideal head scales for different distances and tasks related to perceiving faces, facial expressions, and eye gaze, and we also found that participants were more comfortable with slightly bigger heads at larger distances. We discuss our findings with respect to the technologies used, and we discuss implications and guidelines for practical applications that aim to leverage XR-enhanced facial cues.","PeriodicalId":231654,"journal":{"name":"ACM Transactions on Applied Perceptions","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133949512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomas Howard, K. Driller, W. Frier, C. Pacchierotti, M. Marchal, J. Hartcher-O'Brien
{"title":"Gap Detection in Pairs of Ultrasound Mid-air Vibrotactile Stimuli","authors":"Thomas Howard, K. Driller, W. Frier, C. Pacchierotti, M. Marchal, J. Hartcher-O'Brien","doi":"10.1145/3570904","DOIUrl":"https://doi.org/10.1145/3570904","url":null,"abstract":"Ultrasound mid-air haptic (UMH) devices are a novel tool for haptic feedback, capable of providing localized vibrotactile stimuli to users at a distance. UMH applications largely rely on generating tactile shape outlines on the users’ skin. Here we investigate how to achieve sensations of continuity or gaps within such two-dimensional curves by studying the perception of pairs of amplitude-modulated focused ultrasound stimuli. On the one hand, we aim to investigate perceptual effects that may arise from providing simultaneous UMH stimuli. On the other hand, we wish to provide perception-based rendering guidelines for generating continuous or discontinuous sensations of tactile shapes. Finally, we hope to contribute toward a measure of the perceptually achievable resolution of UMH interfaces. We performed a user study to identify how far apart two focal points need to be to elicit a perceptual experience of two distinct stimuli separated by a gap. Mean gap detection thresholds were found at 32.3-mm spacing between focal points, but a high within- and between-subject variability was observed. Pairs spaced below 15 mm were consistently (>95%) perceived as a single stimulus, while pairs spaced 45 mm apart were consistently (84%) perceived as two separate stimuli. To investigate the observed variability, we resort to acoustic simulations of the resulting pressure fields. These show a non-linear evolution of actual peak pressure spacing as a function of nominal focal point spacing. Beyond an initial threshold in spacing (between 15 and 18 mm), which we believe to be related to the perceived size of a focal point, the probability of detecting a gap between focal points appears to linearly increase with spacing. Our work highlights physical interactions and perceptual effects to consider when designing or investigating the perception of UMH shapes.","PeriodicalId":231654,"journal":{"name":"ACM Transactions on Applied Perceptions","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133471555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Content-adaptive Visibility Predictor for Perceptually Optimized Image Blending","authors":"Taiki Fukiage, Takeshi Oishi","doi":"10.1145/3565972","DOIUrl":"https://doi.org/10.1145/3565972","url":null,"abstract":"The visibility of an image semi-transparently overlaid on another image varies significantly, depending on the content of the images. This makes it difficult to maintain the desired visibility level when the image content changes. To tackle this problem, we developed a perceptual model to predict the visibility of the blended results of arbitrarily combined images. Conventional visibility models cannot reflect the dependence of the suprathreshold visibility of the blended images on the appearance of the pre-blended image content. Therefore, we have proposed a visibility model with a content-adaptive feature aggregation mechanism, which integrates the visibility for each image feature (i.e., such as spatial frequency and colors) after applying weights that are adaptively determined according to the appearance of the input image. We conducted a large-scale psychophysical experiment to develop the visibility predictor model. Ablation studies revealed the importance of the adaptive weighting mechanism in accurately predicting the visibility of blended images. We have also proposed a technique for optimizing the image opacity such that users can set the visibility of the target image to an arbitrary level. Our evaluation revealed that the proposed perceptually optimized image blending was effective under practical conditions.","PeriodicalId":231654,"journal":{"name":"ACM Transactions on Applied Perceptions","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116745572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elena Arabadzhiyska, C. Tursun, H. Seidel, P. Didyk
{"title":"Practical Saccade Prediction for Head-Mounted Displays: Towards a Comprehensive Model","authors":"Elena Arabadzhiyska, C. Tursun, H. Seidel, P. Didyk","doi":"10.1145/3568311","DOIUrl":"https://doi.org/10.1145/3568311","url":null,"abstract":"Eye-tracking technology has started to become an integral component of new display devices such as virtual and augmented reality headsets. Applications of gaze information range from new interaction techniques that exploit eye patterns to gaze-contingent digital content creation. However, system latency is still a significant issue in many of these applications because it breaks the synchronization between the current and measured gaze positions. Consequently, it may lead to unwanted visual artifacts and degradation of the user experience. In this work, we focus on foveated rendering applications where the quality of an image is reduced towards the periphery for computational savings. In foveated rendering, the presence of system latency leads to delayed updates to the rendered frame, making the quality degradation visible to the user. To address this issue and to combat system latency, recent work proposes using saccade landing position prediction to extrapolate gaze information from delayed eye tracking samples. Although the benefits of such a strategy have already been demonstrated, the solutions range from simple and efficient ones, which make several assumptions about the saccadic eye movements, to more complex and costly ones, which use machine learning techniques. However, it is unclear to what extent the prediction can benefit from accounting for additional factors and how more complex predictions can be performed efficiently to respect the latency requirements. This paper presents a series of experiments investigating the importance of different factors for saccades prediction in common virtual and augmented reality applications. In particular, we investigate the effects of saccade orientation in 3D space and smooth pursuit eye-motion (SPEM) and how their influence compares to the variability across users. We also present a simple, yet efficient post-hoc correction method that adapts existing saccade prediction methods to handle these factors without performing extensive data collection. Furthermore, our investigation and the correction technique may also help future developments of machine-learning-based techniques by limiting the required amount of training data.","PeriodicalId":231654,"journal":{"name":"ACM Transactions on Applied Perceptions","volume":"332 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115976874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rachel Brown, Vasha Dutell, B. Walter, R. Rosenholtz, P. Shirley, M. McGuire, D. Luebke
{"title":"Efficient Dataflow Modeling of Peripheral Encoding in the Human Visual System","authors":"Rachel Brown, Vasha Dutell, B. Walter, R. Rosenholtz, P. Shirley, M. McGuire, D. Luebke","doi":"10.1145/3564605","DOIUrl":"https://doi.org/10.1145/3564605","url":null,"abstract":"Computer graphics seeks to deliver compelling images, generated within a computing budget, targeted at a specific display device, and ultimately viewed by an individual user. The foveated nature of human vision offers an opportunity to efficiently allocate computation and compression to appropriate areas of the viewer’s visual field, of particular importance with the rise of high-resolution and wide field-of-view display devices. However, while variations in acuity and contrast sensitivity across the field of view have been well-studied and modeled, a more consequential variation concerns peripheral vision’s degradation in the face of clutter, known as crowding. Understanding of peripheral crowding has greatly advanced in recent years, in terms of both phenomenology and modeling. Accurately leveraging this knowledge is critical for many applications, as peripheral vision covers a majority of pixels in the image. We advance computational models for peripheral vision aimed toward their eventual use in computer graphics. In particular, researchers have recently developed high-performing models of peripheral crowding, known as “pooling” models, which predict a wide range of phenomena but are computationally inefficient. We reformulate the problem as a dataflow computation, which enables faster processing and operating on larger images. Further, we account for the explicit encoding of “end stopped” features in the image, which was missing from previous methods. We evaluate our model in the context of perception of textures in the periphery, including a novel texture dataset and updated textural descriptors. Our improved computational framework may simplify development and testing of more sophisticated, complete models in more robust and realistic settings relevant to computer graphics.","PeriodicalId":231654,"journal":{"name":"ACM Transactions on Applied Perceptions","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122993288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}