{"title":"Visual Search Target Inference in Natural Interaction Settings with Machine Learning","authors":"Michael Barz, Sven Stauden, Daniel Sonntag","doi":"10.1145/3379155.3391314","DOIUrl":"https://doi.org/10.1145/3379155.3391314","url":null,"abstract":"Visual search is a perceptual task in which humans aim at identifying a search target object such as a traffic sign among other objects. Search target inference subsumes computational methods for predicting this target by tracking and analyzing overt behavioral cues of that person, e.g., the human gaze and fixated visual stimuli. We present a generic approach to inferring search targets in natural scenes by predicting the class of the surrounding image segment. Our method encodes visual search sequences as histograms of fixated segment classes determined by SegNet, a deep learning image segmentation model for natural scenes. We compare our sequence encoding and model training (SVM) to a recent baseline from the literature for predicting the target segment. Also, we use a new search target inference dataset. The results show that, first, our new segmentation-based sequence encoding outperforms the method from the literature, and second, that it enables target inference in natural settings.","PeriodicalId":226088,"journal":{"name":"ACM Symposium on Eye Tracking Research and Applications","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121653644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards capturing focal/ambient attention during dynamic wayfinding","authors":"J. Krukar, P. Mavros, C. Hoelscher","doi":"10.1145/3379157.3391417","DOIUrl":"https://doi.org/10.1145/3379157.3391417","url":null,"abstract":"This work-in-progress paper reports on an ongoing experiment in which mobile eye-tracking is used to evaluate different wayfinding support systems. Specifically, it tackles the problem of detecting and isolating attentional demands of building layouts and signage systems in wayfinding tasks. The coefficient K has been previously established as a measure of focal/ambient attention for eye-tracking data. Here, we propose a novel method to compute coefficient K using eye-tracking from virtual reality experiments. We detail challenges associated with transforming a two-dimensional coefficient K concept to three-dimensional data, and the debatable theoretical equivalence of the concept after such a transformation. We present a preliminary implementation to experimental data and explore the possibilities of the method for novel insight in architectural analyses.","PeriodicalId":226088,"journal":{"name":"ACM Symposium on Eye Tracking Research and Applications","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125614182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data-Driven Classification of Dyslexia Using Eye-Movement Correlates of Natural Reading","authors":"J. Szalma, B. Weiss","doi":"10.1145/3379156.3391379","DOIUrl":"https://doi.org/10.1145/3379156.3391379","url":null,"abstract":"Developmental dyslexia is a reading disability estimated to affect between 5 to 10 percent of the population. Current screening methods are limited as they tell very little about the oculomotor processes underlying natural reading. Investigation of eye-movement correlates of reading using machine learning could enhance detection of dyslexia. Here we used eye-tracking data collected during natural reading of 48 young adults (24 dyslexic, 24 control). We established a set of 67 features containing saccade-, glissade-, fixation-related measures and the reading speed. To detect participants with dyslexic reading patterns, we used a linear support vector machine with 10-fold stratified cross-validation repeated 10 times. For feature selection we used a recursive feature elimination method, and we also considered hyperparameter optimization, both with nested and regular cross-validation. The overall best model achieved a 90.1% classification accuracy, while the best nested model achieved a 75.75% accuracy.","PeriodicalId":226088,"journal":{"name":"ACM Symposium on Eye Tracking Research and Applications","volume":"164 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113981776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Label Likelihood Maximisation: Adapting iris segmentation models using domain adaptation","authors":"Anton Molbjerg Eskildsen, D. Hansen","doi":"10.1145/3379155.3391327","DOIUrl":"https://doi.org/10.1145/3379155.3391327","url":null,"abstract":"We propose to use unlabelled eye image data for domain adaptation of an iris segmentation network. Adaptation allows the model to be less reliant on its initial generality. This is beneficial due to the large variance exhibited by eye image data which makes training of robust models difficult. The method uses a label prior in conjunction with network predictions to produce pseudo-labels. These are used in place of ground-truth data to adapt a base model. A fully connected neural network performs the pixel-wise iris segmentation. The base model is trained on synthetic data and adapted to several existing datasets with real-world eye images. The adapted models improve the average pupil centre detection rates by 24% at a distance of 25 pixels. We argue that the proposed method, and domain adaptation in general, is an interesting direction for increasing robustness of eye feature detectors.","PeriodicalId":226088,"journal":{"name":"ACM Symposium on Eye Tracking Research and Applications","volume":"125 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115170123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Visual Analysis of Eye Movements During Game Play","authors":"Michael Burch, K. Kurzhals","doi":"10.1145/3379156.3391839","DOIUrl":"https://doi.org/10.1145/3379156.3391839","url":null,"abstract":"Eye movements indicate visual attention and strategies during game play, regardless of whether in board, sports, or computer games. Additional factors such as individual vs. group play and active playing vs. observing game play further differentiate application scenarios for eye movement analysis. Visual analysis has proven to be an effective means to investigate and interpret such highly dynamic spatio-temporal data. In this paper, we contribute a classification strategy for different scenarios for the visual analysis of gaze data during game play. Based on an initial sample of related work, we derive multiple aspects comprising data sources, game mode, player number, player state, analysis mode, and analysis goal. We apply this classification strategy to describe typical analysis scenarios and research questions as they can be found in related work. We further discuss open challenges and research directions for new application scenarios of eye movements in game play.","PeriodicalId":226088,"journal":{"name":"ACM Symposium on Eye Tracking Research and Applications","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115269297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparison of three dwell-time-based gaze text entry methods: Extended Abstract","authors":"J. Matulewski, M. Patera","doi":"10.1145/3379157.3388931","DOIUrl":"https://doi.org/10.1145/3379157.3388931","url":null,"abstract":"Gaze text entry (GTE) with use of visual keyboards displayed on computer screen is an important topic for both the scientists dealing with the gaze interaction and potential users i.e. people with physical disabilities and their families. The most commonly used technique for GTE is based on dwell-time regions, at which the user needs to look longer to activate the associated action, in our case - entering the letter. In the article, we present the results of tests of three GTE systems (gaze keyboards) on a sample of 29 participants. We compare the objective measures of usability, namely the text entry rate and the number of errors, as well as subjective ones, obtained using SUS questionnaire. Additionally, two similar keyboards based on the ‘Qwerty’ buttons layout were compared in terms of time to the first fixation and its duration in the areas of interest (AOI) corresponding to the visual buttons. One of these gaze keyboards, the so called ’Molecular’ one, contains dynamic elements that have been designed and implemented in our laboratory, and which aim is to support the search for buttons by increasing the size of buttons with suggested letters, without significant change of their positions.","PeriodicalId":226088,"journal":{"name":"ACM Symposium on Eye Tracking Research and Applications","volume":"157 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122775028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling Metacomprehension Monitoring Accuracy with Eye Gaze on Informational Content in a Multimedia Learning Environment","authors":"Megan D. Wiedbusch, R. Azevedo","doi":"10.1145/3379155.3391329","DOIUrl":"https://doi.org/10.1145/3379155.3391329","url":null,"abstract":"Multimedia learning environments support learners in developing self-regulated learning (SRL) strategies. However, capturing these strategies and cognitive processes can be difficult for researchers because cognition is often inferred, not directly measured. This study sought to model self-reported metacognitive judgments using eye-tracking from 60 undergraduate students as they learned about biological systems with MetaTutorIVH, a multimedia learning environment. We found that participants’ gaze behaviors were different between the perceived relevance of the instructional content provided regardless of the actual content relevance. Additionally, we fit a cumulative link mixed effects ordinal regression model to explain reported metacognitive judgments based on content fixations, relevance, and presentation type. Main effects were found for all variables and several interactions between both fixations and content relevance as well as content fixations and presentation type. Surprisingly, accurate metacognitive judgments did not explain performance. Implication for multimedia learning environment design are discussed.","PeriodicalId":226088,"journal":{"name":"ACM Symposium on Eye Tracking Research and Applications","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124977212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gaze Estimation in the Dark with Generative Adversarial Networks","authors":"Jung-Hwa Kim, Jin-Woo Jeong","doi":"10.1145/3379157.3391654","DOIUrl":"https://doi.org/10.1145/3379157.3391654","url":null,"abstract":"In this paper, we propose to utilize generative adversarial networks (GANs) to achieve successful gaze estimation in interactive multimedia environments with low light conditions such as a digital museum or exhibition hall. The proposed approach utilizes a GAN to enhance user images captured under low-light conditions, thereby recovering missing information for gaze estimation. The recovered images are fed into the CNN architecture to estimate the direction of user gaze. The preliminary experimental results on the modified MPIIGaze dataset demonstrated an average performance improvement of 6.6 under various low light conditions, which is a promising step for further research.","PeriodicalId":226088,"journal":{"name":"ACM Symposium on Eye Tracking Research and Applications","volume":"07 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121659846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dmytro Katrychuk, Henry K. Griffith, Oleg V. Komogortsev
{"title":"A Calibration Framework for Photosensor-based Eye-Tracking System","authors":"Dmytro Katrychuk, Henry K. Griffith, Oleg V. Komogortsev","doi":"10.1145/3379156.3391370","DOIUrl":"https://doi.org/10.1145/3379156.3391370","url":null,"abstract":"The majority of eye-tracking systems require user-specific calibration to achieve suitable accuracy. Traditional calibration is performed by presenting targets at fixed locations that form a certain coverage of the device screen. If simple regression methods are used to learn a gaze map from the recorded data, the risk of overfitting is minimal. This is not the case if a gaze map is formed using neural networks, as is often employed in photosensor oculography (PSOG), which raises the question of careful design of calibration procedure. This paper evaluates different calibration data parsing approaches and the collection time-performance trade-off effect of grid density to build a calibration framework for PSOG with the use of video-based simulation framework.","PeriodicalId":226088,"journal":{"name":"ACM Symposium on Eye Tracking Research and Applications","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126721241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kai Preuss, Christopher Hilton, K. Gramann, Nele Russwinkel
{"title":"Cognitive Processing Stages During Mental Folding Are Reflected in Eye Movements","authors":"Kai Preuss, Christopher Hilton, K. Gramann, Nele Russwinkel","doi":"10.1145/3379157.3391415","DOIUrl":"https://doi.org/10.1145/3379157.3391415","url":null,"abstract":"Distinct cognitive processing stages in mental spatial transformation tasks can be identified in oculomotor behavior. We recorded eye movements whilst participants performed a mental folding task. Gaze behaviour was analyzed to provide insights into the relationship of task difficulty, gaze proportion on each stimulus, gaze switches between stimuli, and reaction times. We found a monotonic decrease in switch frequency and reference object gaze proportions with increasing difficulty level. Further, we found that these measures of gaze behaviour are related to the time taken to perform the mental transformation. We propose that the observed patterns of eye movements are indicative of distinct cognitive stages during mental folding. Lastly, further exploratory analyses are discussed.","PeriodicalId":226088,"journal":{"name":"ACM Symposium on Eye Tracking Research and Applications","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127831927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}