{"title":"Machine learning models to predict desktop activity recognition based on low-point gaze features","authors":"Hazem Al-Najjar , Nadia Al-Rousan , Hamzeh F. Assous , Dania AL-Najjar","doi":"10.1016/j.array.2025.100525","DOIUrl":null,"url":null,"abstract":"<div><div>Eye-tracking desktop activity prediction analyzes how users behave and think through their eye movements for the purpose of behavior prediction during computer use. The study examines how low-point gaze features functioning alongside machine learning (ML) models enable predictions of eight frequent desktop activities which are Debug, Browse, Play, Read, Interpret, Search, Watch and Write. The research uses simple gaze metrics obtained from 24 users through the Tobii X2-30 eye tracker for fixation count analysis along with saccade direction and gaze point statistics in order to support scalable non-intrusive deployment. The research utilized over 200,000 samples from which statistical analytics data was derived along with spatial-temporal eye movement characteristics through preprocessing. Eight well-known ML algorithms including Logistic Regression, Decision Tree and Random Forest, Neural Network and Gradient Boosting, AdaBoost, Naive Bayes and K-Nearest Neighbors received training and evaluation through 80/20 train-test split divisions. Each model conducted testing for activities through the computation of accuracy and precision and recall and F1-score metrics. Results from the evaluations show that Neural Networks coupled with Random Forests produce the best results through average performance metrics which surpass 0.91. The sustained focus activities such as Play, Read, Interpret, Watch and Write responded best to NN while RF demonstrated its strength during tasks of task-switching and problem-solving particularly in Debug and Search activities. The presented study proves the possibility of achieving accurate eye-tracking activity prediction through lightweight gaze features alongside conventional machine learning models. This investigation brings the possibility of creating real-time dynamic interfaces and user-based systems and diagnostic assessments in both consumer and clinical applications. Future research will investigate combining different models with optimization techniques to boost robustness performance in user-variable conditions and dynamic operational settings.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"28 ","pages":"Article 100525"},"PeriodicalIF":4.5000,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Array","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590005625001523","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Eye-tracking desktop activity prediction analyzes how users behave and think through their eye movements for the purpose of behavior prediction during computer use. The study examines how low-point gaze features functioning alongside machine learning (ML) models enable predictions of eight frequent desktop activities which are Debug, Browse, Play, Read, Interpret, Search, Watch and Write. The research uses simple gaze metrics obtained from 24 users through the Tobii X2-30 eye tracker for fixation count analysis along with saccade direction and gaze point statistics in order to support scalable non-intrusive deployment. The research utilized over 200,000 samples from which statistical analytics data was derived along with spatial-temporal eye movement characteristics through preprocessing. Eight well-known ML algorithms including Logistic Regression, Decision Tree and Random Forest, Neural Network and Gradient Boosting, AdaBoost, Naive Bayes and K-Nearest Neighbors received training and evaluation through 80/20 train-test split divisions. Each model conducted testing for activities through the computation of accuracy and precision and recall and F1-score metrics. Results from the evaluations show that Neural Networks coupled with Random Forests produce the best results through average performance metrics which surpass 0.91. The sustained focus activities such as Play, Read, Interpret, Watch and Write responded best to NN while RF demonstrated its strength during tasks of task-switching and problem-solving particularly in Debug and Search activities. The presented study proves the possibility of achieving accurate eye-tracking activity prediction through lightweight gaze features alongside conventional machine learning models. This investigation brings the possibility of creating real-time dynamic interfaces and user-based systems and diagnostic assessments in both consumer and clinical applications. Future research will investigate combining different models with optimization techniques to boost robustness performance in user-variable conditions and dynamic operational settings.