Anniina Anttila, Mikko Nuutinen, Riikka-Leena Leskelä, Mark van Gils, Anu Pekki, Riitta Sauni
{"title":"Cluster Analysis Reveals Subgroups with Different Risk Profiles and Sickness Absence Patterns in an Occupational Health Cohort.","authors":"Anniina Anttila, Mikko Nuutinen, Riikka-Leena Leskelä, Mark van Gils, Anu Pekki, Riitta Sauni","doi":"10.1007/s10926-025-10319-x","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Using unsupervised and supervised machine learning methods, we aimed to identify clinically relevant groups of employees with similar characteristics and analyze the association of long and short sickness absence periods with these groups.</p><p><strong>Methods: </strong>The participants were 12,099 employees of various occupations in Finnish companies. The data comprised 104 variables from medical records including data on sickness absences and a questionnaire used between 2011 and 2019 in health examinations. The latent dimensions for the employees were defined by principal component analysis to reduce the number of variables. Clusters were calculated using the K-means algorithm from datapoints expressed by the resulting five principal components. Logistic regression analyses were used to assess the associations of the clusters with long (> 30 days) and repetitive short (1-10 days) sickness absence (SA) episodes.</p><p><strong>Results: </strong>Employees in cluster one indicated positive managerial performance and workplace atmosphere, and employees had the least of both short and long SA. Cluster two indicated deficiencies related to managerial performance and workplace atmosphere. Cluster three had deficiencies mainly related to mood and depression and cluster four had cardiovascular diseases. Employees in cluster five reported many symptoms, especially dizziness and sensory symptoms, and had the highest occurrence of repetitive short SA. Cluster six indicated deficiencies related to work ability and had the highest occurrence of a long SA episode during follow-up.</p><p><strong>Conclusion: </strong>Unsupervised and supervised machine learning methods identified six clinically coherent employee clusters, providing information on typical combinations of characteristics and risk profiles of sickness absence.</p>","PeriodicalId":48035,"journal":{"name":"Journal of Occupational Rehabilitation","volume":" ","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Occupational Rehabilitation","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10926-025-10319-x","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"REHABILITATION","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Using unsupervised and supervised machine learning methods, we aimed to identify clinically relevant groups of employees with similar characteristics and analyze the association of long and short sickness absence periods with these groups.
Methods: The participants were 12,099 employees of various occupations in Finnish companies. The data comprised 104 variables from medical records including data on sickness absences and a questionnaire used between 2011 and 2019 in health examinations. The latent dimensions for the employees were defined by principal component analysis to reduce the number of variables. Clusters were calculated using the K-means algorithm from datapoints expressed by the resulting five principal components. Logistic regression analyses were used to assess the associations of the clusters with long (> 30 days) and repetitive short (1-10 days) sickness absence (SA) episodes.
Results: Employees in cluster one indicated positive managerial performance and workplace atmosphere, and employees had the least of both short and long SA. Cluster two indicated deficiencies related to managerial performance and workplace atmosphere. Cluster three had deficiencies mainly related to mood and depression and cluster four had cardiovascular diseases. Employees in cluster five reported many symptoms, especially dizziness and sensory symptoms, and had the highest occurrence of repetitive short SA. Cluster six indicated deficiencies related to work ability and had the highest occurrence of a long SA episode during follow-up.
Conclusion: Unsupervised and supervised machine learning methods identified six clinically coherent employee clusters, providing information on typical combinations of characteristics and risk profiles of sickness absence.
期刊介绍:
The Journal of Occupational Rehabilitation is an international forum for the publication of peer-reviewed original papers on the rehabilitation, reintegration, and prevention of disability in workers. The journal offers investigations involving original data collection and research synthesis (i.e., scoping reviews, systematic reviews, and meta-analyses). Papers derive from a broad array of fields including rehabilitation medicine, physical and occupational therapy, health psychology and psychiatry, orthopedics, oncology, occupational and insurance medicine, neurology, social work, ergonomics, biomedical engineering, health economics, rehabilitation engineering, business administration and management, and law. A single interdisciplinary source for information on work disability rehabilitation, the Journal of Occupational Rehabilitation helps to advance the scientific understanding, management, and prevention of work disability.