Yanbing Yang , Ziwei Liu , Yongkun Chen , Binyu Yan , Yimao Sun , Tao Feng
{"title":"基于生成语言模型的可见光人体活动识别","authors":"Yanbing Yang , Ziwei Liu , Yongkun Chen , Binyu Yan , Yimao Sun , Tao Feng","doi":"10.1016/j.inffus.2025.103159","DOIUrl":null,"url":null,"abstract":"<div><div>Visible light-based indoor Human Activity Recognition (HAR) rises as a promising approach, due to its ability to provide indoor illumination, privacy protection, and serve sensing purposes. However, current visible light HAR methods are primarily focused on classification of individual human activity, which falls short of naturally representing and contextual relations. In this paper, we extend the challenge to a cross-modal alignment task between visible light signals and textual descriptions, proposing a framework that leverages generative large language models (LLMs) to decode visible light feature representations into human activity descriptions through sequence-to-sequence modeling. We implement a prototype system of our method and build up a custom dataset. Experiments in real indoor space demonstrate that our method achieves effective natural language level HAR from visible light sensing system, it promotes the information fusion between visible light and natural language, bringing the intelligent physical information systems towards realistic application with the integration of the generative LLMs.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"121 ","pages":"Article 103159"},"PeriodicalIF":14.7000,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Visible light human activity recognition driven by generative language model\",\"authors\":\"Yanbing Yang , Ziwei Liu , Yongkun Chen , Binyu Yan , Yimao Sun , Tao Feng\",\"doi\":\"10.1016/j.inffus.2025.103159\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Visible light-based indoor Human Activity Recognition (HAR) rises as a promising approach, due to its ability to provide indoor illumination, privacy protection, and serve sensing purposes. However, current visible light HAR methods are primarily focused on classification of individual human activity, which falls short of naturally representing and contextual relations. In this paper, we extend the challenge to a cross-modal alignment task between visible light signals and textual descriptions, proposing a framework that leverages generative large language models (LLMs) to decode visible light feature representations into human activity descriptions through sequence-to-sequence modeling. We implement a prototype system of our method and build up a custom dataset. Experiments in real indoor space demonstrate that our method achieves effective natural language level HAR from visible light sensing system, it promotes the information fusion between visible light and natural language, bringing the intelligent physical information systems towards realistic application with the integration of the generative LLMs.</div></div>\",\"PeriodicalId\":50367,\"journal\":{\"name\":\"Information Fusion\",\"volume\":\"121 \",\"pages\":\"Article 103159\"},\"PeriodicalIF\":14.7000,\"publicationDate\":\"2025-04-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Fusion\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1566253525002325\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525002325","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Visible light human activity recognition driven by generative language model
Visible light-based indoor Human Activity Recognition (HAR) rises as a promising approach, due to its ability to provide indoor illumination, privacy protection, and serve sensing purposes. However, current visible light HAR methods are primarily focused on classification of individual human activity, which falls short of naturally representing and contextual relations. In this paper, we extend the challenge to a cross-modal alignment task between visible light signals and textual descriptions, proposing a framework that leverages generative large language models (LLMs) to decode visible light feature representations into human activity descriptions through sequence-to-sequence modeling. We implement a prototype system of our method and build up a custom dataset. Experiments in real indoor space demonstrate that our method achieves effective natural language level HAR from visible light sensing system, it promotes the information fusion between visible light and natural language, bringing the intelligent physical information systems towards realistic application with the integration of the generative LLMs.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.