{"title":"由大规模视觉和语言模型支持的智能工业视觉监控和维护框架","authors":"Huan Wang;Chenxi Li;Yan-Fu Li;Fugee Tsung","doi":"10.1109/TICPS.2024.3414292","DOIUrl":null,"url":null,"abstract":"Industrial visual monitoring (IVM) is crucial for operation and maintenance, and artificial intelligence (AI) has excelled in this domain. As a revolutionary breakthrough in AI, large models are set to revolutionize IVM by advancing comprehensive automation and intelligence. This paper proposes an intelligent IVM and maintenance framework (IVMMF) empowered by large-scale visual and language models. Firstly, the proposed large-scale visual model comprehensively understands industrial images, providing accurate defect identification and descriptions. Subsequently, the local-knowledge-bases-based large language model was proposed to understand technical knowledge in specific fields, provide professional suggestions for engineers, and realize intelligent information interaction between the system and engineers. IVMMF achieves the intelligence of the entire process, including industrial image understanding, text dialogue, maintenance suggestions, and information communication. Finally, we construct a large-scale image-text IVM dataset, and the experiments demonstrate its exceptional performance, indicating its potential to transform the application paradigm in IVM.","PeriodicalId":100640,"journal":{"name":"IEEE Transactions on Industrial Cyber-Physical Systems","volume":"2 ","pages":"166-175"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Intelligent Industrial Visual Monitoring and Maintenance Framework Empowered by Large-Scale Visual and Language Models\",\"authors\":\"Huan Wang;Chenxi Li;Yan-Fu Li;Fugee Tsung\",\"doi\":\"10.1109/TICPS.2024.3414292\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Industrial visual monitoring (IVM) is crucial for operation and maintenance, and artificial intelligence (AI) has excelled in this domain. As a revolutionary breakthrough in AI, large models are set to revolutionize IVM by advancing comprehensive automation and intelligence. This paper proposes an intelligent IVM and maintenance framework (IVMMF) empowered by large-scale visual and language models. Firstly, the proposed large-scale visual model comprehensively understands industrial images, providing accurate defect identification and descriptions. Subsequently, the local-knowledge-bases-based large language model was proposed to understand technical knowledge in specific fields, provide professional suggestions for engineers, and realize intelligent information interaction between the system and engineers. IVMMF achieves the intelligence of the entire process, including industrial image understanding, text dialogue, maintenance suggestions, and information communication. Finally, we construct a large-scale image-text IVM dataset, and the experiments demonstrate its exceptional performance, indicating its potential to transform the application paradigm in IVM.\",\"PeriodicalId\":100640,\"journal\":{\"name\":\"IEEE Transactions on Industrial Cyber-Physical Systems\",\"volume\":\"2 \",\"pages\":\"166-175\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Industrial Cyber-Physical Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10557154/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Industrial Cyber-Physical Systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10557154/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Intelligent Industrial Visual Monitoring and Maintenance Framework Empowered by Large-Scale Visual and Language Models
Industrial visual monitoring (IVM) is crucial for operation and maintenance, and artificial intelligence (AI) has excelled in this domain. As a revolutionary breakthrough in AI, large models are set to revolutionize IVM by advancing comprehensive automation and intelligence. This paper proposes an intelligent IVM and maintenance framework (IVMMF) empowered by large-scale visual and language models. Firstly, the proposed large-scale visual model comprehensively understands industrial images, providing accurate defect identification and descriptions. Subsequently, the local-knowledge-bases-based large language model was proposed to understand technical knowledge in specific fields, provide professional suggestions for engineers, and realize intelligent information interaction between the system and engineers. IVMMF achieves the intelligence of the entire process, including industrial image understanding, text dialogue, maintenance suggestions, and information communication. Finally, we construct a large-scale image-text IVM dataset, and the experiments demonstrate its exceptional performance, indicating its potential to transform the application paradigm in IVM.