{"title":"基于大语言模型(LLM)的包装虚拟图像辅助生成算法","authors":"Yang Zhou, Fan Zhang","doi":"10.1109/ICCES57224.2023.10192861","DOIUrl":null,"url":null,"abstract":"When applying the Large Language Model (LLM) to image processing, it is crucial to control the training size and the accuracy of the algorithm. This research study proposes a novel LLM-based algorithm for the generation of auxiliary virtual images. The proposed approach is based on a two-step strategy, namely the optimized LLM and the joint pix2pix model, which integrates the neural structure into the traditional processing pipelines. For the designed LLM, this study uses the Transformer's global interactive ability that combines with the local characteristics of CNN to enrich the feature diversity, then the input feature maps are divided into multiple groups and further, then fuse with the updated regulation to achieve the initial generation task. For the joint pix2pix mode, the original image is generated by the generator to generate a new image, the new image and the original image are fused together as fake data and sent to the discriminator for training. The experimental results on the small and large datasets show that the proposed approach outperforms.","PeriodicalId":442189,"journal":{"name":"2023 8th International Conference on Communication and Electronics Systems (ICCES)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Packaging Virtual Image Auxiliary Generation Algorithm based on Large Language Model (LLM)\",\"authors\":\"Yang Zhou, Fan Zhang\",\"doi\":\"10.1109/ICCES57224.2023.10192861\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When applying the Large Language Model (LLM) to image processing, it is crucial to control the training size and the accuracy of the algorithm. This research study proposes a novel LLM-based algorithm for the generation of auxiliary virtual images. The proposed approach is based on a two-step strategy, namely the optimized LLM and the joint pix2pix model, which integrates the neural structure into the traditional processing pipelines. For the designed LLM, this study uses the Transformer's global interactive ability that combines with the local characteristics of CNN to enrich the feature diversity, then the input feature maps are divided into multiple groups and further, then fuse with the updated regulation to achieve the initial generation task. For the joint pix2pix mode, the original image is generated by the generator to generate a new image, the new image and the original image are fused together as fake data and sent to the discriminator for training. The experimental results on the small and large datasets show that the proposed approach outperforms.\",\"PeriodicalId\":442189,\"journal\":{\"name\":\"2023 8th International Conference on Communication and Electronics Systems (ICCES)\",\"volume\":\"60 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 8th International Conference on Communication and Electronics Systems (ICCES)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCES57224.2023.10192861\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 8th International Conference on Communication and Electronics Systems (ICCES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCES57224.2023.10192861","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Packaging Virtual Image Auxiliary Generation Algorithm based on Large Language Model (LLM)
When applying the Large Language Model (LLM) to image processing, it is crucial to control the training size and the accuracy of the algorithm. This research study proposes a novel LLM-based algorithm for the generation of auxiliary virtual images. The proposed approach is based on a two-step strategy, namely the optimized LLM and the joint pix2pix model, which integrates the neural structure into the traditional processing pipelines. For the designed LLM, this study uses the Transformer's global interactive ability that combines with the local characteristics of CNN to enrich the feature diversity, then the input feature maps are divided into multiple groups and further, then fuse with the updated regulation to achieve the initial generation task. For the joint pix2pix mode, the original image is generated by the generator to generate a new image, the new image and the original image are fused together as fake data and sent to the discriminator for training. The experimental results on the small and large datasets show that the proposed approach outperforms.