Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference最新文献_第2页

C-SupConGAN: Using Contrastive Learning and Trained Data Features for Audio-to-Image Generation C-SupConGAN:使用对比学习和训练数据特征生成音频到图像

Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference Pub Date : 2022-12-17 DOI: 10.1145/3582099.3582121

Haechun Chung, Jong-Kook Kim

{"title":"C-SupConGAN: Using Contrastive Learning and Trained Data Features for Audio-to-Image Generation","authors":"Haechun Chung, Jong-Kook Kim","doi":"10.1145/3582099.3582121","DOIUrl":"https://doi.org/10.1145/3582099.3582121","url":null,"abstract":"In this paper, the audio-to-image generation problem is investigated, where appropriate images are generated from the audio input. A previous study, Cross-Modal Contrastive Representation Learning (CMCRL), trained using both audios and images to extract useful audio features for audio-to-image generation. The CMCRL upgraded the Generative Adversarial Networks (GAN) to achieve high performance in the generation learning phase, but the GAN showed training instability. In this paper, the C-SupConGAN that uses the conditional supervised contrastive loss (C-SupCon loss) is proposed. C-SupConGAN enhances the conditional contrastive loss (2C loss) of the Contrastive GAN (ContraGAN) that considers data-to-data relationships and data-to-class relationships in the discriminator. The audio and image embeddings extracted from the encoder pre-trained using CMCRL is used to further extend the C-SupCon loss. The extended C-SupCon loss additionally considers relations information between data embedding and the corresponding audio embedding (data-to-source relationships) or between data embedding and the corresponding image embedding (data-to-target relationships). Extensive experiments show that the proposed method improved performance, generates higher quality images for audio-to-image generation than previous research, and effectively alleviates the training collapse of GAN.","PeriodicalId":222372,"journal":{"name":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","volume":"143 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121292959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The drone detection based on improved YOLOv5 基于改进YOLOv5的无人机检测

Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference Pub Date : 2022-12-17 DOI: 10.1145/3582099.3582113

Ziwei Tian, Jie Huang, Yang Yang, Weiying Nie

引用次数: 0

The Impact of Image Resolution in the training of Generative Adversarial Networks for Violence Detection 图像分辨率对暴力检测生成对抗网络训练的影响

Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference Pub Date : 2022-12-17 DOI: 10.1145/3582099.3582118

Khyle Aaron Goneda Montebon, E. J. G. Emberda

引用次数: 0

Car Types and Semantics Classification Using Weka 使用Weka的汽车类型和语义分类

Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference Pub Date : 2022-12-17 DOI: 10.1145/3582099.3582105

Hung-Hsiang Wang, Yunpeng Shen, Yun-Yun Hung

引用次数: 0

Amyotrophic Lateral Sclerosis and Post-Stroke Orofacial Impairment Video-based Multi-class Classification 肌萎缩性侧索硬化症与脑卒中后面部损伤的视频多分类

Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference Pub Date : 2022-12-17 DOI: 10.1145/3582099.3582123

Allan Magno Pecundo, P. Abu, R. Alampay

{"title":"Amyotrophic Lateral Sclerosis and Post-Stroke Orofacial Impairment Video-based Multi-class Classification","authors":"Allan Magno Pecundo, P. Abu, R. Alampay","doi":"10.1145/3582099.3582123","DOIUrl":"https://doi.org/10.1145/3582099.3582123","url":null,"abstract":"Neurological diseases, such as ALS and Stroke, that affect the brain including the nerves found throughout the body including the spinal cord generally require various forms of testing and clinical diagnosis in order to detect. These current forms of diagnosis, however, present a limitation in the form of being either expensive or subjective. Research has been done in the area of automated medical assessment via machine learning with the goal of offering cheaper and more objective alternatives for aiding diagnosis. For the case of ALS and orofacial impairment in stroke, it has been shown that using features derived from facial movement in videos, it is possible to detect the presence of these neurological diseases among healthy patients, separately. Research in this area, however, is still relatively few and allows for exploration of improvements in the overall model, especially with the emergence of newer algorithms for detecting facial landmarks. For this research, the improvements to be explored in the model will come in the form of exploring how the model can be trained to detect both (multi-class) ALS and orofacial impairment in post-stroke among a healthy population. Results show that features calculated from facial landmarks in videos, it is possible to develop a single muti-class detection model ALS, and orofacial impairment in stroke among a healthy population with accuracy as high as 86%.","PeriodicalId":222372,"journal":{"name":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124626357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Review of Animal Remote Managing and Monitoring System 动物远程管理与监测系统综述

Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference Pub Date : 2022-12-17 DOI: 10.1145/3582099.3582141

John IO Ojo, Chunling Tu, P. Owolawi, Shengzhi Du, D. D. Plessis

引用次数: 0

Research on the big data collection mode of consumers for innovative products and brand value factors 消费者对创新产品和品牌价值因素的大数据收集模式研究

Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference Pub Date : 2022-12-17 DOI: 10.1145/3582099.3582111

Xiaohong Yu, Yu-Che Huang

{"title":"Research on the big data collection mode of consumers for innovative products and brand value factors","authors":"Xiaohong Yu, Yu-Che Huang","doi":"10.1145/3582099.3582111","DOIUrl":"https://doi.org/10.1145/3582099.3582111","url":null,"abstract":"The collection of machine learning and big data can help us understand consumers more clearly. However, in today's increasingly serious product homogeneity, consumers' perception of brands and product demands cannot be really clearly analyzed. Although the brand has become the basis for the image, popularity and reputation of the enterprise. However, it also reflects that the brand value will develop wrong products due to a temporary data misjudgment; therefore, building brand influence has become the goal of reducing the depth of the company's operations, and its evaluation results are expected to help the industry improve itself and provide the main reference for brand strategy. Through structural reconstruction and value chain inductive analysis, this research will provide the core value of products that should be paid attention to when designing and positioning products in the future and what should be paid attention to when collecting big data, and should be able to fully comply with the positioning strategy of the entire product brand, so as to avoid causing brand damage. value. and consumer perception. Through the construction of these data, market-related researchers can also focus on the balance and value chain between brand value and product positioning through this research, and provide reference and product development positioning for relevant education researchers or market data collectors.","PeriodicalId":222372,"journal":{"name":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134119438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A multimodal sentiment recognition method based on attention mechanism 一种基于注意机制的多模态情感识别方法

Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference Pub Date : 2022-12-17 DOI: 10.1145/3582099.3582131

Bo Liu, Jidong Zhang, Yuxiao Xu, Jianqiang Li, Yan Pei, Guanzhi Qu

{"title":"A multimodal sentiment recognition method based on attention mechanism","authors":"Bo Liu, Jidong Zhang, Yuxiao Xu, Jianqiang Li, Yan Pei, Guanzhi Qu","doi":"10.1145/3582099.3582131","DOIUrl":"https://doi.org/10.1145/3582099.3582131","url":null,"abstract":"Effective sentiment analysis on social media data can help to better understand the public's sentiment and opinion tendencies. Combining multimodal content for sentiment classification uses the correlation information of data between modalities, thereby avoiding the situation that a single modality does not fully grasp the overall sentiment. This paper proposes a multimodal sentiment recognition model based on the attention mechanism. Through transfer learning, the latest pre-trained model is used to extract preliminary features of text and image, and the attention mechanism is deployed to achieve further feature extraction of prominent image key regions and text keywords, better mining the internal information of modalities and learning the interaction between modalities. In view of the different contribution of each modal to sentiment classification, a decision-level fusion method is proposed to design fusion rules to integrate the classification results of each modal to obtain the final sentiment recognition result. This model integrates various unimodal features well, and effectively mines the emotional information expressed in Internet social media comments. This method is experimentally tested on the Twitter dataset, and the results show that the classification accuracy of sentiment recognition is significantly improved compared with the single-modal method.","PeriodicalId":222372,"journal":{"name":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134484503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Modal Depression Detection Based on High-Order Emotional Features 基于高阶情绪特征的多模态抑郁检测

Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference Pub Date : 2022-12-17 DOI: 10.1145/3582099.3582144

Yuran Ru, Ning Geng, Li Li, Hui Wang, Yongxiang Zheng, Zhenhua Tan

{"title":"Multi-Modal Depression Detection Based on High-Order Emotional Features","authors":"Yuran Ru, Ning Geng, Li Li, Hui Wang, Yongxiang Zheng, Zhenhua Tan","doi":"10.1145/3582099.3582144","DOIUrl":"https://doi.org/10.1145/3582099.3582144","url":null,"abstract":"The diagnosis of depression has always been a difficulty in its treatment. At present, the research on automatic depression detection mostly directly uses low-order features such as video, audio and text as input. The lack of guidance of high-order features may be a potential problem. This paper proposed a multi-modal depression detection method based on high-order emotional features. A two-stage network is designed to realize emotion recognition and depression detection at the same time, and input the emotional results as high-order semantic features into the improved TBJE-E multi-modal network. This process guided the learning of other modalities with the help of co-attention module, and finally gave the prediction results. The results of experiments on DAIC-WOZ dataset show that the addition of emotional features effectively complements the high-order semantics. Compared with the original TBJE model, the F1 performance of TBJE-E model with emotional features is relatively improved by 6.3%. The method in this paper has reached the SOTA level in the depression detection task. The experimental data also show that at present, the risk of individual internal psychological privacy being stolen by this technology without their knowledge is very low, and this technology has some application value in criminal investigation, psychological diagnosis and treatment and other professional fields.","PeriodicalId":222372,"journal":{"name":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123358812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hybrid of Simplified Small World and Group Counseling Optimization Algorithms with Matured Random Initialization and Variable Insertion Neighborhood Search Technique to Solve Resource Constrained Project Scheduling Problems with Discounted Cash Flows 基于成熟随机初始化和可变插入邻域搜索技术的简化小世界与群体咨询混合优化算法求解现金流折现条件下资源受限项目调度问题

Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference Pub Date : 2022-12-17 DOI: 10.1145/3582099.3582110

Tshewang Phuntsho, T. Gonsalves

{"title":"Hybrid of Simplified Small World and Group Counseling Optimization Algorithms with Matured Random Initialization and Variable Insertion Neighborhood Search Technique to Solve Resource Constrained Project Scheduling Problems with Discounted Cash Flows","authors":"Tshewang Phuntsho, T. Gonsalves","doi":"10.1145/3582099.3582110","DOIUrl":"https://doi.org/10.1145/3582099.3582110","url":null,"abstract":"For long-run projects, the time and order of each activity or job executed matter to contractor firms in terms of profitability. The resource-constrained project scheduling problem with discounted cash flows (RCPSPDC) studies the scheduling of a project with constrained resources to maximize its net present value (NPV). In addition to the rich literature in this field, we add an implementation of RCPSPDC with three more algorithms: simplified small world optimization (SSWO), group counseling optimization (GCO), and a hybrid of these two algorithms with matured random initialization and variable insertion neighborhood search technique. Hybridization of different algorithms has allowed us to combine different search capabilities of various standalone algorithms and eliminate their demerits. Our algorithms were tested on standard 17,280 project instances. The novel hybrid algorithm has a minimal number of parameters and performs better or on par with other existing state-of-the-art hybrid algorithms.","PeriodicalId":222372,"journal":{"name":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121134727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0