{"title":"C-SupConGAN: Using Contrastive Learning and Trained Data Features for Audio-to-Image Generation","authors":"Haechun Chung, Jong-Kook Kim","doi":"10.1145/3582099.3582121","DOIUrl":"https://doi.org/10.1145/3582099.3582121","url":null,"abstract":"In this paper, the audio-to-image generation problem is investigated, where appropriate images are generated from the audio input. A previous study, Cross-Modal Contrastive Representation Learning (CMCRL), trained using both audios and images to extract useful audio features for audio-to-image generation. The CMCRL upgraded the Generative Adversarial Networks (GAN) to achieve high performance in the generation learning phase, but the GAN showed training instability. In this paper, the C-SupConGAN that uses the conditional supervised contrastive loss (C-SupCon loss) is proposed. C-SupConGAN enhances the conditional contrastive loss (2C loss) of the Contrastive GAN (ContraGAN) that considers data-to-data relationships and data-to-class relationships in the discriminator. The audio and image embeddings extracted from the encoder pre-trained using CMCRL is used to further extend the C-SupCon loss. The extended C-SupCon loss additionally considers relations information between data embedding and the corresponding audio embedding (data-to-source relationships) or between data embedding and the corresponding image embedding (data-to-target relationships). Extensive experiments show that the proposed method improved performance, generates higher quality images for audio-to-image generation than previous research, and effectively alleviates the training collapse of GAN.","PeriodicalId":222372,"journal":{"name":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","volume":"143 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121292959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The drone detection based on improved YOLOv5","authors":"Ziwei Tian, Jie Huang, Yang Yang, Weiying Nie","doi":"10.1145/3582099.3582113","DOIUrl":"https://doi.org/10.1145/3582099.3582113","url":null,"abstract":"The wide application of drones not only brings convenience to production and life, but also poses a threat to public safety. Therefore, the detection of s is crucial. However, tiny drones make it difficult to cope with traditional detection methods such as radar and photoelectricity because of their tiny size. Therefore, this paper proposed a tiny drones detection method based on YOLOv5 framework. By optimizing the size of Anchor box, embedding the Convolutional Block Attention Module (CBAM) and optimized loss function (CIoU), the detection performance of the original algorithm for drones under complex background is improved. The improved YOLOv5 algorithm is trained and tested on the self-built dataset, and its mean Average Precision, Accuracy and Recall reach 96.9%, 97.8% and 95.6% respectively. Finally, the improved YOLOv5 is used for drone detection in complex background environments. Compared with the original algorithm, it can correctly identify drone targets in harsh environments.","PeriodicalId":222372,"journal":{"name":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115937132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Impact of Image Resolution in the training of Generative Adversarial Networks for Violence Detection","authors":"Khyle Aaron Goneda Montebon, E. J. G. Emberda","doi":"10.1145/3582099.3582118","DOIUrl":"https://doi.org/10.1145/3582099.3582118","url":null,"abstract":"Since time immemorial, violence has been a problem that the world has been facing. The rise of technology has presented an opportunity to help in this matter. Violence detection solutions have been created for this cause. The problem with existing solutions is that they are not appropriate for settings in a developing country. Factors such as the place, objects seen, people involved, among others, are different from those models who are trained with datasets from developed countries, which might prove ineffective for developing countries. That is why the researchers aim to create a Generative Adversarial Networks Model trained with data that are location-specific to the country of Philippines. In this study, the researchers will gauge the effects and impact that resolution brings in the training of the GAN Model, named V.GAN, to help with improving its performance and implementation.","PeriodicalId":222372,"journal":{"name":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126720065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Car Types and Semantics Classification Using Weka","authors":"Hung-Hsiang Wang, Yunpeng Shen, Yun-Yun Hung","doi":"10.1145/3582099.3582105","DOIUrl":"https://doi.org/10.1145/3582099.3582105","url":null,"abstract":"This paper presents a method of using Weka, a machine learning tool, to identify the difference of types and product semantics between fuel vehicles and electric vehicles. Pictures of 58 fuel vehicles and 42 pictures of electric vehicles during time period from 2021 to 2023 are selectively collected from Consumer Reports website to build the dataset. The fuel vehicle brands include Audi, BMW, Cadillac, and Lexus with 3 types, namely, SUV, Sedan and Luxury. In addition to the above-mentioned brands, Tesla is the 5th brand of electric vehicles. The perception of each picture is labelled by questionnaires of Automotive Model Semantics. Results reveal that even the smaller dataset can be trained to have highly accuracy models for classifying fuel vehicles and electric vehicles, different types as well as product semantics. The method presented is promising for studying car styling and exploring new applications of image classification to branding and product design.","PeriodicalId":222372,"journal":{"name":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130708936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Amyotrophic Lateral Sclerosis and Post-Stroke Orofacial Impairment Video-based Multi-class Classification","authors":"Allan Magno Pecundo, P. Abu, R. Alampay","doi":"10.1145/3582099.3582123","DOIUrl":"https://doi.org/10.1145/3582099.3582123","url":null,"abstract":"Neurological diseases, such as ALS and Stroke, that affect the brain including the nerves found throughout the body including the spinal cord generally require various forms of testing and clinical diagnosis in order to detect. These current forms of diagnosis, however, present a limitation in the form of being either expensive or subjective. Research has been done in the area of automated medical assessment via machine learning with the goal of offering cheaper and more objective alternatives for aiding diagnosis. For the case of ALS and orofacial impairment in stroke, it has been shown that using features derived from facial movement in videos, it is possible to detect the presence of these neurological diseases among healthy patients, separately. Research in this area, however, is still relatively few and allows for exploration of improvements in the overall model, especially with the emergence of newer algorithms for detecting facial landmarks. For this research, the improvements to be explored in the model will come in the form of exploring how the model can be trained to detect both (multi-class) ALS and orofacial impairment in post-stroke among a healthy population. Results show that features calculated from facial landmarks in videos, it is possible to develop a single muti-class detection model ALS, and orofacial impairment in stroke among a healthy population with accuracy as high as 86%.","PeriodicalId":222372,"journal":{"name":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124626357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John IO Ojo, Chunling Tu, P. Owolawi, Shengzhi Du, D. D. Plessis
{"title":"Review of Animal Remote Managing and Monitoring System","authors":"John IO Ojo, Chunling Tu, P. Owolawi, Shengzhi Du, D. D. Plessis","doi":"10.1145/3582099.3582141","DOIUrl":"https://doi.org/10.1145/3582099.3582141","url":null,"abstract":"Livestock is an integral part of everyday life contributing to the social, cultural, and economic sectors. Livestock farming faces challenges in animal remote management and monitoring, due to cost constraints and product quality requirements. The latest development of smart techniques, especially the Internet of Things (IoT) and cloud devices/services provide significant potential for efficient and secure farm management. This paper presents a review of smart farm management and monitoring systems for livestock based on IoT techniques, including the monitoring of animal behavior, weather condition, environment, and more. The review provides a comprehensive understanding of the sensing techniques, data analysis, communication, hardware, and software used in existing smart livestock solutions.","PeriodicalId":222372,"journal":{"name":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130398156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Research on the big data collection mode of consumers for innovative products and brand value factors","authors":"Xiaohong Yu, Yu-Che Huang","doi":"10.1145/3582099.3582111","DOIUrl":"https://doi.org/10.1145/3582099.3582111","url":null,"abstract":"The collection of machine learning and big data can help us understand consumers more clearly. However, in today's increasingly serious product homogeneity, consumers' perception of brands and product demands cannot be really clearly analyzed. Although the brand has become the basis for the image, popularity and reputation of the enterprise. However, it also reflects that the brand value will develop wrong products due to a temporary data misjudgment; therefore, building brand influence has become the goal of reducing the depth of the company's operations, and its evaluation results are expected to help the industry improve itself and provide the main reference for brand strategy. Through structural reconstruction and value chain inductive analysis, this research will provide the core value of products that should be paid attention to when designing and positioning products in the future and what should be paid attention to when collecting big data, and should be able to fully comply with the positioning strategy of the entire product brand, so as to avoid causing brand damage. value. and consumer perception. Through the construction of these data, market-related researchers can also focus on the balance and value chain between brand value and product positioning through this research, and provide reference and product development positioning for relevant education researchers or market data collectors.","PeriodicalId":222372,"journal":{"name":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134119438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bo Liu, Jidong Zhang, Yuxiao Xu, Jianqiang Li, Yan Pei, Guanzhi Qu
{"title":"A multimodal sentiment recognition method based on attention mechanism","authors":"Bo Liu, Jidong Zhang, Yuxiao Xu, Jianqiang Li, Yan Pei, Guanzhi Qu","doi":"10.1145/3582099.3582131","DOIUrl":"https://doi.org/10.1145/3582099.3582131","url":null,"abstract":"Effective sentiment analysis on social media data can help to better understand the public's sentiment and opinion tendencies. Combining multimodal content for sentiment classification uses the correlation information of data between modalities, thereby avoiding the situation that a single modality does not fully grasp the overall sentiment. This paper proposes a multimodal sentiment recognition model based on the attention mechanism. Through transfer learning, the latest pre-trained model is used to extract preliminary features of text and image, and the attention mechanism is deployed to achieve further feature extraction of prominent image key regions and text keywords, better mining the internal information of modalities and learning the interaction between modalities. In view of the different contribution of each modal to sentiment classification, a decision-level fusion method is proposed to design fusion rules to integrate the classification results of each modal to obtain the final sentiment recognition result. This model integrates various unimodal features well, and effectively mines the emotional information expressed in Internet social media comments. This method is experimentally tested on the Twitter dataset, and the results show that the classification accuracy of sentiment recognition is significantly improved compared with the single-modal method.","PeriodicalId":222372,"journal":{"name":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134484503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuran Ru, Ning Geng, Li Li, Hui Wang, Yongxiang Zheng, Zhenhua Tan
{"title":"Multi-Modal Depression Detection Based on High-Order Emotional Features","authors":"Yuran Ru, Ning Geng, Li Li, Hui Wang, Yongxiang Zheng, Zhenhua Tan","doi":"10.1145/3582099.3582144","DOIUrl":"https://doi.org/10.1145/3582099.3582144","url":null,"abstract":"The diagnosis of depression has always been a difficulty in its treatment. At present, the research on automatic depression detection mostly directly uses low-order features such as video, audio and text as input. The lack of guidance of high-order features may be a potential problem. This paper proposed a multi-modal depression detection method based on high-order emotional features. A two-stage network is designed to realize emotion recognition and depression detection at the same time, and input the emotional results as high-order semantic features into the improved TBJE-E multi-modal network. This process guided the learning of other modalities with the help of co-attention module, and finally gave the prediction results. The results of experiments on DAIC-WOZ dataset show that the addition of emotional features effectively complements the high-order semantics. Compared with the original TBJE model, the F1 performance of TBJE-E model with emotional features is relatively improved by 6.3%. The method in this paper has reached the SOTA level in the depression detection task. The experimental data also show that at present, the risk of individual internal psychological privacy being stolen by this technology without their knowledge is very low, and this technology has some application value in criminal investigation, psychological diagnosis and treatment and other professional fields.","PeriodicalId":222372,"journal":{"name":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123358812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid of Simplified Small World and Group Counseling Optimization Algorithms with Matured Random Initialization and Variable Insertion Neighborhood Search Technique to Solve Resource Constrained Project Scheduling Problems with Discounted Cash Flows","authors":"Tshewang Phuntsho, T. Gonsalves","doi":"10.1145/3582099.3582110","DOIUrl":"https://doi.org/10.1145/3582099.3582110","url":null,"abstract":"For long-run projects, the time and order of each activity or job executed matter to contractor firms in terms of profitability. The resource-constrained project scheduling problem with discounted cash flows (RCPSPDC) studies the scheduling of a project with constrained resources to maximize its net present value (NPV). In addition to the rich literature in this field, we add an implementation of RCPSPDC with three more algorithms: simplified small world optimization (SSWO), group counseling optimization (GCO), and a hybrid of these two algorithms with matured random initialization and variable insertion neighborhood search technique. Hybridization of different algorithms has allowed us to combine different search capabilities of various standalone algorithms and eliminate their demerits. Our algorithms were tested on standard 17,280 project instances. The novel hybrid algorithm has a minimal number of parameters and performs better or on par with other existing state-of-the-art hybrid algorithms.","PeriodicalId":222372,"journal":{"name":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121134727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}