{"title":"DeepUnseen: Unpredicted Event Recognition Through Integrated Vision-Language Models","authors":"Hidetomo Sakaino, Natnapat Gaviphat, Louie Zamora, Alivanh Insisiengmay, Dwi Fetiria Ningrum","doi":"10.1109/CAI54212.2023.00029","DOIUrl":"https://doi.org/10.1109/CAI54212.2023.00029","url":null,"abstract":"Deep Learning-based segmentation models provide many benefits for scene understanding. However, such models have not been used and tested for unpredicted events like natural disasters by hurricanes, tornados, and typhoons. Since low illumination, heavy rainfall, and storms can degrade image quality, implementing a single state-of-the-art (SOTA) model only may fail to recognize objects correctly. Also, there are more enhancements to segmentation that remain unsolved. Thus, this paper proposes a vision-language-based DL model, namely, DeepUnseen, by integrating different Deep Learning models with the benefits of class and segmentation. Experimental results using disaster and traffic accident scenes showed superiority over a single SOTA Deep Learning model. Moreover, better semantically refined classes are obtained.","PeriodicalId":129324,"journal":{"name":"2023 IEEE Conference on Artificial Intelligence (CAI)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126013424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Arash Maghsoudi, J. Razjouyan, Sara Nowakowski, Ang Li
{"title":"Transfer learning with BERT and a-priori Knowledge-Based Sentence of Interest Selection in Radiology Impressions for Phenotyping Venous Thromboembolism","authors":"Arash Maghsoudi, J. Razjouyan, Sara Nowakowski, Ang Li","doi":"10.1109/cai54212.2023.00122","DOIUrl":"https://doi.org/10.1109/cai54212.2023.00122","url":null,"abstract":"Phenotyping venous thromboembolism (VTE) is a challenging task that requires accurate identification of clinical features from unstructured electronic health records (EHRs). In this study, we propose the use of Bidirectional Encoder Representations from Transformers (BERT), a pre-trained natural language processing (NLP) model, for VTE phenotyping. We fine-tuned BERT on a corpus consisting of radiology impressions of 13702 cancer patients from Harris Health System (HHS) in Houston, Texas. Our evaluation shows that BERT can achieve a sensitivity of 96.1% and precision of 95.1%. Our findings indicate that BERT can be an effective tool for VTE phenotyping using radiology impressions. The proposed approach has potential applications in clinical decision support and population health management.","PeriodicalId":129324,"journal":{"name":"2023 IEEE Conference on Artificial Intelligence (CAI)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125127132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Explaining Cyberbullying Trait Detection Through High Accuracy Transformer Ensemble","authors":"B. Goldfeder, Igor Griva","doi":"10.1109/CAI54212.2023.00116","DOIUrl":"https://doi.org/10.1109/CAI54212.2023.00116","url":null,"abstract":"Cyberbullying remains a pernicious concern within social media which can have an outsized impact on teens and children who are spending more and more of their social engagements online. The recent COVID-19 pandemic physically isolated young students during school and other public closures which served to increase usage of social media. These events heighten the need for careful, explainable, and ethical tools for monitoring and identifying cyberbullying with explaining classification to specific traits. Classical machine learning methods which form a large part of the existing domain provide better introspection into how the ML system arrives at its inference. These traditional stochastic based methodologies are hampered by the complex task of feature engineering and generally lag behind newer deep neural network (DNN) architectures in accuracy. The tradeoff is that DNNs are often opaque with respect to explainability of the model, the inference, and the impact of vast training sets toward ethics and bias. In this work, we propose an advanced architecture that incorporates second and third generation transformer-based models to provide focused and highly accurate cyberbullying trait detection and identification allowing for human verification and validation based on these inferences. This architecture uses the IEEE Fine-Grained Cyberbullying Dataset (FGCD) exceeding current SOA.","PeriodicalId":129324,"journal":{"name":"2023 IEEE Conference on Artificial Intelligence (CAI)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121972565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Efficient Deep Learning Framework for Facial Inpainting","authors":"Akshay Ravi, N. Saxena, A. Roy, Srajan Gupta","doi":"10.1109/CAI54212.2023.00096","DOIUrl":"https://doi.org/10.1109/CAI54212.2023.00096","url":null,"abstract":"The usage of masks during the pandemic has made identifying criminals using surveillance cameras very difficult. Generating the facial features behind a mask is a type of image inpainting. Current research on image inpainting shows promising results on manually pixelated regular holes/patches but has not been designed to handle the specific case of “unmasking” faces. In this paper we propose a novel, custom U-Net based Convolutional Neural Network to regenerate the face under a mask. Simulation results demonstrate that our proposed framework can achieve more than 97% Structural Similarity Index Measure for different types of facial masks across different faces, irrespective of gender, race or color.","PeriodicalId":129324,"journal":{"name":"2023 IEEE Conference on Artificial Intelligence (CAI)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122529457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deforestation Detection in the Brazilian Amazon Using Transformer-based Networks","authors":"Mariam Alshehri, Anes Ouadou, Grant J Scott","doi":"10.1109/CAI54212.2023.00130","DOIUrl":"https://doi.org/10.1109/CAI54212.2023.00130","url":null,"abstract":"Deforestation is a critical environmental issue that has far-reaching impacts on climate change, biodiversity, and the livelihoods of local communities. Conventional methods such as field surveys and map interpretation are not feasible, especially in vast regions like the Brazilian Amazon. In this paper, we adapt ChangeFormer, a transformer-based change detection model, to detect deforestation in the Brazilian Amazon, leveraging the attention mechanism to capture spatial and temporal dependencies in bi-temporal satellite images. To evaluate the model’s performance, we implemented a rigorous methodology to create a deforestation detection dataset using Sentinel-2 images of selected conservation units in the Brazilian Amazon during 2020 and 2021. The model achieved a high accuracy of 94%, demonstrating the potential of transformer-based networks for accurate and efficient deforestation detection.","PeriodicalId":129324,"journal":{"name":"2023 IEEE Conference on Artificial Intelligence (CAI)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121531579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Active Learning to Minimize the Risk from Future Epidemics","authors":"Suprim Nakarmi, K. Santosh","doi":"10.1109/CAI54212.2023.00145","DOIUrl":"https://doi.org/10.1109/CAI54212.2023.00145","url":null,"abstract":"For any future epidemics (e.g., Covid-19), typical deep learning (DL) models are of no use as they require a large amount of data for training. Moreover, data collection (with annotations) typically takes months (and even years) in public healthcare. In such a context, active learning (or human/expert-in-the-loop) is the must, where a machine can learn from the first day with the minimum possible labeled data. In unsupervised learning, we propose to build pre-trained DL models that iteratively learn independently over time, where human/expert intervenes/mentors only when it makes mistakes and for limited data. To validate such a concept, deep features are used to classify data into two clusters (0/1: Covid-19/non-Covid-19) on two different image datasets: Chest X-ray (CXR) and Computed Tomography (CT) scan of sizes 4,714 and 10,000 CTs, respectively. Using pre-trained DL models and unsupervised learning, in our active learning framework, we received the highest AUC of 0.99 and 0.94 on CXR and CT scan datasets, respectively. Our results are comparable with the fully trained (on large data) state-of-the-art DL models.","PeriodicalId":129324,"journal":{"name":"2023 IEEE Conference on Artificial Intelligence (CAI)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130322877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Explainable AI via Linguistic Summarization of Black Box Computer Vision Models","authors":"Brendan Alvey, D. Anderson, James M. Keller","doi":"10.1109/CAI54212.2023.00156","DOIUrl":"https://doi.org/10.1109/CAI54212.2023.00156","url":null,"abstract":"There is an ever-growing demand to characterize and understand AI as it is integrated into everyday life. Linguistic summaries have been previously used to provide natural language descriptions of data and models. However, the number of possible summaries increases rapidly with the number of data attributes. To make sense of the vast number of possible linguistic statements for a system, we introduce a hierarchical approach for generating and ranking linguistic statements. Each description of the model is assigned a value based on user criteria, allowing summaries to be tailored to specific users. We provide visualizations of the generation of summaries for a popular computer vision detector on a synthetically generated dataset.","PeriodicalId":129324,"journal":{"name":"2023 IEEE Conference on Artificial Intelligence (CAI)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133863134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Valsecchi, Ó. Gómez, A. González, M. Macias, M. de Dios, M. Panizo, K. Prada, M. Flores, S. Kaiser, N. Lurromi, E. Bermejo, P. Mesejo, S. Damas, Ó. Cordón, Ó. Ibáñez
{"title":"Skeleton-ID: AI-driven Human Identification","authors":"A. Valsecchi, Ó. Gómez, A. González, M. Macias, M. de Dios, M. Panizo, K. Prada, M. Flores, S. Kaiser, N. Lurromi, E. Bermejo, P. Mesejo, S. Damas, Ó. Cordón, Ó. Ibáñez","doi":"10.1109/cai54212.2023.00124","DOIUrl":"https://doi.org/10.1109/cai54212.2023.00124","url":null,"abstract":"Victims of crime, migration, natural disasters, and armed conflicts often remain unidentified due to the absence of DNA samples and fingerprints. Skeleton-based identification (ID) methods could help to identify these victims because they are suitable for poorly preserved bodies and require comparison data that is relatively easy to obtain —e.g., a simple photo of the victim— instead of a DNA sample of a close relative or access to a database in another country. Skeleton-ID™ is a game-changing software designed for skeleton-based human identification using artificial intelligence (AI). Combining forensic anthropology and odontology methods with cutting-edge image processing and AI technologies allows us to apply these methods on a large scale. This leads to very high reliability (up to 99%), obtaining explainable results, and a drastic reduction of identification time (days to minutes). The software is accessible through a web browser and can process ante-mortem (AM) information on missing persons as well as post-mortem (PM) information. Usable data includes ordinary photos, 3D scans of bones, x-rays, and dental records.","PeriodicalId":129324,"journal":{"name":"2023 IEEE Conference on Artificial Intelligence (CAI)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132291263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muhammad Zia Ur Rehman, Manimurugan Shanmuganathan, Anand Paul
{"title":"Attention-Based Underwater Oil Leakage Detection","authors":"Muhammad Zia Ur Rehman, Manimurugan Shanmuganathan, Anand Paul","doi":"10.1109/CAI54212.2023.00100","DOIUrl":"https://doi.org/10.1109/CAI54212.2023.00100","url":null,"abstract":"This study addresses the pressing issue of oil and water and leakage detection in underwater pipes, which has become a major concern due to the increasing demand for pristine water and natural oil and a growing global demand. While extensive datasets exist for image and voice recognition, few datasets are available for the engineering detection of oil and water pipe leakage using acoustic signals. Consequently, many existing leak detection systems are ineffective at identifying breaches, resulting in major spills that cost pipeline companies millions of dollars. To address this problem, we propose a novel approach that employs an attention-based neural network methodology to predict underwater pipe leakage and evaluate the effectiveness of deep learning models. Our study employs sensor signal datasets from an actual industrial scenario, and our results indicate that the attention model outperforms other models in this domain. This study presents a promising avenue for addressing the issue of water leakage detection and management, which has significant implications for the water industry and the global population.","PeriodicalId":129324,"journal":{"name":"2023 IEEE Conference on Artificial Intelligence (CAI)","volume":"195 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133056462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dominik Starzmann, Almut Kuemmerle, Fabian Stolle, Jens Haala, S. Klinkner
{"title":"Vision Transformer for Beamforming on Phased Array Antennas","authors":"Dominik Starzmann, Almut Kuemmerle, Fabian Stolle, Jens Haala, S. Klinkner","doi":"10.1109/CAI54212.2023.00160","DOIUrl":"https://doi.org/10.1109/CAI54212.2023.00160","url":null,"abstract":"Beamforming is not only a major feature to increase the capabilities of 5G and its successors but also it is a key feature of current and future satellite communication systems. While convolutional neural networks (CNN) have already been applied for beamforming, there is no implementation using vision transformers despite their massive advance in image recognition. To the best of our knowledge, we are the first to deploy a vision transformer for beamforming. It successfully predicts the phases of a given antenna pattern for an 8x6 patch antenna array. The final architecture uses multi-head attention as well as convolutional layers for feature extraction resulting in a convolutional vision transformer. The developed model outperforms comparable CNNs, while needing fewer resources making it more suitable for non-terrestrial applications.","PeriodicalId":129324,"journal":{"name":"2023 IEEE Conference on Artificial Intelligence (CAI)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133260224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}