{"title":"Generative Adversarial Networks (GANs) for Audio-Visual Speech Recognition in Artificial Intelligence IoT","authors":"Yibo He, Kah Phooi Seng, Li Minn Ang","doi":"10.3390/info14100575","DOIUrl":"https://doi.org/10.3390/info14100575","url":null,"abstract":"This paper proposes a novel multimodal generative adversarial network AVSR (multimodal AVSR GAN) architecture, to improve both the energy efficiency and the AVSR classification accuracy of artificial intelligence Internet of things (IoT) applications. The audio-visual speech recognition (AVSR) modality is a classical multimodal modality, which is commonly used in IoT and embedded systems. Examples of suitable IoT applications include in-cabin speech recognition systems for driving systems, AVSR in augmented reality environments, and interactive applications such as virtual aquariums. The application of multimodal sensor data for IoT applications requires efficient information processing, to meet the hardware constraints of IoT devices. The proposed multimodal AVSR GAN architecture is composed of a discriminator and a generator, each of which is a two-stream network, corresponding to the audio stream information and the visual stream information, respectively. To validate this approach, we used augmented data from well-known datasets (LRS2-Lip Reading Sentences 2 and LRS3) in the training process, and testing was performed using the original data. The research and experimental results showed that the proposed multimodal AVSR GAN architecture improved the AVSR classification accuracy. Furthermore, in this study, we discuss the domain of GANs and provide a concise summary of the proposed GANs.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135729614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fábio Mendonça, Sheikh Shanawaz Mostafa, Fernando Morgado-Dias, Antonio G. Ravelo-García
{"title":"On the Use of Kullback–Leibler Divergence for Kernel Selection and Interpretation in Variational Autoencoders for Feature Creation","authors":"Fábio Mendonça, Sheikh Shanawaz Mostafa, Fernando Morgado-Dias, Antonio G. Ravelo-García","doi":"10.3390/info14100571","DOIUrl":"https://doi.org/10.3390/info14100571","url":null,"abstract":"This study presents a novel approach for kernel selection based on Kullback–Leibler divergence in variational autoencoders using features generated by the convolutional encoder. The proposed methodology focuses on identifying the most relevant subset of latent variables to reduce the model’s parameters. Each latent variable is sampled from the distribution associated with a single kernel of the last encoder’s convolutional layer, resulting in an individual distribution for each kernel. Relevant features are selected from the sampled latent variables to perform kernel selection, which filters out uninformative features and, consequently, unnecessary kernels. Both the proposed filter method and the sequential feature selection (standard wrapper method) were examined for feature selection. Particularly, the filter method evaluates the Kullback–Leibler divergence between all kernels’ distributions and hypothesizes that similar kernels can be discarded as they do not convey relevant information. This hypothesis was confirmed through the experiments performed on four standard datasets, where it was observed that the number of kernels can be reduced without meaningfully affecting the performance. This analysis was based on the accuracy of the model when the selected kernels fed a probabilistic classifier and the feature-based similarity index to appraise the quality of the reconstructed images when the variational autoencoder only uses the selected kernels. Therefore, the proposed methodology guides the reduction of the number of parameters of the model, making it suitable for developing applications for resource-constrained devices.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135888870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muhammad Abdullah Hanif, Rehan Hafiz, Muhammad Shafique
{"title":"DAEM: A Data- and Application-Aware Error Analysis Methodology for Approximate Adders","authors":"Muhammad Abdullah Hanif, Rehan Hafiz, Muhammad Shafique","doi":"10.3390/info14100570","DOIUrl":"https://doi.org/10.3390/info14100570","url":null,"abstract":"Approximate adders are some of the fundamental arithmetic operators that are being employed in error-resilient applications, to achieve performance/energy/area gains. This improvement usually comes at the cost of some accuracy and, therefore, requires prior error analysis, to select an approximate adder variant that provides acceptable accuracy. Most of the state-of-the-art error analysis techniques for approximate adders assume input bits and operands to be independent of one another, while some also assume the operands to be uniformly distributed. In this paper, we analyze the impact of these assumptions on the accuracy of error estimation techniques, and we highlight the need to address these assumptions, to achieve better and more realistic quality estimates. Based on our analysis, we propose DAEM, a data- and application-aware error analysis methodology for approximate adders. Unlike existing error analysis models, we neither assume the adder operands to be uniformly distributed nor assume them to be independent. Specifically, we use 2D joint input probability mass functions (PMFs), populated using sample data, in order to incorporate the data and application knowledge in the analysis. These 2D joint input PMFs, along with 2D error maps of approximate adders, are used to estimate the error PMF of an adder network. The error PMF is then utilized to compute different error measures, such as the mean squared error (MSE) and mean error distance (MED). We evaluate the proposed error analysis methodology on audio and video processing applications, and we demonstrate that our methodology provides error estimates having a better correlation with the simulation results, as compared to the state-of-the-art techniques.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135994561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vijayendra D. Avina, Md Amiruzzaman, Stefanie Amiruzzaman, Linh B. Ngo, M. Ali Akber Dewan
{"title":"An AI-Based Framework for Translating American Sign Language to English and Vice Versa","authors":"Vijayendra D. Avina, Md Amiruzzaman, Stefanie Amiruzzaman, Linh B. Ngo, M. Ali Akber Dewan","doi":"10.3390/info14100569","DOIUrl":"https://doi.org/10.3390/info14100569","url":null,"abstract":"In this paper, we propose a framework to convert American Sign Language (ASL) to English and English to ASL. Within this framework, we use a deep learning model along with the rolling average prediction that captures image frames from videos and classifies the signs from the image frames. The classified frames are then used to construct ASL words and sentences to support people with hearing impairments. We also use the same deep learning model to capture signs from the people with deaf symptoms and convert them into ASL words and English sentences. Based on this framework, we developed a web-based tool to use in real-life application and we also present the tool as a proof of concept. With the evaluation, we found that the deep learning model converts the image signs into ASL words and sentences with high accuracy. The tool was also found to be very useful for people with hearing impairment and deaf symptoms. The main contribution of this work is the design of a system to convert ASL to English and vice versa.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136184855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Investigating the Relationship of User Acceptance to the Characteristics and Performance of an Educational Software in Byzantine Music","authors":"Konstantinos-Hercules Kokkinidis, Georgios Patronas, Sotirios K. Goudos, Theodoros Maikantis, Nikolaos Nikolaidis","doi":"10.3390/info14100568","DOIUrl":"https://doi.org/10.3390/info14100568","url":null,"abstract":"The purpose of this study is to examine the impact of educational software characteristics on software performance through the mediating role of user acceptance. Our approach allows for a deeper understanding of the factors that contribute to the effectiveness of educational software by bridging the fields of educational technology, psychology, and human–computer interaction, offering a holistic perspective on software adoption and performance. This study is based on a sample collected from public and private education institutes in Northern Greece and on data obtained from 236 users. The statistical method employed is structural equation models (SEMs), via SPSS—AMOS estimation. The findings of this study suggest that user acceptance and performance appraisal are exceptionally interrelated in regard to educational applications. The study argues that user acceptance is positively related to the performance of educational software and constitutes the nested epicenter mediating construct in the educational software characteristics. Additional findings, such as computer-familiar users and users from the field of choral music, are positively related to the performance of the educational software. Our conclusions help in understanding the psychological and behavioral aspects of technology adoption in the educational setting. Findings are discussed in terms of their practical usefulness in education and further research.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136184506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bogdan Nicula, Mihai Dascalu, Tracy Arner, Renu Balyan, Danielle S. McNamara
{"title":"Automated Assessment of Comprehension Strategies from Self-Explanations Using LLMs","authors":"Bogdan Nicula, Mihai Dascalu, Tracy Arner, Renu Balyan, Danielle S. McNamara","doi":"10.3390/info14100567","DOIUrl":"https://doi.org/10.3390/info14100567","url":null,"abstract":"Text comprehension is an essential skill in today’s information-rich world, and self-explanation practice helps students improve their understanding of complex texts. This study was centered on leveraging open-source Large Language Models (LLMs), specifically FLAN-T5, to automatically assess the comprehension strategies employed by readers while understanding Science, Technology, Engineering, and Mathematics (STEM) texts. The experiments relied on a corpus of three datasets (N = 11,833) with self-explanations annotated on 4 dimensions: 3 comprehension strategies (i.e., bridging, elaboration, and paraphrasing) and overall quality. Besides FLAN-T5, we also considered GPT3.5-turbo to establish a stronger baseline. Our experiments indicated that the performance improved with fine-tuning, having a larger LLM model, and providing examples via the prompt. Our best model considered a pretrained FLAN-T5 XXL model and obtained a weighted F1-score of 0.721, surpassing the 0.699 F1-score previously obtained using smaller models (i.e., RoBERTa).","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135766283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-Beam Radar Communication Integrated System Design","authors":"Hao Ma, Jun Wang, Xin Sun, Wenxin Jin","doi":"10.3390/info14100566","DOIUrl":"https://doi.org/10.3390/info14100566","url":null,"abstract":"In this paper, we propose a multi-beam integrated radar and communication scheme using phased-array antenna, in which the same LFM-BPSK integrated waveform is used for both the radar and the communication beams. In the integrated beam design, the radar beam is periodically scanned in different directions for detection, and the communication beam is periodically manipulated in one direction for communication. The system’s beamforming uses adaptive beamforming technology to achieve radar echoes and communication reception. For the LFM-BPSK integrated waveform used by the system, we propose a method for estimating parameters during communication reception. Through simulation, the proposed beam-pattern design, adaptive beamforming, and parameter estimation scheme can achieve radar and communication functions using phased-array antennas.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135767021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Timotej Jagrič, Daniel Zdolšek, Robert Horvat, Iztok Kolar, Niko Erker, Jernej Merhar, Vita Jagrič
{"title":"New Suptech Tool of the Predictive Generation for Insurance Companies—The Case of the European Market","authors":"Timotej Jagrič, Daniel Zdolšek, Robert Horvat, Iztok Kolar, Niko Erker, Jernej Merhar, Vita Jagrič","doi":"10.3390/info14100565","DOIUrl":"https://doi.org/10.3390/info14100565","url":null,"abstract":"Financial innovation, green investments, or climate change are changing insurers’ business ecosystems, impacting their business behaviour and financial vulnerability. Supervisors and other stakeholders are interested in identifying the path toward deterioration in the insurance company’s financial health as early as possible. Suptech tools enable them to discover more and to intervene in a timely manner. We propose an artificial intelligence approach using Kohonen’s self-organizing maps. The dataset used for development and testing included yearly financial statements with 4058 observations for European composite insurance companies from 2012 to 2021. In a novel manner, the model investigates the behaviour of insurers, looking for similarities. The model forms a map. For the obtained groupings of companies from different geographical origins, a common characteristic was discovered regarding their future financial deterioration. A threshold defined using the solvency capital requirement (SCR) ratio being below 130% for the next year is applied to the map. On the test sample, the model correctly identified on average 86% of problematic companies and 79% of unproblematic companies. Changing the SCR ratio level enables differentiation into multiple map sections. The model does not rely on traditional methods, or the use of the SCR ratio as a dependent variable but looks for similarities in the actual insurer’s financial behaviour. The proposed approach offers grounds for a Suptech tool of predictive generation to support early detection of the possible future financial distress of an insurance company.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135766153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Neural Network Applications in Polygraph Scoring—A Scoping Review","authors":"Dana Rad, Nicolae Paraschiv, Csaba Kiss","doi":"10.3390/info14100564","DOIUrl":"https://doi.org/10.3390/info14100564","url":null,"abstract":"Polygraph tests have been used for many years as a means of detecting deception, but their accuracy has been the subject of much debate. In recent years, researchers have explored the use of neural networks in polygraph scoring to improve the accuracy of deception detection. The purpose of this scoping review is to offer a comprehensive overview of the existing research on the subject of neural network applications in scoring polygraph tests. A total of 57 relevant papers were identified and analyzed for this review. The papers were examined for their research focus, methodology, results, and conclusions. The scoping review found that neural networks have shown promise in improving the accuracy of polygraph tests, with some studies reporting significant improvements over traditional methods. However, further research is needed to validate these findings and to determine the most effective ways of integrating neural networks into polygraph testing. The scoping review concludes with a discussion of the current state of the field and suggestions for future research directions.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135854234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rafael Navas-González, Óscar Oballe-Peinado, Julián Castellanos-Ramos, Daniel Rosas-Cervantes, José A. Sánchez-Durán
{"title":"Practice Projects for an FPGA-Based Remote Laboratory to Teach and Learn Digital Electronics","authors":"Rafael Navas-González, Óscar Oballe-Peinado, Julián Castellanos-Ramos, Daniel Rosas-Cervantes, José A. Sánchez-Durán","doi":"10.3390/info14100558","DOIUrl":"https://doi.org/10.3390/info14100558","url":null,"abstract":"This work presents examples of practice sessions to teach and learn digital electronics using an FPGA-based development platform, accessible either through the on-campus laboratory or online using a remote laboratory developed by the authors. The main tasks proposed in the practice sessions are to design specific modules that will be included as a main block in more complex projects. Each project is adapted and ready once the student modules to be implemented, debugged, and/or tested in the FPGA-based platform are added using the aforementioned accessibility methods. The proposal suggests the use of a web-based remote laboratory to complement (rather than replace) on-campus teaching in response to the growing need for access to laboratory resources beyond regular teaching hours. The paper introduces the main topics on implementing and using the tool, sets out how to adapt regular projects to be executed in the remote lab, and describes several practice projects proposed to students in the final three academic years. The paper concludes with an analysis and evaluation of the user experience taken from surveys conducted with students at the end of the semester.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135969643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}