{"title":"MusicTalk: A Microservice Approach for Musical Instrument Recognition","authors":"Yi-Bing Lin;Chang-Chieh Cheng;Shih-Chuan Chiu","doi":"10.1109/OJCS.2024.3476416","DOIUrl":"https://doi.org/10.1109/OJCS.2024.3476416","url":null,"abstract":"Musical instrument recognition is the process of using machine learning or audio signal processing to identify and classify different musical instruments from an audio recording. This capability enables more precise analysis of musical pieces, aiding in tasks like transcription, music recommendation, and automated composition. The challenges include (1) recognition models not being accurate enough, (2) the need to retrain the entire model when a new instrument is added, and (3) differences in audio formats that prevent direct usage. To address these challenges, this article introduces MusicTalk, a microservice based musical instrument (MI) detection system, with several key contributions. Firstly, MusicTalk introduces a novel patchout mechanism named Brightness Characteristic Based Patchout for the ViT algorithm, which enhances MI detection accuracy compared to existing solutions. Secondly, MusicTalk integrates individual MI detectors as microservices, facilitating efficient interaction with other microservices. Thirdly, MusicTalk incorporates an audio shaper that unifies diverse music open datasets such as Audioset, Openmic-2018, MedleyDB, URMP, and INSTDB. By employing Grad-CAM analysis on Mel-Spectrograms, we elucidate the characteristics of the MI detection model. This analysis allows us to optimize ensemble combinations of ViT with patchout and CNNs within MusicTalk, resulting in high accuracy rates. For instance, the system achieves precision and recall rates of 96.17% and 95.77% respectively for violin detection, which are the highest among previous approaches. An additional advantage of MusicTalk lies in its microservice-driven visualization capabilities. By integrating MI detectors as microservices, MusicTalk enables seamless visualization of songs using animated avatars. In a case study featuring “Peter and the Wolf,” we demonstrate that improved MI detection accuracy enhances the visual storytelling impact of music. The overall F1-score improvement of MusicTalk over previous approaches for this song is up to 12%.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"5 ","pages":"612-623"},"PeriodicalIF":0.0,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10709650","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142518112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muhammad Ehtisham Hassan;Iffat Maab;Masroor Hussain;Usman Habib;Yutaka Matsuo
{"title":"Polarity Classification of Low Resource Roman Urdu and Movie Reviews Sentiments Using Machine Learning-Based Ensemble Approaches","authors":"Muhammad Ehtisham Hassan;Iffat Maab;Masroor Hussain;Usman Habib;Yutaka Matsuo","doi":"10.1109/OJCS.2024.3476378","DOIUrl":"https://doi.org/10.1109/OJCS.2024.3476378","url":null,"abstract":"The complex linguistic characteristics and limited resources present sentiment analysis in Roman Urdu as a unique challenge, necessitating the development of accurate NLP models. In this study, we investigate the performance of prominent ensemble methods on two diverse datasets of UCL and IMDB movie reviews with Roman Urdu and English dialects, respectively. We perform a comparative examination to assess the effectiveness of ensemble techniques including stacking, bagging, random subspace, and boosting, optimized through grid search. The ensemble techniques employ four base learners (Support Vector Machine, Random Forest, Logistic Regression, and Naive Bayes) for sentiment classification. The experiment analysis focuses on different N-gram feature sets (unigrams, bigrams, and trigrams), Chi-square feature selection, and text representation schemes (Bag of Words and TF-IDF). Our empirical findings underscore the superiority of stacking across both datasets, achieving high accuracies and F1-scores: 80.30% and 81.76% on the UCL dataset, and 90.92% and 91.12% on the IMDB datasets, respectively. The proposed approach has significant performance compared to baseline approaches on the relevant tasks and improves the accuracy up to 7% on the UCL dataset.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"5 ","pages":"599-611"},"PeriodicalIF":0.0,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10707202","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142517837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Verifiable Random Function Schemes Based on SM2 Digital Signature Algorithm and its Applications for Committee Elections","authors":"Yongxin Zhang;Jiacheng Yang;Hong Lei;Zijian Bao;Ning Lu;Wenbo Shi;Bangdao Chen","doi":"10.1109/OJCS.2024.3463649","DOIUrl":"https://doi.org/10.1109/OJCS.2024.3463649","url":null,"abstract":"A verifiable random function (VRF) is a pseudorandom function that enables source verification. By providing a public verification key and accompanying proof with the output, all parties can verify the correctness of the output without interaction. VRF has gained widespread adoption in blockchain applications, including Algorand, Ouroboros, and ChainLink. This article introduces SM2VRF, the first VRF based on the Chinese standard SM2 cryptographic algorithm, and extends it to a batch construction called SM2VRF-B for efficient verification of multiple sources. We showcase the applicability of SM2VRF in an electronic random committee election scenario, where the blockchain is utilized for storing candidate parameters and votes. By employing the Hamming distance, our scheme eliminates the risk of election failure. We provide a security proof for the proposed scheme, followed by an evaluation of the performance of both SM2VRF and SM2VRF-B. We implement our committee election scheme with Ethereum to assess the feasibility and efficiency.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"5 ","pages":"480-490"},"PeriodicalIF":0.0,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10699362","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142377066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scalability and Security of Blockchain-Empowered Metaverse: A Survey","authors":"Huawei Huang;Zhaokang Yin;Qinglin Yang;Taotao Li;Xiaofei Luo;Lu Zhou;Zibin Zheng","doi":"10.1109/OJCS.2024.3468445","DOIUrl":"https://doi.org/10.1109/OJCS.2024.3468445","url":null,"abstract":"Metaverse brings unlimited space and tremendous potential since it is an integrated application of multiple fundamental technologies such as artificial intelligence, blockchain, networking, Internet of Things, and interactivity. During those building blocks of metaverse, blockchain is a type of technology operated by a group of individual participants and known for its immutability feature. The massive adoption of blockchain has been severely prevented by various security and scalability issues in blockchain-based applications due to the inherent characteristics of this technology. To accelerate the massive adoption of blockchain, many previous studies have been carried out to address the security and scalability issues. This article reviews blockchain-related publications collected from four major security conferences (i.e., NDSS, CCS, S&P, and USENIX Security) published in the past three years. Through this overview, we disclose the security and scalability issues of mainstream blockchains such as Bitcoin and Ethereum. Our study aims to help researchers better understand the bottleneck of blockchain-empowered metaverse, and how to address user requirements for security and scalability from the perspective of blockchains.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"5 ","pages":"648-659"},"PeriodicalIF":0.0,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10695094","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142565541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Photogenic Guided Image-to-Image Translation With Single Encoder","authors":"Rina Oh;T. Gonsalves","doi":"10.1109/OJCS.2024.3462477","DOIUrl":"https://doi.org/10.1109/OJCS.2024.3462477","url":null,"abstract":"Image-to-image translation involves combining content and style from different images to generate new images. This technology is particularly valuable for exploring artistic aspects, such as how artists from different eras would depict scenes. Deep learning models are ideal for achieving these artistic styles. This study introduces an unpaired image-to-image translation architecture that extracts style features directly from input style images, without requiring a special encoder. Instead, the model uses a single encoder for the content image. To process the spatial features of the content image and the artistic features of the style image, a new normalization function called Direct Adaptive Instance Normalization with Pooling is developed. This function extracts style images more effectively, reducing the computational costs compared to existing guided image-to-image translation models. Additionally, we employed a Vision Transformer (ViT) in the Discriminator to analyze entire spatial features. The new architecture, named Single-Stream Image-to-Image Translation (SSIT), was tested on various tasks, including seasonal translation, weather-based environment transformation, and photo-to-art conversion. The proposed model successfully reflected the design information of the style images, particularly in translating photos to artworks, where it faithfully reproduced color characteristics. Moreover, the model consistently outperformed state-of-the-art translation models in each experiment, as confirmed by Fréchet Inception Distance (FID) and Kernel Inception Distance (KID) scores.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"5 ","pages":"624-635"},"PeriodicalIF":0.0,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10694773","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142517855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing Fault Tolerance in High-Performance Computing: A Real Hardware Case Study on a RISC-V Vector Processing Unit","authors":"Marcello Barbirotta;Francesco Minervini;Carlos Rojas Morales;Adrian Cristal;Osman Unsal;Mauro Olivieri","doi":"10.1109/OJCS.2024.3468895","DOIUrl":"https://doi.org/10.1109/OJCS.2024.3468895","url":null,"abstract":"High-Performance Computing (HPC) systems are designed for large-scale processing and complex dataset analysis leveraging scalability, efficiency, and parallelism, often integrating specialized hardware structures such as Vector Processing Units (VPUs). As these systems have grown in complexity and scale, their vulnerability to errors and failures has become an important and complex issue in the HPC world. Our research addresses this challenge by exploring and implementing advanced fault tolerance techniques inside the Vitruvius+ architecture, a partial out-of-order Vector Processing Unit. To the best of our knowledge, this is the first full RTL-level implementation of instruction replication in an HPC-class vector processor for reliability. Specifically, we investigate the integration and interaction of redundancy mechanisms inside the most sensitive architectural units, obtaining a reduction of 75% in non-silent faults causing system failure, proven by an extensive fault injection simulation campaign, with a hardware overhead of only 7.5% and a negligible variation in clock frequency.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"5 ","pages":"553-565"},"PeriodicalIF":0.0,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10694791","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142397274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Antti Klemetti;Mikko Raatikainen;Juhani Kivimäki;Lalli Myllyaho;Jukka K. Nurminen
{"title":"Removing Neurons From Deep Neural Networks Trained With Tabular Data","authors":"Antti Klemetti;Mikko Raatikainen;Juhani Kivimäki;Lalli Myllyaho;Jukka K. Nurminen","doi":"10.1109/OJCS.2024.3467182","DOIUrl":"https://doi.org/10.1109/OJCS.2024.3467182","url":null,"abstract":"Deep neural networks bear substantial cloud computational loads and often surpass client devices' capabilities. Research has concentrated on reducing the inference burden of convolutional neural networks processing images. Unstructured pruning, which leads to sparse matrices requiring specialized hardware, has been extensively studied. However, neural networks trained with tabular data and structured pruning, which produces dense matrices handled by standard hardware, are less explored. We compare two approaches: 1) Removing neurons followed by training from scratch, and 2) Structured pruning followed by fine-tuning through additional training over a limited number of epochs. We evaluate these approaches using three models of varying sizes (1.5, 9.2, and 118.7 million parameters) from Kaggle-winning neural networks trained with tabular data. Approach 1 consistently outperformed Approach 2 in predictive performance. The models from Approach 1 had 52%, 8%, and 12% fewer parameters than the original models, with latency reductions of 18%, 5%, and 5%, respectively. Approach 2 required at least one epoch of fine-tuning for recovering predictive performance, with further fine-tuning offering diminishing returns. Approach 1 yields lighter models for retraining in the presence of concept drift and avoids shifting computational load from inference to training, which is inherent in Approach 2. However, Approach 2 can be used to pinpoint the layers that have the least impact on the model's predictive performance when neurons are removed. We found that the feed-forward component of the transformer architecture used in large language models is a promising target for neuron removal.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"5 ","pages":"542-552"},"PeriodicalIF":0.0,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10693557","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142397170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Privacy Preserving Machine Learning With Federated Personalized Learning in Artificially Generated Environment","authors":"Md. Tanzib Hosain;Mushfiqur Rahman Abir;Md. Yeasin Rahat;M. F. Mridha;Saddam Hossain Mukta","doi":"10.1109/OJCS.2024.3466859","DOIUrl":"https://doi.org/10.1109/OJCS.2024.3466859","url":null,"abstract":"The widespread adoption of Privacy Preserving Machine Learning (PPML) with Federated Personalized Learning (FPL) has been driven by significant advances in intelligent systems research. This progress has raised concerns about data privacy in the artificially generated environment, leading to growing awareness of the need for privacy-preserving solutions. There has been a seismic shift in interest towards Federated Personalized Learning (FPL), which is the leading paradigm for training Machine Learning (ML) models on decentralized data silos while maintaining data privacy. This research article presents a comprehensive analysis of a cutting-edge approach to personalize ML models while preserving privacy, achieved through the innovative framework of Privacy Preserving Machine Learning with Federated Personalized Learning (PPMLFPL). Regarding the increasing concerns about data privacy in virtual environments, this study evaluated the effectiveness of PPMLFPL in addressing the critical balance between personalized model refinement and maintaining the confidentiality of individual user data. According to our results based on various effectiveness metrics, the use of the Adaptive Personalized Cross-Silo Federated Learning with Homomorphic Encryption (APPLE+HE) algorithm for privacy-preserving machine learning tasks in federated personalized learning settings within the artificially generated environment is strongly recommended, obtaining an accuracy of 99.34%.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"5 ","pages":"694-704"},"PeriodicalIF":0.0,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10691662","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142672017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuo Pei;Jiajia Wang;Yunlong Yang;Anyang Dong;Bingqi Guo;Junlong Guo;Yufeng Yao
{"title":"A Human-Centered Kinematics Design Optimization of Upper Limb Rehabilitation Exoskeleton Based on Configuration Manifold","authors":"Shuo Pei;Jiajia Wang;Yunlong Yang;Anyang Dong;Bingqi Guo;Junlong Guo;Yufeng Yao","doi":"10.1109/OJCS.2024.3465661","DOIUrl":"https://doi.org/10.1109/OJCS.2024.3465661","url":null,"abstract":"Shoulder repetitive training is of paramount importance for rehabilitation of stroke patients with hemiplegia. This article investigated kinematic structural optimization of an upper limb rehabilitation exoskeleton's shoulder structure, aiming to cover the range of motion (ROM) of human shoulder, achieve sufficient dexterity, obtain a compact structure, and avoid collisions with the user within the workspace. Based on the concept of configuration manifold, configuration parameters and joint angle parameters were fused, and parameter optimization was transformed into a searching problem in high-dimensional configuration space. Geometric constraints between human and exoskeleton were described parametrically. Upper limb movements were mapped to the exoskeleton's configuration space to calculate spatial vectors of joints, and determine whether vectors satisfy constraints. The formulated multi-objective optimisation problem was computed by multi-objective particle swarm optimization (MOPSO) algorithm to determine the shoulder configuration parameters. Experimental results demonstrate that functional rehabilitation exoskeleton (FREE) exhibits a wide ROM, excellent dexterity, and can assist users in completing most activities of daily life (ADLs). The design framework proposed in this article can help designers determining optimal exoskeleton configurations through formulated constraints.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"6 ","pages":"282-293"},"PeriodicalIF":0.0,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10685137","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143422836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wei Shen;Ming Fang;Yuxia Wang;Jiafeng Xiao;Huangqun Chen;Weifeng Zhang;Xi Li
{"title":"AE-YOLOv5 for Detection of Power Line Insulator Defects","authors":"Wei Shen;Ming Fang;Yuxia Wang;Jiafeng Xiao;Huangqun Chen;Weifeng Zhang;Xi Li","doi":"10.1109/OJCS.2024.3465430","DOIUrl":"https://doi.org/10.1109/OJCS.2024.3465430","url":null,"abstract":"The power transmission network, which delivers power energy from generator to customers, plays an important role in the power grid. Insulator is a basic component in the power transmission network. Its defects may lead to the paralysis of the entire transmission network, resulting in serious electricity accidents. Therefore, how to use artificial intelligence and other emerging technologies to realize automatic detection of power line insulator defects has become an urgent problem to be solved. To accurately detect insulator defects in complex environment, this article proposes Attention Enhanced YOLOv5 (AE-YOLOv5) by inserting visual attention modules into original YOLOv5 model. In particular, we design a Channel-Spatial Attention module and plug it into the backbone of YOLOv5 to enhance its representation learning ability. Furthermore, a Multi-scale Attention module is also proposed to enhance the Feature Pyramid Network (FPN). To validate the efficacy of our proposed model, we conducted training and testing on a dataset collected from real-world scenarios. The experimental results demonstrate that our model can effectively and accurately detect defects of power line insulators in real-time.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"5 ","pages":"468-479"},"PeriodicalIF":0.0,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10684881","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142368634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}