{"title":"A cooperative opportunistic spectrum access scheme in cognitive cellular networks","authors":"Lei Zhang","doi":"10.1145/3548636.3548653","DOIUrl":"https://doi.org/10.1145/3548636.3548653","url":null,"abstract":"In this paper, to decrease the energy consumption, we propose a cooperative opportunistic spectrum access (OSA) scheme in cognitive cellular networks to reduce the number of the spectrum handoff. A three dimension continuous-time Markov chain (CTMC) is employed to model the OSA schemes in the cognitive cellular networks, and the grade of service (GoS) performance metrics is derived. Simulations are conducted to verify the theoretical analysis, and the results show the GoS improvements of the proposed cooperative OSA scheme.","PeriodicalId":384376,"journal":{"name":"Proceedings of the 4th International Conference on Information Technology and Computer Communications","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114632733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Wav2sv: End-to-end Speaker Embeddings Learning from Raw Waveforms based on Metric Learning for Speaker Verification","authors":"Zhiqing Chen, Yifan Pan, Haoran Zhang, Yuesheng Zhu","doi":"10.1145/3548636.3548644","DOIUrl":"https://doi.org/10.1145/3548636.3548644","url":null,"abstract":"With the application of deep learning in the field of speaker recognition, the performance of speaker recognition systems has been greatly improved. However, most current work still relies on handcrafted features, existing raw waveform-based systems fail to utilize the multi-scale feature and multi-level information efficiently. Besides, the speaker embedding generated by speaker identification is used to complete speaker verification through similarity discrimination, resulting in a domain mismatch problem. To address these problems, we propose an end-to-end system called Wav2sv, which uses a stack of strided convolution layers as a feature encoder, SE-Res2Blocks and dense connection between each frame layer as the frame aggregator; and obtain the speaker embedding with a metric learning objective. This new end-to-end system can automatically learn the most suitable speaker embedding from raw waveform based on metric learning for speaker verification. Our simulation results on VoxCeleb1 indicate that the proposed approach achieves an EER of 4.75%, which is 18% superior to the Wav2spk baseline. Our work demonstrates the great potential of extracting speaker embeddings from raw waveforms.","PeriodicalId":384376,"journal":{"name":"Proceedings of the 4th International Conference on Information Technology and Computer Communications","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134316856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mingcong Lu, Yusong Zhang, Quan Zheng, Zhenyuan Ma, Liqing Liu, Yongping Xiong, Ruifan Li
{"title":"EP-BERTGCN: A Simple but Effective Power Equipment Fault Recognition Method","authors":"Mingcong Lu, Yusong Zhang, Quan Zheng, Zhenyuan Ma, Liqing Liu, Yongping Xiong, Ruifan Li","doi":"10.1145/3548636.3548646","DOIUrl":"https://doi.org/10.1145/3548636.3548646","url":null,"abstract":"With the advancement of China’s State Grid in recent years, text-based power equipment fault recognition has become an essential tool for power equipment maintenance. The task suffers from the domain gap that exists between the electric power domain and the general natural language processing domain. To improve the recognition performance, we proposed a method that combines pre-trained Bidirectional Encoder Representations from Transformers (BERT) and Graph Convolutional Network (GCN), i.e., Electric Power -BERTGCN. Our EP-BERTGCN first builds the graph among documents and words within documents based on pre-trained BERT. Then, the two softmax outputs with pre-trained BERT and GCNs are combined for final classification results. Extensive experiments show that our method outperforms previous baselines.","PeriodicalId":384376,"journal":{"name":"Proceedings of the 4th International Conference on Information Technology and Computer Communications","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132884740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Removing or Avoiding the Camera Focus Light Appearing on Glossy or Bouncing Surfaces by Light Guiding and Image Warping","authors":"Lianly Rompis","doi":"10.1145/3548636.3548640","DOIUrl":"https://doi.org/10.1145/3548636.3548640","url":null,"abstract":"Digital picture or image becomes more and more important in people's daily life, such as for administration, documentation, active communications, and learning processes. People also demand the exchange of information through digital applications and social media. Taking picture or photo of a document, book, certificate, or thing is usually done by the help of digital scanner, but not all these stuffs can be converted into digital with a scanner. In specific condition people need real image transfer and will take photo of the real view directly using their own digital camera or smartphone camera. From author experience and observations, when taking photos of a document or book or certificate or thing normally with plastic cover or glass surface or bouncing surface, we clearly got an annoying light came from the camera focus light or flash that usually creates noise and messy image. This research was conducted based on the idea to remove or avoid getting the unwanted or disturbing light while taking photos of glossy or bouncing surfaces. A procedure to get a clear and smooth image instead of noise and broken image was derived and implemented by author using light guiding and image warping. These two methods bring accurate images that are necessary for communication needs and facilitate better future works without compromising security policies and people's rights. As the research methodology, author performed literature study, observation, analysis, and implementation. The result of this research offers a good procedure and method that can help people getting clear photos or pictures of a glossy or bouncing surface, advance research to be implemented related to camera technology, and best comprehension to improve photography techniques for future developments.","PeriodicalId":384376,"journal":{"name":"Proceedings of the 4th International Conference on Information Technology and Computer Communications","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132327735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improvement in Land Cover Classification Using Multitemporal Sentinel-1 and Sentinel-2 Satellite Imagery","authors":"Limei Wang, Guowang Jin, X. Xiong","doi":"10.1145/3548636.3548639","DOIUrl":"https://doi.org/10.1145/3548636.3548639","url":null,"abstract":"For improving the performance of multitemporal Sentinel-1 (S1) SAR and Sentinel-2 (S2) optical imagery for land cover classification, a framework based on Rotating Kernel Transformation (RKT) denoising algorithm and the stratified sampling method based on the Crop Data Layer (CDL) is proposed. Random Forest classifications based on different denoising algorithms and sampling methods are carried out to compare their accuracy and applicability for land cover classification. The results show that the RKT algorithm and the stratified sampling method can significantly improve the classification accuracy. The classification accuracy by S1 data alone without denoising (overall accuracy: 0.873, Kappa: 0.796) is significantly lower than that of S2 (overall accuracy: 0.979, Kappa: 0.970) resulting from effects of serious salt-and-pepper noise. After RKT filtering, the speckle noise of the S1 classification result is significantly reduced and the accuracy is significantly improved (overall accuracy: 0.944, Kappa: 0.912). RKT filter outperforms the Lee and Median filters in improving the classification accuracy of SAR imagery. Feature-level fusion of S1 and S2 achieves the highest classification accuracy (overall accuracy: 0.983, Kappa: 0.972) which is significantly higher than that of S1 and slightly higher than that of S2 data alone. It proves that the fusion of the optical and SAR data can weaken the speckle noises on classification maps and improve the classification accuracy. The stratified sampling method applied in this study significantly improves the classification accuracy of each experimental group, with the overall accuracy increasing by about 10% and the Kappa coefficient increasing by more than 15% on average.","PeriodicalId":384376,"journal":{"name":"Proceedings of the 4th International Conference on Information Technology and Computer Communications","volume":"33 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127884484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Credit card fraud detection based on self-paced ensemble neural network","authors":"Wei Zhou, Xiaorui Xue, Yi-zhao Xu","doi":"10.1145/3548636.3548650","DOIUrl":"https://doi.org/10.1145/3548636.3548650","url":null,"abstract":"Along with the significant increase in the number of credit cards, the number of credit card frauds worldwide is increasing day by day. At the same time, the development of Internet technology has led to the emergence of new fraud methods. The traditional credit card fraud detection methods can no longer meet the needs of the current credit card financial industry development. Identifying fraudulent credit card transactions effectively, quickly and accurately has become a major concern for banks. Methods combining expert rules and statistical analysis, decision tree methods, anomaly detection methods, and feature engineering methods are used in credit card fraud detection research. Among the many methods, deep learning is a new artificial intelligence method that has developed rapidly in recent years and is widely used in credit card fraud detection research. This paper uses a self-paced ensemble neural network (SP-ENN) model to learn credit card fraud transactions by dividing the datasets with different hardness, then identifying these transactions by neural networks, and finally performing a comprehensive evaluation. It was found that this model significantly outperforms other up-sampling or integration models in detecting credit card fraud data.","PeriodicalId":384376,"journal":{"name":"Proceedings of the 4th International Conference on Information Technology and Computer Communications","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134234231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improve the automatic transliteration from Nôm scripts into Vietnamese National scripts by integrating Sino – Vietnamese knowledge into Statistical Machine Translation","authors":"Lam H. Thai, Long H. B. Nguyen, Dinh Dien","doi":"10.1145/3548636.3548647","DOIUrl":"https://doi.org/10.1145/3548636.3548647","url":null,"abstract":"Nôm scripts (chữ Nôm) are Vietnamese ancient scripts that were popularly used in Vietnam from the 10th century to the early 20th century. Nowadays, some automatic transliteration from Nôm scripts (NS) into Vietnamese National scripts (chữ Quốc ngữ - QN) systems were developed to help modern Vietnamese people acquire many valuable lessons and knowledge from previous generations through preserving the Sino-Nom heritage. However, these systems have still not performed well in many domains, except for Literature. Our research continues to employ Statistical Machine Translation (SMT) but expands the dataset up to 10 domains. Furthermore, we also focus on analyzing the impact of Chinese scripts with Sino-Vietnamese readings on Nôm script – National script and then integrating this knowledge into our transliteration model. Our experimental results show that our approach helps the model reach 94.04 BLEU score, dramatically increasing by 8.63 BLEU score in the genealogical domain and 0.31 BLEU score in the general model.","PeriodicalId":384376,"journal":{"name":"Proceedings of the 4th International Conference on Information Technology and Computer Communications","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117086279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Surface Defect Detection method for vacuum gauges based on VAG-YOLO","authors":"Qikai Cai, C. Gao, Ping Zhang, Yuanguo Ren","doi":"10.1145/3548636.3548638","DOIUrl":"https://doi.org/10.1145/3548636.3548638","url":null,"abstract":"Vacuum gauges are the key equipment in vacuum inspection equipment, and the surface defects of vacuum gauges will directly affect the inspection performance and service life of vacuum inspection equipment. At present, the surface defect detection of vacuum gauges mainly relies on the visual inspection of workers, which is less efficient and accurate, and the workers are prone to misjudge the products due to subjective factors. To solve the problems of traditional manual inspection, this paper proposes an improved vacuum gauge surface defect detection method based on the YOLOv5s model called VAG-YOLO. we add a multi-scale adaptive fusion structure (MAF) to the YOLOv5s model to make full use of adaptive fusion of features at different scales to improve the detection performance of the network and increase the defect detection accuracy; Meanwhile, the transformer bottleneck structure (BoT) is introduced to combine multi head Self- Attention (MHSA) with convolutional neural network (CNN) to achieve the effect of reducing the number of network parameters and improving the detection speed. The experimental results show that the average detection accuracy of the VGA-YOLO model is 83.4%, which is higher and faster than the detection accuracy of various other algorithms, and can detect vacuum gauge surface defects in real time.","PeriodicalId":384376,"journal":{"name":"Proceedings of the 4th International Conference on Information Technology and Computer Communications","volume":"206 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132102376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-material Reconstruction Method Based On Deep Prior of Spectral Computed Tomography","authors":"Xiao-Kun Yu, Ailong Cai, Lei Li, Bin Yan","doi":"10.1145/3548636.3548642","DOIUrl":"https://doi.org/10.1145/3548636.3548642","url":null,"abstract":"Spectral computed tomography (Spectral CT) has attracted more and more attention because of its ability of material discrimination. However, as the number of materials increases, it becomes more difficult to decompose the material according to the polychromatic projection. This paper presents a direct multi-material reconstruction method, in which a deep convolutional neural network (CNN)-based prior is incorporated into the optimization model. The efficient iterative algorithm is designed under the framework of the alternating direction method of multipliers (ADMM). The numerical experiments further validate the superiority of the proposed method in multi-material reconstruction and noise suppression.","PeriodicalId":384376,"journal":{"name":"Proceedings of the 4th International Conference on Information Technology and Computer Communications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129554152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lip to Speech Synthesis Based on Speaker Characteristics Feature Fusion","authors":"Rui Zeng, Shengwu Xiong","doi":"10.1145/3548636.3548648","DOIUrl":"https://doi.org/10.1145/3548636.3548648","url":null,"abstract":"Lip to speech synthesis (Lip2Speech) is a technology that reconstructs speech from the silent talking face video. With the development of deep learning, achievements have been made in this field. Due to the silent talking face video does not contain the speaker characteristics information, reconstructing speech directly from the silent talking video will lose the characteristic information of the speaker, thus reducing the quality of the reconstructed speech. In this paper we proposed a new framework using the pre-trained speaker encoder network which extract the speaker characteristics information. More specially: (1) The pretrained speaker encoder network generates a fixed-dimensional embedding vector from a few seconds of given speaker's speech, which contains the speaker characteristics information, (2) The content encoder uses a stack of 3D convolutions to extracts the content information of the video, (3) a sequence-to-sequence synthesis network based on Tacotron2 that generates Mel-spectrogram from silent video, conditioned on the speaker's identity embedding. Experimental results show that, using the pretrained speaker encoder can improved the speech reconstruction quality.","PeriodicalId":384376,"journal":{"name":"Proceedings of the 4th International Conference on Information Technology and Computer Communications","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121714948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}