Proceedings of the 4th International Conference on Information Technology and Computer Communications最新文献

A cooperative opportunistic spectrum access scheme in cognitive cellular networks 认知蜂窝网络中的合作机会频谱接入方案

Proceedings of the 4th International Conference on Information Technology and Computer Communications Pub Date : 2022-06-23 DOI: 10.1145/3548636.3548653

Lei Zhang

引用次数: 0

Wav2sv: End-to-end Speaker Embeddings Learning from Raw Waveforms based on Metric Learning for Speaker Verification Wav2sv:基于度量学习的原始波形的端到端说话人嵌入学习

Proceedings of the 4th International Conference on Information Technology and Computer Communications Pub Date : 2022-06-23 DOI: 10.1145/3548636.3548644

Zhiqing Chen, Yifan Pan, Haoran Zhang, Yuesheng Zhu

{"title":"Wav2sv: End-to-end Speaker Embeddings Learning from Raw Waveforms based on Metric Learning for Speaker Verification","authors":"Zhiqing Chen, Yifan Pan, Haoran Zhang, Yuesheng Zhu","doi":"10.1145/3548636.3548644","DOIUrl":"https://doi.org/10.1145/3548636.3548644","url":null,"abstract":"With the application of deep learning in the field of speaker recognition, the performance of speaker recognition systems has been greatly improved. However, most current work still relies on handcrafted features, existing raw waveform-based systems fail to utilize the multi-scale feature and multi-level information efficiently. Besides, the speaker embedding generated by speaker identification is used to complete speaker verification through similarity discrimination, resulting in a domain mismatch problem. To address these problems, we propose an end-to-end system called Wav2sv, which uses a stack of strided convolution layers as a feature encoder, SE-Res2Blocks and dense connection between each frame layer as the frame aggregator; and obtain the speaker embedding with a metric learning objective. This new end-to-end system can automatically learn the most suitable speaker embedding from raw waveform based on metric learning for speaker verification. Our simulation results on VoxCeleb1 indicate that the proposed approach achieves an EER of 4.75%, which is 18% superior to the Wav2spk baseline. Our work demonstrates the great potential of extracting speaker embeddings from raw waveforms.","PeriodicalId":384376,"journal":{"name":"Proceedings of the 4th International Conference on Information Technology and Computer Communications","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134316856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

EP-BERTGCN: A Simple but Effective Power Equipment Fault Recognition Method 一种简单有效的电力设备故障识别方法

Proceedings of the 4th International Conference on Information Technology and Computer Communications Pub Date : 2022-06-23 DOI: 10.1145/3548636.3548646

Mingcong Lu, Yusong Zhang, Quan Zheng, Zhenyuan Ma, Liqing Liu, Yongping Xiong, Ruifan Li

引用次数: 1

Removing or Avoiding the Camera Focus Light Appearing on Glossy or Bouncing Surfaces by Light Guiding and Image Warping 通过导光和图像翘曲消除或避免相机聚焦光出现在光滑或弹跳表面上

Proceedings of the 4th International Conference on Information Technology and Computer Communications Pub Date : 2022-06-23 DOI: 10.1145/3548636.3548640

Lianly Rompis

{"title":"Removing or Avoiding the Camera Focus Light Appearing on Glossy or Bouncing Surfaces by Light Guiding and Image Warping","authors":"Lianly Rompis","doi":"10.1145/3548636.3548640","DOIUrl":"https://doi.org/10.1145/3548636.3548640","url":null,"abstract":"Digital picture or image becomes more and more important in people's daily life, such as for administration, documentation, active communications, and learning processes. People also demand the exchange of information through digital applications and social media. Taking picture or photo of a document, book, certificate, or thing is usually done by the help of digital scanner, but not all these stuffs can be converted into digital with a scanner. In specific condition people need real image transfer and will take photo of the real view directly using their own digital camera or smartphone camera. From author experience and observations, when taking photos of a document or book or certificate or thing normally with plastic cover or glass surface or bouncing surface, we clearly got an annoying light came from the camera focus light or flash that usually creates noise and messy image. This research was conducted based on the idea to remove or avoid getting the unwanted or disturbing light while taking photos of glossy or bouncing surfaces. A procedure to get a clear and smooth image instead of noise and broken image was derived and implemented by author using light guiding and image warping. These two methods bring accurate images that are necessary for communication needs and facilitate better future works without compromising security policies and people's rights. As the research methodology, author performed literature study, observation, analysis, and implementation. The result of this research offers a good procedure and method that can help people getting clear photos or pictures of a glossy or bouncing surface, advance research to be implemented related to camera technology, and best comprehension to improve photography techniques for future developments.","PeriodicalId":384376,"journal":{"name":"Proceedings of the 4th International Conference on Information Technology and Computer Communications","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132327735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improvement in Land Cover Classification Using Multitemporal Sentinel-1 and Sentinel-2 Satellite Imagery 基于Sentinel-1和Sentinel-2卫星多时相影像的土地覆盖分类改进

Proceedings of the 4th International Conference on Information Technology and Computer Communications Pub Date : 2022-06-23 DOI: 10.1145/3548636.3548639

Limei Wang, Guowang Jin, X. Xiong

{"title":"Improvement in Land Cover Classification Using Multitemporal Sentinel-1 and Sentinel-2 Satellite Imagery","authors":"Limei Wang, Guowang Jin, X. Xiong","doi":"10.1145/3548636.3548639","DOIUrl":"https://doi.org/10.1145/3548636.3548639","url":null,"abstract":"For improving the performance of multitemporal Sentinel-1 (S1) SAR and Sentinel-2 (S2) optical imagery for land cover classification, a framework based on Rotating Kernel Transformation (RKT) denoising algorithm and the stratified sampling method based on the Crop Data Layer (CDL) is proposed. Random Forest classifications based on different denoising algorithms and sampling methods are carried out to compare their accuracy and applicability for land cover classification. The results show that the RKT algorithm and the stratified sampling method can significantly improve the classification accuracy. The classification accuracy by S1 data alone without denoising (overall accuracy: 0.873, Kappa: 0.796) is significantly lower than that of S2 (overall accuracy: 0.979, Kappa: 0.970) resulting from effects of serious salt-and-pepper noise. After RKT filtering, the speckle noise of the S1 classification result is significantly reduced and the accuracy is significantly improved (overall accuracy: 0.944, Kappa: 0.912). RKT filter outperforms the Lee and Median filters in improving the classification accuracy of SAR imagery. Feature-level fusion of S1 and S2 achieves the highest classification accuracy (overall accuracy: 0.983, Kappa: 0.972) which is significantly higher than that of S1 and slightly higher than that of S2 data alone. It proves that the fusion of the optical and SAR data can weaken the speckle noises on classification maps and improve the classification accuracy. The stratified sampling method applied in this study significantly improves the classification accuracy of each experimental group, with the overall accuracy increasing by about 10% and the Kappa coefficient increasing by more than 15% on average.","PeriodicalId":384376,"journal":{"name":"Proceedings of the 4th International Conference on Information Technology and Computer Communications","volume":"33 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127884484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Credit card fraud detection based on self-paced ensemble neural network 基于自节奏集成神经网络的信用卡欺诈检测

Proceedings of the 4th International Conference on Information Technology and Computer Communications Pub Date : 2022-06-23 DOI: 10.1145/3548636.3548650

Wei Zhou, Xiaorui Xue, Yi-zhao Xu

{"title":"Credit card fraud detection based on self-paced ensemble neural network","authors":"Wei Zhou, Xiaorui Xue, Yi-zhao Xu","doi":"10.1145/3548636.3548650","DOIUrl":"https://doi.org/10.1145/3548636.3548650","url":null,"abstract":"Along with the significant increase in the number of credit cards, the number of credit card frauds worldwide is increasing day by day. At the same time, the development of Internet technology has led to the emergence of new fraud methods. The traditional credit card fraud detection methods can no longer meet the needs of the current credit card financial industry development. Identifying fraudulent credit card transactions effectively, quickly and accurately has become a major concern for banks. Methods combining expert rules and statistical analysis, decision tree methods, anomaly detection methods, and feature engineering methods are used in credit card fraud detection research. Among the many methods, deep learning is a new artificial intelligence method that has developed rapidly in recent years and is widely used in credit card fraud detection research. This paper uses a self-paced ensemble neural network (SP-ENN) model to learn credit card fraud transactions by dividing the datasets with different hardness, then identifying these transactions by neural networks, and finally performing a comprehensive evaluation. It was found that this model significantly outperforms other up-sampling or integration models in detecting credit card fraud data.","PeriodicalId":384376,"journal":{"name":"Proceedings of the 4th International Conference on Information Technology and Computer Communications","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134234231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improve the automatic transliteration from Nôm scripts into Vietnamese National scripts by integrating Sino – Vietnamese knowledge into Statistical Machine Translation 将中越知识集成到统计机器翻译中，提高了Nôm文字自动转写成越南国家文字的能力

Proceedings of the 4th International Conference on Information Technology and Computer Communications Pub Date : 2022-06-23 DOI: 10.1145/3548636.3548647

Lam H. Thai, Long H. B. Nguyen, Dinh Dien

{"title":"Improve the automatic transliteration from Nôm scripts into Vietnamese National scripts by integrating Sino – Vietnamese knowledge into Statistical Machine Translation","authors":"Lam H. Thai, Long H. B. Nguyen, Dinh Dien","doi":"10.1145/3548636.3548647","DOIUrl":"https://doi.org/10.1145/3548636.3548647","url":null,"abstract":"Nôm scripts (chữ Nôm) are Vietnamese ancient scripts that were popularly used in Vietnam from the 10th century to the early 20th century. Nowadays, some automatic transliteration from Nôm scripts (NS) into Vietnamese National scripts (chữ Quốc ngữ - QN) systems were developed to help modern Vietnamese people acquire many valuable lessons and knowledge from previous generations through preserving the Sino-Nom heritage. However, these systems have still not performed well in many domains, except for Literature. Our research continues to employ Statistical Machine Translation (SMT) but expands the dataset up to 10 domains. Furthermore, we also focus on analyzing the impact of Chinese scripts with Sino-Vietnamese readings on Nôm script – National script and then integrating this knowledge into our transliteration model. Our experimental results show that our approach helps the model reach 94.04 BLEU score, dramatically increasing by 8.63 BLEU score in the genealogical domain and 0.31 BLEU score in the general model.","PeriodicalId":384376,"journal":{"name":"Proceedings of the 4th International Conference on Information Technology and Computer Communications","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117086279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Surface Defect Detection method for vacuum gauges based on VAG-YOLO 基于var - yolo的真空计表面缺陷检测方法

Proceedings of the 4th International Conference on Information Technology and Computer Communications Pub Date : 2022-06-23 DOI: 10.1145/3548636.3548638

Qikai Cai, C. Gao, Ping Zhang, Yuanguo Ren

{"title":"A Surface Defect Detection method for vacuum gauges based on VAG-YOLO","authors":"Qikai Cai, C. Gao, Ping Zhang, Yuanguo Ren","doi":"10.1145/3548636.3548638","DOIUrl":"https://doi.org/10.1145/3548636.3548638","url":null,"abstract":"Vacuum gauges are the key equipment in vacuum inspection equipment, and the surface defects of vacuum gauges will directly affect the inspection performance and service life of vacuum inspection equipment. At present, the surface defect detection of vacuum gauges mainly relies on the visual inspection of workers, which is less efficient and accurate, and the workers are prone to misjudge the products due to subjective factors. To solve the problems of traditional manual inspection, this paper proposes an improved vacuum gauge surface defect detection method based on the YOLOv5s model called VAG-YOLO. we add a multi-scale adaptive fusion structure (MAF) to the YOLOv5s model to make full use of adaptive fusion of features at different scales to improve the detection performance of the network and increase the defect detection accuracy; Meanwhile, the transformer bottleneck structure (BoT) is introduced to combine multi head Self- Attention (MHSA) with convolutional neural network (CNN) to achieve the effect of reducing the number of network parameters and improving the detection speed. The experimental results show that the average detection accuracy of the VGA-YOLO model is 83.4%, which is higher and faster than the detection accuracy of various other algorithms, and can detect vacuum gauge surface defects in real time.","PeriodicalId":384376,"journal":{"name":"Proceedings of the 4th International Conference on Information Technology and Computer Communications","volume":"206 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132102376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-material Reconstruction Method Based On Deep Prior of Spectral Computed Tomography 基于光谱计算机断层扫描深度先验的多材料重建方法

Proceedings of the 4th International Conference on Information Technology and Computer Communications Pub Date : 2022-06-23 DOI: 10.1145/3548636.3548642

Xiao-Kun Yu, Ailong Cai, Lei Li, Bin Yan

引用次数: 0

Lip to Speech Synthesis Based on Speaker Characteristics Feature Fusion 基于说话人特征融合的唇语合成

Proceedings of the 4th International Conference on Information Technology and Computer Communications Pub Date : 2022-06-23 DOI: 10.1145/3548636.3548648

Rui Zeng, Shengwu Xiong

{"title":"Lip to Speech Synthesis Based on Speaker Characteristics Feature Fusion","authors":"Rui Zeng, Shengwu Xiong","doi":"10.1145/3548636.3548648","DOIUrl":"https://doi.org/10.1145/3548636.3548648","url":null,"abstract":"Lip to speech synthesis (Lip2Speech) is a technology that reconstructs speech from the silent talking face video. With the development of deep learning, achievements have been made in this field. Due to the silent talking face video does not contain the speaker characteristics information, reconstructing speech directly from the silent talking video will lose the characteristic information of the speaker, thus reducing the quality of the reconstructed speech. In this paper we proposed a new framework using the pre-trained speaker encoder network which extract the speaker characteristics information. More specially: (1) The pretrained speaker encoder network generates a fixed-dimensional embedding vector from a few seconds of given speaker's speech, which contains the speaker characteristics information, (2) The content encoder uses a stack of 3D convolutions to extracts the content information of the video, (3) a sequence-to-sequence synthesis network based on Tacotron2 that generates Mel-spectrogram from silent video, conditioned on the speaker's identity embedding. Experimental results show that, using the pretrained speaker encoder can improved the speech reconstruction quality.","PeriodicalId":384376,"journal":{"name":"Proceedings of the 4th International Conference on Information Technology and Computer Communications","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121714948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1