International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)最新文献

筛选
英文 中文
Enhancing audio perception in augmented reality: a dynamic vocal information processing framework 增强增强现实中的音频感知:动态人声信息处理框架
Danqing Zhao, Shuyi Xin, Lechen Liu, Yihan Sun, Anqi Du
{"title":"Enhancing audio perception in augmented reality: a dynamic vocal information processing framework","authors":"Danqing Zhao, Shuyi Xin, Lechen Liu, Yihan Sun, Anqi Du","doi":"10.1117/12.3014440","DOIUrl":"https://doi.org/10.1117/12.3014440","url":null,"abstract":"The development of the Metaverse nowadays has sparked widespread emotions among researchers, and correspondingly, many technologies have been derived to improve the human's sense of reality in the Metaverse. Especially, Extended Reality (XR), as an indispensable and important technology and research direction in the study of the metaverse, aims to bring seamless transformation between the virtual world and the real-world immersion to the experiential world. However, the technology we currently lack is the ability to simultaneously separate, classify, and locate dynamic human sound information to enhance human sound perception in complex noise environments. This article proposes a framework that utilizes FCNN for separation, algebraic models for positioning to obtain estimated distances, and SVM for classification. The dataset is built to simulates distance-related changes with accurate ground truth labels. The results show that our method can effectively separate, separate, and locate mixed sound data, providing users with comprehensive information about the content, gender, and distance of the speaking object in complex sound environments, enhancing their immersive experience and perception ability. Our innovation lies in the combination of three audio processing technologies and the framework proposed may well inspire future work on related topics.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":" 22","pages":"129691Z - 129691Z-9"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139640520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on automatic scoring algorithm for English composition based on machine learning 基于机器学习的英语作文自动评分算法研究
Hui Li
{"title":"Research on automatic scoring algorithm for English composition based on machine learning","authors":"Hui Li","doi":"10.1117/12.3014482","DOIUrl":"https://doi.org/10.1117/12.3014482","url":null,"abstract":"It is difficult to extract deep semantic features for English composition scoring methods based on artificial features, and it is difficult for English composition scoring methods based on neural networks to extract shallow features such as the number of words, resulting in the limitations of different composition scoring methods. Based on existing research results, this paper proposes an English composition scoring method that combines artificial feature extraction methods and deep learning methods. This method uses artificially designed features to extract shallow features at the word and sentence levels in the composition, draws on existing methods to extract semantic features of the composition, and performs regression calculations on the deep features and shallow features to obtain the total score of the composition. The experiment uses the Pearson evaluation index to measure the correlation between the predicted total score of the essay and the true total score under the combination method. The experiment shows that compared with the average results of 0.747 and 0.645 of baseline models such as BiLSTM and RNN, the algorithm proposed in this article is respectively improvements are 0.068 and 0.17, which proves the effectiveness of the method proposed in this paper.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"20 6","pages":"129690T - 129690T-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139640403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on the simplification of building complex model under multi-factor constraints 多因素约束下建筑复杂模型的简化研究
Haoyuan Bai, Kelong Yang, Shunhua Liao
{"title":"Research on the simplification of building complex model under multi-factor constraints","authors":"Haoyuan Bai, Kelong Yang, Shunhua Liao","doi":"10.1117/12.3014388","DOIUrl":"https://doi.org/10.1117/12.3014388","url":null,"abstract":"With the wide application of 3D building cluster models in urban planning, visualization and other fields, how to improve the rendering efficiency and reduce the computational cost of building cluster models has become an important issue. To address this problem, this paper proposes a visual perception evaluation model used to assess the weights of buildings based on multi-factor considerations to determine the order of building simplification, and weights the vertex importance for the classical QEM algorithm to redefine the collapsing cost of the edges, which achieves the purpose of reducing the complexity of the model while maintaining the visual quality. Experimental results show that the algorithm can significantly reduce the model rendering time and computational cost while maintaining the visual quality.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":" 9","pages":"129691G - 129691G-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139640390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Three-dimensional target detection algorithm for dangerous goods in CT security inspection CT 安全检查中危险品的三维目标检测算法
Jingze He, Yao Guo, qing song
{"title":"Three-dimensional target detection algorithm for dangerous goods in CT security inspection","authors":"Jingze He, Yao Guo, qing song","doi":"10.1117/12.3014353","DOIUrl":"https://doi.org/10.1117/12.3014353","url":null,"abstract":"In this paper, a 3D dangerous goods detection method based on RetinaNet is proposed. This method uses the bidirectional feature pyramid network structure of RetinaNet to extract multi-scale features from point cloud data and trains the system using Focal Loss function to achieve fast and accurate detection of dangerous goods. In addition, in order to improve the detection accuracy, this paper also introduces the 3D region proposal network (3D RPN) and nonmaximum suppression (NMS) algorithm. The experimental results show that the proposed method performs well on our self-built CT dataset, with high accuracy and low false positive rate, and is suitable for dangerous goods detection tasks in practical scenarios.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"54 5","pages":"1296902 - 1296902-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on collaborative positioning of intelligent vehicle aided navigation based on computer vision technology 基于计算机视觉技术的智能车辆辅助导航协同定位研究
Shun Zhang
{"title":"Research on collaborative positioning of intelligent vehicle aided navigation based on computer vision technology","authors":"Shun Zhang","doi":"10.1117/12.3014415","DOIUrl":"https://doi.org/10.1117/12.3014415","url":null,"abstract":"Due to the low accuracy of collecting vehicle position information, the error in the positioning stage is relatively large. Therefore, the collaborative positioning of intelligent vehicle aided navigation based on computer vision technology is proposed. Taking the computer vision equipment-smart cameras VOF/VOF-S as a specific data acquisition device, and combining with the specific running state of the vehicle, the specific parameters in the data acquisition stage are set differently, so as to realize the accurate acquisition of vehicle position information. In the positioning stage, the plane where the wheel is located is taken as the road plane, and the coordinate parameters of data information collected by several road ground points in VOF/VOF-S computer vision technology device are integrated to realize the transformation of vehicle position information in real space. In the test results, the positioning error of vehicle position under different driving conditions is always stable within 1.50m, which has high accuracy.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"4 1","pages":"129692P - 129692P-5"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Image segmentation of rail surface defects based on fractional order particle swarm optimization 2D-Otsu algorithm 基于分数阶粒子群优化 2D-Otsu 算法的轨道表面缺陷图像分割
Na Geng, Hu Sheng, Weizhi Sun, Yifeng Wang, Tan Yu, Zihan Liu
{"title":"Image segmentation of rail surface defects based on fractional order particle swarm optimization 2D-Otsu algorithm","authors":"Na Geng, Hu Sheng, Weizhi Sun, Yifeng Wang, Tan Yu, Zihan Liu","doi":"10.1117/12.3014444","DOIUrl":"https://doi.org/10.1117/12.3014444","url":null,"abstract":"Under the influence of high density operation and natural environment, the rail surface will appear abrasion damage, which will affect the safety and comfort of the train. Rail surface defect detection is an important part to ensure the safe and efficient operation of railway system. In order to distinguish whether there are defects on the rail surface, a method of rail surface defect image segmentation based on FPSO 2D-Otsu algorithm is proposed. The rail image is denoised and enhanced by adaptive fractional calculus, and then the rail image is segmented by FPSO 2D-Otsu algorithm. In order to verify the accuracy of the algorithm, the proposed algorithm is compared with PSO 2D-Otsu image segmentation algorithm. The experimental results show that the accuracy of FPSO 2D-Otsu algorithm in rail image segmentation is improved from 48.76% to 83.59% compared with PSO 2D-Otsu algorithm.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"226 1","pages":"129690A - 129690A-4"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Microexpression recognition algorithm based on multi feature fusion 基于多特征融合的微表情识别算法
BaiYang Xiang, BoKai Li, Huaijuan Zang, Zeliang Zhao, Shu Zhan
{"title":"Microexpression recognition algorithm based on multi feature fusion","authors":"BaiYang Xiang, BoKai Li, Huaijuan Zang, Zeliang Zhao, Shu Zhan","doi":"10.1117/12.3014469","DOIUrl":"https://doi.org/10.1117/12.3014469","url":null,"abstract":"Video facial micro expression recognition is difficult to extract features due to its short duration and small action amplitude. In order to better combine temporal and spatial information of video, the whole model is divided into local attention module, global attention module and temporal module. First, the local attention module intercepts the key areas and sends them to the network with channel attention after processing; Then the global attention module sends the data into the network with spatial attention after random erasure avoiding key areas; Finally, the temporal module sends the micro expression occurrence frame to the network with temporal shift module and spatial attention after processing; Finally, the classification results are obtained through three full connection layers after feature fusion. The experiment is tested based on CASMEⅡ dataset,After five-fold Cross Validation, the average accuracy rate is 76.15, the unweighted F1 value is 0.691.Compared with the mainstream algorithm, this method has improvement.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"12 6","pages":"1296908 - 1296908-10"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140512112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rapid identification of adulterated rice using fusion of near-infrared spectroscopy and machine vision data: the combination of feature optimization and nonlinear modeling 利用近红外光谱和机器视觉数据的融合快速识别掺假大米:特征优化与非线性建模的结合
Chenxuan Song, Jinming Liu, Chunqi Wang, Zhijiang Li
{"title":"Rapid identification of adulterated rice using fusion of near-infrared spectroscopy and machine vision data: the combination of feature optimization and nonlinear modeling","authors":"Chenxuan Song, Jinming Liu, Chunqi Wang, Zhijiang Li","doi":"10.1117/12.3014380","DOIUrl":"https://doi.org/10.1117/12.3014380","url":null,"abstract":"Rice is susceptible to mold and mildew during storage. Metabolites such as aflatoxin produced during mildew will do great harm to consumers. To meet the need for rapid detection of normal rice adulterated with moldy rice, a rapid identification method of adulterated rice was established based on data fusion of near-infrared spectroscopy and machine vision. Using competitive adaptive reweighted sampling (CARS), genetic algorithm (GA), and least angle regression (LARS) for spectral and image feature extraction, combined with support vector classification (SVC), random forest (RF), and gradient boosting tree (GBT) nonlinear discriminant models, and use Bayesian search to optimize modeling parameters. The results show that the GBT fusion data model established by LARS optimization of spectral and image feature variables has the highest discrimination accuracy, with recognition accuracy rates of 100.00% and 98.11% for its training and testing sets, respectively. The discrimination performance is significantly improved compared to single near-infrared spectroscopy and machine vision. The results indicate that rapid identification of adulterated rice based on near-infrared spectroscopy and machine vision data fusion technology is feasible, providing theoretical support for the development of online identification equipment for adulterated rice.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"63 2","pages":"129692J - 129692J-16"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast and high quality neural radiance fields reconstruction based on depth regularization 基于深度正则化的快速、高质量神经辐射场重建
Bin Zhu, Gaoxiang He, Bo Xie, Yi Chen, Yaoxuan Zhu, Liuying Chen
{"title":"Fast and high quality neural radiance fields reconstruction based on depth regularization","authors":"Bin Zhu, Gaoxiang He, Bo Xie, Yi Chen, Yaoxuan Zhu, Liuying Chen","doi":"10.1117/12.3014528","DOIUrl":"https://doi.org/10.1117/12.3014528","url":null,"abstract":"Although the Neural Radiance Fields (NeRF) has been shown to achieve high-quality novel view synthesis, existing models still perform poorly in some scenarios, particularly unbounded scenes. These models either require excessively long training times or produce suboptimal synthesis results. Consequently, we propose SD-NeRF, which consists of a compact neural radiance field model and self-supervised depth regularization. Experimental results demonstrate that SDNeRF can shorten training time by over 20 times compared to Mip-NeRF360 without compromising reconstruction accuracy.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"43 3","pages":"129692F - 129692F-9"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Combinatorial action recognition based on causal segment intervention 基于因果片段干预的组合动作识别
Xiaozhou Sun
{"title":"Combinatorial action recognition based on causal segment intervention","authors":"Xiaozhou Sun","doi":"10.1117/12.3014465","DOIUrl":"https://doi.org/10.1117/12.3014465","url":null,"abstract":"Combinatorial action recognition has recently attracted the attention of researchers in the field of computer vision. It focuses on the effective representation and discrimination of spatio-temporal interactions occurring between different actions and objects in video data. Existing work tends to strengthen the framework's object recognition capabilities and relationship modeling capabilities, e.g., attention mechanisms, and graph structures. We find that existing algorithms can be influenced by interaction-independent video segments in a video, misleading the algorithm to focus on additional information in the vision. For the algorithm to analyze the spatio-temporal interactions of causally related video segments in a video, a Causal Slice Recognition Network (CSRN) is proposed. This method can effectively remove the interference of video background segments by explicitly recognizing and extracting the causally related segments in the video. We validate the method on the Something-else dataset and obtain the best results.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"252 1","pages":"129692W - 129692W-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信