CAAI Transactions on Intelligence Technology最新文献

筛选
英文 中文
Artificial intelligence assisted prediction of optimum operating conditions of shell and tube heat exchangers: A grey-box approach
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-10-24 DOI: 10.1049/cit2.12393
Zahid Ullah, Iftikhar Ahmad, Abdul Samad, Husnain Saghir, Farooq Ahmad, Manabu Kano, Hakan Caliskan, Nesrin Caliskan, Hiki Hong
{"title":"Artificial intelligence assisted prediction of optimum operating conditions of shell and tube heat exchangers: A grey-box approach","authors":"Zahid Ullah,&nbsp;Iftikhar Ahmad,&nbsp;Abdul Samad,&nbsp;Husnain Saghir,&nbsp;Farooq Ahmad,&nbsp;Manabu Kano,&nbsp;Hakan Caliskan,&nbsp;Nesrin Caliskan,&nbsp;Hiki Hong","doi":"10.1049/cit2.12393","DOIUrl":"https://doi.org/10.1049/cit2.12393","url":null,"abstract":"<p>In this study, a Grey-box (GB) model was developed to predict the optimum mass flow rates of inlet streams of a Shell and Tube Heat Exchanger (STHE) under varying process conditions. Aspen Exchanger Design and Rating (Aspen-EDR) was initially used to construct a first principle model (FP) of the STHE using industrial data. The Genetic Algorithm (GA) was incorporated into the FP model to attain the minimum exit temperature for the hot kerosene process stream under varying process conditions. A dataset comprised of optimum process conditions was generated through FP-GA integration and was utilised to develop an Artificial Neural Networks (ANN) model. Subsequently, the ANN model was merged with the FP model by substituting the GA, to form a GB model. The developed GB model, that is, ANN and FP integration, achieved higher effectiveness and lower outlet temperature than those derived through the standalone FP model. Performance of the GB framework was also comparable to the FP-GA approach but it significantly reduced the computation time required for estimating the optimum process conditions. The proposed GB-based method improved the STHE's ability to extract energy from the process stream and strengthened its resilience to cope with diverse process conditions.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 2","pages":"349-358"},"PeriodicalIF":8.4,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12393","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143856816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving long-tail classification via decoupling and regularisation
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-10-24 DOI: 10.1049/cit2.12374
Shuzheng Gao, Chaozheng Wang, Cuiyun Gao, Wenjian Luo, Peiyi Han, Qing Liao, Guandong Xu
{"title":"Improving long-tail classification via decoupling and regularisation","authors":"Shuzheng Gao,&nbsp;Chaozheng Wang,&nbsp;Cuiyun Gao,&nbsp;Wenjian Luo,&nbsp;Peiyi Han,&nbsp;Qing Liao,&nbsp;Guandong Xu","doi":"10.1049/cit2.12374","DOIUrl":"https://doi.org/10.1049/cit2.12374","url":null,"abstract":"<p>Real-world data always exhibit an imbalanced and long-tailed distribution, which leads to poor performance for neural network-based classification. Existing methods mainly tackle this problem by reweighting the loss function or rebalancing the classifier. However, one crucial aspect overlooked by previous research studies is the imbalanced feature space problem caused by the imbalanced angle distribution. In this paper, the authors shed light on the significance of the angle distribution in achieving a balanced feature space, which is essential for improving model performance under long-tailed distributions. Nevertheless, it is challenging to effectively balance both the classifier norms and angle distribution due to problems such as the low feature norm. To tackle these challenges, the authors first thoroughly analyse the classifier and feature space by decoupling the classification logits into three key components: classifier norm (i.e. the magnitude of the classifier vector), feature norm (i.e. the magnitude of the feature vector), and cosine similarity between the classifier vector and feature vector. In this way, the authors analyse the change of each component in the training process and reveal three critical problems that should be solved, that is, the imbalanced angle distribution, the lack of feature discrimination, and the low feature norm. Drawing from this analysis, the authors propose a novel loss function that incorporates hyperspherical uniformity, additive angular margin, and feature norm regularisation. Each component of the loss function addresses a specific problem and synergistically contributes to achieving a balanced classifier and feature space. The authors conduct extensive experiments on three popular benchmark datasets including CIFAR-10/100-LT, ImageNet-LT, and iNaturalist 2018. The experimental results demonstrate that the authors’ loss function outperforms several previous state-of-the-art methods in addressing the challenges posed by imbalanced and long-tailed datasets, that is, by improving upon the best-performing baselines on CIFAR-100-LT by 1.34, 1.41, 1.41 and 1.33, respectively.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 1","pages":"62-71"},"PeriodicalIF":8.4,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12374","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143536070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning-based tracking control of AUV: Mixed policy improvement and game-based disturbance rejection
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-10-21 DOI: 10.1049/cit2.12372
Jun Ye, Hongbo Gao, Manjiang Hu, Yougang Bian, Qingjia Cui, Xiaohui Qin, Rongjun Ding
{"title":"Learning-based tracking control of AUV: Mixed policy improvement and game-based disturbance rejection","authors":"Jun Ye,&nbsp;Hongbo Gao,&nbsp;Manjiang Hu,&nbsp;Yougang Bian,&nbsp;Qingjia Cui,&nbsp;Xiaohui Qin,&nbsp;Rongjun Ding","doi":"10.1049/cit2.12372","DOIUrl":"https://doi.org/10.1049/cit2.12372","url":null,"abstract":"<p>A mixed adaptive dynamic programming (ADP) scheme based on zero-sum game theory is developed to address optimal control problems of autonomous underwater vehicle (AUV) systems subject to disturbances and safe constraints. By combining prior dynamic knowledge and actual sampled data, the proposed approach effectively mitigates the defect caused by the inaccurate dynamic model and significantly improves the training speed of the ADP algorithm. Initially, the dataset is enriched with sufficient reference data collected based on a nominal model without considering modelling bias. Also, the control object interacts with the real environment and continuously gathers adequate sampled data in the dataset. To comprehensively leverage the advantages of model-based and model-free methods during training, an adaptive tuning factor is introduced based on the dataset that possesses model-referenced information and conforms to the distribution of the real-world environment, which balances the influence of model-based control law and data-driven policy gradient on the direction of policy improvement. As a result, the proposed approach accelerates the learning speed compared to data-driven methods, concurrently also enhancing the tracking performance in comparison to model-based control methods. Moreover, the optimal control problem under disturbances is formulated as a zero-sum game, and the actor-critic-disturbance framework is introduced to approximate the optimal control input, cost function, and disturbance policy, respectively. Furthermore, the convergence property of the proposed algorithm based on the value iteration method is analysed. Finally, an example of AUV path following based on the improved line-of-sight guidance is presented to demonstrate the effectiveness of the proposed method.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 2","pages":"510-528"},"PeriodicalIF":8.4,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12372","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143856812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-sensor missile-borne LiDAR point cloud data augmentation based on Monte Carlo distortion simulation
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-10-17 DOI: 10.1049/cit2.12389
Luda Zhao, Yihua Hu, Fei Han, Zhenglei Dou, Shanshan Li, Yan Zhang, Qilong Wu
{"title":"Multi-sensor missile-borne LiDAR point cloud data augmentation based on Monte Carlo distortion simulation","authors":"Luda Zhao,&nbsp;Yihua Hu,&nbsp;Fei Han,&nbsp;Zhenglei Dou,&nbsp;Shanshan Li,&nbsp;Yan Zhang,&nbsp;Qilong Wu","doi":"10.1049/cit2.12389","DOIUrl":"https://doi.org/10.1049/cit2.12389","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <p>Large-scale point cloud datasets form the basis for training various deep learning networks and achieving high-quality network processing tasks. Due to the diversity and robustness constraints of the data, data augmentation (DA) methods are utilised to expand dataset diversity and scale. However, due to the complex and distinct characteristics of LiDAR point cloud data from different platforms (such as missile-borne and vehicular LiDAR data), directly applying traditional 2D visual domain DA methods to 3D data can lead to networks trained using this approach not robustly achieving the corresponding tasks. To address this issue, the present study explores DA for missile-borne LiDAR point cloud using a Monte Carlo (MC) simulation method that closely resembles practical application. Firstly, the model of multi-sensor imaging system is established, taking into account the joint errors arising from the platform itself and the relative motion during the imaging process. A distortion simulation method based on MC simulation for augmenting missile-borne LiDAR point cloud data is proposed, underpinned by an analysis of combined errors between different modal sensors, achieving high-quality augmentation of point cloud data. The effectiveness of the proposed method in addressing imaging system errors and distortion simulation is validated using the imaging scene dataset constructed in this paper. Comparative experiments between the proposed point cloud DA algorithm and the current state-of-the-art algorithms in point cloud detection and single object tracking tasks demonstrate that the proposed method can improve the network performance obtained from unaugmented datasets by over 17.3% and 17.9%, surpassing SOTA performance of current point cloud DA algorithms.</p>\u0000 </section>\u0000 </div>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 1","pages":"300-316"},"PeriodicalIF":8.4,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12389","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143533353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Resource-adaptive and OOD-robust inference of deep neural networks on IoT devices
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-10-09 DOI: 10.1049/cit2.12384
Cailen Robertson, Ngoc Anh Tong, Thanh Toan Nguyen, Quoc Viet Hung Nguyen, Jun Jo
{"title":"Resource-adaptive and OOD-robust inference of deep neural networks on IoT devices","authors":"Cailen Robertson,&nbsp;Ngoc Anh Tong,&nbsp;Thanh Toan Nguyen,&nbsp;Quoc Viet Hung Nguyen,&nbsp;Jun Jo","doi":"10.1049/cit2.12384","DOIUrl":"https://doi.org/10.1049/cit2.12384","url":null,"abstract":"<p>Efficiently executing inference tasks of deep neural networks on devices with limited resources poses a significant load in IoT systems. To alleviate the load, one innovative method is branching that adds extra layers with classification exits to a pre-trained model, enabling inputs with high-confidence predictions to exit early, thus reducing inference cost. However, branching networks, not originally tailored for IoT environments, are susceptible to noisy and out-of-distribution (OOD) data, and they demand additional training for optimal performance. The authors introduce BrevisNet, a novel branching methodology designed for creating on-device branching models that are both resource-adaptive and noise-robust for IoT applications. The method leverages the refined uncertainty estimation capabilities of Dirichlet distributions for classification predictions, combined with the superior OOD detection of energy-based models. The authors propose a unique training approach and thresholding technique that enhances the precision of branch predictions, offering robustness against noise and OOD inputs. The findings demonstrate that BrevisNet surpasses existing branching techniques in training efficiency, accuracy, overall performance, and robustness.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 1","pages":"115-133"},"PeriodicalIF":8.4,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12384","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A criterion for selecting the appropriate one from the trained models for model-based offline policy evaluation
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-10-09 DOI: 10.1049/cit2.12376
Chongchong Li, Yue Wang, Zhi-Ming Ma, Yuting Liu
{"title":"A criterion for selecting the appropriate one from the trained models for model-based offline policy evaluation","authors":"Chongchong Li,&nbsp;Yue Wang,&nbsp;Zhi-Ming Ma,&nbsp;Yuting Liu","doi":"10.1049/cit2.12376","DOIUrl":"https://doi.org/10.1049/cit2.12376","url":null,"abstract":"<p>Offline policy evaluation, evaluating and selecting complex policies for decision-making by only using offline datasets is important in reinforcement learning. At present, the model-based offline policy evaluation (MBOPE) is widely welcomed because of its easy to implement and good performance. MBOPE directly approximates the unknown value of a given policy using the Monte Carlo method given the estimated transition and reward functions of the environment. Usually, multiple models are trained, and then one of them is selected to be used. However, a challenge remains in selecting an appropriate model from those trained for further use. The authors first analyse the upper bound of the difference between the approximated value and the unknown true value. Theoretical results show that this difference is related to the trajectories generated by the given policy on the learnt model and the prediction error of the transition and reward functions at these generated data points. Based on the theoretical results, a new criterion is proposed to tell which trained model is better suited for evaluating the given policy. At last, the effectiveness of the proposed criterion is demonstrated on both benchmark and synthetic offline datasets.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 1","pages":"223-234"},"PeriodicalIF":8.4,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12376","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pre-trained SAM as data augmentation for image segmentation
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-10-08 DOI: 10.1049/cit2.12381
Junjun Wu, Yunbo Rao, Shaoning Zeng, Bob Zhang
{"title":"Pre-trained SAM as data augmentation for image segmentation","authors":"Junjun Wu,&nbsp;Yunbo Rao,&nbsp;Shaoning Zeng,&nbsp;Bob Zhang","doi":"10.1049/cit2.12381","DOIUrl":"https://doi.org/10.1049/cit2.12381","url":null,"abstract":"<p>Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset. Initially, data augmentation mainly involved some simple transformations of images. Later, in order to increase the diversity and complexity of data, more advanced methods appeared and evolved to sophisticated generative models. However, these methods required a mass of computation of training or searching. In this paper, a novel training-free method that utilises the Pre-Trained Segment Anything Model (SAM) model as a data augmentation tool (PTSAM-DA) is proposed to generate the augmented annotations for images. Without the need for training, it obtains prompt boxes from the original annotations and then feeds the boxes to the pre-trained SAM to generate diverse and improved annotations. In this way, annotations are augmented more ingenious than simple manipulations without incurring huge computation for training a data augmentation model. Multiple comparative experiments on three datasets are conducted, including an in-house dataset, ADE20K and COCO2017. On this in-house dataset, namely Agricultural Plot Segmentation Dataset, maximum improvements of 3.77% and 8.92% are gained in two mainstream metrics, mIoU and mAcc, respectively. Consequently, large vision models like SAM are proven to be promising not only in image segmentation but also in data augmentation.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 1","pages":"268-282"},"PeriodicalIF":8.4,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12381","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143536006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Grey-box modelling for estimation of optimum cut point temperature of crude distillation column
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-10-07 DOI: 10.1049/cit2.12386
Junaid Shahzad, Iftikhar Ahmad, Muhammad Ahsan, Farooq Ahmad, Husnain Saghir, Manabu Kano, Hakan Caliskan, Hiki Hong
{"title":"Grey-box modelling for estimation of optimum cut point temperature of crude distillation column","authors":"Junaid Shahzad,&nbsp;Iftikhar Ahmad,&nbsp;Muhammad Ahsan,&nbsp;Farooq Ahmad,&nbsp;Husnain Saghir,&nbsp;Manabu Kano,&nbsp;Hakan Caliskan,&nbsp;Hiki Hong","doi":"10.1049/cit2.12386","DOIUrl":"https://doi.org/10.1049/cit2.12386","url":null,"abstract":"<p>A grey-box modelling framework was developed for the estimation of cut point temperature of a crude distillation unit (CDU) under uncertainty in crude composition and process conditions. First principle (FP) model of CDU was developed for Pakistani crudes from Zamzama and Kunnar fields. A hybrid methodology based on the integration of Taguchi method and genetic algorithm (GA) was employed to estimate the optimal cut point temperature for various sets of process variables. Optimised datasets were utilised to develop an artificial neural networks (ANN) model for the prediction of optimum values of cut points. The ANN model was then used to replace the hybrid framework of the Taguchi method and the GA. The integration of the ANN and FP model makes it a grey-box (GB) model. For the case of Zamama crude, the GB model helped in the decrease of up to 38.93% in energy required per kilo barrel of diesel and an 8.2% increase in diesel production compared to the stand-alone FP model under uncertainty. Similarly, for Kunnar crude, up to 18.87% decrease in energy required per kilo barrel of diesel and a 33.96% increase in diesel production was observed in comparison to the stand-alone FP model.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 1","pages":"160-174"},"PeriodicalIF":8.4,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12386","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143533415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Terahertz image denoising via multiscale hybrid-convolution residual network
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-10-02 DOI: 10.1049/cit2.12380
Heng Wu, Zijie Guo, Chunhua He, Shaojuan Luo, Bofang Song
{"title":"Terahertz image denoising via multiscale hybrid-convolution residual network","authors":"Heng Wu,&nbsp;Zijie Guo,&nbsp;Chunhua He,&nbsp;Shaojuan Luo,&nbsp;Bofang Song","doi":"10.1049/cit2.12380","DOIUrl":"https://doi.org/10.1049/cit2.12380","url":null,"abstract":"<p>Terahertz imaging technology has great potential applications in areas, such as remote sensing, navigation, security checks, and so on. However, terahertz images usually have the problems of heavy noises and low resolution. Previous terahertz image denoising methods are mainly based on traditional image processing methods, which have limited denoising effects on the terahertz noise. Existing deep learning-based image denoising methods are mostly used in natural images and easily cause a large amount of detail loss when denoising terahertz images. Here, a residual-learning-based multiscale hybrid-convolution residual network (MHRNet) is proposed for terahertz image denoising, which can remove noises while preserving detail features in terahertz images. Specifically, a multiscale hybrid-convolution residual block (MHRB) is designed to extract rich detail features and local prediction residual noise from terahertz images. Specifically, MHRB is a residual structure composed of a multiscale dilated convolution block, a bottleneck layer, and a multiscale convolution block. MHRNet uses the MHRB and global residual learning to achieve terahertz image denoising. Ablation studies are performed to validate the effectiveness of MHRB. A series of experiments are conducted on the public terahertz image datasets. The experimental results demonstrate that MHRNet has an excellent denoising effect on synthetic and real noisy terahertz images. Compared with existing methods, MHRNet achieves comprehensive competitive results.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 1","pages":"235-252"},"PeriodicalIF":8.4,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12380","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143533457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bilingual phrase induction with local hard negative sampling
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-10-01 DOI: 10.1049/cit2.12383
Hailong Cao, Hualin Miao, Weixuan Wang, Liangyou Li, Wei Peng, Tiejun Zhao
{"title":"Bilingual phrase induction with local hard negative sampling","authors":"Hailong Cao,&nbsp;Hualin Miao,&nbsp;Weixuan Wang,&nbsp;Liangyou Li,&nbsp;Wei Peng,&nbsp;Tiejun Zhao","doi":"10.1049/cit2.12383","DOIUrl":"https://doi.org/10.1049/cit2.12383","url":null,"abstract":"<p>Bilingual lexicon induction focuses on learning word translation pairs, also known as bitexts, from monolingual corpora by establishing a mapping between the source and target embedding spaces. Despite recent advancements, bilingual lexicon induction is limited to inducing bitexts consisting of individual words, lacking the ability to handle semantics-rich phrases. To bridge this gap and support downstream cross-lingual tasks, it is practical to develop a method for bilingual phrase induction that extracts bilingual phrase pairs from monolingual corpora without relying on cross-lingual knowledge. In this paper, the authors propose a novel phrase embedding training method based on the skip-gram structure. Specifically, a local hard negative sampling strategy that utilises negative samples of central tokens in sliding windows to enhance phrase embedding learning is introduced. The proposed method achieves competitive or superior performance compared to baseline approaches, with exceptional results recorded for distant languages. Additionally, we develop a phrase representation learning method that leverages multilingual pre-trained language models. These mPLMs-based representations can be combined with the above-mentioned static phrase embeddings to further improve the accuracy of the bilingual phrase induction task. We manually construct a dataset of bilingual phrase pairs and integrate it with MUSE to facilitate the bilingual phrase induction task.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 1","pages":"147-159"},"PeriodicalIF":8.4,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12383","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143534010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信