{"title":"Practice challenge recommendations in online judge using implicit rating extraction and utility sequence patterns","authors":"Ramesh P Natarajan, Kannimuthu S, Bhanu D","doi":"10.1108/dta-10-2023-0688","DOIUrl":"https://doi.org/10.1108/dta-10-2023-0688","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>The existing traditional recommendations based on content-based filtering (CBF), collaborative filtering (CF) and hybrid approaches are inadequate for recommending practice challenges in programming online judge (POJ). These systems only consider the preferences of the target users or similar users to recommend items. In the learning environment, recommender systems should consider the learning path, knowledge level and ability of the learner. Another major problem in POJ is the learners don't give ratings to practice challenges like e-commerce and video streaming portals. This purpose of the proposed approach is to overcome the abovementioned shortcomings.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>To achieve the context-aware practice challenge recommendation, the data preparation techniques including implicit rating extraction, data preprocessing to remove outliers, sequence-based learner clustering and utility sequence pattern mining approaches are used in the proposed approach. The approach ensures that the recommender system considers the knowledge level, learning path and learning goals of the learner to recommend practice challenges.</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>Experiments on practice challenge recommendations conducted using real-world POJ dataset show that the proposed system outperforms other traditional approaches. The experiment also demonstrates that the proposed system is recommending challenges based on the learner's current context. The implicit rating extracted using the proposed approach works accurately in the recommender system.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>The proposed system contains the following novel approaches to address the lack of rating and context-aware recommendations. The mathematical model was used to extract ratings from learner submissions. The statistical approach was used in data preprocessing. The sequence similarity-based learner clustering was used in transition matrix. Utilizing the rating as a utility in the USPAN algorithm provides useful insights into learner–challenge relationships.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":"2 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel neural network architecture and cross-model transfer learning for multi-task autonomous driving","authors":"Youwei Li, Jian Qu","doi":"10.1108/dta-08-2022-0307","DOIUrl":"https://doi.org/10.1108/dta-08-2022-0307","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>The purpose of this research is to achieve multi-task autonomous driving by adjusting the network architecture of the model. Meanwhile, after achieving multi-task autonomous driving, the authors found that the trained neural network model performs poorly in untrained scenarios. Therefore, the authors proposed to improve the transfer efficiency of the model for new scenarios through transfer learning.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>First, the authors achieved multi-task autonomous driving by training a model combining convolutional neural network and different structured long short-term memory (LSTM) layers. Second, the authors achieved fast transfer of neural network models in new scenarios by cross-model transfer learning. Finally, the authors combined data collection and data labeling to improve the efficiency of deep learning. Furthermore, the authors verified that the model has good robustness through light and shadow test.</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>This research achieved road tracking, real-time acceleration–deceleration, obstacle avoidance and left/right sign recognition. The model proposed by the authors (UniBiCLSTM) outperforms the existing models tested with model cars in terms of autonomous driving performance. Furthermore, the CMTL-UniBiCL-RL model trained by the authors through cross-model transfer learning improves the efficiency of model adaptation to new scenarios. Meanwhile, this research proposed an automatic data annotation method, which can save 1/4 of the time for deep learning.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>This research provided novel solutions in the achievement of multi-task autonomous driving and neural network model scenario for transfer learning. The experiment was achieved on a single camera with an embedded chip and a scale model car, which is expected to simplify the hardware for autonomous driving.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":"6 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140579652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tongzheng Pu, Chongxing Huang, Haimo Zhang, Jingjing Yang, Ming Huang
{"title":"Application of deep learning model incorporating domain knowledge in international migration forecasting","authors":"Tongzheng Pu, Chongxing Huang, Haimo Zhang, Jingjing Yang, Ming Huang","doi":"10.1108/dta-08-2023-0523","DOIUrl":"https://doi.org/10.1108/dta-08-2023-0523","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>Forecasting population movement trends is crucial for implementing effective policies to regulate labor force growth and understand demographic changes. Combining migration theory expertise and neural network technology can bring a fresh perspective to international migration forecasting research.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>This study proposes a conditional generative adversarial neural network model incorporating the migration knowledge – conditional generative adversarial network (MK-CGAN). By using the migration knowledge to design the parameters, MK-CGAN can effectively address the limited data problem, thereby enhancing the accuracy of migration forecasts.</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>The model was tested by forecasting migration flows between different countries and had good generalizability and validity. The results are robust as the proposed solutions can achieve lesser mean absolute error, mean squared error, root mean square error, mean absolute percentage error and <em>R</em><sup>2</sup> values, reaching 0.9855 compared to long short-term memory (LSTM), gated recurrent unit, generative adversarial network (GAN) and the traditional gravity model.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>This study is significant because it demonstrates a highly effective technique for predicting international migration using conditional GANs. By incorporating migration knowledge into our models, we can achieve prediction accuracy, gaining valuable insights into the differences between various model characteristics. We used SHapley Additive exPlanations to enhance our understanding of these differences and provide clear and concise explanations for our model predictions. The results demonstrated the theoretical significance and practical value of the MK-CGAN model in predicting international migration.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":"10 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140579647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring cross-cultural disparities in tourists' perceived images: a text mining and sentiment analysis study using LDA and BERT-BILSTM models","authors":"Qiuying Chen, Ronghui Liu, Qingquan Jiang, Shangyue Xu","doi":"10.1108/dta-10-2023-0645","DOIUrl":"https://doi.org/10.1108/dta-10-2023-0645","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>Tourists with different cultural backgrounds think and behave differently. Accurately capturing and correctly understanding cultural differences will help tourist destinations in product/service planning, marketing communication and attracting and retaining tourists. This research employs Hofstede's cultural dimensions theory to analyse the variations in destination image perceptions of Chinese-speaking and English-speaking tourists to Xiamen, a prominent tourist attraction in China.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>The evaluation utilizes a two-stage approach, incorporating LDA and BERT-BILSTM models. By leveraging text mining, sentiment analysis and <em>t</em>-tests, this research investigates the variations in tourists' perceptions of Xiamen across different cultures.</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>The results reveal that cultural disparities significantly impact tourists' perceived image of Xiamen, particularly regarding their preferences for renowned tourist destinations and the factors influencing their travel experience.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>This research pioneers applying natural language processing methods and machine learning techniques to affirm the substantial differences in the perceptions of tourist destinations among Chinese-speaking and English-speaking tourists based on Hofstede's cultural theory. The findings furnish theoretical insights for destination marketing organizations to target diverse cultural tourists through precise marketing strategies and illuminate the practical application of Hofstede's cultural theory in tourism and hospitality.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":"273 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140170304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Faguo Liu, Qian Zhang, Tao Yan, Bin Wang, Ying Gao, Jiaqi Hou, Feiniu Yuan
{"title":"Light field image coding using a residual channel attention network–based view synthesis","authors":"Faguo Liu, Qian Zhang, Tao Yan, Bin Wang, Ying Gao, Jiaqi Hou, Feiniu Yuan","doi":"10.1108/dta-03-2023-0071","DOIUrl":"https://doi.org/10.1108/dta-03-2023-0071","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>Light field images (LFIs) have gained popularity as a technology to increase the field of view (FoV) of plenoptic cameras since they can capture information about light rays with a large FoV. Wide FoV causes light field (LF) data to increase rapidly, which restricts the use of LF imaging in image processing, visual analysis and user interface. Effective LFI coding methods become of paramount importance. This paper aims to eliminate more redundancy by exploring sparsity and correlation in the angular domain of LFIs, as well as mitigate the loss of perceptual quality of LFIs caused by encoding.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>This work proposes a new efficient LF coding framework. On the coding side, a new sampling scheme and a hierarchical prediction structure are used to eliminate redundancy in the LFI's angular and spatial domains. At the decoding side, high-quality dense LF is reconstructed using a view synthesis method based on the residual channel attention network (RCAN).</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>In three different LF datasets, our proposed coding framework not only reduces the transmitted bit rate but also maintains a higher view quality than the current more advanced methods.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>(1) A new sampling scheme is designed to synthesize high-quality LFIs while better ensuring LF angular domain sparsity. (2) To further eliminate redundancy in the spatial domain, new ranking schemes and hierarchical prediction structures are designed. (3) A synthetic network based on RCAN and a novel loss function is designed to mitigate the perceptual quality loss due to the coding process.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":"33 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139920787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"False alarm detection in intensive care unit for monitoring arrhythmia condition using bio-signals","authors":"Aleena Swetapadma, Tishya Manna, Maryam Samami","doi":"10.1108/dta-08-2023-0437","DOIUrl":"https://doi.org/10.1108/dta-08-2023-0437","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>A novel method has been proposed to reduce the false alarm rate of arrhythmia patients regarding life-threatening conditions in the intensive care unit. In this purpose, the atrial blood pressure, photoplethysmogram (PLETH), electrocardiogram (ECG) and respiratory (RESP) signals are considered as input signals.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>Three machine learning approaches feed-forward artificial neural network (ANN), ensemble learning method and <em>k</em>-nearest neighbors searching methods are used to detect the false alarm. The proposed method has been implemented using Arduino and MATLAB/SIMULINK for real-time ICU-arrhythmia patients' monitoring data.</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>The proposed method detects the false alarm with an accuracy of 99.4 per cent during asystole, 100 per cent during ventricular flutter, 98.5 per cent during ventricular tachycardia, 99.6 per cent during bradycardia and 100 per cent during tachycardia. The proposed framework is adaptive in many scenarios, easy to implement, computationally friendly and highly accurate and robust with overfitting issue.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>As ECG signals consisting with PQRST wave, any deviation from the normal pattern may signify some alarming conditions. These deviations can be utilized as input to classifiers for the detection of false alarms; hence, there is no need for other feature extraction techniques. Feed-forward ANN with the Lavenberg–Marquardt algorithm has shown higher rate of convergence than other neural network algorithms which helps provide better accuracy with no overfitting.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":"88 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139758104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Community relations discovery methods for users in Fancircle based on sentiment analysis in China","authors":"Kai Wang","doi":"10.1108/dta-09-2023-0570","DOIUrl":"https://doi.org/10.1108/dta-09-2023-0570","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>The identification of network user relationship in Fancircle contributes to quantifying the violence index of user text, mining the internal correlation of network behaviors among users, which provides necessary data support for the construction of knowledge graph.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>A correlation identification method based on sentiment analysis (CRDM-SA) is put forward by extracting user semantic information, as well as introducing violent sentiment membership. To be specific, the topic of the implementation of topology mapping in the community can be obtained based on self-built field of violent sentiment dictionary (VSD) by extracting user text information. Afterward, the violence index of the user text is calculated to quantify the fuzzy sentiment representation between the user and the topic. Finally, the multi-granularity violence association rules mining of user text is realized by constructing violence fuzzy concept lattice.</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>It is helpful to reveal the internal relationship of online violence under complex network environment. In that case, the sentiment dependence of users can be characterized from a granular perspective.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>The membership degree of violent sentiment into user relationship recognition in Fancircle community is introduced, and a text sentiment association recognition method based on VSD is proposed. By calculating the value of violent sentiment in the user text, the annotation of violent sentiment in the topic dimension of the text is achieved, and the partial order relation between fuzzy concepts of violence under the effective confidence threshold is utilized to obtain the association relation.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":"85 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139578943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ping Huang, Haitao Ding, Hong Chen, Jianwei Zhang, Zhenjia Sun
{"title":"A Bayesian Inference-based approach for extracting driving data with implicit intention","authors":"Ping Huang, Haitao Ding, Hong Chen, Jianwei Zhang, Zhenjia Sun","doi":"10.1108/dta-03-2023-0074","DOIUrl":"https://doi.org/10.1108/dta-03-2023-0074","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>The growing availability of naturalistic driving datasets (NDDs) presents a valuable opportunity to develop various models for autonomous driving. However, while current NDDs include data on vehicles with and without intended driving behavior changes, they do not explicitly demonstrate a type of data on vehicles that intend to change their driving behavior but do not execute the behaviors because of safety, efficiency, or other factors. This missing data is essential for autonomous driving decisions. This study aims to extract the driving data with implicit intentions to support the development of decision-making models.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>According to Bayesian inference, drivers who have the same intended changes likely share similar influencing factors and states. Building on this principle, this study proposes an approach to extract data on vehicles that intended to execute specific behaviors but failed to do so. This is achieved by computing driving similarities between the candidate vehicles and benchmark vehicles with incorporation of the standard similarity metrics, which takes into account information on the surrounding vehicles' location topology and individual vehicle motion states. By doing so, the method enables a more comprehensive analysis of driving behavior and intention.</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>The proposed method is verified on the Next Generation SIMulation dataset (NGSim), which confirms its ability to reveal similarities between vehicles executing similar behaviors during the decision-making process in nature. The approach is also validated using simulated data, achieving an accuracy of 96.3 per cent in recognizing vehicles with specific driving behavior intentions that are not executed.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>This study provides an innovative approach to extract driving data with implicit intentions and offers strong support to develop data-driven decision-making models for autonomous driving. With the support of this approach, the development of autonomous vehicles can capture more real driving experience from human drivers moving towards a safer and more efficient future.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":"4 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139496379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ID-SF-Fusion: a cooperative model of intent detection and slot filling for natural language understanding","authors":"Meng Zhu, Xiaolong Xu","doi":"10.1108/dta-03-2023-0088","DOIUrl":"https://doi.org/10.1108/dta-03-2023-0088","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>Intent detection (ID) and slot filling (SF) are two important tasks in natural language understanding. ID is to identify the main intent of a paragraph of text. The goal of SF is to extract the information that is important to the intent from the input sentence. However, most of the existing methods use sentence-level intention recognition, which has the risk of error propagation, and the relationship between intention recognition and SF is not explicitly modeled. Aiming at this problem, this paper proposes a collaborative model of ID and SF for intelligent spoken language understanding called ID-SF-Fusion.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>ID-SF-Fusion uses Bidirectional Encoder Representation from Transformers (BERT) and Bidirectional Long Short-Term Memory (BiLSTM) to extract effective word embedding and context vectors containing the whole sentence information respectively. Fusion layer is used to provide intent–slot fusion information for SF task. In this way, the relationship between ID and SF task is fully explicitly modeled. This layer takes the result of ID and slot context vectors as input to obtain the fusion information which contains both ID result and slot information. Meanwhile, to further reduce error propagation, we use word-level ID for the ID-SF-Fusion model. Finally, two tasks of ID and SF are realized by joint optimization training.</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>We conducted experiments on two public datasets, Airline Travel Information Systems (ATIS) and Snips. The results show that the Intent ACC score and Slot F1 score of ID-SF-Fusion on ATIS and Snips are 98.0 per cent and 95.8 per cent, respectively, and the two indicators on Snips dataset are 98.6 per cent and 96.7 per cent, respectively. These models are superior to slot-gated, SF-ID NetWork, stack-Prop and other models. In addition, ablation experiments were performed to further analyze and discuss the proposed model.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>This paper uses word-level intent recognition and introduces intent information into the SF process, which is a significant improvement on both data sets.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":"18 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139501012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A hybrid method for forecasting coal price based on ensemble learning and deep learning with data decomposition and data enhancement","authors":"Jing Tang, Yida Guo, Yilin Han","doi":"10.1108/dta-07-2023-0377","DOIUrl":"https://doi.org/10.1108/dta-07-2023-0377","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>Coal is a critical global energy source, and fluctuations in its price significantly impact related enterprises' profitability. This study aims to develop a robust model for predicting the coal price index to enhance coal purchase strategies for coal-consuming enterprises and provide crucial information for global carbon emission reduction.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>The proposed coal price forecasting system combines data decomposition, semi-supervised feature engineering, ensemble learning and deep learning. It addresses the challenge of merging low-resolution and high-resolution data by adaptively combining both types of data and filling in missing gaps through interpolation for internal missing data and self-supervision for initiate/terminal missing data. The system employs self-supervised learning to complete the filling of complex missing data.</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>The ensemble model, which combines long short-term memory, XGBoost and support vector regression, demonstrated the best prediction performance among the tested models. It exhibited superior accuracy and stability across multiple indices in two datasets, namely the Bohai-Rim steam-coal price index and coal daily settlement price.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>The proposed coal price forecasting system stands out as it integrates data decomposition, semi-supervised feature engineering, ensemble learning and deep learning. Moreover, the system pioneers the use of self-supervised learning for filling in complex missing data, contributing to its originality and effectiveness.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":"41 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139501078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}