{"title":"SPARQL Optimization Using Re-ordering Joining Patterns with Surrogate Key Concept and Subset Patterns","authors":"Rupal Gupta;Sanjay Kumar Malik","doi":"10.13052/jwe1540-9589.2334","DOIUrl":"https://doi.org/10.13052/jwe1540-9589.2334","url":null,"abstract":"Semantic web data resides on the web in the form of knowledge graphs known as RDF graphs and searching around the web has been always a crucial task. For the data retrieval of RDF data of the semantic web, SPARQL query language has been used which in turn is based on triple patterns and joins. Optimization of SPARQL query has been a problematic concern for decades due to the large amount of triple patterns associated with RDF data. Although several researchers have put a lot of effort into the optimization of SPARQL query, it is difficult to understand the concept from scratch due to its diversified nature. This paper analyses various optimization techniques for the SPARQL query used with the semantic web to process knowledge graphs. These techniques include join-based, heuristic-based, rule-based, and indexing-based approaches for optimization. This paper will help researchers in this domain to easily get into the core concept of SPARQL execution along with various optimization approaches used for query processing, which can help in various other domains like linked open data and information retrieval. In this paper, an optimization algorithm HSOA (hybrid SPARQL optimization algorithm) has been proposed, which comprises the features of index-based, cost-based, and triple reordering-based optimization approaches. The proposed hybrid algorithm has been designed specifically for n-triple RDF data, which comprises subset patterns, and surrogate key concepts. The results produced by the proposed algorithm are encouraging and have also been tested and compared with the benchmark dataset and SPARQL queries like LUBM, BSBM, and SP2Bench.","PeriodicalId":49952,"journal":{"name":"Journal of Web Engineering","volume":"23 3","pages":"393-430"},"PeriodicalIF":0.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10547280","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141251187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data Lake Conceptualized Web Platform for Food Research Data Collection","authors":"Gi-taek An;Seyoung Oh;Eunhye Kim;Jung-min Park","doi":"10.13052/jwe1540-9589.2333","DOIUrl":"https://doi.org/10.13052/jwe1540-9589.2333","url":null,"abstract":"Food research is uniquely intertwined with everyday life and necessitates the utilization of big data. Within this domain, the research data consist of various forms and formats, encompassing biological experiment results, chemical analysis data, nutritional information, microbiological data, sensor data, images, and videos. This diversity stems from the integration of data from various subdomains within the larger field. With recent advancements in deep learning technology, the importance of data has grown significantly, resulting in increased reliance on data-driven research. Although specialized platforms for sharing and utilizing data have been established at the national level, particularly in the bioscience field, food research lacks a dedicated infrastructure and specialized data-sharing platforms. In this study, we develop a platform that leverages Hadoop-based distributed file systems to create a data lake. This platform enables data storage and sharing through a web-based interface. The distributed file system supports scalability by adding data nodes, making it an effective solution for capacity expansion. In addition, the web-based platform ensures high accessibility, allowing users access from anywhere, at any time, using any device. Finally, we introduce the establishment of a 1.8 PB Hadoop-based physical storage system and present an approach for building a highly accessible web platform with substantial utility.","PeriodicalId":49952,"journal":{"name":"Journal of Web Engineering","volume":"23 3","pages":"377-392"},"PeriodicalIF":0.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10547279","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141251183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Music Curriculum Research Using a Large Language Model, Cloud Computing and Data Mining Technologies","authors":"Yuting Shang","doi":"10.13052/jwe1540-9589.2323","DOIUrl":"https://doi.org/10.13052/jwe1540-9589.2323","url":null,"abstract":"This paper presents a method to enhance the scientific nature of the music curriculum model by integrating a large language model, cloud computing and data mining technology for the analysis of the music teaching curriculum model. To maintain the integrity of the mixing matrix while employing the frequency hopping frequency, the paper suggests dividing the mixing matrix into a series of sub-matrices along the vertical time axis. This approach transforms wideband music signal processing into a narrowband processing problem. Additionally, two hybrid matrix estimation algorithms are proposed in this paper using underdetermined conditions. Furthermore, utilizing the estimated mixing matrix and the detected time-frequency support domain, the paper employs the subspace projection algorithm for underdetermined blind separation of music signals in the time-frequency domain. This procedure, along with the integration of the estimated direction of arrival (DoA), enables the completion of frequency-hopping network station music signal sorting. Extensive simulation teaching demonstrates that the music curriculum model proposed in this paper, based on a large language model, cloud computing and data mining technologies, significantly enhances the quality of modern music teaching.","PeriodicalId":49952,"journal":{"name":"Journal of Web Engineering","volume":"23 2","pages":"251-273"},"PeriodicalIF":0.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10504109","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140606116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application of an Improved Convolutional Neural Network Algorithm in Text Classification","authors":"Jing Peng;Shuquan Huo","doi":"10.13052/jwe1540-9589.2331","DOIUrl":"https://doi.org/10.13052/jwe1540-9589.2331","url":null,"abstract":"This paper proposes a text classification model based on a combination of a convolutional neural network (CNN) and a support vector machine (SVM) using Amazon review polarity, TREC, and Kaggle as experimental data. By adding an attention mechanism to simplify the parameters and using the classifier based on SVM to replace the Softmax layer, the extraction effect of feature words is improved and the problem of weak generalization ability of the CNN model is solved. Simulation experiments show that the proposed algorithm performs better in precision rate, recall rate, F1 value, and training time compared with CNN, RNN, BERT and term frequency-inverse document frequency (TF-IDF).","PeriodicalId":49952,"journal":{"name":"Journal of Web Engineering","volume":"23 3","pages":"315-339"},"PeriodicalIF":0.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10547278","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141251186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing English Language Education Through Big Data Analytics and Generative AI","authors":"Jianhua Liu","doi":"10.13052/jwe1540-9589.2322","DOIUrl":"https://doi.org/10.13052/jwe1540-9589.2322","url":null,"abstract":"This research paper provides a comprehensive examination of the significant impact of big data analytics and generative artificial intelligence (GAI) on the field of English language education. Utilizing a meticulous framework rooted in the evolutionary network influence of big data, our study critically analyzes several aspects of student engagement, learning motivation, self-efficacy, and the existing disparities among learners. Our primary objective is to enhance students' active participation, intrinsic interest, and self-confidence in the context of English language learning, thus advancing their overall linguistic competence. To achieve these objectives, our study systematically integrates the concept of practice education with a multidisciplinary approach, leveraging the power of big data analysis and GAI, and reveals profound insights into student learning behaviors, preferences, and personalized educational needs. We employ advanced techniques for meticulous data processing and interpretation, empowering educators to make data-informed decisions and tailor pedagogical strategies to meet the unique requirements of each student. This data-driven pedagogical approach not only facilitates the implementation of effective teaching methodologies but also effectively addresses the disparities stemming from diverse student backgrounds, thereby fostering a more inclusive and personalized learning environment.","PeriodicalId":49952,"journal":{"name":"Journal of Web Engineering","volume":"23 2","pages":"227-249"},"PeriodicalIF":0.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10504108","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140606030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing Suggestion Detection in Online User Reviews through Integrated Information Retrieval and Deep Learning Approaches","authors":"Zahra Hadizadeh;Amin Nazari;Muharram Mansoorizadeh","doi":"10.13052/jwe1540-9589.2335","DOIUrl":"https://doi.org/10.13052/jwe1540-9589.2335","url":null,"abstract":"In the aftermath of the COVID-19 pandemic, using web platforms as a communication medium and decision-making tool in online commerce has become widely acknowledged. User-generated comments, reflecting positive and negative sentiments towards specific items, serve as invaluable indicators, offering recommendations for product and organizational improvements. Consequently, the extraction of suggestions from mined opinions can enhance the efficacy of companies and organizations in this domain. Prevailing research in suggestion mining predominantly employs rule-based methodologies and statistical classifiers, relying on manually identified features. However, a recent trend has emerged wherein researchers explore solutions grounded in deep learning tools and techniques. This study aims to employ information retrieval techniques for the automated identification of suggestions. To this end, various methodologies, including distance measurement approaches, multilayer perceptron neural networks, support vector machines, regression logistics, convolutional neural networks utilizing TF-IDF, Bag of Words (BOW), and Word2Vec vectors, along with keyword extraction, have been integrated. The proposed approach is assessed using the SemEval2019 dataset to extract suggestions from the textual content of online user reviews. The obtained results demonstrate a notable enhancement in the F\u0000<inf>1</inf>\u0000 score, reaching 0.76 compared to prior research. The experiments further suggest that information retrieval-based approaches exhibit promising potential for this specific task.","PeriodicalId":49952,"journal":{"name":"Journal of Web Engineering","volume":"23 3","pages":"431-463"},"PeriodicalIF":0.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10547281","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141251179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Semantically Enriched Keyword Prefetching Based on Usage and Domain Knowledge","authors":"Sonia Setia;Jyoti;Neelam Duhan;Aman Anand;Nikita Verma","doi":"10.13052/jwe1540-9589.2332","DOIUrl":"https://doi.org/10.13052/jwe1540-9589.2332","url":null,"abstract":"In intelligent web systems [2], web prefetching [27] plays a crucial role. In order to make accurate predictions for web prefetching, it is important but challenging to uncover valuable information from web use statistics [16]. Using statistics and domain expertise, this study presents a new approach dubbed SPUDK for efficient prefetching. In this paper, it is shown how web access logs can be used efficiently for browsing prediction. Our main focus is on the technique needed to manage the queries found in web access logs so that valuable information can be attained. We further process these access logs using a taxonomy and a thesaurus, WordNet, to find the semantics of queries. SPUDK, a system that organises use data into semantic clusters, is one example of this approach. Our contributions in this paper are as follows: (1) A technique to exploit query keywords from access logs. (2) An approach to enrich queries with semantic information. (3) A new similarity measure for finding similarity among URLs present in access logs. (4) A novel clustering technique to find semantic clusters of URLs. (5) Experimental evaluation of the proposed system. The proposed SPUDK system is evaluated using American Online (AOL) logs, which gives improvement of 39% in precision of prediction, 35% in hit ratio and reduction of 50.6% in latency on average as compared to other prediction techniques in the literature.","PeriodicalId":49952,"journal":{"name":"Journal of Web Engineering","volume":"23 3","pages":"341-375"},"PeriodicalIF":0.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10547277","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141251081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Leveraging the Synergy of IPv6, Generative AI, and Web Engineering to Create a Big Data-Driven Education Platform","authors":"Gao Yongli;Dong Qi;Chen Zhipeng","doi":"10.13052/jwe1540-9589.2321","DOIUrl":"https://doi.org/10.13052/jwe1540-9589.2321","url":null,"abstract":"The rapid advancement of network technology in China has significantly accelerated the implementation of information technology in higher education. Through the utilization of computer technology, multimedia technology, big data technology, artificial intelligence technology, and network communication technology, the integration of these technologies in university teaching has become widespread. This paper presents an analysis and discussion on the utilization of the latest IPv6 network transmission protocol technology to enhance the application of data collection in university education, with a specific focus on gathering information related to university faculties. By leveraging web engineering and multimedia technology as fundamental components, the network facilitates the sharing of educational resources among students, thereby enabling the reform of management approaches, fostering educational progress in China, and establishing a comprehensive big data-driven education platform specifically tailored to colleges and universities. Additionally, the incorporation of big data visualization and analysis tools allows for easy retrieval of existing university educational information, facilitates the creation of data charts, and expedites the utilization of data for its inherent value. Finally, the proposed approach employs generative AI to collect and analyze feedback from students and educators, followed by the application of web engineering techniques to continuously enhance the online education platform based on this feedback.","PeriodicalId":49952,"journal":{"name":"Journal of Web Engineering","volume":"23 2","pages":"197-226"},"PeriodicalIF":0.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10504107","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140606069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Flight Price Prediction Web-Based Platform: Leveraging Generative AI for Real-Time Airfare Forecasting","authors":"Yuanyuan Guan","doi":"10.13052/jwe1540-9589.2325","DOIUrl":"https://doi.org/10.13052/jwe1540-9589.2325","url":null,"abstract":"The aviation business encounters difficulties in correctly and swiftly predicting flight fares due to the dynamic nature of the sector. Factors such as variations in demand, fuel costs, and the intricacies of various routes have an impact on this. This work presents a new method to tackle this issue by utilizing generative artificial intelligence (GAI) approaches to accurately forecast airfares in real-time. This paper presents a novel framework that integrates generative models, deep learning architectures, and historical pricing data to improve the precision of future flight price predictions. The study employs a GAI within a cutting-edge web engineering framework. This approach is designed primarily to gather knowledge about complex patterns and relationships present in historical airline data. Through the utilization of this methodology, the model is able to accurately perceive complex connections and adjust to ever-changing market conditions. Our model utilizes deep neural networks to effectively handle various circumstances and extract vital information, so facilitating a comprehensive comprehension of the intricate elements that impact flight cost. Moreover, the suggested approach places significant emphasis on precisely predicting upcoming occurrences in real-time, facilitating prompt reactions to market volatility and offering a valuable resource for airlines, travel agents, and customers alike. In order to enhance the accuracy of real-time forecasts, we utilize a web-based platform that allows for smooth interaction with live data streams and guarantees swift updates. The results demonstrate the model's capacity to adjust to dynamic market conditions, rendering it an attractive option for stakeholders in search of precise and current forecasts of flight prices.","PeriodicalId":49952,"journal":{"name":"Journal of Web Engineering","volume":"23 2","pages":"299-314"},"PeriodicalIF":0.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10504110","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140606117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Contextualized Satire Detection in Short Texts Using Deep Learning Techniques","authors":"Ashraf Kamal;Muhammad Abulaish;Jahiruddin","doi":"10.13052/jwe1540-9589.2312","DOIUrl":"https://doi.org/10.13052/jwe1540-9589.2312","url":null,"abstract":"Satire is prominent in user-generated content on various online platforms in the form of satirical news, customer reviews, blogs, articles, and short messages that are typically of an informal nature. As satire is also used to disseminate false information on the Internet, its computational detection has become a well-known issue. Existing work focuses primarily on formal document- or sentence-level textual data, whereas informal short texts have gotten less attention for satire detection. This paper presents a new model called BiLSTM self-attention (BiSAT) for detecting satire in informal short texts. It consists of various components such as input, embedding, self-attention, and two bi-directional long short-term memory (BiLSTM) layers for learning crucial contextual information pertaining to the satire present in the texts. The input layer uses the text as input to create an input vector, which is then given to the embedding layer to create the appropriate numeric vector. The output of the embedding layer is passed on to the first BiLSTM layer, which extracts contextual information-based sequences in the opposite direction. Between the first and second BiLSTM layers, a self-attention layer is employed to draw attention to the important satirical information that is acquired by the hidden layer of the first BiLSTM. The BiSAT model also takes a classic feature engineering approach, employing a 13-dimensional auxiliary feature vector comprised of features from four separate feature categories: sentiment, punctuation, hyperbole, and affective. The proposed BiSAT model is empirically evaluated on two benchmark datasets and a newly created dataset called Satire-280. It outperforms existing research and baseline methods by a significant margin. The Satire-280 dataset along with code can be downloaded from GitHub repository: https://github.com/Ashraf-Kamal/Satire-Detection.","PeriodicalId":49952,"journal":{"name":"Journal of Web Engineering","volume":"23 1","pages":"27-52"},"PeriodicalIF":0.8,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10488438","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140342678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}