Mert Sehri , Zehui Hua , Francisco de Assis Boldt , Patrick Dumond
{"title":"Selective embedding for deep learning","authors":"Mert Sehri , Zehui Hua , Francisco de Assis Boldt , Patrick Dumond","doi":"10.1016/j.knosys.2025.114535","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning has revolutionized many industries by enabling models to automatically learn complex patterns from raw data, reducing dependence on manual feature engineering. However, deep learning algorithms are sensitive to input data, and performance often deteriorates under nonstationary conditions and across dissimilar domains, especially when using time-domain data. Conventional single-channel or parallel multi-source data loading strategies either limit generalization or increase computational costs. This study introduces selective embedding, a novel data loading strategy, which alternates short segments of data from multiple sources within a single input channel. Drawing inspiration from cognitive psychology, selective embedding mimics human-like information processing to reduce model overfitting, enhance generalization, and improve computational efficiency. Validation is conducted using six time-domain datasets, demonstrating that the proposed method consistently achieves high classification accuracy for many deep learning architectures while significantly reducing training times. Across multiple datasets, selective embedding consistently improves test accuracy by 20 to 30 percent compared to traditional single-channel loading strategies, while also matching or exceeding the performance of parallel multi-source loading methods. Importantly, these gains are achieved while significantly reducing training times, demonstrating both efficiency and scalability across simple and complex architectures. The approach proves particularly effective for complex systems with multiple data sources, offering a scalable and resource-efficient solution for real-world applications in healthcare, heavy machinery, marine, railway, and agriculture, where robustness and adaptability are critical.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114535"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125015746","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Deep learning has revolutionized many industries by enabling models to automatically learn complex patterns from raw data, reducing dependence on manual feature engineering. However, deep learning algorithms are sensitive to input data, and performance often deteriorates under nonstationary conditions and across dissimilar domains, especially when using time-domain data. Conventional single-channel or parallel multi-source data loading strategies either limit generalization or increase computational costs. This study introduces selective embedding, a novel data loading strategy, which alternates short segments of data from multiple sources within a single input channel. Drawing inspiration from cognitive psychology, selective embedding mimics human-like information processing to reduce model overfitting, enhance generalization, and improve computational efficiency. Validation is conducted using six time-domain datasets, demonstrating that the proposed method consistently achieves high classification accuracy for many deep learning architectures while significantly reducing training times. Across multiple datasets, selective embedding consistently improves test accuracy by 20 to 30 percent compared to traditional single-channel loading strategies, while also matching or exceeding the performance of parallel multi-source loading methods. Importantly, these gains are achieved while significantly reducing training times, demonstrating both efficiency and scalability across simple and complex architectures. The approach proves particularly effective for complex systems with multiple data sources, offering a scalable and resource-efficient solution for real-world applications in healthcare, heavy machinery, marine, railway, and agriculture, where robustness and adaptability are critical.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.