In smart surveillance systems, cameras often have limited computational capacity, which necessitates the offloading of captured images or videos to cloud servers for analysis, raising significant privacy concerns. To address these challenges, we propose a lightweight privacy-preserving split learning framework tailored for smart surveillance systems. In this framework, an upper model is deployed on resource-constrained cameras to extract intermediate features from image segments, which are then transmitted to a lower model on the cloud for further analysis and training. This approach reduces the likelihood of sensitive data exposure by avoiding the transmission of raw images or videos. Furthermore, our framework incorporates adversarial training to defend against reconstruction attacks, preventing adversaries from deducing private information from the intermediate features. Compared to traditional split learning methods, the proposed solution significantly reduces client-side memory usage and computation time, making it well-suited for deployment on low-resource devices. Experimental results on CIFAR10, CIFAR100, and SVHN datasets demonstrate the effectiveness of our framework, with reductions in the server-side decoder’s reconstruction classification accuracy to 12.18%, 2.18%, and 13.09%, respectively. These results validate the framework’s ability to enhance privacy while maintaining computational efficiency.