Yuntian Chen;Zhanyong Tang;Tianpei Lu;Bingsheng Zhang;Zhiying Shi;Zheng Wang
{"title":"通过细粒度协同计算加速私有大型变压器推理","authors":"Yuntian Chen;Zhanyong Tang;Tianpei Lu;Bingsheng Zhang;Zhiying Shi;Zheng Wang","doi":"10.1109/TIFS.2025.3584639","DOIUrl":null,"url":null,"abstract":"Homomorphic encryption (HE) and secret sharing (SS) enable computations on encrypted data, providing significant privacy benefits for large transformer-based models (TBM) in sensitive sectors like medicine and finance. However, private TBM inference incurs significant costs due to the coarse-grained application of HE and SS. We present <sc>FASTLMPI</small>, a new approach to accelerate private TBM inference through fine-grained computation optimization. Specifically, through the fine-grained co-design of homomorphic encryption and secret sharing, <sc>FASTLMPI</small> achieves efficient protocols for matrix multiplication, SoftMax, LayerNorm, and GeLU. In addition, <sc>FASTLMPI</small> introduces a precise segmented approximation technique for differentiable non-linear functions, improving its fitting accuracy while maintaining a low polynomial degree. Compared to solution BOLT (S&P’24), <sc>FASTLMPI</small> shows a remarkable 25.1% to 55.3% decrease in runtime and an impressive 39.0% reduction in communication costs.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"7482-7497"},"PeriodicalIF":8.0000,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accelerating Private Large Transformers Inference Through Fine-Grained Collaborative Computation\",\"authors\":\"Yuntian Chen;Zhanyong Tang;Tianpei Lu;Bingsheng Zhang;Zhiying Shi;Zheng Wang\",\"doi\":\"10.1109/TIFS.2025.3584639\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Homomorphic encryption (HE) and secret sharing (SS) enable computations on encrypted data, providing significant privacy benefits for large transformer-based models (TBM) in sensitive sectors like medicine and finance. However, private TBM inference incurs significant costs due to the coarse-grained application of HE and SS. We present <sc>FASTLMPI</small>, a new approach to accelerate private TBM inference through fine-grained computation optimization. Specifically, through the fine-grained co-design of homomorphic encryption and secret sharing, <sc>FASTLMPI</small> achieves efficient protocols for matrix multiplication, SoftMax, LayerNorm, and GeLU. In addition, <sc>FASTLMPI</small> introduces a precise segmented approximation technique for differentiable non-linear functions, improving its fitting accuracy while maintaining a low polynomial degree. Compared to solution BOLT (S&P’24), <sc>FASTLMPI</small> shows a remarkable 25.1% to 55.3% decrease in runtime and an impressive 39.0% reduction in communication costs.\",\"PeriodicalId\":13492,\"journal\":{\"name\":\"IEEE Transactions on Information Forensics and Security\",\"volume\":\"20 \",\"pages\":\"7482-7497\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Information Forensics and Security\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11059953/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11059953/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
Accelerating Private Large Transformers Inference Through Fine-Grained Collaborative Computation
Homomorphic encryption (HE) and secret sharing (SS) enable computations on encrypted data, providing significant privacy benefits for large transformer-based models (TBM) in sensitive sectors like medicine and finance. However, private TBM inference incurs significant costs due to the coarse-grained application of HE and SS. We present FASTLMPI, a new approach to accelerate private TBM inference through fine-grained computation optimization. Specifically, through the fine-grained co-design of homomorphic encryption and secret sharing, FASTLMPI achieves efficient protocols for matrix multiplication, SoftMax, LayerNorm, and GeLU. In addition, FASTLMPI introduces a precise segmented approximation technique for differentiable non-linear functions, improving its fitting accuracy while maintaining a low polynomial degree. Compared to solution BOLT (S&P’24), FASTLMPI shows a remarkable 25.1% to 55.3% decrease in runtime and an impressive 39.0% reduction in communication costs.
期刊介绍:
The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features