通过细粒度协同计算加速私有大型变压器推理

IF 8 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

IEEE Transactions on Information Forensics and Security Pub Date : 2025-06-30 DOI:10.1109/TIFS.2025.3584639

Yuntian Chen;Zhanyong Tang;Tianpei Lu;Bingsheng Zhang;Zhiying Shi;Zheng Wang

{"title":"通过细粒度协同计算加速私有大型变压器推理","authors":"Yuntian Chen;Zhanyong Tang;Tianpei Lu;Bingsheng Zhang;Zhiying Shi;Zheng Wang","doi":"10.1109/TIFS.2025.3584639","DOIUrl":null,"url":null,"abstract":"Homomorphic encryption (HE) and secret sharing (SS) enable computations on encrypted data, providing significant privacy benefits for large transformer-based models (TBM) in sensitive sectors like medicine and finance. However, private TBM inference incurs significant costs due to the coarse-grained application of HE and SS. We present <sc>FASTLMPI, a new approach to accelerate private TBM inference through fine-grained computation optimization. Specifically, through the fine-grained co-design of homomorphic encryption and secret sharing, <sc>FASTLMPI achieves efficient protocols for matrix multiplication, SoftMax, LayerNorm, and GeLU. In addition, <sc>FASTLMPI introduces a precise segmented approximation technique for differentiable non-linear functions, improving its fitting accuracy while maintaining a low polynomial degree. Compared to solution BOLT (S&P’24), <sc>FASTLMPI shows a remarkable 25.1% to 55.3% decrease in runtime and an impressive 39.0% reduction in communication costs.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"7482-7497"},"PeriodicalIF":8.0000,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accelerating Private Large Transformers Inference Through Fine-Grained Collaborative Computation\",\"authors\":\"Yuntian Chen;Zhanyong Tang;Tianpei Lu;Bingsheng Zhang;Zhiying Shi;Zheng Wang\",\"doi\":\"10.1109/TIFS.2025.3584639\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Homomorphic encryption (HE) and secret sharing (SS) enable computations on encrypted data, providing significant privacy benefits for large transformer-based models (TBM) in sensitive sectors like medicine and finance. However, private TBM inference incurs significant costs due to the coarse-grained application of HE and SS. We present <sc>FASTLMPI, a new approach to accelerate private TBM inference through fine-grained computation optimization. Specifically, through the fine-grained co-design of homomorphic encryption and secret sharing, <sc>FASTLMPI achieves efficient protocols for matrix multiplication, SoftMax, LayerNorm, and GeLU. In addition, <sc>FASTLMPI introduces a precise segmented approximation technique for differentiable non-linear functions, improving its fitting accuracy while maintaining a low polynomial degree. Compared to solution BOLT (S&P’24), <sc>FASTLMPI shows a remarkable 25.1% to 55.3% decrease in runtime and an impressive 39.0% reduction in communication costs.\",\"PeriodicalId\":13492,\"journal\":{\"name\":\"IEEE Transactions on Information Forensics and Security\",\"volume\":\"20 \",\"pages\":\"7482-7497\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Information Forensics and Security\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11059953/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11059953/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

同态加密（HE）和秘密共享（SS）支持对加密数据进行计算，为医疗和金融等敏感领域的大型基于变压器的模型（TBM）提供了显著的隐私优势。然而，由于HE和SS的粗粒度应用，私有TBM推理产生了巨大的成本。我们提出了一种通过细粒度计算优化加速私有TBM推理的新方法FASTLMPI。具体来说，通过同态加密和秘密共享的细粒度协同设计，FASTLMPI实现了矩阵乘法、SoftMax、LayerNorm和GeLU的高效协议。此外，FASTLMPI引入了对可微非线性函数的精确分段逼近技术，在保持低多项式度的同时提高了拟合精度。与解决方案BOLT （S&P’24）相比，FASTLMPI的运行时间降低了25.1%至55.3%，通信成本降低了39.0%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Accelerating Private Large Transformers Inference Through Fine-Grained Collaborative Computation

Homomorphic encryption (HE) and secret sharing (SS) enable computations on encrypted data, providing significant privacy benefits for large transformer-based models (TBM) in sensitive sectors like medicine and finance. However, private TBM inference incurs significant costs due to the coarse-grained application of HE and SS. We present FASTLMPI, a new approach to accelerate private TBM inference through fine-grained computation optimization. Specifically, through the fine-grained co-design of homomorphic encryption and secret sharing, FASTLMPI achieves efficient protocols for matrix multiplication, SoftMax, LayerNorm, and GeLU. In addition, FASTLMPI introduces a precise segmented approximation technique for differentiable non-linear functions, improving its fitting accuracy while maintaining a low polynomial degree. Compared to solution BOLT (S&P’24), FASTLMPI shows a remarkable 25.1% to 55.3% decrease in runtime and an impressive 39.0% reduction in communication costs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Information Forensics and Security 工程技术-工程：电子与电气

CiteScore

14.40

自引率

7.40%

发文量

234

审稿时长

6.5 months

期刊介绍： The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features