{"title":"When cryptography stops data science: Strategies for resolving the conflicts between data scientists and cryptographers","authors":"","doi":"10.1016/j.dsm.2024.03.001","DOIUrl":null,"url":null,"abstract":"<div><p>The advent of the digital era and computer-based remote communications has significantly enhanced the applicability of various sciences over the past two decades, notably data science (DS) and cryptography (CG). Data science involves clustering and categorizing unstructured data, while cryptography ensures security and privacy aspects. Despite certain CG laws and requirements mandating fully randomized or pseudonoise outputs from CG primitives and schemes, it appears that CG policies might impede data scientists from working on ciphers or analyzing information systems supporting security and privacy services. However, this study posits that CG does not entirely preclude data scientists from operating in the presence of ciphers, as there are several examples of successful collaborations, including homomorphic encryption schemes, searchable encryption algorithms, secret-sharing protocols, and protocols offering conditional privacy. These instances, along with others, indicate numerous potential solutions for fostering collaboration between DS and CG. Therefore, this study classifies the challenges faced by DS and CG into three distinct groups: challenging problems (which can be conditionally solved and are currently available to use; e.g., using secret sharing protocols, zero-knowledge proofs, partial homomorphic encryption algorithms, etc.), open problems (where proofs to solve exist but remain unsolved and is now considered as open problems; e.g., proposing efficient functional encryption algorithm, fully homomorphic encryption scheme, etc.), and hard problems (infeasible to solve with current knowledge and tools). Ultimately, the paper will address specific solutions and outline future directions to tackle the challenges arising at the intersection of DS and CG, such as providing specific access for DS experts in secret-sharing algorithms, assigning data index dimensions to DS experts in ultra-dimension encryption algorithms, defining some functional keys in functional encryption schemes for DS experts, and giving limited shares of data to them for analytics.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666764924000134/pdfft?md5=74cffc92910a646ae465235dd70aec61&pid=1-s2.0-S2666764924000134-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Science and Management","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666764924000134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The advent of the digital era and computer-based remote communications has significantly enhanced the applicability of various sciences over the past two decades, notably data science (DS) and cryptography (CG). Data science involves clustering and categorizing unstructured data, while cryptography ensures security and privacy aspects. Despite certain CG laws and requirements mandating fully randomized or pseudonoise outputs from CG primitives and schemes, it appears that CG policies might impede data scientists from working on ciphers or analyzing information systems supporting security and privacy services. However, this study posits that CG does not entirely preclude data scientists from operating in the presence of ciphers, as there are several examples of successful collaborations, including homomorphic encryption schemes, searchable encryption algorithms, secret-sharing protocols, and protocols offering conditional privacy. These instances, along with others, indicate numerous potential solutions for fostering collaboration between DS and CG. Therefore, this study classifies the challenges faced by DS and CG into three distinct groups: challenging problems (which can be conditionally solved and are currently available to use; e.g., using secret sharing protocols, zero-knowledge proofs, partial homomorphic encryption algorithms, etc.), open problems (where proofs to solve exist but remain unsolved and is now considered as open problems; e.g., proposing efficient functional encryption algorithm, fully homomorphic encryption scheme, etc.), and hard problems (infeasible to solve with current knowledge and tools). Ultimately, the paper will address specific solutions and outline future directions to tackle the challenges arising at the intersection of DS and CG, such as providing specific access for DS experts in secret-sharing algorithms, assigning data index dimensions to DS experts in ultra-dimension encryption algorithms, defining some functional keys in functional encryption schemes for DS experts, and giving limited shares of data to them for analytics.