{"title":"构建安全的人类研究平台:计算机科学家的重要性","authors":"J. Lane","doi":"10.1145/3078597.3078618","DOIUrl":null,"url":null,"abstract":"Businesses and government are using new approaches to decision-making. They are exploiting new streams of (mostly) digital personal data, such as daily transaction records, web-browsing data, cell phone location data, and social media activity; and they are applying new analytical models and tools. Social science researchers, who are not trained in the stewardship of these new kinds of data, must now collect, manage and use them appropriately. There are many technical challenges: disparate datasets must be ingested, their provenance determined and metadata documented. Researchers must be able to query datasets to know what data are available and how they can be used. Datasets must be joined in a scientific manner, which means that workflows need to be traced and managed in such a way that the research can be replicated(Lane, 2017). Computer scientists' expertise is of critical value in many of these areas, but of greatest interest to this group is the facilities in which data on human subjects are stored. The data must be securely housed, and privacy and confidentiality must be protected using the best approaches available. The access and use must be documented to meet the needs of data providers. Yet the technology currently used to provide access to sensitive data is largely artisanal and manual. The stewardship restrictions placed on the use of confidential administrative data prevent the use of best practices for research data management. As a result, links between data sources are rarely validated, results often are not replicated, and connected datasets, results, and methods are not accessible to subsequent researchers in the same field. This is where computer scientists' expertise can come to play in building approaches that will enable sensitive data from different sources to be discovered, integrated, and analyzed in a carefully controlled manner, and that will, furthermore, allow researchers to share analysis methods, results, and expertise in ways not easily possible today","PeriodicalId":436194,"journal":{"name":"Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Building Secure Platforms for Research on Human Subjects: The Importance of Computer Scientists\",\"authors\":\"J. Lane\",\"doi\":\"10.1145/3078597.3078618\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Businesses and government are using new approaches to decision-making. They are exploiting new streams of (mostly) digital personal data, such as daily transaction records, web-browsing data, cell phone location data, and social media activity; and they are applying new analytical models and tools. Social science researchers, who are not trained in the stewardship of these new kinds of data, must now collect, manage and use them appropriately. There are many technical challenges: disparate datasets must be ingested, their provenance determined and metadata documented. Researchers must be able to query datasets to know what data are available and how they can be used. Datasets must be joined in a scientific manner, which means that workflows need to be traced and managed in such a way that the research can be replicated(Lane, 2017). Computer scientists' expertise is of critical value in many of these areas, but of greatest interest to this group is the facilities in which data on human subjects are stored. The data must be securely housed, and privacy and confidentiality must be protected using the best approaches available. The access and use must be documented to meet the needs of data providers. Yet the technology currently used to provide access to sensitive data is largely artisanal and manual. The stewardship restrictions placed on the use of confidential administrative data prevent the use of best practices for research data management. As a result, links between data sources are rarely validated, results often are not replicated, and connected datasets, results, and methods are not accessible to subsequent researchers in the same field. This is where computer scientists' expertise can come to play in building approaches that will enable sensitive data from different sources to be discovered, integrated, and analyzed in a carefully controlled manner, and that will, furthermore, allow researchers to share analysis methods, results, and expertise in ways not easily possible today\",\"PeriodicalId\":436194,\"journal\":{\"name\":\"Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3078597.3078618\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3078597.3078618","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Building Secure Platforms for Research on Human Subjects: The Importance of Computer Scientists
Businesses and government are using new approaches to decision-making. They are exploiting new streams of (mostly) digital personal data, such as daily transaction records, web-browsing data, cell phone location data, and social media activity; and they are applying new analytical models and tools. Social science researchers, who are not trained in the stewardship of these new kinds of data, must now collect, manage and use them appropriately. There are many technical challenges: disparate datasets must be ingested, their provenance determined and metadata documented. Researchers must be able to query datasets to know what data are available and how they can be used. Datasets must be joined in a scientific manner, which means that workflows need to be traced and managed in such a way that the research can be replicated(Lane, 2017). Computer scientists' expertise is of critical value in many of these areas, but of greatest interest to this group is the facilities in which data on human subjects are stored. The data must be securely housed, and privacy and confidentiality must be protected using the best approaches available. The access and use must be documented to meet the needs of data providers. Yet the technology currently used to provide access to sensitive data is largely artisanal and manual. The stewardship restrictions placed on the use of confidential administrative data prevent the use of best practices for research data management. As a result, links between data sources are rarely validated, results often are not replicated, and connected datasets, results, and methods are not accessible to subsequent researchers in the same field. This is where computer scientists' expertise can come to play in building approaches that will enable sensitive data from different sources to be discovered, integrated, and analyzed in a carefully controlled manner, and that will, furthermore, allow researchers to share analysis methods, results, and expertise in ways not easily possible today