Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery

What Goes Up, Must Go Down: A Case Study From RAL on Shrinking an Existing Storage Service 什么上升，就必须下降:一个来自RAL的案例研究缩减现有的存储服务

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD) Pub Date : 2018-12-12 DOI: 10.22323/1.327.0026

R. Appleyard, G. Patargias

{"title":"What Goes Up, Must Go Down: A Case Study From RAL on Shrinking an Existing Storage Service","authors":"R. Appleyard, G. Patargias","doi":"10.22323/1.327.0026","DOIUrl":"https://doi.org/10.22323/1.327.0026","url":null,"abstract":"Much attention is paid to the process of how new storage services are deployed into production that the challenges therein. Far less is paid to what happens when a storage service is approaching the end of its useful life. The challenges in rationalising and de-scoping a service that, while relatively old, is still critical to production work for both the UK WLCG Tier 1 and local facilities are not to be underestimated. \u0000 \u0000RAL has been running a disk and tape storage service based on CASTOR (Cern Advanced STORage) for over 10 years. CASTOR must cope with both the throughput requirements of supplying data to a large batch farm and the data integrity requirements needed by a long-term tape archive. A new storage service, called ‘Echo’ is now being deployed to replace the disk-only element of CASTOR, but we intend to continue supporting the CASTOR system for tape into the medium term. This, in turn, implies a downsizing and redesign of the CASTOR service in order to improve manageability and cost effectiveness. We will give an outline of both Echo and CASTOR as background. \u0000 \u0000This paper will discuss the project to downsize CASTOR and improve its manageability when running both at a considerably smaller scale (we intend to go from around 140 storage nodes to around 20), and with a considerably lower amount of available staff effort. This transformation must be achieved while, at the same time, running the service in 24/7 production and supporting the transition to the newer storage element. To achieve this goal, we intend to transition to a virtualised infrastructure to underpin the remaining management nodes and improve resilience by allowing management functions to be performed by many different nodes concurrently (‘cattle’ as opposed to ‘pets’), and also intend to streamline the system by condensing the existing 4 CASTOR ‘stagers’ (databases that record the state of the disk pools) into a single one that supports all users.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"139 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121310896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unified Account Management for High Performance Computing as a Service with Microservice Architecture 基于微服务架构的高性能计算即服务的统一账户管理

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD) Pub Date : 2018-12-12 DOI: 10.22323/1.327.0020

Rongqiang Cao, Shasha Lu, Xiaoning Wang, Haili Xiao, Xue-bin Chi

{"title":"Unified Account Management for High Performance Computing as a Service with Microservice Architecture","authors":"Rongqiang Cao, Shasha Lu, Xiaoning Wang, Haili Xiao, Xue-bin Chi","doi":"10.22323/1.327.0020","DOIUrl":"https://doi.org/10.22323/1.327.0020","url":null,"abstract":"In recent years, High Performance Computing (HPC) has developed rapidly in China. From Chinese Academy of Sciences (CAS) level, Scientific Computing Grid (ScGrid), is a general-purpose computing platform started from 2006 in CAS, which provided a problem solving environment for computing users through grid computing and cloud computing technologies. Then ScGrid becomes Supercomputing Cloud, an important port of Chinese Science Cloud from 2011. From national level, China National Grid (CNGrid) has integrated massive HPC resources from several national supercomputing centers and other large centers distributed geographically, and been providing efficient computing services for users in diverse disciplines and research areas. During more than 10 years, CNGrid and ScGrid has integrated tens of HPC resources distributed geographically across China, comprising 6 National Supercomputer Centers of Tianjin, Jinan, Changsha, and Shenzhen, Guangzhou, Wuxi, and also dozens of teraflops-scale HPC resources belong to universities and institutes. In total, the computing capability is more than 200PF and the storage capacity is more than 160PB in CNGrid. \u0000 As worked in the operation and management center of CNGrid and ScGrid for many years, we notice that users prefer to manage their jobs at different supercomputers and clusters via a global account on different remote clients such as science gateways, desktop applications and even scripts. And they don’ t like to apply for an account to each supercomputer and login into the supercomputer in specific way. \u0000 Therefore, we described Unified Account Management as a Service (UAMS) to access and use all HPC resources via a global account for each user in this paper. We addressed and solved challenges for mapping a global account to many local accounts, and provided unified account registration, management and authentication for different collaborative web gateways, command toolkits and other desktop applications. UAMS was designed in accordance with the core rules of simplicity, compatibility and reusability. In architecture design, we focused on loosely-coupled style to acquire good scalability and update internal modules transparently. In implementation, we applied widely accepted knowledge for the definitions of the RESTful API and divided them into several isolated microservices according to their usages and scenarios. For security, all sensitive data transferred in wide-network is protected by HTTPS with transport layer security outside of CNGrid and secure communication channels provided by OpenSSH inside of CNGrid. In addition, all parameters submitted to RESTful web services are strictly checked in format and variable type. \u0000 By providing these frequently important but always challenging capabilities as a service, UAMS allows users to use tens of HPC resources and clients via only an account, and makes it easy for developers to implement clients and services related HPC with advantages of numerous users and s","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122163747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Extending WLCG Tier-2 Resources using HPC and Cloud Solutions 使用高性能计算和云解决方案扩展WLCG第2层资源

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD) Pub Date : 2018-12-12 DOI: 10.22323/1.327.0025

J. Chudoba, M. Svatos

{"title":"Extending WLCG Tier-2 Resources using HPC and Cloud Solutions","authors":"J. Chudoba, M. Svatos","doi":"10.22323/1.327.0025","DOIUrl":"https://doi.org/10.22323/1.327.0025","url":null,"abstract":"Available computing resources limit data simulation and processing of LHC experiments. WLCG Tier centers connected via Grid provide majority of computing and storage capacities, which allow relatively fast and precise analyses of data. Requirements on the number of simulated events must be often reduced to meet installed capacities. Projection of requirements for future LHC runs shows a significant shortage of standard Grid resources if a flat budget is assumed. There are several activities exploring other sources of computing power for LHC projects. The most significant are big HPC centers (supercomputers) and Cloud resources provided both by commercial and academic institutions. The Tier-2 center hosted by the Institute of Physics (FZU) in Prague provides resources for ALICE and ATLAS collaborations on behalf of all involved Czech institutions. Financial resources provided by funding agencies and resources provided by IoP do not allow to buy enough servers to meet demands of experiments. We extend storage resources by two distant sites with additional finance sources. Xrootd servers in the Institute of Nuclear Physics in Rez near Prague store files for the ALICE experiment. CESNET data storage group operates dCache instance with a tape backend for ATLAS (and Pierre Auger Observatory) collaboration. Relatively big computing capacities could be used in the national supercomputing center IT4I in Ostrava. Within the ATLAS collaboration, we explore two different solutions to overcome technical problems arising from different computing environment on the supercomputer. The main difference is that individual worker nodes do not have an external network connection and cannot directly download input and upload output data. One solution is already used for HPC centers in the USA, but until now requires significant adjustments of procedures used for standard ATLAS production. Another solution is based on ARC CE hosted by the Tier-2 center at IoP and resubmission of jobs remotely via ssh.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130283316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Explore New Computing Environment for LHAASO Offline Data Analysis 探索LHAASO离线数据分析的新计算环境

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD) Pub Date : 2018-12-12 DOI: 10.22323/1.327.0021

Qiulan Huang, Gongxing Sun, Qiao Yin, Zhanchen Wei, Qiang Li

{"title":"Explore New Computing Environment for LHAASO Offline Data Analysis","authors":"Qiulan Huang, Gongxing Sun, Qiao Yin, Zhanchen Wei, Qiang Li","doi":"10.22323/1.327.0021","DOIUrl":"https://doi.org/10.22323/1.327.0021","url":null,"abstract":"This paper explores a way to build a new computing environment based on Hadoop to make the Large High Altitude Air Shower Observatory(LHAASO) jobs run on it transparently. Particularly, we discuss a new mechanism to support LHAASO software to random access data in HDFS. This new feature allows the Map/Reduce tasks to random read/write data on the local file system instead of using Hadoop data streaming interface. This makes HEP jobs run on Hadoop possible. We also develop MapReduce patterns for LHAASO jobs such as Corsika simulation, ARGO detector simulation (Geant4), KM2A simulation and Medea++ reconstruction. And user-friendly interface is provided. In addition, we provide the real-time cluster monitoring in terms of cluster healthy, number of running jobs, finished jobs and killed jobs. Also the accounting system is included. This work has been in production for LHAASO offline data analysis to gain about 20,000 CPU hours per month since September, 2016. The results show the efficiency of IO intensive job can be improved about 46%. Finally, we describe our ongoing work of data migration tool to serve the data move between HDFS and other storage systems.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127145039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Smart Policy Driven Data Management and Data Federations 智能策略驱动的数据管理和数据联合

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD) Pub Date : 2018-12-12 DOI: 10.22323/1.327.0001

P. Fuhrmann, M. Antonacci, G. Donvito, O. Keeble, P. Millar

引用次数: 0

WLCG Tier-2 site at NCP, Status Update and Future Direction WLCG在NCP的二级站点，状态更新和未来方向

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD) Pub Date : 2018-12-12 DOI: 10.22323/1.327.0015

Saqib Haleem, Fawad Saeed, A. Rehman, Muhammad Imran

{"title":"WLCG Tier-2 site at NCP, Status Update and Future Direction","authors":"Saqib Haleem, Fawad Saeed, A. Rehman, Muhammad Imran","doi":"10.22323/1.327.0015","DOIUrl":"https://doi.org/10.22323/1.327.0015","url":null,"abstract":"The National Centre for Physics (NCP) in Pakistan maintains a computing infrastructure for the scientific community. A major portion of the computing and storage resources are reserved for the CMS experiment through the WLCG infrastructure, and a small portion of the computing resources are reserved for other non experimental high-energy physics (EHEP) scientific experiments. For efficient utilization of resources, many scientific organizations have migrated their resources into infrastructure-as-a-service (IaaS) facilities. The NCP has also taken such an initiative last year, and has migrated most of their resources into an IaaS facility. An HT-condor based batch system has been deployed for the local experimental high energy physics community to allow them to perform their analysis task. Recently we deployed an HT-Condor compute element (CE) as a gateway for the CMS jobs. On the network side, our Tier-2 site is completely accessible and operational on IPv6. Moreover, we recently deployed a Perfsonar node to actively monitor the throughput and latency issues between NCP and other WLCG sites. This paper discusses the status of NCP Tier-2 site, its current challenges and future directions.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126501719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Authorship recognition and disambiguation of scientific papers using a neural networks approach 使用神经网络方法的科学论文的作者身份识别和消歧

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD) Pub Date : 2018-03-01 DOI: 10.22323/1.327.0007

S. Schifano, Tommaso Sgarbanti, L. Tomassetti

{"title":"Authorship recognition and disambiguation of scientific papers using a neural networks approach","authors":"S. Schifano, Tommaso Sgarbanti, L. Tomassetti","doi":"10.22323/1.327.0007","DOIUrl":"https://doi.org/10.22323/1.327.0007","url":null,"abstract":"Authorship recognition and author names disambiguation are main issues affecting the quality and reliability of bibliographic records retrieved from digital libraries, such as Web of Science, Scopus, Google Scholar and many others. So far, these problems have been faced using methods mainly based on text-pattern-recognition for specific datasets, with high-level degree of errors. \u0000 \u0000In this paper, we propose a different approach using neural networks to learn features automatically for solving authorship recognition and disambiguation of author names. The network learns for each author the set of co-writers, and from this information recovers authorship of papers. In addition, the network can be trained taking into account other features, such as author affiliations, keywords, projects and research areas. \u0000 \u0000The network has been developed using the TensorFlow framework, and run on recent Nvidia GPUs and multi-core Intel CPUs. Test datasets have been selected from records of Scopus digital library, for several groups of authors working in the fields of computer science, environmental science and physics. The proposed methods achieves accuracies above 99% in authorship recognition, and is able to effectively disambiguate homonyms. \u0000 \u0000We have taken into account several network parameters, such as training-set and batch size, number of levels and hidden units, weights initialization, back-propagation algorithms, and analyzed also their impact on accuracy of results. This approach can be easily extended to any dataset and any bibliographic records provider.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127079288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A Study of Credential Integration Model in Academic Research Federation Supporting a Wide Variety of Services 支持多种服务的学术研究联合会证书集成模型研究

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD) Pub Date : 2018-03-01 DOI: 10.22323/1.327.0016

E. Sakane, Takeshi Nishimura, K. Aida, Motonori Nakamura

{"title":"A Study of Credential Integration Model in Academic Research Federation Supporting a Wide Variety of Services","authors":"E. Sakane, Takeshi Nishimura, K. Aida, Motonori Nakamura","doi":"10.22323/1.327.0016","DOIUrl":"https://doi.org/10.22323/1.327.0016","url":null,"abstract":"This paper investigates the situation where users must utilize each credential according to the desired services, and clarifies the problems in the situation and the issues addressed by the concept of ``identity federation''. Japan has the GakuNin, which is an academic access management federation, and the HPCI, which is a distributed high performance computing infrastructure. For the provision of the HPCI resources, the HPCI cannot simply behave as a service provider in the GakuNin. Consequently, in performing academic research, especially HPCI users belonging to academic institutions are compelled to manage both the GakuNin and the HPCI credentials. In this paper, based on the situation in Japan mentioned above, we discuss a credential integration model in order to more efficiently utilize a wide variety of services. We first characterize services in an academic federation from the point of view of authorization and investigate the problem that users must utilize each credential issued by different identity providers. Thus, we discuss the issues to integrate user's credentials, and consider a model that solves the issues.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123484089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)最新文献