XSEDE personnel help University of Cincinnati create Advanced Research Computing Cluster
New system enables a variety of R&E activities, fosters increased collaboration between scientific, engineering, and humanities departments
Original content via Indiana University
Extreme Science and Engineering Discovery Environment (XSEDE) Capabilities and Resource Integration (XCRI) staff recently supported the University of Cincinnati in implementing its first centralized high performance computing system.
Conversations began between XCRI and UC in March 2018, when George Turner, chief systems architect at Indiana University, delivered a presentation about Jetstream for UC Library's Data Day. Soon after, Jane Combs, associate director of research and development with UC's IT@UC, engaged Turner for advice in developing a central Advanced Research Computing (ARC) program.
Their conversations led to the creation of the pilot Advanced Research Computing Cluster (ARCC), made possible through funding from UC's Office of Research and the integration of departmental HPC clusters. In the past, computational research took place entirely within certain labs; UC's current goal is to offer all researchers access to the ARCC.
The project plan called for multiple builds; in December 2018, the first involved creating an initial cluster with the hardware on hand at UC in a spare engineering lab on campus. Eric Coulter, Senior XCRI Engineer, directed initial build activities remotely, while Stephen Bird (IU/XSEDE) and George Turner worked on site with local UC staff.
Over the 2018 winter break, the team ran a productive simulation, and once all the components arrived, the whole team assembled at UC in January to complete the final build-out and systems integration in UC's data center. The ARCC has 36 compute nodes with 40 cores per server, a GPU node with two NVIDIA V100s, and a modest workbench/scratch filesystem all connected via Intel's low-latency, high-bandwidth Omni-Path switch fabric. Excluding the GPUs, the ARCC offers a theoretical max of 101 TFLOPS of computing power.
The ARCC was designed to enable a variety of research and educational activities, and to encourage increased collaboration between scientific, engineering, and humanities departments. The "condo" model, which involves sharing departmental HPC resources, provides access to a broad swath of researchers, while also making a much larger system available to labs when they make a small contribution toward the whole.
Specific research projects that will benefit immediately from the ARCC include the following:
Data-intensive Neutrino Physics
Computational Intelligence for Large Adaptive UAS Swarms for Disaster Management
Quantum Models of Ion Solvation Thermodynamics
Computational Data Mining Techniques for Complex, Multimodal Datasets
Modeling of Multiphase Flows with Machine Learning Techniques
This resource is the first of many for UC's Advanced Research Computing project, and XCRI will continue to help as needed with the growing pains of a new program.
As part of its Capabilities and Resource Integration effort, the Extreme Science and Engineering Discovery Environment (XSEDE) distributes an Ansible-based toolkit to minimize the complexity of building XSEDE-compatible Linux clusters for use by the US open science community. The XSEDE Compatible Basic Cluster (XCBC) toolkit provides a set of scripts to quickly and easily build a local HPC system with a similar user environment to larger XSEDE systems, based on the OpenHPC project.