COVID-19 HPC Consortium

HPC Resources available to fight COVID-19

The COVID-19 HPC Consortium encompasses computing capabilities from some of the most powerful and advanced computers in the world. We hope to empower researchers around the world to accelerate understanding of the COVID-19 virus and the development of treatments and vaccines to help address infections. Consortium members manage a range of computing capabilities that span from small clusters to some of the very largest supercomputers in the world.

Preparing your COVID-19 HPC Consortium Request

To request access to resources of the COVID-19 HPC Consortium, you must prepare a description, no longer than three pages, of your proposed work. To ensure your request is directed to the appropriate resource(s), your description should include the sections outlined below. Do not include any proprietary information in proposals, since your request will be reviewed by staff from a number of consortium sites. 

The proposals will be evaluated on the following criteria:

  • Potential benefits for COVID-19 response
  • Feasibility of the technical approach
  • Need for high-performance computing
  • High-performance computing knowledge and experience of the proposing team
  • Estimated computing resource requirements 

Please note the following parameters and expectations:

  • Allocations of resources are expected to be for a maximum of 6 months; proposers may submit subsequent proposals for additional resources
  • All supported projects will have the name of the principal investigator, affiliation, project title and project abstract posted to the  COVID-19 HPC Consortium web site.
  • Project PIs are expected to provide brief (~2 paragraphs) updates on a weekly basis.
  • It is expected that teams who receive Consortium access will publish their results in the open scientific literature. 

A. Scientific/Technical Goal

Describe how your proposed work contributes to our understanding of COVID-19 and/or improves the nation's ability to respond to the pandemic.

  • What is the scientific/technical goal?
  • What is the plan and timetable for getting to the goal?
  • What is the expected period for performance (one week to three months)?
  • Where do you plan to publish your results and in what timeline? 

B. Estimate of Compute, Storage and Other Resources

To the extent possible, provide an estimate of the scale and type of the resources needed to complete the work, making sure to address the points below. The information in the Resources section below is available to help you answer this question.  Please be as specific as possible in your resource request.  If you have more than one phase of computational work, please address the points below for each phase (including subtotals for each phase):

  • Are there computing architectures or systems that are most appropriate (e.g. GPUs, large memory, large core counts on shared memory nodes, etc.)?
  • What is the scale of total computing and data storage resources needed for the work?
    • For example, how long does a single analysis take on what number/kind of CPU cores or GPUs, requiring how much memory (RAM and/or GPU memory) and what sizes of input and output data? How many analyses are proposed?
  • How distributed can the computation be, and can it be executed across multiple computing systems?
  • Can this workload be executed in a cloud environment?
  • Does your project require access to any public datasets? If so, please describe these datasets and how you intend to use them?
  • Do you prefer specific resource provider(s)/system(s), or can your analyses be run on a range of systems?

C. Support Needs

Describe whether collaboration or support from staff at the National labs, Commercial Cloud providers, or other HPC facilities will be essential, helpful, or unnecessary. Estimates of necessary application support are very helpful. Teams should also identify any restrictions that might apply to the project, such as export-controlled code, ITAR restrictions, proprietary data sets, regional location of compute resources, or personal health information (PHI) or HIPAA restrictions. In such cases, please provide information on security, privacy and access issues.

D. Team and Team Preparedness

Summarize your team's qualifications and readiness to execute the project.

  • What is the expected lead time before you can begin the simulation runs?
  • What systems have you recently used and how big were the simulation runs?
  • Given that some resources are at federal facilities with restrictions, please provide a list of team members that will require accounts on resources along with their citizenship. 

Document Formatting

While readability is of greatest importance, documents must satisfy the following minimum requirements. Documents that conform to NSF proposal format guidelines will satisfy these guidelines.

  • Margins: Documents must have 2.5-cm (1-inch) margins at the top, bottom, and sides.
  • Fonts and Spacing: The type size used throughout the documents must conform to the following three requirements:
  • Use one of the following typefaces identified below:
    • Arial 11, Courier New, or Palatino Linotype at a font size of 10 points or larger;
    • Times New Roman at a font size of 11 points or larger; or
    • Computer Modern family of fonts at a font size of 11 points or larger.
  • A font size of less than 10 points may be used for mathematical formulas or equations, figures, table or diagram captions and when using a Symbol font to insert Greek letters or special characters. PIs are cautioned, however, that the text must still be readable.
  • Type density must be no more than 15 characters per 2.5 cm (1 inch).
  • No more than 6 lines must be within a vertical space of 2.5 cm (1 inch).

* **Page Numbering**: Page numbers should be included in each file by the submitter. Page numbering is not provided by XRAS. * **File Format**: XRAS accepts only PDF file formats.

Submitting your COVID-19 HPC Consortium request 

  1. Create an XSEDE portal account
    • Go to https://portal.xsede.org/
    • Click on "Sign In" at the upper right, if you have an XSEDE account … 
    • … or click "Create Account" to create one. 
    • To create an account, basic information will be required (name, organization, degree, address, phone, email). 
    • Email verification will be necessary to complete the account creation.
    • Set your username and password.
    • After your account is created, be sure you're logged into https://portal.xsede.org/
    • IMPORTANT: Each individual should have their own XSEDE account; it is against policy to share user accounts.
  2. Go to the allocation request form
    • Follow this link to go directly to the submission form.
    • Or to navigate to the request form:
      • Click the "Allocations" tab in the XSEDE User Portal,
      • Then select "Submit/Review Request."
      • Select the "COVID-19 HPC Consortium" opportunity.
    • Select "Start a New Submission."
  3. Complete your submission
    • Provide the data required by the form. Fields marked with a red asterisk are required to complete a submission.
    • The most critical screens are the PersonnelTitle/Abstract, and Resources screens.
      • On the Personnel screen, one person must be designated as the Principal Investigator (PI) for the request. Other individuals can be added as co-PIs or Users (but they must have XSEDE accounts).
      • On the Title/Abstract screen, all fields are required.
      • On the Resources screen…
        • Enter "n/a" in the "Disclose Access to Other Compute Resources" field (to allow the form to be submitted).
        • Then, select "COVID-19 HPC Consortium" and enter 1 in the Amount Requested field. 
    • On the Documents screen, select "Add Document" to upload your 3-page document. Select "Main Document" or "Other" as the document Type.
      • Only PDF files can be accepted.
    • You can ignore the Grants and Publications sections. However, you are welcome to enter any supporting agency awards, if applicable.
    • On the Submit screen, select "Submit Request." If necessary, correct any errors and submit the request again.

Resources available for COVID-19 HPC Consortium request 
Click on title to see full description

U.S. Department of Energy (DOE) Advanced Scientific Computing Research (ASCR)
Supercomputing facilities at DOE offer some of the most powerful resources for scientific computing in the world. The Argonne Leadership Computing Facility (ALCF) and Oak Ridge Leadership Computing Facility (OLCF) and the Lawrence Berkeley National Laboratory (LBNL) may be used for modeling and simulation coupled with machine and deep learning techniques to study a range of areas, including examining underlying protein structure, classifying the evolution of the virus, understanding mutation, uncovering important differences, and similarities with the 2002-2003 SARS virus, searching for potential vaccine and antiviral, compounds, and simulating the spread of COVID-19 and the effectiveness of countermeasure options.

 

Oak Ridge Summit | 200 PF, 4608 nodes, IBM POWER9/NVIDIA Volta

Summit System

 2 x IBM POWER9 per node
42 TF per node
6 x NVIDIA Volta GPUs per node
512 GB DDR4 + 96 GB HBM2 (GPU memory) per node
1600 GB per node
2 x Mellanox EDR IB adapters (100 Gbps per adapter)
250 PB, 2.5 TB/s, IBM Spectrum Scale storage

 

Argonne Theta | 11.69 PF, 4292 nodes, Intel Knights Landing

1 x Intel KNL 7230 per node, 64 cores per CPU
192 GB DDR4, 16 GB MCDRAM memory per node
128 GB local storage per node
Aries dragonfly network
10 PB Lustre + 1 PB IBM Spectrum Scale storage
Full details available at: https://www.alcf.anl.gov/alcf-resources

 

Lawrence Berkeley National Laboratory 

LBNL Cori | 32 PF, 12,056 Intel Xeon Phi and Xeon nodes
9,668 nodes, each with one 68-core Intel Xeon Phi Processor 7250 (KNL)
96 GB DDR4 and 16 GB MCDRAM memory per KNL node
2,388 nodes, each with two 16-core Intel Xeon Processor E5-2698 v3 (Haswell)
128 GB DDR4 memory per Haswell node
Cray Aries dragonfly high speed network
30 PB Lustre file system and 1.8 PB Cray DataWarp flash storage
Full details at: https://www.nersc.gov/systems/cori/

U.S. DOE National Nuclear Security Administration (NNSA)

Established by Congress in 2000, NNSA is a semi-autonomous agency within the U.S. Department of Energy responsible for enhancing national security through the military application of nuclear science. NNSA resources at Lawrence Livermore National Laboratory (LLNL), Los Alamos National Laboratory (LANL), and Sandia National Laboratories (SNL) are being made available to the COVID-19 HPC Consortium.

Lawrence Livermore + Los Alamos + Sandia | 32.2 PF, 7375 nodes, IBM POWER8/9, Intel Xeon
  • LLNL Lassen
    • 23 PFLOPS, 788 compute nodes, IBM Power9/NVIDIA Volta GV100
    • 28 TF per node
    • 2 x IBM POWER9 CPUs (44 cores) per node
    • 4 x NVIDIA Volta GPUs per node
    • 256 BD DDR4 + 64 GB HBM2 (GPU memory) per node
    • 1600 GB NVMe local storage per node
    • 2 x Mellanox EDR IB (100Gb/s per adapter)
    • 24 PB storage
  • LLNL Quartz
    • 3.2 PF, 3004 compute nodes, Intel Broadwell
    • 1.2 TF per node
    • 2 x Intel Xeon E5-2695 CPUs (36 cores) per node
    • 128 GB memory per node
    • 1 x Intel Omni-Path IB (100Gb/s)
    • 30 PB storage (shared with other clusters)
  • LLNL Pascal
    • 0.9 PF, 163 compute nodes, Intel Broadwell CPUs/NVIDIA Pascal P100
    • 11.6 TF per node
    • 2 x Intel Xeon E5-2695 CPUs (36 cores) per node
    • 2 x NVIDIA Pascal P100 GPUs per node
    • 256 GB memory + 32 HBM2 (GPU memory) per node
    • 1 x Mellanox EDR IB (100Gb/s)
    • 30 PB storage (shared with other clusters) 
  • LLNL Ray
    • 1.0   PF, 54 compute nodes, IBM Power8/NVIDIA Pascal P100
    • 19 TF per node
    • 2 x IBM Power8 CPUs (20 cores) per node
    • 4 x NVIDIA Pascal P100 GPUs per node
    • 256 GB + 64 GB HBM2 (GPU memory) per node
    • 1600 GB NVMe local storage per node
    • 2 x Mellanox EDR IB (100Gb/s per adapter)
    • 1.5 PB storage
  • LLNL Surface
    • 506 TF, 158 compute nodes, Intel Sandy Bridge/NVIDIA Kepler K40m
    • 3.2 TF per node
    • 2 x Intel Xeon E5-2670 CPUs (16 cores) per node
    • 3 x NVIDIA Kepler K40m GPUs
    • 256 GB memory + 36 GB GDDR5 (GPU memory) per node
    • 1 x Mellanox FDR IB (56Gb/s)
    • 30 PB storage (shared with other clusters)
  • LLNL Syrah
    • 108 TF, 316 compute nodes, Intel Sandy Bridge
    • 0.3 TF per node
    • 2 x Intel Xeon E5-2670 CPUs (16 cores) per node
    • 64 GB memory per node
    • 1 x QLogic IB (40Gb/s)
    • 30 PB storage (shared with other clusters)
  • LANL Snow
    • 445 TF, 368 compute nodes, Intel Broadwell
    • 1.2 TF per node
    • 2 x Intel Xeon E5-2695 CPUs (36 cores) per node
    • 128 GB memory per node
    • 1 x Intel Omni-Path IB (100Gb/s)
    • 15.2 PB storage
  • LANL Badger
    • 790 TF, 660 compute nodes, Intel Broadwell
    • 1.2 TF per node
    • 2 x Intel Xeon E5-2695 CPUs (36 cores) per node
    • 128 GB memory per node
    • 1 x Intel Omni-Path IB (100Gb/s)
    • 15.2 PB storage
U.S. DOE Office of Nuclear Energy

Idaho National Laboratory | 6 PF, 2079 nodes, Intel Xeon

  • Sawtooth |6 PF; 2079 compute nodes; 99,792 cores; 108 NVIDIA Tesla V100 GPUs
    • Mellanox Infiniband EDR, hypercube
    • CPU-only nodes:
      • 2052 nodes, 2 x Intel Xeon 8268 CPUs
      • 192 GB Ram/node
    • CPU/GPU nodes:
      • 27 nodes, 2 x Intel Xeon 8268 CPUs
      • 384 GB Ram/node
      • 4 NVIDIA Tesla V100 GPUs
Rensselaer Polytechnic Institute
The Rensselaer Polytechnic Institute (RPI) Center for Computational Innovations is solving problems for next-generation research through the use of massively parallel computation and data analytics. The center supports researchers, faculty, and students a diverse spectrum of disciplines. RPI is making its Artificial Intelligence Multiprocessing Optimized System (AiMOS) system available to the COVID-19 HPC Consortium. AiMOS is an 8-petaflop IBM Power9/Volta supercomputer configured to enable users to explore new AI applications.

 

RPI AiMOS | 11.1 PF, 252 nodes POWER9/Volta

2 x IBM POWER9 CPU per node, 20 cores per CPU
6 x NVIDIA Tesla GV100 per node
32 GB HBM per GPU
512 GB DRAM per node
1.6 TB NVMe per node
Mellanox EDR InfiniBand
11 PB IBM Spectrum Scale storage

MIT/Massachusetts Green HPC Center (MGHPCC)
MIT is contributing two HPC systems to the COVID-19 HPC Consortium. The MIT Supercloud, a 7-petaflops Intel x86/NVIDIA Volta HPC cluster, is designed to support research projects that require significant compute, memory or big data resources. Satori, is a 2-petaflops scalable AI-oriented hardware resource for research computing at MIT composed of 64 IBM Power9/Volta nodes. The MIT resources are installed at the Massachusetts Green HPC Center (MGHPCC), which operates as a joint venture between Boston University, Harvard University, MIT, Northeastern University, and the University of Massachusetts.

 

MIT/MGHPCC Supercloud | 6.9 PF, 440 nodes Intel Xeon/Volta

2 x Intel Xeon (18 CPU cores per node)
2 x NVIDIA V100 GPUs pe node
32 GB HBM per GPU
Mellanox EDR InfiniBand
3 PB scratch storage

MIT/MGHPCC Satori | 2.0 PF, 64 nodes IBM POWER9/NVIDIA Volta

2 x POWER9 , 40  cores per node
4 x NVIDIA Volta GPUs per node (256 total)
32 GB HBM per GPU
1.6 TB NVMe per node
Mellanox EDR InfiniBand
2 PB scratch storage

IBM Research WSC

The IBM Research WSC cluster consists of 56 compute nodes, each with dual socket 22 core CPU and 6 GPUs, plus seven additional nodes dedicated to management functions. The cluster is intended to be used for the following purposes: client collaboration, advanced research for government-funded projects, advanced research on Converged Cognitive Systems, and advanced research on Deep Learning.

IBM Research WSC | 2.8 PF, 54 nodes IBM POWER9/NVIDIA Volta

  • 54 IBM POWER9 nodes

  • 2 x POWER9 CPU per node, 22 cores per CPU

  • 6 x NVIDIA V100 GPUs per node (336 total)

  • 512 GB DRAM per node

  • 1.4 TB NVMe per node

  • 2 x Mellanox EDR InfiniBand per node

  • 2 PB IBM Spectrum Scale distributed storage

  • RHEL 7.6

  • CUDA 10.1
  • 
IBM PowerAI 1.6

Tools to Accelerate Discovery:

Deep Search: The traditional drug discovery pipeline is time and cost intensive. To deal with new viral outbreaks and epidemics, such as COVID-19, we need more rapid drug discovery processes. Generative AI models have shown promise for automating the discovery of molecules. However, many challenges still exist: Current generative frameworks are not efficient in handling design tasks with multiple discovery constraints, have limited exploratory and expansion capabilities, and require expensive model retraining to learn beyond limited training data.

We have developed advanced and robust generative frameworks that can overcome these challenges to create novel peptides, proteins, drug candidates, and materials. We have applied our methodology to generate drug-like molecule candidates for COVID-19 targets. Our hope is that by releasing these novel molecules, the research and drug design communities can accelerate the process of identifying promising new drug candidates for coronavirus and potential similar, new outbreaks. This work demonstrates our vision for the future of accelerated discovery, where AI researchers and pharmaceutical scientists work together to rapidly create next-generation therapeutics, aided by novel AI-powered tools.

Drug Candidate Exploration: The traditional drug discovery pipeline is time and cost intensive. To deal with new viral outbreaks and epidemics, such as COVID-19, we need more rapid drug discovery processes. Generative AI models have shown promise for automating the discovery of molecules. However, many challenges still exist: Current generative frameworks are not efficient in handling design tasks with multiple discovery constraints, have limited exploratory and expansion capabilities, and require expensive model retraining to learn beyond limited training data.

We have developed advanced and robust generative frameworks that can overcome these challenges to create novel peptides, proteins, drug candidates, and materials. We have applied our methodology to generate drug-like molecule candidates for COVID-19 targets. Our hope is that by releasing these novel molecules, the research and drug design communities can accelerate the process of identifying promising new drug candidates for coronavirus and potential similar, new outbreaks. This work demonstrates our vision for the future of accelerated discovery, where AI researchers and pharmaceutical scientists work together to rapidly create next-generation therapeutics, aided by novel AI-powered tools.

Functional Genomics Platform: The IBM Functional Genomics Platform is a database and a cloud platform designed to study microbial life at scale. It contains over 300 million sequences for both bacteria and viruses— seamlessly connecting their genomes, genes, proteins, and functional domains. Together, these sequences describe the collective biological activity that a microbe can have and are therefore used to develop health interventions such as antivirals, vaccines, and diagnostic tests. In response to the global COVID-19 pandemic, we processed all newly sequenced public SARS-CoV-2 genomes and are offering access for free to the IBM Functional Genomics Platform to support important research for identifying molecular targets to aid discovery during this public health crisis.

U.S. National Science Foundation (NSF)

The NSF Office of Advanced Cyberinfrastructure supports and coordinates the development, acquisition, and provision of state-of-the-art cyberinfrastructure resources, tools and services essential to the advancement and transformation of science and engineering. By fostering a vibrant ecosystem of technologies and a skilled workforce of developers, researchers, staff and users, OAC serves the growing community of scientists and engineers, across all disciplines. The most capable resources supported by NSF OAC are being made available to support the COVID-19 HPC Consortium.

Frontera | 38.7 PF, 8114 nodes, Intel Xeon, NVIDIA RTX GPU

Funded by the National Science Foundation and Operated by the Texas Advanced Computing Center (TACC), Frontera provides a balanced set of capabilities that supports both capability and capacity simulation, data-intensive science, visualization, and data analysis, as well as emerging applications in AI and deep learning. Frontera has two computing subsystems, a primary computing system focused on double precision performance, and a second subsystem focused on single-precision streaming-memory computing.  Frontera is built be Dell, Intel, DataDirect Networks, Mellanox, NVIDIA, and GRC.

Comet | 2.75 PF, total 2020 nodes, Intel Xeon, NVIDIA Pascal GPU

Operated by the San Diego Supercomputer Center (SDSC), Comet is a nearly 3-petaflop cluster designed by Dell and SDSC. It features Intel next-generation processors with AVX2, Mellanox FDR InfiniBand interconnects, and Aeon storage. 

Stampede2 | 19.3 PF, 4200 Intel KNL, 1,736 Intel Xeon

Operated by TACC, Stampede 2 is a nearly 20-petaflop HPC national resource accessible to  thousands of researchers across the country, including to enable new computational and data-driven scientific and engineering, research and educational discoveries and advances. 

Longhorn | 2.8 PF, 112 nodes, IBM POWER9, NVIDIA Volta

Longhorn is a TACC resource built in partnership with IBM to support GPU-accelerated workloads. The power of this system is in its multiple GPUs per node, and it is intended to support sophisticated workloads that require high GPU density and little CPU compute. Longhorn will support double-precision machine learning and deep learning workloads that can be accelerated by GPU-powered frameworks, as well as general purpose GPU calculations.

Bridges | 2 PF, 874 nodes, Intel Xeon, NVIDA K80/V100/P100 GPUs, DGX-2

Operated by the Pittsburgh Supercomputing Center (PSC), Bridges and Bridges-AI provides an innovative HPC and data-analytic system, integrating advanced memory technologies to empower new modalities of artificial intelligence based computations, bring desktop convenience to HPC, connect to campuses, and express data-intensive scientific and engineering workflows.  

Jetstream | 320 nodes, Cloud accessible

Operated by a team led by the Indiana University Pervasive Technology Institute, Jetstream is a configurable large-scale computing resource that leverages both on-demand and persistent virtual machine technology to support a wide array of software environments and services through incorporating elements of commercial cloud computing resources with some of the best software in existence for solving important scientific problems.  

Open Science Grid | Distributed High Throughput Computing, 10,000+ nodes, Intel x86-compatible CPUs, various NVIDIA GPUs

The Open Science Grid (OSG) is a large virtual cluster of distributed high-throughput computing (dHTC) capacity shared by numerous national labs, universities, and non-profits, with the ability to seamlessly integrate cloud resources, too. The OSG Connect service makes this large distributed system available to researchers, who can individually use up to tens of thousands of CPU cores and up to hundreds of GPUs, along with significant support from the OSG team. Ideal work includes parameter optimization/sweeps, molecular docking, image processing, many bioinformatics tasks, and other work that can run as numerous independent tasks each needing 1-8 CPU cores, <8 GB RAM, and <10GB input or output data, though these can be exceeded significantly by integrating cloud resources and other clusters, including many of those contributing to the COVID-19 HPC Consortium.

Cheyenne | 5.34 PF, 4032 nodes, Intel Xeon

Operated by the National Center for Atmospheric Research (NCAR), Cheyenne is a critical tool for researchers across the country studying climate change, severe weather, geomagnetic storms, seismic activity, air quality, wildfires, and other important geoscience topics. The Cheyenne environment also encompases tens of petabytes of storage capacity and an analysis cluster to support efficient workflows. Built by SGI (now HPE), Cheyenne is funded by the Geosciences directorate of the National Science Foundation.

Blue Waters| 13.34 PF, 26,864 nodes, AMD Interlagos, NVIDIA Kepler K20X GPU

The Blue Waters sustained-petascale computing project is supported by the National Science Foundation, the State of Illinois, the University of Illinois and the National Geospatial-Intelligence Agency. Blue Waters is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Supercomputing Applications and provided by Cray.  Blue Waters is a well-balanced architecture that has 22,636 XE6 nodes with X86 compatible AMD two Interlagos 16 core CPUs and 4,228 XK7 nodes, each with a NVIDIA Kepler K20X GPU and a 16 core AMD Interlagos CPU.  The system is  integrated with a single high speed Gemini 24x24x24 torus with an aggregate bandwidth of 265+ TBps to simultaneously support very large scale parallel and high through, many job applications.  Blue Waters has a 36PB (26 usable) shared Lustre file system that supports 1.1 TB/s I/O bandwidth.  It is connected at a total of 450 Gbps to Wide Area networks.  The rich system software includes multiple compilers, communication libraries, software and visualization tools, docker containers, python, and machine learning and data management methods that supports capability and capacity simulation, data-intensive science, visualization, and data analysis, and machine learning/AI.  All projects are provided with expert points of contact and provided with advanced application support.  

 

NASA High-End Computing Capability

NASA Supercomputing Systems | 19.13 PF, 15800 nodes Intel x86

NASA's High-End Computing Capability (HECC) Portfolio provides world-class high-end computing, storage, and associated services to enable NASA-sponsored scientists and engineers supporting NASA programs to broadly and productively employ large-scale modeling, simulation, and analysis to achieve successful mission outcomes.

NASA's Ames Research Center in Silicon Valley hosts the agency's most powerful supercomputing facilities. To help meet the COVID-19 challenge facing the nation and the world, HECC is offering access to NASA's high-performance computing (HPC) resources for researchers requiring HPC to support their efforts to combat this virus. 

 

NASA Supercomputing Systems | 19.39 PF, 17609 nodes Intel Xeon

AITKEN | 3.69 PF, 1,152 nodes, Intel Xeon
ELECTRA | 8.32 PF, 3,456 nodes, Intel Xeon
PLEIDES | 7.09 PF, 11,207 nodes, Intel Xeon, NVIDIA K40, Volta GPUs
ENDEAVOR | 32 TF, 2 nodes, Intel Xeon
MEROPE | 253 TF, 1792 nodes, Intel Xeon

Amazon Web Services
As part of the COVID-19 HPC Consortium, AWS is offering research institutions and companies technical support and promotional credits for the use of AWS services to advance research on diagnosis, treatment, and vaccine studies to accelerate our collective understanding of the novel coronavirus (COVID-19). Researchers and scientists working on time-critical projects can use AWS to instantly access virtually unlimited infrastructure capacity, and the latest technologies in compute, storage and networking to accelerate time to results. Learn more here.
Microsoft Azure High Performance Computing (HPC)

Microsoft Azure offers purpose-built compute and storage specifically designed to handle the most demanding computationally and data intensive scientific workflows. Azure is optimized for applications such as genomics, precision medicine and clinical trials in life sciences.  

Our team of HPC experts and AI for Health data science experts, whose mission is to improve the health of people and communities worldwide, are available to collaborate with COVID-19 researchers as they tackle this critical challenge. More broadly, Microsoft's research scientists across the world, spanning computer science, biology, medicine, and public health, will be available to provide advice and collaborate per mutual interest.

Azure HPC helps improve the efficiency of drug development process with power and scale for computationally intensive stochastic modeling and simulation workloads, such as population pharmacokinetic and pharmacokinetic-pharmacodynamic modeling.

Microsoft will give access to our Azure cloud and HPC capabilities.

HPC-optimized and AI Optimized virtual machines (VM)

·        Memory BW Intensive CPUs: Azure HBv2 Instances (AMD EPYC™ 7002-series | 4GB RAM per core | 200Gb/s HDR InfiniBand)

·        Compute Intensive CPUs: Azure HC Instances (Intel Xeon Platinum 8168 | 8GB RAM per core | 100Gb/s EDR InfiniBand)

·        GPU Intensive RDMA connected: Azure NDv2 Instances (8 NVIDIA V100 Tensor Core GPUs with NVIDIA NVLink interconnected GPUs | 32GB RAM each | 40 non-hyperthreaded Intel Xeon Platinum 8168 processor cores | 100Gb/s EDR InfiniBand)

·        See the full list of HPC-optimized VM's (H-SeriesNC-Series, and ND-Series)

 

Storage Options:

·        Azure HPC Cache | Azure NetApp Files | Azure Blog Storage | Cray ClusterStor

 

Management:

·        Azure CycleCloud

 

Batch scheduler

 

Azure HPC life sciences: https://azure.microsoft.com/en-us/solutions/high-performance-computing/health-and-life-sciences/#features

Azure HPC web site: https://azure.microsoft.com/en-us/
AI for Health web site: https://www.microsoft.com/en-us/ai/ai-for-health

Hewlett Packard Enterprise
As part of this new effort to attack the novel coronavirus (COVID-19) pandemic, Hewlett Packard Enterprise is committing to providing supercomputing software and applications expertise free of charge to help researchers port, run, and optimize essential applications. Our HPE Artificial Intelligence (AI) experts are collaborating to support the COVID-19 Open Research Dataset and several other COVID-19 initiatives for which AI can drive critical breakthroughs. They will develop AI tools to mine data across thousands of scholarly articles related to COVID-19 and related coronaviruses to help the medical community develop answers to high-priority scientific questions. We encourage researchers to submit any COVID-19 related proposals to the consortium's online portal. More information can be found here: www.hpe.com/us/en/about/covid19/hpc-consortium.html.
Google

Transform research data into valuable insights and conduct large-scale analyses with the power of Google Cloud. As part of the COVID-19 HPC Consortium, Google is providing access to Google Cloud HPC resources for academic researchers.

 

.
BP

BP's Center for High Performance Computing (CHPC), located at their US headquarters in Houston, serves as a worldwide hub for processing and managing huge amounts of geophysical data from across BP's portfolio and is a key tool in helping scientists to ‘see' more clearly what lies beneath the earth's surface. The high performance computing team is made up of people with deep skills in computational science, applied math, software engineering and systems administration. BP's biosciences research team includes computational and molecular biologists, with expertise in software tools for bioinformatics, microbial genomics, computational enzyme design and metabolic modeling.

To help meet the COVID-19 challenge facing the nation, BP is offering access to the CHPC, the high performance computing team and the biosciences research team to support researchers in their efforts to combat the virus. BP's computing capabilities include:

  • Almost 7,000 HPE compute servers with Intel Cascade Lake AP, Skylake, Knights Landing and Haswell processors
  • Over 300,000 cores
  • 16.3 Petaflops (calculations or floating operations points per second)
  • Over 40 Petabytes of storage capacity
  • Mellanox IB and Intel OPA high speed interconnects
NVIDIA

A task force of NVIDIA researchers and data scientists with expertise in AI and HPC will help optimize research projects on the Consortium's supercomputers. The NVIDIA team has expertise across a variety of domains, including AI, supercomputing, drug discovery, molecular dynamics, genomics, medical imaging and data analytics. NVIDIA will also contribute the packaging of software for relevant AI and life-sciences software applications through NVIDIA NGC, a hub for GPU-accelerated software. The company is also providing compute time on an AI supercomputer, SaturnV.

 

D. E. Shaw Research Anton 2 at PSC

Operated by the Pittsburgh Supercomputing Center (PSC) with support from National Institutes of Health award R01GM116961, Anton 2 is a special-purpose supercomputer for molecular dynamics (MD) simulations developed and provided without cost by D. E. Shaw Research. For more information, see https://psc.edu/anton2-for-covid-19-research.

Intel

Intel will provide HPC /AI and HLS subject matter experts and engineers to collaborate on COVID-19 code enhancements to benefit the community. Intel will also provide licenses for High Performance Computing software development tools for the research programs selected by the COVID-19 HPC Consortium. The integrated tool suites include Intel's C++ and Fortran Compilers, performance libraries, and performance-analysis tools.

Ohio Supercomputer Center

The Ohio Supercomputer Center (OSC), a member of the Ohio Technology Consortium of the Ohio Department of Higher Education, addresses the rising computational demands of academic and industrial research communities by providing a robust shared infrastructure and proven expertise in advanced modeling, simulation and analysis. OSC empowers scientists with the vital resources essential to make extraordinary discoveries and innovations, partners with businesses and industry to leverage computational science as a competitive force in the global knowledge economy, and leads efforts to equip the workforce with the key technology skills required to secure 21st century jobs. For more, visit www.osc.edu

  • OSC Owens | 1.6 PF, 824 nodes Intel Xeon/Pascal
    • 2 x Intel Xeon (28 cores per node, 48 cores per big-mem node)
    • 160 NIVIDIA P100 GPUs (1 per node)
    • 128 GB per node (1.5TB per big-mem node)
    • Mellanox EDR Infiniband
    • 12.5 PB Project and Scratch storage
  • OSC Pitzer | 1.3.PF, 260 nodes Intel Xeon/Volta
    • 2 x Intel Xeon (40 cores per node, 80 cores per big-mem node)
    • 64 NIVIDIA V100 GPUs (2 per node)
    • 192 GB per node (384 GB per GPU node, 3TB per big-mem node)
    • Mellanox EDR Infiniband
    • 12.5 PB Project and Scratch storage
Dell

Zenith

The Zenith cluster is the result of a partnership between Dell and Intel®. On the TOP500 list of fastest supercomputers in the world, Zenith includes Intel Xeon® Scalable Processors, Omni‑Path fabric architecture, data center storage solutions, FPGAs, adapters, software and tools. Projects underway include image classification to identify disease in X-rays, MRI scan matching to thoughts and actions, and building faster neural networks to drive recommendation engines. Zenith is available to researchers via the COVD-19 HPC Consortium via standard the application process, subject to availability.

Zenith Configuration:

  • Servers:
    • 422 PowerEdge C6420 servers  
    • 160x PowerEdge C6320p servers
    • 4 PowerEdge R740 servers with Intel – FPGAs
  • Processors
    • 2nd generation Intel Xeon Scalable processors
    • Intel Xeon Phi™
  • • Memory
    • 192GB at 2,933MHz per node (Xeon Gold)
    • 96GB at 2,400MHz per node (Xeon Phi)
  • Operating System: Red Hat® Enterprise Linux® 7
  • Host channel adapter (HCA) card: Intel Omni‑Path Host Fabric Interface Storage
  • Storage
    • 2.68PB Ready Architecture for HPC Lustre Storage
    • 480TB Ready Solutions for HPC NFS Storage
    • 174TB Isilon F800 all-flash NAS storage
UK Digital Research Infrastructure
The UK Digital Research Infrastructure consists of a range of advanced computing systems from academic and UK government agencies with a wide range of different capabilities and capacities. Expertise in porting, developing and testing software is also available from the research software engineers (RSEs) supporting the systems.

Specific technical details on the systems available:

  • ARCHER | 4920 nodes (118,080 cores), two 2.7 GHz, 12-core Intel Xeon E5-2697 v2 per node. 4544 nodes with 64 GB memory nodes and 376 with 128 GB. Cray Aries interconnect. 4.4 PB high-performance storage.
  • Cirrus | 280 nodes (10080 cores), two 2.1GHz 18 core Intel Xeon E5-2695 per node. 256 GB memory per node; 2 GPU nodes each containing two 2.4 Ghz, 20 core Intel Xeon 6148 processors and four NVIDIA Tesla V100-PCIE-16GB GPU accelerators. Mellanox FDR interconnect. 144 NVIDIA V100 GPUs in 36 Plainfield blades (2 Intel Cascade Lake processors and 4 GPUs per node).
  • DiRAC Data Intensive Service (Cambridge) | 484 nodes (15488 cores), two Intel Xeon 6142 per node, 192 GB or 384 GB memory per node; 11 nodes with 4x Nvidia P100 GPUs and 96 GB memory per node; 342 nodes of Intel Xeon Phi with 96 GB memory per node.
  • DiRAC Data Intensive Service (Leicester) | 408 nodes (14688 cores), two Intel Xeon 6140 per node, 192 GB memory per node; 1x 6 TB RAM server with 144 Intel Xeon 6154 cores; 3x 1.5TB RAM servers with 36 Intel Xeon 6140 cores;  64 nodes (4096 cores) Arm ThunderX2 with 128 GB RAM/node.
  • DiRAC Extreme Scaling Service (Edinburgh) | 1468 nodes (35,232 cores), two Intel Xeon 4116 per node, 96 GB RAM/node. Dual rail Intel OPA interconnect.
  • DiRAC Memory Intensive Service (Durham) | 452 nodes (12,656 cores), two Intel Xeon 5120 per node, 512 GB RAM/node, 440TB flash volume for checkpointing.
  • Isambard | 332 nodes (21,248 cores), two Arm-based Marvell ThunderX2 32 core 2.1 GHz per node. 256 GB memory per node. Cray Aries interconnect. 75 TB high-performance storage.
  • JADE | 22x Nvidia DGX-1V nodes with 8x Nvidia V100 16GB and 2x 20 core Intel Xeon E5-2698 per node.
  • MMM Hub (Thomas) | 700 nodes (17000 cores), 2x 12 core Intel Xeon E5-2650v4 2.1 GHz per node, 128 GB RAM/node.
  • NI-HPC | 60x Dell PowerEdge R6525, two AMD Rome 64-core 7702 per node. 768GB RAM/node; 4x Dell PowerEdge R6525 with 2TB RAM; 8 x Dell DSS8440 (each with 2x Intel Xeon 8168 24-core). Provides 32x Nvidia Tesla V100 32GB.
  • XCK | 96 nodes (6144 cores), one 1.3 GHz, 64-core Intel Xeon Phi 7320 per node + 20 nodes (640 cores), two 2.3 Ghz, 16 core Intel Xeon E5-2698 v3 per node. 16 GB fast memory + 96 GB DDR per Xeon Phi node, 128 GB per Xeon node. Cray Aries interconnect. 9TB of DataWarp storage and 650 TB of high-performance storage.
  • XCS | 6720 nodes (241,920 cores), two Intel Xeon 2.1 GHz, 18-core E5-2695 v4 series per node. All with 128 GB RAM/node. Cray Aries interconnect. 11 PB of high-performance storage.
CSCS – Swiss National Supercomputing Centre
CSCS Piz Daint 27 PF, 5704 nodes, Cray XC50/NVIDIA PASCAL
Xeon E5-2690v3 12C 2.6GHz 64GB RAM
NVIDIA® Tesla® P100 16GB
Aries interconnect
Swedish National Infrastructure for Computing (SNIC)

The Swedish National Infrastructure for Computing is a national research infrastructure that makes resources for large scale computation and data storage available, as well as provides advanced user support to make efficient use of the SNIC resources.

Beskow | 2.5 PF, 2060 nodes, Intel Haswell & Broadwell.

Funded by Swedish National Infrastructure for Computing and operated by the PDC Center for High-Performance Computing at the KTH Royal Institute of Technology in Stockholm, Sweden, Beskow supports capability computing and simulations in the form of wide jobs. Attached to Beskow is a 5 PB Lustre file system from DataDirect Networks. Beskow is also a Tier-1 resource in the Prace European Project.

Beskow is built by Cray, Intel and DataDirect Networks.

Consortium Affiliates provide a range of computing services and expertise that can enhance and accelerate the research for fighting COVID-19. Matched proposals will have access to resources and help from Consortium Affiliates, provided for free, enabling rapid and efficiently execution of complex computational research programs.

Atrio | [Affiliate]

Atrio will assist researchers studying COVID-19 in creating and optimizing performance of application containers (e.g. CryoEM processing application suite) , as well as performance-optimized deployment of those application containers on to any of HPC Consortium members' computational platforms and specifically onto high performing GPU and CPU resources. Our proposal is two fold - one is additional computational resources, and another, equally important, is support for COVID-19 researchers with an easy way to access and use HPC Consortium computational resources. That support consists of creating application containers for researchers, optimizing their performance, and an optional multi-site container and cluster management software toolset.

Data Expedition Inc | [Affiliate]

Data Expedition, Inc. (DEI) is offering free licenses of its easy-to-use ExpeDat and CloudDat accelerated data transport software to researchers studying COVID-19. This software transfers files ranging from megabytes to terabytes from storage to storage, across wide area networks, among research institutions, cloud providers, and personal computers at speeds many times faster than traditional software. Available immediately for an initial 90-day license. Requests to extend licenses will be evaluated on a case-by-case basis to facilitate continuing research..

Flatiron | [Affiliate]

The Flatiron Institute is a multi-disciplinary science lab with 50 scientists in computational Biology. Flatiron is pleased to offer 3.5M core hours per month on our modern HPC system and 5M core hours per month on Gordon, our older HPC facility at SDSC.

Fluid Numerics | [Affiliate]

Fluid Numerics' Slurm-GCP deployment leverages Google Compute Engine resources and the Slurm job scheduler to execute high performance computing (HPC) and high throughput computing (HTC) workloads. Our system is currently capable of ~6pflops but please keep in mind this is a quota-bound metric that can be adjusted if needed. We intend to provide onboarding and remote system administration resources for the fluid-slurm-gcp HPC cluster solution on Google Cloud Platform. We will help researchers leverage GCP for COVID-19 research by assisting with software installation and porting, user training, consulting, and coaching, and general GCP administration, including quota requests, identity and access management, and security compliance.

SAS | [Affiliate]

SAS is offering to provide licensed access to the SAS Viya platform and data science project based resources. SAS provided resources will be specific to the requirements of the selected COVID-19 project use-case. SAS expects a typical engagement on a project would require 1-2 data science resources, a project manager, a data prep specialist and potentially a visualization expert.

Raptor Computing Systems, LLC | [Affiliate]

Our main focus for this membership though is developer systems, as we offer a wide variety of desktop and workstation systems built on the POWER architecture. These run Linux, support NVIDIA GPUs and provide an applications development environment for targeting the larger supercomputers. This is the main focus of our support effort. We can provide these machines free of charge (up to a reasonable limit) to the COVID effort to free up supercomputer / high end HPC server time that would otherwise be allocated to development and testing of the algorithms / software in use.

The HDF Group | [Affiliate]

The HDF Group helps scientists use open source HDF5 effectively, including offering general usage and performance tuning advice, and helping to troubleshoot any issues that arise. Our engineers will be available to assist you in applying HPC and HDF® technologies together for your COVID-19 research.

Acknowledging Support

Papers, presentations, and other publications featuring work that was supported, at least in part, by the resources, services and support provided via the COVID-19 HPC Consortium are expected to acknowledge that support.  Please include the following acknowledgement:

This work used resources services, and support provided via the COVID-19 HPC Consortium (https://covid19-hpc-consortium.org/), which is a unique private-public effort to bring together  government, industry, and academic leaders who are volunteering free compute time and resources in support of COVID-19 research.

 

(Revised 10 June 2020)

Key Points
Computing Resources utilized in research against COVID-19
National scientists encouraged to use computing resources
How and where to find computing resources
Contact Information