Content with tag argonne .

COVID-19 Response: To our valued stakeholders and XSEDE collaborators,
By now you have received a flurry of communication surrounding the ongoing COVID-19 pandemic and how various organizations are responding, and XSEDE is no exception. As XSEDE staff have transitioned out of their usual offices and into telecommuting arrangements with their home institutions, they have worked both to support research around the pandemic and to ensure we operate without interruption.


SDSC Webinar: Running Jupyter Notebooks on Comet / April 16
Jupyter Notebooks are popular interactive web tools that can be launched to access local and remote file systems. This webinar, led by SDSC Computational Data Scientist/HPC Trainer Mary Thomas, covers SDSC's multi-tiered approach to running notebooks on Comet and how to run the different modes including default HTTP connections or those using JupyterHub.


XSEDE Webinar: Running Jupyter Notebooks on Comet / May 21 
Jupyter Notebooks are interactive web tools known as a computational notebooks, in which researchers can combine software, text, multimedia resources, and computational output. Registration close date 05/20/2020 23:59 PDT


COVID-19 HPC Consortium

HPC Resources available to fight COVID-19

The COVID-19 HPC Consortium encompasses computing capabilities from some of the most powerful and advanced computers in the world. We hope to empower researchers around the world to accelerate understanding of the COVID-19 virus and the development of treatments and vaccines to help address infections. Consortium members manage a range of computing capabilities that span from small clusters to some of the very largest supercomputers in the world.

Preparing your COVID-19 HPC Consortium Request

To request access to resources of the COVID-19 HPC Consortium, you must prepare a description, no longer than three pages, of your proposed work. To ensure your request is directed to the appropriate resource(s), your description should include the following sections. Do not include any proprietary information in proposals, since your request will be reviewed by staff from a number of consortium sites. It is expected that teams who receive Consortium access will publish their results in the open scientific literature. All supported projects will have the name of the principal investigator, project title and project abstract posted to the COVID-19 HPC Consortium web site.

The proposals will be evaluated on the following criteria:

  • Potential benefits for COVID-19 response
  • Feasibility of the technical approach
  • Need for high-performance computing
  • High-performance computing knowledge and experience of the proposing team
  • Estimated computing resource requirements 

A. Scientific/Technical Goal

Describe how your proposed work contributes to our understanding of COVID-19 and/or improves the nation's ability to respond to the pandemic.

  • What is the scientific/technical goal?
  • What is the plan and timetable for getting to the goal?
  • What is the expected period for performance (one week to three months)?
  • Where do you plan to publish your results and in what timeline? 

B. Estimate of Compute, Storage and Other Resources

To the extent possible, provide an estimate of the scale and type of the resources needed to complete the work. The links in the Resources section are available to help you answer this question.

  • Are there specific computing architectures or systems that are most appropriate (e.g. GPUs, large memory, large core counts on shared memory node, etc.)
  • How much computing support will this effort approximately require in terms of core, node, or GPU hours?
  • How distributed can the computation be, and can it be split across multiple HPC systems?
  • Can this workload execute in a cloud environment? 
  • Describe the storage needs of the project.
  • Does your project require access to any public datasets? If so, please describe these datasets adn how you intend to use them? 

C. Support Needs

Describe whether collaboration or support from staff at the National labs, Commercial Cloud providers, or other HPC facilities will be essential, helpful, or unnecessary. Estimates of necessary application support are very helpful. Teams should also identify any restrictions that might apply to the project, such as export-controlled code, ITAR restrictions, proprietary data sets, regional location of compute resources, or HIPAA restrictions. 

D. Team and Team Preparedness

Summarize your team's qualifications and readiness to execute the project.

  • What is the expected lead time before you can begin the simulation runs?
  • What systems have you recently used and how big were the simulation runs?
  • Given that some resources are at federal facilities with restrictions, please provide a list of team members that will require accounts on resources along with their citizenship. 

Document Formatting

While readability is of greatest importance, documents must satisfy the following minimum requirements. Documents that conform to NSF proposal format guidelines will satisfy these guidelines.

  • Margins: Documents must have 2.5-cm (1-inch) margins at the top, bottom, and sides.
  • Fonts and Spacing: The type size used throughout the documents must conform to the following three requirements:
  • Use one of the following typefaces identified below:
    • Arial 11, Courier New, or Palatino Linotype at a font size of 10 points or larger;
    • Times New Roman at a font size of 11 points or larger; or
    • Computer Modern family of fonts at a font size of 11 points or larger.
  • A font size of less than 10 points may be used for mathematical formulas or equations, figures, table or diagram captions and when using a Symbol font to insert Greek letters or special characters. PIs are cautioned, however, that the text must still be readable.
  • Type density must be no more than 15 characters per 2.5 cm (1 inch).
  • No more than 6 lines must be within a vertical space of 2.5 cm (1 inch).

* **Page Numbering**: Page numbers should be included in each file by the submitter. Page numbering is not provided by XRAS. * **File Format**: XRAS accepts only PDF file formats.

Submitting your COVID-19 HPC Consortium request 

  1. Create an XSEDE portal account
    • Go to https://portal.xsede.org/
    • Click on "Sign In" at the upper right, if you have an XSEDE account … 
    • … or click "Create Account" to create one. 
    • To create an account, basic information will be required (name, organization, degree, address, phone, email). 
    • Email verification will be necessary to complete the account creation.
    • Set your username and password.
    • After your account is created, be sure you're logged into https://portal.xsede.org/
    • IMPORTANT: Each individual should have their own XSEDE account; it is against policy to share user accounts.
  2. Go to the allocation request form
    • Follow this link to go directly to the submission form.
    • Or to navigate to the request form:
      • Click the "Allocations" tab in the XSEDE User Portal,
      • Then select "Submit/Review Request."
      • Select the "COVID-19 HPC Consortium" opportunity.
    • Select "Start a New Submission."
  3. Complete your submission
    • Provide the data required by the form. Fields marked with a red asterisk are required to complete a submission.
    • The most critical screens are the PersonnelTitle/Abstract, and Resources screens.
      • On the Personnel screen, one person must be designated as the Principal Investigator (PI) for the request. Other individuals can be added as co-PIs or Users (but they must have XSEDE accounts).
      • On the Title/Abstract screen, all fields are required.
      • On the Resources screen…
        • Enter "n/a" in the "Disclose Access to Other Compute Resources" field (to allow the form to be submitted).
        • Then, select "COVID-19 HPC Consortium" and enter 1 in the Amount Requested field. 
    • On the Documents screen, select "Add Document" to upload your 3-page document. Select "Main Document" or "Other" as the document Type.
      • Only PDF files can be accepted.
    • You can ignore the Grants and Publications sections. However, you are welcome to enter any supporting agency awards, if applicable.
    • On the Submit screen, select "Submit Request." If necessary, correct any errors and submit the request again.

Resources available for COVID-19 HPC Consortium request 
Click on title to see full description

U.S. Department of Energy (DOE) Advanced Scientific Computing Research (ASCR)
Supercomputing facilities at DOE offer some of the most powerful resources for scientific computing in the world. The Argonne Leadership Computing Facility (ALCF) and Oak Ridge Leadership Computing Facility (OLCF) and the National Energy Research Scientific Computing Center (NERSC) may be used for modeling and simulation coupled with machine and deep learning techniques to study a range of areas, including examining underlying protein structure, classifying the evolution of the virus, understanding mutation, uncovering important differences, and similarities with the 2002-2003 SARS virus, searching for potential vaccine and antiviral, compounds, and simulating the spread of COVID-19 and the effectiveness of countermeasure options.

 

Oak Ridge Summit | 200 PF, 4608 nodes, IBM POWER9/NVIDIA Volta

Summit System

 2 x IBM POWER9 per node
42 TF per node
6 x NVIDIA Volta GPUs per node
512 GB DDR4 + 96 GB HBM2 (GPU memory) per node
1600 GB per node
2 x Mellanox EDR IB adapters (100 Gbps per adapter)
250 PB, 2.5 TB/s, IBM Spectrum Scale storage

 

Argonne Theta | 11.69 PF, 4292 nodes, Intel Knights Landing

1 x Intel KNL 7230 per node, 64 cores per CPU
192 GB DDR4, 16 GB MCDRAM memory per node
128 GB local storage per node
Aries dragonfly network
10 PB Lustre + 1 PB IBM Spectrum Scale storage
Full details available at: https://www.alcf.anl.gov/alcf-resources

 

National Energy Research Scientific Computing Center (NERSC)

NERSC Cori | 32 PF, 12,056 Intel Xeon Phi and Xeon nodes
9,668 nodes, each with one 68-core Intel Xeon Phi Processor 7250 (KNL)
96 GB DDR4 and 16 GB MCDRAM memory per KNL node
2,388 nodes, each with two 16-core Intel Xeon Processor E5-2698 v3 (Haswell)
128 GB DDR4 memory per Haswell node
Cray Aries dragonfly high speed network
30 PB Lustre file system and 1.8 PB Cray DataWarp flash storage
Full details at: https://www.nersc.gov/systems/cori/

U.S. DOE National Nuclear Security Administration (NNSA)

Established by Congress in 2000, NNSA is a semi-autonomous agency within the U.S. Department of Energy responsible for enhancing national security through the military application of nuclear science. NNSA resources at Lawrence Livermore National Laboratory (LLNL), Los Alamos National Laboratory (LANL), and Sandia National Laboratories (SNL) are being made available to the COVID-19 HPC Consortium.

Lawrence Livermore + Los Alamos + Sandia | 32.2 PF, 7375 nodes, IBM POWER8/9, Intel Xeon
  • LLNL Lassen
    • 23 PFLOPS, 788 compute nodes, IBM Power9/NVIDIA Volta GV100
    • 28 TF per node
    • 2 x IBM POWER9 CPUs (44 cores) per node
    • 4 x NVIDIA Volta GPUs per node
    • 256 BD DDR4 + 64 GB HBM2 (GPU memory) per node
    • 1600 GB NVMe local storage per node
    • 2 x Mellanox EDR IB (100Gb/s per adapter)
    • 24 PB storage
  • LLNL Quartz
    • 3.2 PF, 3004 compute nodes, Intel Broadwell
    • 1.2 TF per node
    • 2 x Intel Xeon E5-2695 CPUs (36 cores) per node
    • 128 GB memory per node
    • 1 x Intel Omni-Path IB (100Gb/s)
    • 30 PB storage (shared with other clusters)
  • LLNL Pascal
    • 0.9 PF, 163 compute nodes, Intel Broadwell CPUs/NVIDIA Pascal P100
    • 11.6 TF per node
    • 2 x Intel Xeon E5-2695 CPUs (36 cores) per node
    • 2 x NVIDIA Pascal P100 GPUs per node
    • 256 GB memory + 32 HBM2 (GPU memory) per node
    • 1 x Mellanox EDR IB (100Gb/s)
    • 30 PB storage (shared with other clusters) 
  • LLNL Ray
    • 1.0   PF, 54 compute nodes, IBM Power8/NVIDIA Pascal P100
    • 19 TF per node
    • 2 x IBM Power8 CPUs (20 cores) per node
    • 4 x NVIDIA Pascal P100 GPUs per node
    • 256 GB + 64 GB HBM2 (GPU memory) per node
    • 1600 GB NVMe local storage per node
    • 2 x Mellanox EDR IB (100Gb/s per adapter)
    • 1.5 PB storage
  • LLNL Surface
    • 506 TF, 158 compute nodes, Intel Sandy Bridge/NVIDIA Kepler K40m
    • 3.2 TF per node
    • 2 x Intel Xeon E5-2670 CPUs (16 cores) per node
    • 3 x NVIDIA Kepler K40m GPUs
    • 256 GB memory + 36 GB GDDR5 (GPU memory) per node
    • 1 x Mellanox FDR IB (56Gb/s)
    • 30 PB storage (shared with other clusters)
  • LLNL Syrah
    • 108 TF, 316 compute nodes, Intel Sandy Bridge
    • 0.3 TF per node
    • 2 x Intel Xeon E5-2670 CPUs (16 cores) per node
    • 64 GB memory per node
    • 1 x QLogic IB (40Gb/s)
    • 30 PB storage (shared with other clusters)
  • LANL Snow
    • 445 TF, 368 compute nodes, Intel Broadwell
    • 1.2 TF per node
    • 2 x Intel Xeon E5-2695 CPUs (36 cores) per node
    • 128 GB memory per node
    • 1 x Intel Omni-Path IB (100Gb/s)
    • 15.2 PB storage
  • LANL Badger
    • 790 TF, 660 compute nodes, Intel Broadwell
    • 1.2 TF per node
    • 2 x Intel Xeon E5-2695 CPUs (36 cores) per node
    • 128 GB memory per node
    • 1 x Intel Omni-Path IB (100Gb/s)
    • 15.2 PB storage
Rensselaer Polytechnic Institute
The Rensselaer Polytechnic Institute (RPI) Center for Computational Innovations is solving problems for next-generation research through the use of massively parallel computation and data analytics. The center supports researchers, faculty, and students a diverse spectrum of disciplines. RPI is making its Artificial Intelligence Multiprocessing Optimized System (AiMOS) system available to the COVID-19 HPC Consortium. AiMOS is an 8-petaflop IBM Power9/Volta supercomputer configured to enable users to explore new AI applications.

 

RPI AiMOS | 11.1 PF, 252 nodes POWER9/Volta

2 x IBM POWER9 CPU per node, 20 cores per CPU
6 x NVIDIA Tesla GV100 per node
32 GB HBM per GPU
512 GB DRAM per node
1.6 TB NVMe per node
Mellanox EDR InfiniBand
11 PB IBM Spectrum Scale storage

MIT/Massachusetts Green HPC Center (MGHPCC)
MIT is contributing two HPC systems to the COVID-19 HPC Consortium. The MIT Supercloud, a 7-petaflops Intel x86/NVIDIA Volta HPC cluster, is designed to support research projects that require significant compute, memory or big data resources. Satori, is a 2-petaflops scalable AI-oriented hardware resource for research computing at MIT composed of 64 IBM Power9/Volta nodes. The MIT resources are installed at the Massachusetts Green HPC Center (MGHPCC), which operates as a joint venture between Boston University, Harvard University, MIT, Northeastern University, and the University of Massachusetts.

 

MIT/MGHPCC Supercloud | 6.9 PF, 440 nodes Intel Xeon/Volta

2 x Intel Xeon (18 CPU cores per node)
2 x NVIDIA V100 GPUs pe node
32 GB HBM per GPU
Mellanox EDR InfiniBand
3 PB scratch storage

MIT/MGHPCC Satori | 2.0 PF, 64 nodes IBM POWER9/NVIDIA Volta

2 x POWER9 , 40  cores per node
4 x NVIDIA Volta GPUs per node (256 total)
32 GB HBM per GPU
1.6 TB NVMe per node
Mellanox EDR InfiniBand
2 PB scratch storage

IBM Research WSC

The IBM Research WSC cluster consists of 56 compute nodes, each with dual socket 22 core CPU and 6 GPUs, plus seven additional nodes dedicated to management functions. The cluster is intended to be used for the following purposes: client collaboration, advanced research for government-funded projects, advanced research on Converged Cognitive Systems, and advanced research on Deep Learning.

IBM Research WSC | 2.8 PF, 54 nodes IBM POWER9/NVIDIA Volta

  • 54 IBM POWER9 nodes

  • 2 x POWER9 CPU per node, 22 cores per CPU

  • 6 x NVIDIA V100 GPUs per node (336 total)

  • 512 GB DRAM per node

  • 1.4 TB NVMe per node

  • 2 x Mellanox EDR InfiniBand per node

  • 2 PB IBM Spectrum Scale distributed storage

  • RHEL 7.6

  • CUDA 10.1
  • 
IBM PowerAI 1.6
U.S. National Science Foundation (NSF)

The NSF Office of Advanced Cyberinfrastructure supports and coordinates the development, acquisition, and provision of state-of-the-art cyberinfrastructure resources, tools and services essential to the advancement and transformation of science and engineering. By fostering a vibrant ecosystem of technologies and a skilled workforce of developers, researchers, staff and users, OAC serves the growing community of scientists and engineers, across all disciplines. The most capable resources supported by NSF OAC are being made available to support the COVID-19 HPC Consortium.

Frontera | 38.7 PF, 8114 nodes, Intel Xeon, NVIDIA RTX GPU

Funded by the National Science Foundation and Operated by the Texas Advanced Computing Center (TACC), Frontera provides a balanced set of capabilities that supports both capability and capacity simulation, data-intensive science, visualization, and data analysis, as well as emerging applications in AI and deep learning. Frontera has two computing subsystems, a primary computing system focused on double precision performance, and a second subsystem focused on single-precision streaming-memory computing.  Frontera is built be Dell, Intel, DataDirect Networks, Mellanox, NVIDIA, and GRC.

Comet | 2.75 PF, total 2020 nodes, Intel Xeon, NVIDIA Pascal GPU

Operated by the San Diego Supercomputer Center (SDSC), Comet is a nearly 3-petaflop cluster designed by Dell and SDSC. It features Intel next-generation processors with AVX2, Mellanox FDR InfiniBand interconnects, and Aeon storage. 

Stampede2 | 19.3 PF, 4200 Intel KNL, 1,736 Intel Xeon

Operated by TACC, Stampede 2 is a nearly 20-petaflop HPC national resource accessible to  thousands of researchers across the country, including to enable new computational and data-driven scientific and engineering, research and educational discoveries and advances. 

Longhorn | 2.8 PF, 112 nodes, IBM POWER9, NVIDIA Volta

Longhorn is a TACC resource built in partnership with IBM to support GPU-accelerated workloads. The power of this system is in its multiple GPUs per node, and it is intended to support sophisticated workloads that require high GPU density and little CPU compute. Longhorn will support double-precision machine learning and deep learning workloads that can be accelerated by GPU-powered frameworks, as well as general purpose GPU calculations.

Bridges | 2 PF, 874 nodes, Intel Xeon, NVIDA K80, Volta GPUs, DGX-2

Operated by the Pittsburgh Supercomputing Center (PSC), Bridges and Bridges-AI provides an innovative HPC and data-analytic system, integrating advanced memory technologies to empower new modalities of artificial intelligence based computations, bring desktop convenience to HPC, connect to campuses, and express data-intensive scientific and engineering workflows.  

Jetstream | 320 nodes, Cloud accessible

Operated by a team led by the Indiana University Pervasive Technology Institute, Jetstream is a configurable large-scale computing resource that leverages both on-demand and persistent virtual machine technology to support a wide array of software environments and services through incorporating elements of commercial cloud computing resources with some of the best software in existence for solving important scientific problems.  

Open Science Grid | Distributed High Throughput Computing, 10,000+ nodes, Intel x86-compatible CPUs, various NVIDIA GPUs

The Open Science Grid (OSG) is a large virtual cluster of distributed high-throughput computing (dHTC) capacity shared by numerous national labs, universities, and non-profits, with the ability to seamlessly integrate cloud resources, too. The OSG Connect service makes this large distributed system available to researchers, who can individually use up to tens of thousands of CPU cores and up to hundreds of GPUs, along with significant support from the OSG team. Ideal work includes parameter optimization/sweeps, molecular docking, image processing, many bioinformatics tasks, and other work that can run as numerous independent tasks each needing 1-8 CPU cores, <8 GB RAM, and <10GB input or output data, though these can be exceeded significantly by integrating cloud resources and other clusters, including many of those contributing to the COVID-19 HPC Consortium.

Cheyenne | 5.34 PF, 4032 nodes, Intel Xeon

Operated by the National Center for Atmospheric Research (NCAR), Cheyenne is a critical tool for researchers across the country studying climate change, severe weather, geomagnetic storms, seismic activity, air quality, wildfires, and other important geoscience topics. The Cheyenne environment also encompases tens of petabytes of storage capacity and an analysis cluster to support efficient workflows. Built by SGI (now HPE), Cheyenne is funded by the Geosciences directorate of the National Science Foundation.

Blue Waters| 13.34 PF, 26,864 nodes, AMD Interlagos, NVIDIA Kepler K20X GPU

The Blue Waters sustained-petascale computing project is supported by the National Science Foundation, the State of Illinois, the University of Illinois and the National Geospatial-Intelligence Agency. Blue Waters is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Supercomputing Applications and provided by Cray.  Blue Waters is a well-balanced architecture that has 22,636 XE6 nodes with X86 compatible AMD two Interlagos 16 core CPUs and 4,228 XK7 nodes, each with a NVIDIA Kepler K20X GPU and a 16 core AMD Interlagos CPU.  The system is  integrated with a single high speed Gemini 24x24x24 torus with an aggregate bandwidth of 265+ TBps to simultaneously support very large scale parallel and high through, many job applications.  Blue Waters has a 36PB (26 usable) shared Lustre file system that supports 1.1 TB/s I/O bandwidth.  It is connected at a total of 450 Gbps to Wide Area networks.  The rich system software includes multiple compilers, communication libraries, software and visualization tools, docker containers, python, and machine learning and data management methods that supports capability and capacity simulation, data-intensive science, visualization, and data analysis, and machine learning/AI.  All projects are provided with expert points of contact and provided with advanced application support.  

 

NASA High-End Computing Capability

NASA Supercomputing Systems | 19.13 PF, 15800 nodes Intel x86

NASA's High-End Computing Capability (HECC) Portfolio provides world-class high-end computing, storage, and associated services to enable NASA-sponsored scientists and engineers supporting NASA programs to broadly and productively employ large-scale modeling, simulation, and analysis to achieve successful mission outcomes.

NASA's Ames Research Center in Silicon Valley hosts the agency's most powerful supercomputing facilities. To help meet the COVID-19 challenge facing the nation and the world, HECC is offering access to NASA's high-performance computing (HPC) resources for researchers requiring HPC to support their efforts to combat this virus. 

 

NASA Supercomputing Systems | 19.39 PF, 17609 nodes Intel Xeon

AITKEN | 3.69 PF, 1,152 nodes, Intel Xeon
ELECTRA | 8.32 PF, 3,456 nodes, Intel Xeon
PLEIDES | 7.09 PF, 11,207 nodes, Intel Xeon, NVIDIA K40, Volta GPUs
ENDEAVOR | 32 TF, 2 nodes, Intel Xeon
MEROPE | 253 TF, 1792 nodes, Intel Xeon

Amazon Web Services
As part of the COVID-19 HPC Consortium, AWS is offering research institutions and companies technical support and promotional credits for the use of AWS services to advance research on diagnosis, treatment, and vaccine studies to accelerate our collective understanding of the novel coronavirus (COVID-19). Researchers and scientists working on time-critical projects can use AWS to instantly access virtually unlimited infrastructure capacity, and the latest technologies in compute, storage and networking to accelerate time to results. Learn more here.
Microsoft Azure High Performance Computing (HPC)

Microsoft Azure offers purpose-built compute and storage specifically designed to handle the most demanding computationally and data intensive scientific workflows. Azure is optimized for applications such as genomics, precision medicine and clinical trials in life sciences.  

Our team of HPC experts and AI for Health data science experts, whose mission is to improve the health of people and communities worldwide, are available to collaborate with COVID-19 researchers as they tackle this critical challenge. More broadly, Microsoft's research scientists across the world, spanning computer science, biology, medicine, and public health, will be available to provide advice and collaborate per mutual interest.

Azure HPC helps improve the efficiency of drug development process with power and scale for computationally intensive stochastic modeling and simulation workloads, such as population pharmacokinetic and pharmacokinetic-pharmacodynamic modeling.

Microsoft will give access to our Azure cloud and HPC capabilities.

HPC-optimized and AI Optimized virtual machines (VM)

·        Memory BW Intensive CPUs: Azure HBv2 Instances (AMD EPYC™ 7002-series | 4GB RAM per core | 200Gb/s HDR InfiniBand)

·        Compute Intensive CPUs: Azure HC Instances (Intel Xeon Platinum 8168 | 8GB RAM per core | 100Gb/s EDR InfiniBand)

·        GPU Intensive RDMA connected: Azure NDv2 Instances (8 NVIDIA V100 Tensor Core GPUs with NVIDIA NVLink interconnected GPUs | 32GB RAM each | 40 non-hyperthreaded Intel Xeon Platinum 8168 processor cores | 100Gb/s EDR InfiniBand)

·        See the full list of HPC-optimized VM's (H-SeriesNC-Series, and ND-Series)

 

Storage Options:

·        Azure HPC Cache | Azure NetApp Files | Azure Blog Storage | Cray ClusterStor

 

Management:

·        Azure CycleCloud

 

Batch scheduler

 

Azure HPC life sciences: https://azure.microsoft.com/en-us/solutions/high-performance-computing/health-and-life-sciences/#features

Azure HPC web site: https://azure.microsoft.com/en-us/
AI for Health web site: https://www.microsoft.com/en-us/ai/ai-for-health

Hewlett Packard Enterprise
As part of this new effort to attack the novel coronavirus (COVID-19) pandemic, Hewlett Packard Enterprise is committing to providing supercomputing software and applications expertise free of charge to help researchers port, run, and optimize essential applications. Our HPE Artificial Intelligence (AI) experts are collaborating to support the COVID-19 Open Research Dataset and several other COVID-19 initiatives for which AI can drive critical breakthroughs. They will develop AI tools to mine data across thousands of scholarly articles related to COVID-19 and related coronaviruses to help the medical community develop answers to high-priority scientific questions. We encourage researchers to submit any COVID-19 related proposals to the consortium's online portal. More information can be found here: www.hpe.com/us/en/about/covid19/hpc-consortium.html.
Google

Transform research data into valuable insights and conduct large-scale analyses with the power of Google Cloud. As part of the COVID-19 HPC Consortium, Google is providing access to Google Cloud HPC resources for academic researchers.

 

.

(Revised 3 April 2020)

Key Points
Computing Resources utilized in research against COVID-19
National scientists encouraged to use computing resources
How and where to find computing resources
Contact Information

BOINC@TACC / March 20
The BOINC@TACC project is based on the Volunteer Computing model. It helps researchers in running applications from a wide range of scientific domains on laptops, desktops, tablets, or cloud-based virtual machines owned by volunteers. With BOINC@TACC, students and researchers can run small high-throughput computing jobs without spending their active project allocations. Participants may attend in person at TACC or remotely by webcast.


 
April 2020 | Science Highlights, Announcements & Upcoming Events
 
XSEDE helps the nation's most creative minds discover breakthroughs and solutions for some of the world's greatest scientific challenges. Through free, customized access to the National Science Foundation's advanced digital resources, consulting, training, and mentorship opportunities, XSEDE enables you to Discover More. Get started here.
 
Science Highlights
 
Supercomputers Unlock Reproductive Mysteries of Viruses and Life
 
XSEDE-allocated resources complete simulations pertinent to coronavirus, DNA replication
 
 
Viruses such as the  novel coronavirus  rely on the host cell membrane to drastically bend and eventually let loose the replicated viruses trapped inside the cell. Scientists have used supercomputer simulations to help propose a mechanism for this budding off of viruses. A related study also used simulations to find a mechanism for how the DNA of all life adds a base to its growing strand during replication. This fundamental research could help lead to new strategies and better technology that combats infectious and genetic diseases.
 
 
Supercomputer simulations led scientists to a mechanism for the budding off of viruses such as the coronavirus. [Credit: Mandal et al.]
 
Freedom, They Printed
 
AI on XSEDE-allocated system solves mystery of who printed seminal works on liberty
 
 
In the 17th century, you could get jailed or even executed for criticizing the government of England. But a flood of books on civil liberties, produced at great risk by anonymous printers, helped change that. An artificial intelligence (AI) analysis of irregular letters using PSC's Bridges supercomputer, with help from XSEDE's Extended Collaborative Support Services (ECSS) team, has helped a Carnegie Mellon team solve the mystery of who printed nine of these seminal works.
 
 
Movable metal type for a printing press.
 
XSEDE Resources Help Benchmark Cancer Immunotherapy Tool
 
New study advances research on individualized patient treatment
 
 
With the  American Cancer Society  estimating 1.76 million new cases and more than 600,000 deaths during 2019 in the U.S. alone, cancer remains a critical healthcare challenge. In efforts to help mitigate these numbers, researchers at Rice University are using XSEDE resources to evaluate their new molecular docking tool, called Docking INCrementally or DINC, which aims to improve immunotherapy outcomes by identifying more effective personalized treatments. T his molecular docking approach can make predictions of molecular interactions that other docking tools would miss, which has strong implications in cases where these predictions are notoriously difficult to make, and especially in the context of immunotherapy, which leverages the immune system to combat cancer.
 
 
This image depicts a Human Leucocyte Antigen (HLA) receptor (in grey) that displays a small protein fragment or peptide (in red) at the surface of a cell. If this peptide is recognized as "suspicious" by the immune system, the cell will be destroyed. [Credit: D. Devaurs]
 
A Galactic Choice
 
AI running on XSEDE systems surpasses humans at classifying galaxies
 
 
New telescope surveys are discovering hundreds of millions of new galaxies—far more than humans can classify. A National Center for Supercomputing Applications (NCSA)-led team has employed deep learning artificial intelligence (AI) on XSEDE-allocated systems to produce a galaxy-classifying artificial intelligence with better-than-human accuracy and capacity.
 
 
Images from the Dark Energy Survey that the AI identified as spiral.
 
Program Announcements
 
XSEDE in the Time of COVID-19
 
Throughout these uncertain times, XSEDE is continuing to operate as normal. XSEDE leadership has compiled status updates from all XSEDE subaward partner institutions and allocated resources/services. This information, as well as other announcements related to impacts in XSEDE services resulting from the COVID-19 outbreak, are being gathered and reported at   https://confluence.xsede.org/display/XT/COVID-19+Information .
 
XSEDE Joins HPC Consortium for COVID-19 Response
 
 
XSEDE and the XSEDE Resource Allocations System ( XRAS ) are proud to be contributing to the COVID-19 HPC Consortium from  T he White House's Office of Science and Technology Policy , a private/government/academic partnership that seeks to expedite applications for advanced computing research to combat the COVID-19 pandemic.
 
Researchers who are interested in conducting this timely work are asked to  submit research proposals to the COVID-19 Online Portal here , which is handled by the XRAS team.
 
 
Distance Learning Resources Offered by XSEDE
 
For helpful learning materials and assistance in bringing courses online, please check out XSEDE's distance learning resources.
 
Online training materials available on the XSEDE User Portal
 
XSEDE Training YouTube Channel
  • If you are not able to attend the HPC Monthly Workshop series in person, the lectures, organized by workshop, are available on the XSEDE Training YouTube channel.
 
Consulting Support
  • XSEDE Education staff can provide consulting for transitioning courses to remote and online delivery. Email kcahill@osc.edu if you have questions about resources or would like to schedule a consultation.
 
The HPC University
  • The HPC University (HPCU) is a virtual organization that provides a cohesive, persistent, and sustainable online environment to share educational and training materials for a continuum of high performance computing environments.
  • Resources 
  • Workshop/Training Materials
 
Computing4Change Now Accepting Applications
Do you know an undergraduate student who seeks not only to enhance their skillset, but also to create positive change in the community?
 
Computing4Change is a competition for students from diverse disciplines and backgrounds who want to work collaboratively to:
  • Learn to apply data analysis and computational thinking to a social challenge
  • Experience the latest tools and techniques for exploring data through visualization
  • Expand skills in team-based problem solving
  • Learn how to communicate ideas more effectively to the general public
 
The Computing4Change competition will be held at SC20 in Atlanta, GA, Nov. 14-20, 2020.  Applications will be accepted through May 18, 2020
 
 
Community Announcements
 
Priority Help to COVID-19 Projects Offered by Trusted CI, NSF CI CoE Pilot, and SGCI
 
The  NSF Cyberinfrastructure Center of Excellence Pilot Trusted CI , and the  Science Gateways Community Institute  are all available to help the science community tackle research to address the coronavirus disease 2019 (COVID-19) outbreak. If your project could benefit from expert cyberinfrastructure consulting in:
  • data management and visualization
  • workflow management
  • use of cloud resources, high-performance clusters, or distributed resources
  • science gateway technology
  • cybersecurity
  • compliance
please contact   covid19@trustedci.org  for priority assistance.
 
 
Gateways 2020 Call for Participation
 
 
Gateways 2020 (October 19–21, Bethesda, Maryland) is now accepting submissions of short papers, demos, panels, tutorials, and workshops on the topic of gateways for science, engineering, or other disciplines. The deadline for T utorials and Workshops submissions has been extended until April 28. Short Papers, Demos, and Panels are due May 11, 2020. A poster deadline (open to all attendees) will be September 11.
 
 
GlobusWorld 2020 Goes Virtual
 
 
As with most other events in April, GlobusWorld 2020 will now be held as a virtual conference on April 29. The agenda focuses on product and user updates, celebrating 10 years of delivering data management services to researchers. Globus will also hold a Customer Forum in an online format, and provide ample opportunity for subscribers to engage with Globus leadership and the product team. Register for the Customer Forum here .
 
 
Upcoming Dates and Deadlines
 

 


Advanced Computing for Social Change Institute

Providing transformative student experiences through the application of XSEDE resources and services.


Learning through ACSCI

The Advanced Computing for Social Change Institute offers unique opportunities, co-located with professional conferences, for undergraduate students who want to enhance their skillset and create positive change in their community.

The programs recruit students from diverse disciplines and backgrounds who want to work collaboratively to:

  • Learn to apply data analysis and computational thinking to a social challenge
  • Experience the latest tools and techniques for exploring data through visualization
  • Expand skills in team-based problem solving
  • Learn how to communicate ideas more effectively to the general public

Eligibility:

  • Be currently enrolled as a full time undergraduate student at an accredited college/university
  • Be a U.S. citizen or permanent resident of the United States (for ACSC only)
  • Not plan to graduate the semester before or two months after the program
  • Have a minimum overall GPA of at least 2.5/4.0 (or equivalent)
  • Be able to attend a full challenge or competition during program dates
  • Complete the online application form before the deadline

Students from any undergraduate background are eligible, although some preference will be given to women, minorities, students from majors outside computer science, and students at the sophomore or junior level.

Students will be assigned to teams to ensure a balance of backgrounds, and an advisor will be assigned to each team. The costs of airfare, lodging, meals, and conference registration will be provided.

Application Details

The next Advanced Computing for Social Change (ACSC) event will be in Portland, OR co-located with the PEARC20 conference July 26-30, 2020. See current guidance regarding PEARC20 and COVID-19.

APPLICATION DEADLINE: The application period for the PEARC20 event runs through May 15, 2020. Notification of acceptance will be sent in June 2020.

The upcoming Computing4Change (C4C) event will be held in Atlanta, GA co-located with the SC20 conference Nov 14-20, 2020.

APPLICATION DEADLINE: The application deadline for the SC20 event is May 18, 2020. Notification of acceptance to be sent in June 2020.

Visit the ACSC FAQ for details.

Key Points
Developing a Diverse Workforce
Infusing Computational Science
Expanding Instructional Resources
Contact Information

XSEDE's Stampede2, Comet supercomputers complete simulations pertinent to coronavirus, DNA replication

By Jorge Salazar, Texas Advanced Computing Center

Supercomputer simulations led scientists to a mechanism for the budding off of viruses such as the coronavirus. A related study also used simulations to find a mechanism for how the DNA of all life adds a base to its growing strand during replication. This fundamental research could help lead to new strategies and better technology that combats infectious and genetic diseases. [Credit: Mandal et al.]

 

Fundamental research could help lead to new strategies and better technology that combats infectious and genetic diseases.

Viruses such as the novel coronavirus rely on the host cell membrane to drastically bend and eventually let loose the replicated viruses trapped inside the cell. Scientists have used supercomputer simulations to help propose a mechanism for this budding off of viruses. A related study also used simulations to find a mechanism for how the DNA of all life adds a base to its growing strand during replication.

Researchers used supercomputer time awarded through XSEDE funded by the National Science Foundation for this research on both the Stampede2 supercomputer at TACC, and the Comet supercomputer at SDSC.

Atomistic simulation of membrane deformation with two Vps32 trimers separated by (a) ~25 nm and (b) ~50 nm on a membrane ribbon. (c and d) Results of continuum mechanics analysis that illustrate the effect of two filaments separated by different distances on the shape of the membrane are shown; the spontaneous curvature is zero outside of the region of the filament, shown here in red. [Credit: Mandal et al.]
Structure of DNA polymerase, highlighting the active site groups that were treated at the quantum mechanical level during simulations. Reactant state shown here. Credit: Roston et al.

The study on cell membrane remodeling, important for viral reproduction, cell growth and communication, and other biological processes was published online in the Biophysical Journal in February 2020. The study co-author, Qiang Cui, also was part of a study on DNA base addition published in the Proceedings of the National Academy of Sciences, December 2019. Cui is a professor in the Departments of Chemistry, Physics, and Biomedical Engineering, Boston University.

"Supercomputers with massive parallelization are very much required to push the boundary of biomolecular simulations," Cui said.

Cui's science team developed supercomputer simulations of the cell membrane, in particular filaments of the Vps32 protein, a major component of the endosomal sorting required for transport complex (ESCRT-III), which was the prime suspect for the driving force that causes the cell membrane to form buds in a process called membrane invagination. ESCRT proteins function in the cytosol, the liquid inside cells surrounding organelles, the cell subunits. They perform various jobs such as making organelles; sorting recyclable material in the cell and ejecting waste, and more.

Left: Qiang Cui, Professor in the Departments of Chemistry, Physics, and Biomedical Engineering, Boston University. Right: Daniel Roston, Assistant Project Scientist, Department of Chemistry and Biochemistry at UC San Diego.

"The most interesting observation is that the ESCRT-III polymer that we studied features a clear intrinsic twist," Cui said. "This suggests that twisting stress that accumulates as the polymer grows on the surface might play a major role in creating the three-dimensional buckling of the membrane. People focused more on the bending of the filament in the past."

The proposed mechanism supported by simulations basically involves initially dimpling and then pushing out of the membrane as the corkscrew Vps32 protein filament grows, eventually causing the neck of the membrane invagination.

Simulations of systems containing up to two million atoms posed a large hurdle for Cui and colleagues. "Stampede2 has been crucial for us to set up these relatively large-scale membrane simulations," Cui said.

While this study is pure research, the knowledge gained could help benefit society.

"Membrane remodeling is an important process that underlies many crucial cellular functions and events, such as synaptic transmission and virus infection. Understanding the mechanism of membrane remodeling will ultimately help propose new strategies for battling human diseases due to impaired membrane fusion activities — or preventing viral infection — a timely topic these days given the quick spread of the new coronavirus," Cui said.

Cui also co-authored a computational study that used supercomputer simulations to determine a chemical mechanism for the reaction of nucleotide addition, used in the cell to add nucleotide bases to a growing strand of DNA.

Diagram of plausible mechanism of membrane curvature development catalyzed by Vps32 filaments. (a) Initial adsorption of Vps32 induces local positive curvature because of insertion of the N-terminal helix; (b) adsorption of a ring of Vps32 polymers induces a negative curvature at the center of the circular ring; (c) as the Vsp32 filament continues to elongate, bending and twisting deformations of the filament lead to the formation of a three-dimensional (3D) helical spiral that creates the neck of the membrane invagination. [Credit: Mandal et al.]

"By doing that, computationally, we are also able to determine the role of a catalytic metal ion of magnesium that's in the active site of the enzyme DNA polymerase," said study co-author Daniel Roston, assistant project scientist in the Department of Chemistry and Biochemistry at UC San Diego. "This metal has been a bit controversial in the literature. Nobody was really sure exactly what it was doing there. We think it's playing an important catalytic role."

DNA polymerase adds the nucleotides guanine, adenine, thymine, cytosine (G-A-T-C) to DNA by removing a proton from the end of the growing strand through reaction with a water molecule.

"When we say in the study that a water molecule serves as the base, it serves as a base to remove a proton, an acid base chemistry. What's left there after you remove the proton is much more chemically active to react with a new nucleotide that needs to be added to the DNA," Roston said.

The chemistry needs multiple proton transfers in a complex active site. Experimental probes using X-ray crystallography have been unable to distinguish among the many possible reaction pathways.

"Simulations offer a complement to crystallography because you can model in all the hydrogens and run molecular dynamics simulations, where you allow all the atoms to move around in the simulation and see where they want to go, and what interactions are helping them get where they need to go," Roston said.

"Our role was to do these molecular dynamics simulations and test different models for how the atoms are moving around during the reaction and test different interactions that are helping that along."

Mechanism (A), transition state structure (B), and free-energy surfaces for a mechanism with a Mg2+-coordinated hydroxide as the base under 4 different conditions (C–F). The 4 conditions are: 2 Mg2+ and deprotonated leaving group (C), 3 Mg2+ and deprotonated leaving group (D), 2 Mg2+ and protonated leaving group (E), 3 Mg2+ and protonated leaving group (F). The proposed mechanism achieves all the characteristics of the mechanism suggested by experiments, including rate acceleration by a third Mg2+ and by protonation of the leaving group. The structure in the Upper Right is representative of the transition state region for that reaction with the breaking and forming bonds shown as transparent. The proton shown in red is only present in the Bottom simulations. The dotted lines guide the eye along the minimum free-energy path from reactant to product; the transition state corresponds to the location of the maximum free energy along this minimum path. Roston et al

The number of energy calculations needed to complete the molecular dynamics simulations was huge, on the order of 10e8 to 10e9 for the system with thousands of atoms and many complex interactions. That's because timesteps at the right resolution are on the order of femtoseconds, 10e-15 seconds.

"Chemical reactions, life, doesn't happen that quickly," Roston said. "It happens on a timescale of people talking to each other. Bridging this gap in timescale of many, many orders of magnitude requires many steps in your simulations. It very quickly becomes computationally intractable."

"One of the great things about XSEDE is that we can take advantage of a ton of computational power," Roston added. Through XSEDE, Roston and colleagues used about 500,000 CPU hours on Comet system at SDSC. Comet allowed them to simultaneously run many different simulations that all feed off one another.

Said Roston: "DNA replication is what life is about. We're getting at the heart of how that happens, the really fundamental process to life as we know it on Earth. This is so important, we should really understand how it works at a deep level. But then, there are also important aspects of technology such as CRISPR that take advantage of this kind of work to develop systems to manipulate DNA. Understanding the details of how life has evolved to manipulate DNA will surely play a role in feeding our understanding and our ability to harness technologies in the future."

‘Molecular simulation of mechanical properties and membrane activities of the ESCRT-III complexes' was published online February 2020 in the journal Biophysical Journal. The study co-authors are Taraknath Mandal and Qiang Cui of Boston University; Wilson Lough, Saverio E. Spagnolie, and Anjon Audhya of the University of Wisconsin-Madison. Study funding came from the National Science Foundation. Computations are also supported in part by the Shared Computing Cluster, which is administered by Boston University's Research Computing Services.

‘Extensive free-energy simulations identify water as the base in nucleotide addition by DNA polymerase' was published December 2019 in the Proceedings of the National Academy of Sciences. The study co-authors are Daniel Roston of the University of California San Diego; Darren Demapan of the University of Wisconsin-Madison; and Qiang Cui of Boston University. Study funding came from the National Institutes of Health.

The Stampede2 supercomputer at the Texas Advanced Computing Center (left) and the Comet supercomputer at the San Diego Supercomputer Center (right) are allocated resources of the Extreme Science and Engineering Discovery Environment (XSEDE) funded by the National Science Foundation (NSF). Credit: TACC, SDSC.

 

At a Glance:

  •  XSEDE supercomputer simulations support a new mechanism for the budding off of viruses.
  • The ESCRTIII polymer features a clear intrinsic twist in molecular dynamics simulations, and might play a major role in creating the 3D buckling of the cell membrane. A related study used simulations to find the mechanism for DNA base addition during replication.
  • The computational study determined the role of catalytic magnesium ion in the active site of DNA polymerase.
  • The XSEDE-allocated supercomputers Stampede2 of TACC and Comet of SDSC supported the studies.
  • This fundamental research could help lead to new strategies and better technology that combats infectious and genetic diseases.


AI Running on XSEDE Systems Surpasses Humans at Classifying Galaxies

By Ken Chiacchia, Pittsburgh Supercomputing Center

Images from the Dark Energy Survey that the AI identified as spiral (top) or elliptical (bottom).

New telescope surveys are discovering hundreds of millions of new galaxies—far more than humans can classify. A National Center for Supercomputing Applications (NCSA)-led team has employed deep learning artificial intelligence (AI) on XSEDE-allocated systems to produce a galaxy-classifying artificial intelligence with better-than-human accuracy and capacity.

Why It's Important

Astronomers estimate there are at least 100 billion galaxies in the observable universe.

Scientists would like to get a better handle on these huge collections of stars for a number of reasons. For one, most of the mass of the universe seems to be invisible. One way we "see" the presence of this dark matter is through its effects on galaxies. Also, the motions of galaxies tell us that the expansion of the universe is accelerating. The reason for this may be that most of the energy of the universe is in an unknown form called dark energy. Astrophysical Surveys, such as the recent Dark Energy Survey (DES) and the upcoming Legacy Survey of Space and Time (LSST), are collecting data to study these fundamental questions.

"Cataloging all the galaxies in the universe is of fundamental interest in science for a number of reasons. For instance, combining gravitational wave observations with large scale galaxy catalogs has enabled the first gravitational wave standard-siren measurement of the Hubble constant which tells us how fast the universe is expanding…Astronomers have been trying to use AI to automate these tasks for quite some time, but traditional machine-learning algorithms, while promising, couldn't achieve human-level accuracy." — Asad Khan, NCSA

As a first step, scientists are studying the shapes of galaxies. The shape of a galaxy tends to be strongly intertwined with the history of its evolution. Shape also sheds light on a galaxy's star-formation rate, past mergers and interactions with other galaxies as well as other properties.

The logical starting point for astronomers in modern surveys is to classify and sort the vast number of galaxies observed. The main classification is whether a galaxy has a spiral shape, with curving arms like the Milky Way, or elliptical, which looks like a uniform ball of stars.

A method of visualizing how the AI classified galaxies helped give astronomers confidence. Classification of the labeled Dark Energy Survey test set (left), the Sloan Digital Sky Survey test set (center) and the predictions made by the AI for unlabelled galaxies (right).

This simple task is enormous owing to the tremendous number of galaxies. Astronomers initially turned to crowdsourcing to solve it. One highly successful effort was Galaxy Zoo. It used thousands of volunteers to classify galaxies. They classified 900,000 in the project's first phase. Volunteers will continue to have a role. But newer surveys of farther-away galaxies will dwarf that effort. The earlier Sloan Digital Sky Survey (SDSS) identified 50 million galaxies. The DES has identified more than 300 million. Even with thousands of volunteers, astronomers could never classify that many.

Graduate student Asad Khan, his advisor Eliu Huerta, and colleagues at NCSA at the University of Illinois Urbana-Champaign, as well as at Argonne National Laboratory, decided to solve this problem using deep learning on the XSEDE-allocated systems Bridges at the Pittsburgh Supercomputing Center and Comet at the San Diego Supercomputer Center.

How XSEDE Helped

Previous attempts to apply AI to galaxy classifications couldn't achieve human-level accuracy. To improve on that, the NCSA-led team turned to a type of machine learning called deep learning (DL). In DL, the computer learns a representation of the data, using a multi-level artificial neural network. They employed Comet in the early phases of the work, transitioning to Bridges to take advantage of the most advanced processors available for deep learning at the time—NVIDIA Tesla P100 GPUs. Today, both Bridges and Comet contain P100 nodes.

"XSEDE was pretty helpful for quickly testing out initial ideas for our project and hence played an important role in shaping our research that eventually resulted in a peer-reviewed publication that has been cited several times, and which has been extensively followed up by specialized magazines in Europe and the U.S. It is useful to have a shared resource for computation at the national level that can quickly respond to the demands of scientists from different and varied disciplines. We were able to submit several jobs to do a hyperparameter search for the best architecture for our problem. The ability to submit several jobs in parallel—and access to several GPUs—was pretty useful to cut back on [computational] time by at least four-fold. We saved about $1,000 that we would have required to do the same computing and data storage on the cloud." — Asad Khan, NCSA

For the data set, the scientists used a subset of the SDSS classified by the volunteers of Galaxy Zoo and verified as being above 90 percent accurate. They divided the data into three subsets: a roughly 36,000-galaxy training data set; a 1,000-galaxy validation data set; and a 12,500-galaxy testing data set. They chose the latter two data sets so that the galaxies in them lie in parts of the sky that both the SDSS and the DES had surveyed, taking advantage of the lessons learned by the earlier study. To generate and process all of the data sets that they used for training and testing, they used the Blue Waters supercomputer at NCSA, an XSEDE SP-2 resource.

In the testing phase, the AI matched the Galaxy Zoo classifications 85 percent of the time. But when they adjusted for the known error rate in Galaxy Zoo, they found their AI was over 99 percent accurate—better than the humans. As a last step, the scientists applied their AI to predict galaxy types in a set of about 10,000 not-yet-labelled galaxies. In addition, they had built their AI so that its processes for classifying the galaxies could be examined by humans. This step, which explained how the AI works, was important for convincing astronomers that the AI's methods can be trusted.

The team reported their results in the journal Physics Letters B in August 2019. They presented their visualization the following November at the annual SC19 supercomputing conference.

"In order to accelerate the adoption of AI tools for big-data analytics, it is essential to understand how these algorithms process data and extract information to make trustworthy predictions. In this article, we first designed AI algorithms that significantly outperform humans at classification and data labelling tasks, and then produced scientific visualizations that shed new and detailed information about how neural networks perform these tasks." — Eliu Huerta, NCSA

Future work will be to apply the method to larger groups of unidentified galaxies, automating galaxy identification to keep pace with the hundreds of millions expected to be discovered in the near future. The team has also begun using XSEDE-allocated Bridges-AI, whose NVIDIA Tesla V100 GPUs are currently the most advanced GPUs for deep learning. The platform's NVIDIA DGX-2 enterprise AI research system enables high-performance deep learning across 16 V100s.

This research is part of the Blue Waters sustained-petascale computing project, which is supported by the National Science Foundation (NSF) (awards OCI-0725070 and ACI-1238993) and the State of Illinois. Blue Waters is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Supercomputing Applications. NVIDIA donated several Tesla P100 and V100 GPUs used for the analysis. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation (NSF) grant number ACI-1548562. Specifically, it used the Bridges system, which is supported by NSF award number ACI-1445606, at the Pittsburgh Supercomputing Center (PSC) and Comet, which is supported by NSF award number ACI- 1341698 at the San Diego Supercomputer Center (SDSC). Additional support was through grant TG-PHY160053. This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357.

 

Read the paper.

At a Glance

  • New telescope surveys are discovering hundreds of millions of new galaxies — far more than humans can classify.
  • A National Center for Supercomputing Applications (NCSA)-led team has employed "deep learning" artificial intelligence (AI) on XSEDE-allocated systems to produce a galaxy-classifying AI.
  • The system demonstrates better-than-human accuracy and capacity.

Organization

Learn the description and breakdown of the XSEDE organizational structure. Questions? Contact us at info@xsede.org.

XSEDE is divided into four organizational levels:

  • Level 1 (Project Level):
    • encompasses all functional areas as well as external components (e.g. XAB, SPF, UAC)
    • led by the Principle Investigator, John Towns
  • Level 2 (L2 Functional Areas):
    • represents the six functional areas of the project: Community Engagement & Enrichment (CEE), Extended Collaborative Support Services (ECSS), XSEDE Cyberinfrastructure Integration (XCI), Operations (Ops), Resource Allocations Service (RAS), and the Program Office
    • led by the L2 Directors
  • Level 3 (L3 Focus Areas):
    • Each L2 Functional Area is separated into focus areas, called L3 teams, that, collectively, represent the functional responsibilities of that L2 area
    • led by L3 Managers
  • Level 4 (Working Level):
    • Individual team members complete assigned individual and group activities for the L3 team(s) they report into

A Work Breakdown Structure (WBS) approach is used to designate Levels 1 through 3 of the organizational structure.

Organizational level 2 teams are specifically setup to align with the project's strategic goals.

Work Breakdown Structure diagram

The breakdown of the XSEDE project is shown in the figure below.

XSEDE Organization Chart

Key Points
Four organizational levels
Organizational structure aligned with strategic goals
Contact Information

Rice University Study Advances Research on Individualized Patient Treatment

By Kim Bruch, San Diego Supercomputer Center (SDSC)

The Comet supercomputer at the San Diego Supercomputer Center was used to test interactions between thousands of pairs of molecules to advance immunotherapy research aimed at combating cancer. This image depicts a Human Leucocyte Antigen (HLA) receptor (in grey) that displays a small protein fragment or peptide (in red) at the surface of a cell. If this peptide is recognized as "suspicious" by the immune system, the cell will be destroyed. Credit: D. Devaurs, Rice University

 

With the American Cancer Society estimating 1.76 million new cases and more than 600,000 deaths during 2019 in the U.S. alone, cancer remains a critical healthcare challenge. In efforts to help mitigate these numbers, researchers at Rice University requested an allocation from the Extreme Science and Engineering Discovery Environment (XSEDE) for time on the Comet supercomputer at the San Diego Supercomputer Center (SDSC). They used Comet to evaluate their new molecular docking tool, called Docking INCrementally or DINC, which aims to improve immunotherapy outcomes by identifying more effective personalized treatments.

Led by postdoctoral researcher Didier Devaurs, the Rice researchers recently published their evaluation of DINC in the BMC Molecular and Cell Biology journal. The most significant result is that their molecular docking approach can make predictions of molecular interactions that other docking tools would miss. This has strong implications in cases where these predictions are notoriously difficult to make, and especially in the context of immunotherapy, which leverages the immune system to combat cancer.

"Immunotherapy is an innovative cancer treatment that has shown promising results," said Devaurs. "It consists of ‘training' a patient's cells by recognizing specific tumor-derived peptides, which are fragments of proteins within a cell." He further explained that each cancer patient displays a unique set of tumor-derived proteins and requires a fully personalized treatment.

The goal of their DINC tool is to assist with identifying these peptides for cancer immunotherapy, which required significant high-performance computing resources.

"The computational challenge here is that thousands of tumor-derived peptides were tested, and each test required exhaustive computing on Comet," said Devaurs. "Thanks to XSEDE, we were able to complete this evaluation of DINC, and our study showed that Comet has the computational power to make predictions that could be useful to immunotherapy. We will next assess how to rank the numerous predictions it makes to provide only the most realistic ones to our clinician colleagues working on novel cancer treatments."

Access to Comet was done via the National Science Foundation's Extreme Science and Engineering Discovery Environment (XSEDE) program, under NSF grant ACI-1548562. Additional funding came from the National Institutes of Health (1R21CA209941-01), the Informatics Technology for Cancer Research (ITCR) initiative of the National Cancer Institute (NCI), and the Cancer Prevention & Research Institute of Texas (RP170508). This work was also supported by a training fellowship from the Gulf Coast Consortia through the Computational Cancer Biology Training Program (RP170593).

Molecular structures are depicted by images produced with the PyMOL Molecular Graphics System, Version 1.8 Schrödinger, LLC. The funding bodies had no role in the design of the study and collection, analysis, and interpretation of data, or in writing the manuscript.

At a Glance

  • The American Cancer Society estimates 1.76 million new cases and more than 600,000 deaths during 2019 in the U.S.
  • In efforts to help mitigate these numbers, researchers used the Comet supercomputer at SDSC to evaluate a new molecular docking tool, which aims to improve immunotherapy outcomes by identifying more effective personalized treatments.