ECSS staff share technical solutions to scientific computing challenges monthly in this open forum.
The ECSS Symposium allows the over 70 ECSS staff members to exchange on a monthly basis information about successful techniques used to address challenging science problems. Tutorials on new technologies may be featured. Two 30-minute, technically-focused talks are presented each month and include a brief question and answer period. This series is open to everyone.
Day and Time: Third Tuesdays @ 1 pm Eastern / 12 pm Central / 10 am Pacific
Add this event to your calendar.
Webinar (PC, Mac, Linux, iOS, Android): Launch Zoom webinar
iPhone one-tap (US Toll): +16468769923,,114343187# (or) +16699006833,,114343187#
Telephone (US Toll): Dial(for higher quality, dial a number based on your current location):
US: +1 646 876 9923 (or) +1 669 900 6833 (or) +1 408 638 0968
Meeting ID: 114 343 187
Upcoming events are also posted to the Training category of XSEDE News.
Due to the large number of attendees, only the presenters and host broadcast audio. Attendees may submit chat questions to the presenters through a moderator.
June 19, 2018
An Innovative Tool for IO Workload Management on Supercomputers
Presenter(s): Si Liu (TACC)
Modern supercomputer applications have been driving a high demand for capable storage resources in addition to fast computing resources. However, these storage systems, especially parallel shared filesystems, have become the Achilles' heel of powerful supercomputers. Single user's improper IO work can easily result in global filesystem performance degradation and even unresponsiveness. In this project, we developed an innovative IO workload managing system that optimally controls the IO workload from the users' side. This system will automatically detect and restrict improper IO workload from supercomputer users to protect parallel shared filesystems.
The Brain Image Library
Presenter(s): Derek Simmel (PSC)
The Brain Image Library (BIL) is a national public resource enabling researchers to deposit, analyze, mine, share and interact with large brain image datasets. As part of a comprehensive U.S. NIH BRAIN cyberinfrastructure initiative, BIL encompasses the deposition of datasets, the integration of datasets into a searchable web-accessible system, the redistribution of datasets, and a High Performance Computing enclave to allow researchers to process datasets in-place and share restricted and pre-release datasets. BIL serves a geographically distributed user base including large confocal imaging centers that are generating petabytes of confocal imaging datasets per year. For these users, the library serves as an archive facility for whole brain volumetric datasets from mammals, and a facility to provide researchers with a practical way to analyze, mine, share or interact with large image datasets. The Brain Image Library is a operated as a partnership between the Biomedical Applications Group at the Pittsburgh Supercomputing Center, the Center for Biological Imaging at the University of Pittsburgh and the Molecular Biosensor and Imaging Center at Carnegie Mellon University.
In this talk, I will briefly review the characteristics of the data that the Brain Image Library will store, and the infrastructure we are building at PSC to ingest and manage the data for access.
May 15, 2018
Computational fluid-structure interaction of biological systems
Presenter(s): Hang Liu (TACC)
Principal Investigator(s): Haoxiang Luo (Vanderbilt University)
I will briefly discuss what we have done to optimize the VICAR3D codes developed by the PI's group through this ECSS project. This includes those standard procedures we usually do in this kind efforts such as profiling code performance characteristics, sorting out the performance glitches, reorganizing the data domain decomposition, making the code more efficient in parallel, examining the performance portability when applying the code on architectures from Sandy Bridge and Knights Corner on Stampede1 to Knight's Landing on Stampede2. I would also like to share some interesting collisions and pleasant collaborations with the PI during the project and the lessons we learned.
A historical big data analysis to disclose the social construction of juvenile delinquency
Presenter(s): Sandeep Puthanveetil (NCSA)
Principal Investigator(s): Yu Zhang (The State University of New York at Brockport)
Social construction is a theoretical position that social reality is created through the human's definition and interaction. As one type of social reality, juvenile delinquency is perceived as part of social problems, deeply contextualized and socially constructed in American society. The social construction of juvenile delinquency started far earlier than the first juvenile court in 1899 in the U.S. Scholars have tried traditional historical analysis to explore the timeline of the social construction of juvenile delinquency in the past, but it is inefficient to examine hundred years of documents using traditional paper-and-pencil methods. This project aims to study the social construction of juvenile delinquency in the United States using data analysis of scanned historic newspaper collections. It combines image and linguistic analyses, and big data tools to analyze hundreds of years of scanned newspaper images and show a clear development of social construction of juvenile delinquency in the American society. Currently, the startup phase analyzes data from an archive of newspapers (1853-1921) from the Library of Congress Chronicling America website (http://chroniclingamerica.loc.gov/newspapers/). Sandeep will provide a very brief overview of the project, discuss the image analysis tools being designed and developed as part of this project, specifically with regard to segmentation of newspaper articles and OCR, their current progress, and some of the upcoming tasks in the text analysis and visualization stages of the project.
April 17, 2018
Clusters in the Cloud - Programmable, Elastic Cyberinfrastructure
Presenter(s): Eric Coulter (IU)
Principal Investigator(s): Sudhakar Pamidighantam (IU) Amit Majumdar (SDSC) Borries Demeler (UTHSC)
Eric will discuss the process of building a customized virtual cluster using Openstack, Ansible and SLURM, the benefits of elastic resources for gateway groups, and how this can be applied to extend the compute resources available to traditional hardware systems. Eric has worked with PI Sudhakar Pamidighantam (SEAGrid science gateway), PI Amit Majumdar (Neuroscience Gateway) and PI Borries Demeler (UltraScan science gateway) to enable production-ready virtual clusters on Jetstream.
Software as a Service Gateways
Presenter(s): Eroma Abeysinghe (IU)
Principal Investigator(s): Alison Marsden (Stanford) Charles Danko (Cornell)
Research groups producing open source scientific software often have a daunting task of helping their user communities with build instructions for a wide variety of hardware platforms, assist in optimizing the applications and develop detailed documentation to use the software. Such software communities can ease the support by developing custom science gateways for these specialized software. In this talk, Eroma Abeyasinghe will discuss two such efforts in developing, deploying and operating science gateways for Finite-Element Blood flow solver (SimVascular) and Detection of Regulatory Elements (dReg). Working in collaborations with PI's Alison Marsden and Charles Danko respectively. Eroma will discuss her experiences in developed these gateways based on the open community science gateway framework Apache Airavata and the PI's early success in community engagement with research and education.
March 20, 2018
ECSS Symposium featuring PI Panel
Presenter(s): Michael Cianfrocco (University of Michigan) Cameron Smith (Rensselaer Polytechnic Institute) Jian Tao (Texas A&M University) Sever Tipei (University of Illinois)
Curious about XSEDE's Extended Collaborative Support Services (ECSS)? Join us at our ECSS Symposium webinar on March 20 to hear from a panel of PIs about their experiences working with ECSS! They'll share what it was like requesting ECSS support, what the collaboration was like throughout the course of the project, and how ECSS support helped them achieve results.
Michael Cianfrocco is a Research Assistant Professor at the University of Michigan's Life Sciences Institute. Michael's ECSS project, "Analysis of Cryo-EM data on Comet and Gordon," began with a postdoctoral position with Andres Leschziner's lab at UCSD. Michael has been working with Mona Wong (SDSC) through both ECSS and the Science Gateways Community Institute to develop a gateway that would offer the cryoEM science community a web-based tool to simplify the analysis of data using a standardized workflow running on XSEDE's supercomputers. This gateway will lower the barrier to high performance computing tools and contribute to the fast-growing field of structural biology.
Cameron Smith is a Computational Scientist at the Scientific Computation Research Center at Rensselaer Polytechnic Institute. Cameron's project, "Adaptive Finite-element Simulations of Complex Industrial Flow Problems" focuses on scaling and performance analysis of adaptive in-memory workflows using PHASTA CFD, EnGPar load balancing, and PUMI unstructured mesh services on Stampede2's Knights Landing processors. The workflows are executed through the PHASTA science gateway. Cameron worked with ECSS staff Lars Koersterke and Lei Huang (both at TACC) on this project.
Jian Tao is a Research Scientist in the Strategic Initiatives Group at Texas A&M Engineering Experiment Station and High Performance Research Computing at Texas A&M University. Jian's work, "Deploying Containerized Coastal Model on XSEDE Resources," first began while he was at Louisiana State University. The goal is to develop and deploy enhancements into the SIMULOCEAN science gateway, integrating new Docker features of Bridges and Globus capabilities for authentication, file transfer and sharing. The PI worked with Mona Wong and Andrea Zonca (SDSC) and Stuart Martin from the Globus team.
Sever Tipei is a Professor of Composition-Theory in the School of Music at University of Illinois' College of Fine and Applied Arts. His project, "DISSCO, a Digital Instrument for Sound Synthesis and Composition" involves optimization and parallelization of the multi-threaded code DISSCO (developed jointly at the UIUC Computer Music Project and at Argonne National Laboratory). DISSCO combines the field of Computer-assisted Composition with that of the Sound Design in a seamless process. Sever has worked with ECSS staff Paul Rodriguez and Bob Sinkovits (both at SDSC) on this project.
February 20, 2018
Deep Learning: An Increasingly Common HPC Task
Presenter(s): Paola Buitrago (PSC) Joel Welling (PSC)
Presentation Slides Joel Welling Slides
Presentation Slides Paola Buitrago Slides
Deep learning is a highly compute- and data-intensive category of tasks with wide applicability in science as well as industry. Join Paola Buitrago and Joel Welling from the Pittsburgh Supercomputing Center in two talks that will provide an overview of the current deep learning landscape and examples of the deep learning environments available to XSEDE users. Paola will provide a brief history of the field and an update on its technical performance, with examples from domains as diverse as vision and theorem proving. Joel will follow with a description of the PSC's support for two major deep learning packages, TensorFlow and Caffe.
January 16, 2018
An Introduction to Jetstream
Presenter(s): Virginia Trueheart (TACC)
Jetstream is an interactive computing resource designed to make High Performance Computing accessible to users that are not part of traditional HPC fields. This tutorial aims to introduce Jetstream's capabilities to this expanded user base. It will demonstrate how to access the system, make use of the various Virtual Machines available, and use publicly available images to assist with research. It will also cover how to create, modify, and save personal images that can be customized to individual workflows and be saved long term for reference in publication.
Visualizations of Simulated Supercell Storm Data
Presenter(s): Greg Foss (TACC)
Principal Investigator(s): Amy McGovern (University of Oklahoma)
XSEDE ECSS project: High Performance Computing Resources in Support of Spatiotemporal Relational Data Mining for Anticipation of Severe Weather.
Amy McGovern and her collaborator Corey Potvin from the National Severe Storms Laboratory are developing and applying novel spatiotemporal data mining techniques to supercell thunderstorm simulations, with the goal of identifying tornado precursors. The overall goal of the project is to improve tornado warning lead time and accuracy by integrating into the "Warn on Forecast" project, a National Oceanic and Atmospheric Administration research program tasked to increase tornado, severe thunderstorm, and flash flood warning lead times.
XSEDE ECSS staff was enlisted to see what could be found using 3D visualization techniques and an interactive user interface. The resulting images and animations will assist in defining storm features and as input to the data mining: ensuring automatically extracted objects match visually identified ones. This talk will feature graphics from a selection of three (5.7 TB) datasets, with visualization samples identifying various supercell thunderstorm features.
December 19, 2017
Jupyter Notebooks deployments at scale for Gateways and Workshops
Presenter(s): Andrea Zonca (SDSC)
Andrea Zonca (SDSC) will give an overview on deployment options for Jupyter Notebooks at scale on XSEDE resources. They are all based on deploying Jupyterhub on Jetstream, then either spawn Notebooks on a traditional HPC system or setup a distributed scalable system on Jetstream instances either via Docker Swarm or Kubernetes.
Deployment and benchmarking of RDMA Hadoop, Spark, and HBase on SDSC Comet
Presenter(s): Mahidhar Tatineni (SDSC)
Data-intensive computing middleware (such as Hadoop, Spark) can potentially benefit greatly from the hardware already designed for high performance and scalability with advanced processor technology, large memory/core, and high performance storage/filesystems. Mahidhar Tatineni (SDSC) will give an overview of the deployment and performance of Remote Direct Memory Access (RDMA) Hadoop, Spark, and HBase middleware on the XSEDE Comet HPC resource. These packages have been developed by Dr. D.K. Panda's Network-Based Computing (NBC) Laboratory at the Ohio State University. The talk will cover details of the integration with the HPC scheduling framework, the design and components of the packages, and the performance benefits of the design. Applications tested include the Kira toolkit (astronomy image processing), latent Dirichlet allocation (LDA) for topic modeling, and BigDL (distributed deep learning library).
October 17, 2017
Geodynamo Simulation Code for Paleomagnetic Observations
Presenter(s): Shiquan Su (NCAR) Chad Burdyshaw (NICS)
Principal Investigator(s): David Gubbins (Scripps Institution of Oceanography at UCSD)
This study characterizes a geodynamo simulation code for paleomagnetic observations targeted to run on TACC Stampede2 KNL cluster; a hybrid, distributed many-core parallel architecture. Issues examined are parallel scaling across distributed nodes and within the many core architecture, as well as vectorization efficiency, arithmetic intensity and memory throughput.
This presentation includes two parts. In the first part, Shiquan Su from NCAR will introduce the background of the project, the parallelization algorithm, the experience on Stampede KNL cluster, OpenMP treatment, and the two approaches to run the project jobs on the machines. In the second part, Chad Burdyshaw from UTK takes a close look at the code. Chad will discuss the tools used to interrogate performance, observations, remedies, and potential solutions.
September 19, 2017
COSMIC2 - A Science Gateway for Cryo-Electron Microscopy with Globus for Terabyte-sized Dataset
Presenter(s): Mona Wong-Barnum (SDSC)
Principal Investigator(s): Andres Leschziner (UCSD) Michael Cianfrocco (University of Michigan)
Structural biology is in the midst of a revolution. Instrumentation and software improvements have allowed for the full realization of cryo-electron microscopy (cryo-EM) as a tool capable of determining atomic structures of protein and macromolecular samples. These advances open the door to solving new structures that were previously unattainable, which will soon make cryo-EM a ubiquitous tool for structural biology worldwide, serving both academic and commercial purposes. However, despite its power, new users to cryo-EM face significant obstacles. One major barrier consists of the handling of large datasets (10+ terabytes), where new cryo-EM users must learn how to interface with the Linux command line while also dealing with managing and submitting jobs to high performance computing resources. To address this barrier, we are developing the COSMIC2 Science Gateway as an easy, web-based, science gateway to simplify cryo-EM data analysis using a standardized workflow. Specifically, we have adapted the successful and mature Cyberinfrastructure for Phylogenetic Research (CIPRES) Workbench  and integrated Globus Auth  and Globus Transfer  to enable federated user identity management and large dataset transfers to Extreme Science and Engineering Discovery Environment's (XSEDE)  high performance computing (HPC) systems. With the support of XSEDE's Extended Collaborative Support Services (ECSS)  and the Science Gateway Community Institute's (SGCI) Extended Developer Support (EDS), this gateway will lower the barrier to high performance computing tools and facilitate the growth of cryo-EM to become a routine tool for structural biology. Talk previously given at PEARC'17
First steps in optimising Cosmos++: A C++ MPI code for simulating black holes
Presenter(s): Damon McDougall (ICES)
Principal Investigator(s): Patrick C. Fragile (College of Charleston)
This ECSS project is to have Cosmos++ run on Stampede2 effectively. Stampede2, at present, is made up entirely of Intel Xeon Phi nodes. These are low clock-frequency but high core-count nodes, and there are some challenges associated with running on this hardware efficiently. Although the project's end goal is to hybridise a pure MPI code, this talk will focus on some of the initial steps we have taken to improve serial performance and how these steps relate to C++ software design. Prior knowledge of compiled languages and custom types would be beneficial but isn't required.
August 15, 2017
HTC with a Sprinkle of HPC: Finding Gravitational Waves with LIGO
Presenter(s): Lars Koesterke (TACC)
Principal Investigator(s): Duncan Brown (Syracuse University) Josh Willis (Abilene Christian University)
XSEDE is supporting the LIGO project to detect signatures of gravitational waves in a stream of data generated by (currently) two observatories in the U.S., located in Washington State and Louisiana. I will report on an ECSS project tasked to improve the performance of one of the largest (most resource demanding) pipelines called pycbc (python compact binary collision). The software evolved from a slow and performance-unaware state to a high-performing pipeline capable of utilizing Xeon, Xeon Phi, and Nvidia GPU architectures alike. Achieving high performance required only a few sprinkles of HPC (High Performance Computing) on top of a HTC (High Throughput Computing) pipeline. While the HPC pieces relevant for this particular project are all well known to ECSS staff it may be surprising what was missing in the considerations of the software developers. Hence this is more a story of how to educate users than a story of new and groundbreaking HPC concepts. Nevertheless I am confident that my fellow ECSS staffers will find this project interesting and enlightening.
Enabling multi-events 3D simulations for earthquake hazard assessment
Presenter(s): Yifeng Cui (SDSC)
Principal Investigator(s): Morgan Moschetti (USGS)
Researchers from USGS use Stampede to perform a series of computationally intensive simulations for improved understanding of earthquake hazards. Hercules, a finite element solver developed at CMU, is used to make the calculations which combines meshing, partitioning and solving functions in a single, self-contained code. Meshing employs a highly efficient octree-based algorithm that scales well. The simulation results are used to investigate the effects of complex geologic structure and topography on seismic wave propagation and ground-shaking hazards, and to evaluate model uncertainties in U.S. seismic hazard models. This talk will provide an overview of current status of the seismic hazard analysis research, and introduce the code performance, the optimizations involved in supporting multi-event simulations for this study through the ECSS project.