ECSS staff share technical solutions to scientific computing challenges monthly in this open forum.
The ECSS Symposium allows the over 70 ECSS staff members to exchange on a monthly basis information about successful techniques used to address challenging science problems. Tutorials on new technologies may be featured. Two 30-minute, technically-focused talks are presented each month and include a brief question and answer period. This series is open to everyone.
Day and Time: Third Tuesdays @ 1 pm Eastern / 12 pm Central / 10 am Pacific
Add this event to your calendar.
Webinar (PC, Mac, Linux, iOS, Android): Launch Zoom webinar
iPhone one-tap (US Toll): +16468769923,,114343187# (or) +16699006833,,114343187#
Telephone (US Toll): Dial(for higher quality, dial a number based on your current location):
US: +1 646 876 9923 (or) +1 669 900 6833 (or) +1 408 638 0968
Meeting ID: 114 343 187
Due to the large number of attendees, only the presenters and host broadcast audio. Attendees may submit chat questions to the presenters through a moderator.
February 20, 2018
Deep Learning: An Increasingly Common HPC Task
Presenter(s): Paola Buitrago (PSC) Joel Welling (PSC)
Deep learning is a highly compute- and data-intensive category of tasks with wide applicability in science as well as industry. Join Paola Buitrago and Joel Welling from the Pittsburgh Supercomputing Center in two talks that will provide an overview of the current deep learning landscape and examples of the deep learning environments available to XSEDE users. Paola will provide a brief history of the field and an update on its technical performance, with examples from domains as diverse as vision and theorem proving. Joel will follow with a description of the PSC's support for two major deep learning packages, TensorFlow and Caffe.
January 16, 2018
An Introduction to Jetstream
Presenter(s): Virginia Trueheart (TACC)
Jetstream is an interactive computing resource designed to make High Performance Computing accessible to users that are not part of traditional HPC fields. This tutorial aims to introduce Jetstream's capabilities to this expanded user base. It will demonstrate how to access the system, make use of the various Virtual Machines available, and use publicly available images to assist with research. It will also cover how to create, modify, and save personal images that can be customized to individual workflows and be saved long term for reference in publication.
Visualizations of Simulated Supercell Storm Data
Presenter(s): Greg Foss (TACC)
Principal Investigator(s): Amy McGovern (University of Oklahoma)
XSEDE ECSS project: High Performance Computing Resources in Support of Spatiotemporal Relational Data Mining for Anticipation of Severe Weather.
Amy McGovern and her collaborator Corey Potvin from the National Severe Storms Laboratory are developing and applying novel spatiotemporal data mining techniques to supercell thunderstorm simulations, with the goal of identifying tornado precursors. The overall goal of the project is to improve tornado warning lead time and accuracy by integrating into the "Warn on Forecast" project, a National Oceanic and Atmospheric Administration research program tasked to increase tornado, severe thunderstorm, and flash flood warning lead times.
XSEDE ECSS staff was enlisted to see what could be found using 3D visualization techniques and an interactive user interface. The resulting images and animations will assist in defining storm features and as input to the data mining: ensuring automatically extracted objects match visually identified ones. This talk will feature graphics from a selection of three (5.7 TB) datasets, with visualization samples identifying various supercell thunderstorm features.
December 19, 2017
Jupyter Notebooks deployments at scale for Gateways and Workshops
Presenter(s): Andrea Zonca (SDSC)
Andrea Zonca (SDSC) will give an overview on deployment options for Jupyter Notebooks at scale on XSEDE resources. They are all based on deploying Jupyterhub on Jetstream, then either spawn Notebooks on a traditional HPC system or setup a distributed scalable system on Jetstream instances either via Docker Swarm or Kubernetes.
Deployment and benchmarking of RDMA Hadoop, Spark, and HBase on SDSC Comet
Presenter(s): Mahidhar Tatineni (SDSC)
Data-intensive computing middleware (such as Hadoop, Spark) can potentially benefit greatly from the hardware already designed for high performance and scalability with advanced processor technology, large memory/core, and high performance storage/filesystems. Mahidhar Tatineni (SDSC) will give an overview of the deployment and performance of Remote Direct Memory Access (RDMA) Hadoop, Spark, and HBase middleware on the XSEDE Comet HPC resource. These packages have been developed by Dr. D.K. Panda's Network-Based Computing (NBC) Laboratory at the Ohio State University. The talk will cover details of the integration with the HPC scheduling framework, the design and components of the packages, and the performance benefits of the design. Applications tested include the Kira toolkit (astronomy image processing), latent Dirichlet allocation (LDA) for topic modeling, and BigDL (distributed deep learning library).
October 17, 2017
Geodynamo Simulation Code for Paleomagnetic Observations
Presenter(s): Shiquan Su (NCAR) Chad Burdyshaw (NICS)
Principal Investigator(s): David Gubbins (Scripps Institution of Oceanography at UCSD)
This study characterizes a geodynamo simulation code for paleomagnetic observations targeted to run on TACC Stampede2 KNL cluster; a hybrid, distributed many-core parallel architecture. Issues examined are parallel scaling across distributed nodes and within the many core architecture, as well as vectorization efficiency, arithmetic intensity and memory throughput.
This presentation includes two parts. In the first part, Shiquan Su from NCAR will introduce the background of the project, the parallelization algorithm, the experience on Stampede KNL cluster, OpenMP treatment, and the two approaches to run the project jobs on the machines. In the second part, Chad Burdyshaw from UTK takes a close look at the code. Chad will discuss the tools used to interrogate performance, observations, remedies, and potential solutions.
September 19, 2017
COSMIC2 - A Science Gateway for Cryo-Electron Microscopy with Globus for Terabyte-sized Dataset
Presenter(s): Mona Wong-Barnum (SDSC)
Principal Investigator(s): Andres Leschziner (UCSD) Michael Cianfrocco (University of Michigan)
Structural biology is in the midst of a revolution. Instrumentation and software improvements have allowed for the full realization of cryo-electron microscopy (cryo-EM) as a tool capable of determining atomic structures of protein and macromolecular samples. These advances open the door to solving new structures that were previously unattainable, which will soon make cryo-EM a ubiquitous tool for structural biology worldwide, serving both academic and commercial purposes. However, despite its power, new users to cryo-EM face significant obstacles. One major barrier consists of the handling of large datasets (10+ terabytes), where new cryo-EM users must learn how to interface with the Linux command line while also dealing with managing and submitting jobs to high performance computing resources. To address this barrier, we are developing the COSMIC2 Science Gateway as an easy, web-based, science gateway to simplify cryo-EM data analysis using a standardized workflow. Specifically, we have adapted the successful and mature Cyberinfrastructure for Phylogenetic Research (CIPRES) Workbench  and integrated Globus Auth  and Globus Transfer  to enable federated user identity management and large dataset transfers to Extreme Science and Engineering Discovery Environment's (XSEDE)  high performance computing (HPC) systems. With the support of XSEDE's Extended Collaborative Support Services (ECSS)  and the Science Gateway Community Institute's (SGCI) Extended Developer Support (EDS), this gateway will lower the barrier to high performance computing tools and facilitate the growth of cryo-EM to become a routine tool for structural biology. Talk previously given at PEARC'17
First steps in optimising Cosmos++: A C++ MPI code for simulating black holes
Presenter(s): Damon McDougall (ICES)
Principal Investigator(s): Patrick C. Fragile (College of Charleston)
This ECSS project is to have Cosmos++ run on Stampede2 effectively. Stampede2, at present, is made up entirely of Intel Xeon Phi nodes. These are low clock-frequency but high core-count nodes, and there are some challenges associated with running on this hardware efficiently. Although the project's end goal is to hybridise a pure MPI code, this talk will focus on some of the initial steps we have taken to improve serial performance and how these steps relate to C++ software design. Prior knowledge of compiled languages and custom types would be beneficial but isn't required.
August 15, 2017
HTC with a Sprinkle of HPC: Finding Gravitational Waves with LIGO
Presenter(s): Lars Koesterke (TACC)
Principal Investigator(s): Duncan Brown (Syracuse University) Josh Willis (Abilene Christian University)
XSEDE is supporting the LIGO project to detect signatures of gravitational waves in a stream of data generated by (currently) two observatories in the U.S., located in Washington State and Louisiana. I will report on an ECSS project tasked to improve the performance of one of the largest (most resource demanding) pipelines called pycbc (python compact binary collision). The software evolved from a slow and performance-unaware state to a high-performing pipeline capable of utilizing Xeon, Xeon Phi, and Nvidia GPU architectures alike. Achieving high performance required only a few sprinkles of HPC (High Performance Computing) on top of a HTC (High Throughput Computing) pipeline. While the HPC pieces relevant for this particular project are all well known to ECSS staff it may be surprising what was missing in the considerations of the software developers. Hence this is more a story of how to educate users than a story of new and groundbreaking HPC concepts. Nevertheless I am confident that my fellow ECSS staffers will find this project interesting and enlightening.
Enabling multi-events 3D simulations for earthquake hazard assessment
Presenter(s): Yifeng Cui (SDSC)
Principal Investigator(s): Morgan Moschetti (USGS)
Researchers from USGS use Stampede to perform a series of computationally intensive simulations for improved understanding of earthquake hazards. Hercules, a finite element solver developed at CMU, is used to make the calculations which combines meshing, partitioning and solving functions in a single, self-contained code. Meshing employs a highly efficient octree-based algorithm that scales well. The simulation results are used to investigate the effects of complex geologic structure and topography on seismic wave propagation and ground-shaking hazards, and to evaluate model uncertainties in U.S. seismic hazard models. This talk will provide an overview of current status of the seismic hazard analysis research, and introduce the code performance, the optimizations involved in supporting multi-event simulations for this study through the ECSS project.
June 20, 2017
Visualization of simulated white dwarf collisions as a primary channel for type Ia supernovae
Presenter(s): David Bock (NCSA)
Principal Investigator(s): Doron Kushnir (Princeton)
Type Ia supernovae are an important and significant class of supernovae. While it is known that these events result from thermonuclear explosions of white dwarfs, there is currently no satisfactory scenario to achieve such explosions. Direct collisions of white dwarfs are simulated to study the possibility that the resulting explosions are the main source of type Ia supernovae. An adaptive mesh refinement grid simulates the varying levels of detail and a custom volume renderer is used to visualize density, temperature, and the resulting nickel production during the collision.
Humanities Computing With XSEDE: The Role of ECSS in Past, Present, and Future (upcoming) Projects
Presenter(s): Alan Craig (NCSA)
This symposium will address the role of ECSS in humanities related projects carried out in XSEDE. Humanities related disciplines are typically underrepresented in the XSEDE ecosystem. I will address my experiences and attempt to answer questions such as: "Where do these projects come from?", "What kinds of things are humanities scholars doing with XSEDE?", "What are some hurdles that need to be overcome for successful projects?", "How do the ECSS collaborations work?", and "How do we know if a project is successful?" in the context of several example projects.
May 16, 2017
Presenter(s): Kwai Wong (UTK)
Principal Investigator(s): Matthew DeAngelis (Georgia State University)
Motivated by the increasing tendency for computing power to assist, or even replace, human effort in the acquisition and analysis of financial information and in the execution of trading strategies, this project examines the "scriptability" of firm disclosures, or the relative ease with which a computer program can transform the large amounts of unstructured data contained in various firm disclosures into usable information. The objective of this ECSS project is to provide support to manage and run a set of computer codes examining the scriptability of a large volume of documents. The performance and the workflow procedure of the computations on will be presented.
Visual exploration and analysis of time series earthquake data
Presenter(s): Amit Chourasia (SDSC)
Principal Investigator(s): Keith Richards-Dinger (UC Riverside) James Dieterich (UC Riverside) Yifeng Cui (SDSC)
Earthquake hazard estimation requires systematic investigation of past records as well as fundamental processes that cause the quake. Robust risk estimation requires detailed long-term records of earthquakes at all scales (magnitude, space, time), which are not available. Hence a synthetic method based on first principals could generate such records that could bridge this critical gap of missing data. RSQSim is such a simulator that generates seismic event catalogs for several thousand years at various scales. This synthetic catalog contains rich detail about the events and corresponding properties.
Exploring this data is of vital importance to validate the simulator as well as to identify features of interest such as quake time histories, conduct analysis such as mean recurrence interval of events on each fault section, etc. This work describes and demonstrates a prototype web based visual tool that enables scientists and students explore this rich dataset. It also discusses the refinement and streamlining data management and analysis that is less error prone and scalable.
This work was performed in collaboration with Keith Richards-Dinger, James Dieterich and Yilfeng Cui and supported by ECSS.
April 18, 2017
Securing Access to Science Gateways with CILogon and Role-based Access Control
Presenter(s): Marcus Christie (IU)
CILogon is a service that allows users to securely access cyberinfrastructure resources by authenticating with their home institutions. Users benefit by not needing to learn a new username and password, and science gateway administrators benefit by not needing to securely manage user passwords.
Apache Airavata is a software framework for building science gateways. Apache Airavata provides abstractions for describing compute and storage resources and the applications that can run on them. Through a web interface, users can launch and monitor applications running on a local cluster, the commercial cloud, or national cyberinfrastructure.
The Apache Airavata project recently integrated support for CILogon into it's web portal. In this talk this integration will be discussed along with the role-based access control authorization system developed for Airavata. Together, CILogon and role-based access control significantly ease the burden on users and administrators of securing access to science gateways.
Statistical Analysis for Partially-Observed Markov Processes with Marked Point Process Observation
Presenter(s): Mitchel Horton (NICS) Junqi Yin (NICS)
Principal Investigator(s): Professor Yong Zeng (University of Missouri at Kansas City)
Volatility is influential in investment, monetary policy making, risk management and security valuation, and is regarded as one of the most important financial market indicators. Recently, a general partially-observed framework of Markov processes with Marked Point Process (MPP) observations has been proposed for streaming financial ultra-high frequency (UHF) data.
For this project, particle Markov Chain Monte Carlo (PMCMC), is applied to the parameter estimation for two models: Geometric Brownian Motion (GBM), and Heston Stocastic Volatility (HSV). Both models operate under 1/8 and 1/100 tick mark rules.
This method combines particle filtering with Markov Chain Monte Carlo (MCMC) to achieve sequential parameter learning in a Bayesian way. MCMC is used to propose new values for model parameters; particle filtering is used to detect values of marginal likelihood in the state-space model based on the proposed parameters.
The CUDA codes to compute the Bayes factors for model comparison and selection between GBM and HSV are donei, and in the simulation testing stage. With the time remaining for this project, new features will be added to HSV (another, even more highly-parallelizable particle filtering (namely, sequential Monte Carlo) method, will be used to solve the same filtering equations which are stochastic PDEs).
March 21, 2017
The Paleoscape Project for Studies of Modern Human Origins
Presenter(s): David O'Neal (PSC)
Principal Investigator(s): Curtis Marean. (Arizona State)
There is widespread consensus in human origins research (paleoanthropology) that the modern human lineage evolved in Africa and all modern humans are descended from that population. The archaeological record for the behavior of this crucial phase is richest in the southern African sub-region and particularly rich in the Cape. It has been hypothesized that the Cape, due to its uniquely rich coastal and terrestrial food resources, may have been the refuge region for the progenitor lineage of all modern humans during harsh global glacial phases.
During this phase of human origins, the economy was based entirely on hunting and gathering, and hunter-gatherer adaptations are tied to the way that climate and environment shape the food and technological resource base. For this reason human origins research recognizes the evolutionary significance of paleoclimate and paleoenvironment, and has a long tradition of engaging with climate and environmental scientists in an effort to understand if and how bio-behavioral evolution in the hominin line responded to climate change.
This XSEDE ECSS project implements the following workflow:
1) run a South African regional climate model to hindcast the climate parameters needed to project vegetation and other resources into the past, 2) run vegetation projections from these climate projections, and 3) run multiple agent-based simulations of the foragers on the these ancient paleoscapes. This unique endeavor is made possible by an unprecedented collaboration of scientists from several countries and many disciplines.