ECSS staff share technical solutions to scientific computing challenges monthly in this open forum.
The ECSS Symposium allows the over 70 ECSS staff members to exchange on a monthly basis information about successful techniques used to address challenging science problems. Tutorials on new technologies may be featured. Two 30-minute, technically-focused talks are presented each month and include a brief question and answer period. This series is open to everyone.
Day and Time: Third Tuesdays @ 1 pm Eastern / 12 pm Central / 10 am Pacific
Add this event to your calendar.
Webinar (PC, Mac, Linux, iOS, Android): Launch Zoom webinar
iPhone one-tap (US Toll): +16468769923,,114343187# (or) +16699006833,,114343187#
Telephone (US Toll): Dial(for higher quality, dial a number based on your current location):
US: +1 646 876 9923 (or) +1 669 900 6833 (or) +1 408 638 0968
Meeting ID: 114 343 187
Upcoming events are also posted to the Training category of XSEDE News.
Due to the large number of attendees, only the presenters and host broadcast audio. Attendees may submit chat questions to the presenters through a moderator.
September 19, 2013
Introducing the XSEDE Workflow Community Applications Team
Presenter: Marlon Pierce, (IU)
Many computational problems have sophisticated execution patterns, or scientific workflows, that build on top of the basic scheduling and queuing structures offered by XSEDE service providers. Examples include techniques for large scale parameter space exploration and coupling of several independently developed applications into new, composite computational experiments. Such executions may occur across multiple resources as well, using the optimal resource for each application in the dependency chain. Managing these complex executions is only part of the problem: scientists must be able to capture the metadata associated with a particular set of runs and share their workflows within their teams and with collaborators. Many academic groups have devoted significant research efforts into building workflow software that addresses one or more of these problems. The goal of the newly constituted XSEDE Workflow Community Applications Team is to assist scientists with scientific workflow problems in using workflow software to their research to XSEDE. The long term goal of the XSEDE workflow team is to assist XSEDE in providing a well-documented, reliable environment for the execution of scientific workflows in partnership with workflow software development teams.
August 20, 2013
Engineering Breakthroughs at NCSA (XSEDE, Blue Waters, Industry)
Presenter: Seid Koric, (NCSA)
Examples of some of my recent academic and industrial HPC work and collaborations are provided. Application examples range from manufacturing and multiphysics material processing simulations to massively parallel linear solvers. Parallel scalability of engineering codes on iForge (exclusive industrial HPC resource at NCSA) and Blue Waters (NSF funded sustained peta-scale system) are included.
Identification of Mechanism Based Inhibitors of Oncogene Pathways Using High-Performance Docking
Presenter: Bhanu Rekepalli, (NICS)
Principal Investigator: Yuri Peterson (MUSC)
Modern high throughput drug discovery is a complex and costly endeavor. Since small molecule chemical space is currently in the millions and is growing rapidly, performing in vitro and cell-based screening is complex and cost-prohibitive even for small subsets of molecules without significant investment in expertise and infrastructure. Thus, computational approaches are increasingly incorporated in drug discovery. Due to limited computational resources most computational drug discovery efforts are limited by either using rigid conformer libraries to bypass the need for more computationally intensive flexible docking, or performing flexible docking on small subsets (tens of thousands). High performance computing resources, such as Kraken, have allowed access to unprecedented amounts of computing power for biomedical researchers, and opens the possibility of exploring immense chemical space in hours that would previously have taken years on local clusters. Our working hypothesis is High Performance Docking (HP-D) allows probing of vast amounts of chemical space that is impractical by any other means. To test this hypothesis we have instrumented the docking program DOCK6 and compared performance between a small academic parallel cluster and Kraken. We show a dramatic and scalable increase in performance that will allowed the exploration of vast amounts of chemical space to identify compounds that dock on proteins related to ovarian cancer pathway.
June 18, 2013
Atomistic Characterization of Stable and Metastable Alumina Surfaces
Presenter: Sudhakar Pamidighantam, (NCSA)
Principal Investigator: Douglas Spearot, Co-pi: shawn Coleman (University of Arkansas)
This presentation is to describe work in progress for the extended collaborative support requested to assist the PIs, with (1) improving the parallel scalability of the virtual diffraction algorithm implemented as a "compute" in LAMMPS and (2) creating a workflow that automates the data transfer of the atomistic simulation results from TACC Stampede to SDSC Gordon in order to perform the virtual diffraction analysis and (3) visualization. Before the virtual diffraction algorithm is made publically available and incorporated into the main distribution of LAMMPS, the performance of the code must scale to larger atomic models. Currently the algorithm is memory-bound, likely due to the simple MPI parallelization used. Strategies and implementation of scaling methods in Compute will be described. Visualization using VisIT system with various configuration protocols is being implemented. Integration into GridChem science gateway and user interactions will be discussed. A plan for the workflow implementation will be presented. Sudhakar.
April 16, 2013
Data Analysis on Massive Online Game Logs
Presenter: Dora Cai (NCSA)
Principal Investigator: Marshall Scott Poole (UIUC)
This presentation will talk about the work with Professor Marshall Scott Poole entitled "Some Assembly Required: Using High Performance Computing to Understand the Emergence of Teams and Ecosystems of Teams". Massively Multiplayer Online Games (MMOGs) provide unique opportunities to investigate large social networks. This project is a multidisciplinary Social Sciences project dedicated to the study of communication-related behaviors using data from MMOGs. A twenty-person team of scholars from four universities is engaged in the study. The project has performed systematic studies on many research areas, such as social network analysis, gamer behavior studies, and virtual world simulation. The Gordon supercomputer has provided great support on this project. This talk will provide an overview of the project, describe my involvement as an ECSS consultant in this project, and present the recent progress on developing a tool to visualize the social networks in MMOGs
Development of Novel Quantum Chemical Molecular Dynamics for Materials Science Modeling
Presenter: Jacek Jakowski (NICS)
Co-Principal Investigators: Jacek Jakowski (NICS), Sophya Garashchuk, (U. of South Carolina), Steve Stuart (Clemson University), Predrag Krstic (University of Tennessee& ORNL), Stephan Irle (Nagoya University)
I will present my work on the project titled: "Modeling of nano-scale carbon and metalized carbon materials for the 'EPSCoR Desktop to TeraGrid EcoSystems project'". with CO-PI: Sophya Garashchuk, (U. of South Carolina), Steve Stuart (Clemson University), Predrag Krstic (University of Tennessee& ORNL), Stephan Irle (Nagoya University). This project contains several subprojects that focus on development and application of various molecular dynamics approaches to material science problems. I will particularly discuss development and parallelization of Bohmian dynamics for modeling quantum nuclear effects of selected nuclei. Implementation and scaling on Kraken at NICS and science problems illustration will be presented.
March 19, 2013
Visualization of Volcanic Eruption Simulations (CFDLib)
Presenter: Amit Chourasia (SDSC)
Principal Investigator: Darcy Ogden (SIO, UCSD)
Eruptive conduits feeding volcanic jets and plumes are connected to the atmosphere through volcanic vents that, depending on their size and 3D shape, can alter the dynamics and structure of these eruptions. The host rock comprising the vent, in turn, can collapse, fracture, and erode in response to the eruptive flow field. This project uses visualization to illustrate and analyze results from fully coupled numerical simulations of high speed, multiphase volcanic mixtures erupting through erodible, visco-plastic host rocks. This work explores the influence of different host rock rheologies and eruptive conditions on the development of simulated volcanic jets. The visualizations shows the dependence of lithic segregation in the plume on eruption pressure.
Using Hybrid MPI+OpenMP Approach to Improve the Scalability of a Phase-Field-Crystal Code
Presenter: Reuben Budiardja (NICS)
Principal Investigator: Katsuyo Thornton (University of Michigan)
Phase-Field-Crystal (PFC) model is a recent development in the computational modeling of nanostructured materials that addresses the challenges for understanding of complex processes in nanostructure growth and self-assembly. A PFC-based code requires good scalability and time-to-solution to perform calculations with sufficient resolutions on the dynamics of metals. In this talk we will describe the work in improving the scalability of a PFC code. At the heart of the code is the solving of multiple indefinite Helmholtz equations. We will discuss the hybrid OpenMP + MPI approach to improve the time-to-solution by exploiting different parallelisms that exist in the code.
February 19, 2013
I/O Analysis for the Community Multiscale Air Quality (CMAQ) Simulation
Presenter: Kwai Wong (University of tennessee)
Principal Investigator: Joshua Fu (University of Tennessee)
The Community Multiscale Air Quality (CMAQ) Model is commonly used by many researches to simulate ozone, particulate matter (PM), toxics, visibility, and acidic and nutrient pollutant? throughout the troposphere. The scale of the model ranges from urban (few km) to regional (hundreds of kilometers) to inter-continental (thousands of kilometers) transport. Depending on the time and length scales of a simulation, the amount of IO will affect the overall performance of the simulation. In this presentation, we will examine the steps and results of the IO procedures used in the code.
Improving the performance and efficiency of an inverse stiffness mapping problem
Presenter: Carlos Rosales (TACC)
Principal Investigator: Lorraine Olson (Rose-Hulman Institute of Technology)
In this talk we will discuss the improvement to a legacy Fortran code used by the PI to investigate early stage breast cancer detection by using an inverse stiffness mapping approach. The talk will describe the basic methodology, the original naive attempts at improving its performance, and the computational trick used to substitute the original solver with a a more effective MUMPS and BLAS combination.
January 15, 2013
Graph Analytics: An XSEDE Introduction
Presenter: Nick Nystrom (PSC)
Novel and Innovative Projects
Effectively analyzing "big data" is increasingly essential to problems of scientific and societal importance. That analysis can take on various forms, depending on the nature of both the data and the intended results. In particular, researchers often seek to discover relationships in complex networks of data that can be represented as sets of nodes and edges, i.e. graphs. Graphs can represent unstructured data in a very general and extensible way, with nodes and edges representing concepts and relationships that need not be defined in a priori schema. Complex networks are readily expressed as graphs include, for example, social networks, protein interaction and gene expression networks, epidemiology, security, citation and authorship networks, supply chains, and datasets for machine learning. The mathematical analysis of graphs is a well-developed field, and standards, software, and even purpose-built computers now support multiple approaches to computational graph analytics. In many cases the graphs of interest are not partitionable; that is, they cannot be divided into subgraphs that can be tackled separately and efficiently. This makes graph analytics a particularly challenging aspect of dealing with big data.
This XSEDE Symposium will cover the following topics:
- an introduction to graph analytics, to establish motivation and terminology,
- several important W3C standards and community software technologies, e.g. RDF, SPARQL, and GraphLab, that offer powerful capability for working with large graphs, and
- XSEDE and other NSF-funded computational resources, namely Blacklight and Sherlock, that are well-suited to large-scale graph analytics.
(Videos not listed above, from prior years, can be found here)
December 18, 2012
Realizing the Universe on XSEDE: Simulation Support for the Dark Energy Survey
Presenter: Raminder Singh (IU)
Principal Investigator: August Evrard (University of Michigan)
The Dark Energy Survey Simulation Working group is using XSEDE resources to produce multiple synthetic sky surveys of galaxies and large-scale structure in support of science analysis for the Dark Energy Survey. In order to scale up our production to the level of fifty 10-billion-particle simulations, we are collaborating with XSEDE ECSS group to embed production control using the Apache Airavata Science Gateway Framework. I will talk about the current integration efforts and future plans to integrate post-processing steps. Based on our set of production runs using Airavata, we find the workflow has reduced production time by more than 40% compared to manual management.
Computational Explorations in to Population Level Disease Spread using Agent Based Modeling
Presenter: David O'Neal (PSC)
Principal Investigator: Shawn Brown (University of Pittsburgh)
FRED (Framework for Reconstructing Epidemic Dynamics) is an open source agent-based modeling system that uses detailed synthetic populations derived from census data to capture geographic and demographic distributions. The original serial implementation required approximately 96 hours and 540 GB of memory to complete a 100-day US simulation. The current version completes the same analysis in 3-4 hours using less than 200 GB of memory.
Following brief overviews of the MIDAS project and FRED model, details of the ECSS collaboration that yielded these results will be presented.
September 18, 2012
DNS of Spatially Developing Turbulent Boundary Layers
PRESENTER(S): David Bock and Darren Adams (NCSA)
Principal Investigator: Antonio Ferrante (University of Washington)
Darren Adams and David Bock of NCSA will discuss their collaborative work for the ECSS project, "Direct Numerical Simulation of Spatially Developing Turbulent Boundary Layers" for PI Antonino Ferrante of the University of Washington. Specifically, Darren will outline his software development efforts providing a high-level HDF5 parallel IO library to assist in data management while David will discuss the processes involved in visualizing the data and demonstrate results with a variety of images and animations.
Efficient Implementation of Novel MD Simulation Methods in Optimized MD Codes
PRESENTER(S): Lonnie Crosby (NICS) and Phil Blood (PSC)
Principal Investigator: Greg Voth (University of Chicago)
Conducting molecular dynamics (MD) simulations involving chemical reactions in large-scale condensed phase systems (liquids, proteins, fuel cells, etc…) is a computationally prohibitive task. Many force fields utilized in these simulations are not capable of simulating the breaking and forming of chemical bonds necessary to describe chemical reactions. Additionally, chemical processes occur over a wide range of length scales and are coupled to slow (long time scale) system motions which make adequate sampling a challenge. Ab initio methodologies are capable of describing these changing chemical bonds; however, these methods are computationally expensive which makes adequate sampling even more prohibitive. Multistate methodologies, such as the multistate empirical valence bond (MS-EVB) method, which are based on effective force fields, are more computationally efficient and enable the simulation of chemical reactions over the necessary time and length scales to properly converge statistical properties.
The typical parallel scaling bottleneck in both reactive and nonreactive all-atom MD simulations is the accurate treatment of long-range electrostatic interactions. Currently, Ewald-type algorithms rely on three-dimensional Fast Fourier Transform (3D-FFT) calculations. The parallel scaling of these 3D-FFT calculations can be severely degraded at higher processor counts due to necessary MPI all-to-all communication. This poses an even bigger problem in MS-EVB calculations, since the electrostatics, and hence the 3D-FFT, must be evaluated many times during a single time step.
Due to the limited scaling of the 3D-FFT in MD simulations, the traditional single-program-multiple-data (SPMD) parallelism model is only able to utilize several hundred CPU cores, even for very large systems. However, with a proper implementation of a multi-program (MP) model, large systems can scale to thousands of CPU cores. This talk will discuss recent efforts in collaboration with XSEDE advanced support to implement the MS-EVB model in the scalable LAMMPS MD code, and to further improve parallel scaling by implementing MP parallelization algorithms in LAMMPS. These algorithms improve parallel scaling in both the standard LAMMPS code and LAMMPS with MS-EVB, thus facilitating the efficient simulation of large-scale condensed phase systems, which include the ability to model chemical reactions.
June 19, 2012
Prediction and Control of Compressible Turbulent Flows
Presenter: Robert McLay (TACC)
Principal Investigator: Daniel Bodony (University of Illinois at Urbana Champaign)
In this talk, I will review the lesson learned (or re-learned) in measuring and improving the performance of his code. The talk will break into two parts. The first part will focus on the measuring of his code with TAU and that the analysis showed. It will cover the overuse of MPI_Barrier and improving the communication pattern. The second part will focus on the replacing how solution files were generated. The code used separate files per task per unknown. A test program was created to show to write a single file using HDF5 in parallel. This single file contained all the unknowns. Even with a relatively small local size of 50K points per task, the test program was able to write 1.5GB/sec on Lonestar. This second part will focus on how to dynamically size the number of writers and number of stripes to maximize I/O performance.
Robert McLay can be reached via this website
Presenter: Dmitry Pekurovsky (SDSC)
Principal Investigator: P.K.Yeung (Georgia Tech)
Fourier and related types of transforms are widely used in scientific community. Three-dimensional Fast Fourier Transforms (3D FFT), for example, are used in many areas such as DNS turbulence, astrophysics, material science, chemistry, oceanography and X-ray crystallography. In many cases this is a very compute-intensive operation. Lately there has been a need for implementations of 3D FFT and related algorithms with good scaling on Petascale parallel machines. Most existing implementations of 3D FFT use one-dimensional task decomposition, and therefore are subject to scaling limitation when the number of cores reaches domain size. P3DFFT library overcomes this limitation. It is an open-source, easy-to-use software package providing general solution for 3D FFT based on two-dimensional decomposition. In this way it is different from majority of other libraries such as FFTW, PESSL, MKL and ACML. In this talk we will discuss what P3DFFT does, its evolution, and scaling on sub-petascale platforms.
Dmitry Pekurovsky can be reached via this website