ECSS staff share technical solutions to scientific computing challenges monthly in this open forum.
The ECSS Symposium allows the over 70 ECSS staff members to exchange on a monthly basis information about successful techniques used to address challenging science problems. Tutorials on new technologies may be featured. Two 30-minute, technically-focused talks are presented each month and include a brief question and answer period. This series is open to everyone.
Day and Time: Third Tuesdays @ 1 pm Eastern / 12 pm Central / 10 am Pacific
Add this event to your calendar.
Webinar (PC, Mac, Linux, iOS, Android): Launch Zoom webinar
iPhone one-tap (US Toll): +16468769923,,114343187# (or) +16699006833,,114343187#
Telephone (US Toll): Dial(for higher quality, dial a number based on your current location):
US: +1 646 876 9923 (or) +1 669 900 6833 (or) +1 408 638 0968
Meeting ID: 114 343 187
Upcoming events are also posted to the Training category of XSEDE News.
Due to the large number of attendees, only the presenters and host broadcast audio. Attendees may submit chat questions to the presenters through a moderator.
April 16, 2013
Data Analysis on Massive Online Game Logs
Presenter: Dora Cai (NCSA)
Principal Investigator: Marshall Scott Poole (UIUC)
This presentation will talk about the work with Professor Marshall Scott Poole entitled "Some Assembly Required: Using High Performance Computing to Understand the Emergence of Teams and Ecosystems of Teams". Massively Multiplayer Online Games (MMOGs) provide unique opportunities to investigate large social networks. This project is a multidisciplinary Social Sciences project dedicated to the study of communication-related behaviors using data from MMOGs. A twenty-person team of scholars from four universities is engaged in the study. The project has performed systematic studies on many research areas, such as social network analysis, gamer behavior studies, and virtual world simulation. The Gordon supercomputer has provided great support on this project. This talk will provide an overview of the project, describe my involvement as an ECSS consultant in this project, and present the recent progress on developing a tool to visualize the social networks in MMOGs
Development of Novel Quantum Chemical Molecular Dynamics for Materials Science Modeling
Presenter: Jacek Jakowski (NICS)
Co-Principal Investigators: Jacek Jakowski (NICS), Sophya Garashchuk, (U. of South Carolina), Steve Stuart (Clemson University), Predrag Krstic (University of Tennessee& ORNL), Stephan Irle (Nagoya University)
I will present my work on the project titled: "Modeling of nano-scale carbon and metalized carbon materials for the 'EPSCoR Desktop to TeraGrid EcoSystems project'". with CO-PI: Sophya Garashchuk, (U. of South Carolina), Steve Stuart (Clemson University), Predrag Krstic (University of Tennessee& ORNL), Stephan Irle (Nagoya University). This project contains several subprojects that focus on development and application of various molecular dynamics approaches to material science problems. I will particularly discuss development and parallelization of Bohmian dynamics for modeling quantum nuclear effects of selected nuclei. Implementation and scaling on Kraken at NICS and science problems illustration will be presented.
March 19, 2013
Visualization of Volcanic Eruption Simulations (CFDLib)
Presenter: Amit Chourasia (SDSC)
Principal Investigator: Darcy Ogden (SIO, UCSD)
Eruptive conduits feeding volcanic jets and plumes are connected to the atmosphere through volcanic vents that, depending on their size and 3D shape, can alter the dynamics and structure of these eruptions. The host rock comprising the vent, in turn, can collapse, fracture, and erode in response to the eruptive flow field. This project uses visualization to illustrate and analyze results from fully coupled numerical simulations of high speed, multiphase volcanic mixtures erupting through erodible, visco-plastic host rocks. This work explores the influence of different host rock rheologies and eruptive conditions on the development of simulated volcanic jets. The visualizations shows the dependence of lithic segregation in the plume on eruption pressure.
Using Hybrid MPI+OpenMP Approach to Improve the Scalability of a Phase-Field-Crystal Code
Presenter: Reuben Budiardja (NICS)
Principal Investigator: Katsuyo Thornton (University of Michigan)
Phase-Field-Crystal (PFC) model is a recent development in the computational modeling of nanostructured materials that addresses the challenges for understanding of complex processes in nanostructure growth and self-assembly. A PFC-based code requires good scalability and time-to-solution to perform calculations with sufficient resolutions on the dynamics of metals. In this talk we will describe the work in improving the scalability of a PFC code. At the heart of the code is the solving of multiple indefinite Helmholtz equations. We will discuss the hybrid OpenMP + MPI approach to improve the time-to-solution by exploiting different parallelisms that exist in the code.
February 19, 2013
I/O Analysis for the Community Multiscale Air Quality (CMAQ) Simulation
Presenter: Kwai Wong (University of tennessee)
Principal Investigator: Joshua Fu (University of Tennessee)
The Community Multiscale Air Quality (CMAQ) Model is commonly used by many researches to simulate ozone, particulate matter (PM), toxics, visibility, and acidic and nutrient pollutant? throughout the troposphere. The scale of the model ranges from urban (few km) to regional (hundreds of kilometers) to inter-continental (thousands of kilometers) transport. Depending on the time and length scales of a simulation, the amount of IO will affect the overall performance of the simulation. In this presentation, we will examine the steps and results of the IO procedures used in the code.
Improving the performance and efficiency of an inverse stiffness mapping problem
Presenter: Carlos Rosales (TACC)
Principal Investigator: Lorraine Olson (Rose-Hulman Institute of Technology)
In this talk we will discuss the improvement to a legacy Fortran code used by the PI to investigate early stage breast cancer detection by using an inverse stiffness mapping approach. The talk will describe the basic methodology, the original naive attempts at improving its performance, and the computational trick used to substitute the original solver with a a more effective MUMPS and BLAS combination.
January 15, 2013
Graph Analytics: An XSEDE Introduction
Presenter: Nick Nystrom (PSC)
Novel and Innovative Projects
Effectively analyzing "big data" is increasingly essential to problems of scientific and societal importance. That analysis can take on various forms, depending on the nature of both the data and the intended results. In particular, researchers often seek to discover relationships in complex networks of data that can be represented as sets of nodes and edges, i.e. graphs. Graphs can represent unstructured data in a very general and extensible way, with nodes and edges representing concepts and relationships that need not be defined in a priori schema. Complex networks are readily expressed as graphs include, for example, social networks, protein interaction and gene expression networks, epidemiology, security, citation and authorship networks, supply chains, and datasets for machine learning. The mathematical analysis of graphs is a well-developed field, and standards, software, and even purpose-built computers now support multiple approaches to computational graph analytics. In many cases the graphs of interest are not partitionable; that is, they cannot be divided into subgraphs that can be tackled separately and efficiently. This makes graph analytics a particularly challenging aspect of dealing with big data.
This XSEDE Symposium will cover the following topics:
- an introduction to graph analytics, to establish motivation and terminology,
- several important W3C standards and community software technologies, e.g. RDF, SPARQL, and GraphLab, that offer powerful capability for working with large graphs, and
- XSEDE and other NSF-funded computational resources, namely Blacklight and Sherlock, that are well-suited to large-scale graph analytics.
(Videos not listed above, from prior years, can be found here)
December 18, 2012
Realizing the Universe on XSEDE: Simulation Support for the Dark Energy Survey
Presenter: Raminder Singh (IU)
Principal Investigator: August Evrard (University of Michigan)
The Dark Energy Survey Simulation Working group is using XSEDE resources to produce multiple synthetic sky surveys of galaxies and large-scale structure in support of science analysis for the Dark Energy Survey. In order to scale up our production to the level of fifty 10-billion-particle simulations, we are collaborating with XSEDE ECSS group to embed production control using the Apache Airavata Science Gateway Framework. I will talk about the current integration efforts and future plans to integrate post-processing steps. Based on our set of production runs using Airavata, we find the workflow has reduced production time by more than 40% compared to manual management.
Computational Explorations in to Population Level Disease Spread using Agent Based Modeling
Presenter: David O'Neal (PSC)
Principal Investigator: Shawn Brown (University of Pittsburgh)
FRED (Framework for Reconstructing Epidemic Dynamics) is an open source agent-based modeling system that uses detailed synthetic populations derived from census data to capture geographic and demographic distributions. The original serial implementation required approximately 96 hours and 540 GB of memory to complete a 100-day US simulation. The current version completes the same analysis in 3-4 hours using less than 200 GB of memory.
Following brief overviews of the MIDAS project and FRED model, details of the ECSS collaboration that yielded these results will be presented.
September 18, 2012
DNS of Spatially Developing Turbulent Boundary Layers
PRESENTER(S): David Bock and Darren Adams (NCSA)
Principal Investigator: Antonio Ferrante (University of Washington)
Darren Adams and David Bock of NCSA will discuss their collaborative work for the ECSS project, "Direct Numerical Simulation of Spatially Developing Turbulent Boundary Layers" for PI Antonino Ferrante of the University of Washington. Specifically, Darren will outline his software development efforts providing a high-level HDF5 parallel IO library to assist in data management while David will discuss the processes involved in visualizing the data and demonstrate results with a variety of images and animations.
Efficient Implementation of Novel MD Simulation Methods in Optimized MD Codes
PRESENTER(S): Lonnie Crosby (NICS) and Phil Blood (PSC)
Principal Investigator: Greg Voth (University of Chicago)
Conducting molecular dynamics (MD) simulations involving chemical reactions in large-scale condensed phase systems (liquids, proteins, fuel cells, etc…) is a computationally prohibitive task. Many force fields utilized in these simulations are not capable of simulating the breaking and forming of chemical bonds necessary to describe chemical reactions. Additionally, chemical processes occur over a wide range of length scales and are coupled to slow (long time scale) system motions which make adequate sampling a challenge. Ab initio methodologies are capable of describing these changing chemical bonds; however, these methods are computationally expensive which makes adequate sampling even more prohibitive. Multistate methodologies, such as the multistate empirical valence bond (MS-EVB) method, which are based on effective force fields, are more computationally efficient and enable the simulation of chemical reactions over the necessary time and length scales to properly converge statistical properties.
The typical parallel scaling bottleneck in both reactive and nonreactive all-atom MD simulations is the accurate treatment of long-range electrostatic interactions. Currently, Ewald-type algorithms rely on three-dimensional Fast Fourier Transform (3D-FFT) calculations. The parallel scaling of these 3D-FFT calculations can be severely degraded at higher processor counts due to necessary MPI all-to-all communication. This poses an even bigger problem in MS-EVB calculations, since the electrostatics, and hence the 3D-FFT, must be evaluated many times during a single time step.
Due to the limited scaling of the 3D-FFT in MD simulations, the traditional single-program-multiple-data (SPMD) parallelism model is only able to utilize several hundred CPU cores, even for very large systems. However, with a proper implementation of a multi-program (MP) model, large systems can scale to thousands of CPU cores. This talk will discuss recent efforts in collaboration with XSEDE advanced support to implement the MS-EVB model in the scalable LAMMPS MD code, and to further improve parallel scaling by implementing MP parallelization algorithms in LAMMPS. These algorithms improve parallel scaling in both the standard LAMMPS code and LAMMPS with MS-EVB, thus facilitating the efficient simulation of large-scale condensed phase systems, which include the ability to model chemical reactions.
June 19, 2012
Prediction and Control of Compressible Turbulent Flows
Presenter: Robert McLay (TACC)
Principal Investigator: Daniel Bodony (University of Illinois at Urbana Champaign)
In this talk, I will review the lesson learned (or re-learned) in measuring and improving the performance of his code. The talk will break into two parts. The first part will focus on the measuring of his code with TAU and that the analysis showed. It will cover the overuse of MPI_Barrier and improving the communication pattern. The second part will focus on the replacing how solution files were generated. The code used separate files per task per unknown. A test program was created to show to write a single file using HDF5 in parallel. This single file contained all the unknowns. Even with a relatively small local size of 50K points per task, the test program was able to write 1.5GB/sec on Lonestar. This second part will focus on how to dynamically size the number of writers and number of stripes to maximize I/O performance.
Robert McLay can be reached via this website
Presenter: Dmitry Pekurovsky (SDSC)
Principal Investigator: P.K.Yeung (Georgia Tech)
Fourier and related types of transforms are widely used in scientific community. Three-dimensional Fast Fourier Transforms (3D FFT), for example, are used in many areas such as DNS turbulence, astrophysics, material science, chemistry, oceanography and X-ray crystallography. In many cases this is a very compute-intensive operation. Lately there has been a need for implementations of 3D FFT and related algorithms with good scaling on Petascale parallel machines. Most existing implementations of 3D FFT use one-dimensional task decomposition, and therefore are subject to scaling limitation when the number of cores reaches domain size. P3DFFT library overcomes this limitation. It is an open-source, easy-to-use software package providing general solution for 3D FFT based on two-dimensional decomposition. In this way it is different from majority of other libraries such as FFTW, PESSL, MKL and ACML. In this talk we will discuss what P3DFFT does, its evolution, and scaling on sub-petascale platforms.
Dmitry Pekurovsky can be reached via this website
May 8, 2012
Visualizing Residual Based Turbulence Models for Large Eddy Simulation
Presenter: Mark W. Vanmoer (NCSA)
Principal Investigator: Arif Masud (UIUC)
This ECSS project supplies visualization support for large eddy simulation (LES) on unstrucutred tetrahedral meshes using residual based turbulence models. LES aims to provide inexpensive solutions for turbulent flow problems. This residual based approach avoids the use of filter functions and is consistent with Navier-Stokes equations. The development of this computational method leads to testing against some standard turbulent flow problems: boxes with periodic boundaries, turbulent flows in channels, and flow around embedded objects. Visualization efforts consist of preparing animations and still shots for these various simulations.
Mark W. Vanmoer can be reached via this website
Running WIEN2K on Ranger with both Coarse and Fine Parallelism
Presenter: Hang Liu (TACC}
Principal Investigator: Luis Smith (Clark University}
Like some other DFT based packages, WIEN2K has typical coarse and fine parallelism implementation. However, it has its own internal management to handle the task geometry for that multi level parallel execution, which is not directly supported by the ibrun-like parallel job launching wrapper on Ranger. This ECSS project intends to figure out optimal compilation of WIEN2K on Ranger at first, and develop a utility script that can be used in usual SGE job script and generates the task geometry needed by parallel execution of WIEN2K.
Download video (.mov) file of these seminars here.
Hang Liu can be reached via this website
April 10, 2012
SCALing the knowledge base for the Never-Ending Language Learner (NELL): A step toward large-scale computing for automated learning
PRESENTER: Joel Welling (PSC)
The Never-Ending Language Learner is a system of Java programs which continuously learns new beliefs from the World Wide Web- see the twitter feed for 'cmunell' to follow its discoveries. This is a project of Tom Mitchell and William Cohen at the CMU CS department. NELL learns by repeatedly cycling through a corpus of language statistics using information from its knowledge base. The sizes of the corpus and of the knowledge base make each cycle a time-consuming process. The ultimate goal of our interaction with this group is to bring this kind of work to high performance computing, allowing larger datasets and speeding the cycle time by a factor of hundreds.
Part of this project is just bringing up the necessary Java environment on Blacklight; I will discuss lessons learned in that process. The current ECCS project involves migrating the NELL ontology software from Tokyo Cabinet, a simple record-based database, to a graph database like Neo4J. This should provide a more natural representation for the ontology and will move the software from an obsolete base to a well-supported one. I will discuss the structure of the ontology and the representations of that structure under the new and old databases.
Joel Welling can be reached via this website
Use of Global Federated File System (GFFS)
PRESENTER: Andrew Grimshaw (University of Virginia)
The GFFS was born out of a need to access and manipulate remote resources such as file systems in a federated, secure, standardized, scalable, and transparent manner without requiring either data owners or applications developers and users to change how they store and access data in any way.
The GFFS accomplishes this by employing a global path-based namespace, e.g.,/data/bio/file1. Data in existing file systems, whether they are Windows file systems, MacOS file systems, AFS, Linux, or Lustre file systems can then be exported, or linked into the global namespace. For example, a user could export a local rooted directory structure on their "C" drive, C:\work\collaboration-with-Bob, into the global namespace at /data/bio/project-Phil. Files and directories on the user's "C" drive in \work\collaboration-with-bob would then, subject to access control, be accessible to users in the GFFS via the /data/bio/project-Bob path. Transparent access to data (and resources more generally) is realized by using OS-specific file system drivers (e.g. FUSE) that understand the underlying standard security, directory, and file access protocols employed by the GFFS. These file system drivers map the GFFS global namespace onto a local file system mount. Data and other resources in the GFFS can then be accessed exactly the same way local files and directories are accessed - applications cannot tell the difference.
Three examples illustrate GFFS typical uses cases, accessing data at an NSF center from a home or campus, accessing data on a campus machine from an NSF center, and directly sharing data with a collaborator at another institution. In all three cases client access to data will be via the GFFS-aware FUSE driver.
March 13, 2012
GPU-based Radiation Treatment Planning Applications
Presenter: Dong Ju (DJ) Choi
Graphics processing unit (GPU) has been actively adopted on the developments of various applications in radiation treatment planning process. Recent progress in the GPU-based applications promise the ability to ideally deliver an optimal treatment in response to daily patient anatomic variation. In this talk we will show some of the applications developed for the research in radiation treatment planning process. We will introduce the applications with their thread parallelism scheme and computational performance improvements.
Dong Ju (DJ) Choi can be reached via this website
Presenter: Thomas Uram
GPSI (pronounced "gypsy") provides computational scientists with a general purpose workbench for developing, testing, and using complex workflows for simulation and analysis. A key aspect that differentiates GPSI from other environments that support large parallel workflows is that it's not tied to any particular science domain. GPSI provides tools that have emerged as common in many of our past science gateway efforts, doing the heavy lifting that is often required in starting a new science gateway, while facilitating customization to a particular domain or collaboration. The GPSI environment integrates support for job execution and management; data management and browsing; and application development and reuse. We are working with research groups in power grid simulation and analysis and in phylogenetics to improve productivity via GPSI.
Thomas Uram can be reached via this website