ECSS staff share technical solutions to scientific computing challenges monthly in this open forum.
The ECSS Symposium allows the over 70 ECSS staff members to exchange on a monthly basis information about successful techniques used to address challenging science problems. Tutorials on new technologies may be featured. Two 30-minute, technically-focused talks are presented each month and include a brief question and answer period. This series is open to everyone.
Day and Time: Third Tuesdays @ 1 pm Eastern / 12 pm Central / 10 am Pacific
Add this event to your calendar.
Note – Symposium not held in July and November due to conflicts with PEARC and SC conferences.
Webinar (PC, Mac, Linux, iOS, Android): Launch Zoom webinar
iPhone one-tap (US Toll): +16468769923,,114343187# (or) +16699006833,,114343187#
Telephone (US Toll): Dial(for higher quality, dial a number based on your current location):
US: +1 646 876 9923 (or) +1 669 900 6833 (or) +1 408 638 0968
Meeting ID: 114 343 187
Upcoming events are also posted to the Training category of XSEDE News.
Due to the large number of attendees, only the presenters and host broadcast audio. Attendees may submit chat questions to the presenters through a moderator.
February 19, 2019
Sustaining Science Gateway Operations through SciGaP Service
Presenter(s): Suresh Marru (Science Gateways Research Center, Indiana University)
Science Gateways dramatically accelerate scientific discovery by providing crucial user- and science-centric points of entries to access cyberinfrastructure resources while shielding them from the technicalities of interacting with XSEDE like distributed infrastructure. XSEDE's Extended Collaborative Support Services (ECSS) has collaborated in making it as easy as possible for scientific communities to create such Science Gateways and help them integrate with XSEDE. However it is important to sustain these collaborative efforts and assist XSEDE communities in operating these gateways. In this talk we will present ECSS project exemplars which have adopted the hosted Apache Airavata services operated by the NSF funded Science Gateway Platform (SciGaP) project thus decreasing the overhead for gateway operations. The talk will conclude by providing references for future ECSS projects to take advantage of out-of-the box Gateway platform with customizable user interfaces, or integrating a la carte via direct programmatic access from existing community Gateway implementations.
Ansible on the Cloud: A match made in heaven
Presenter(s): Eric Coulter (Science Gateways Research Center, Indiana University)
One of the major difficulties facing researchers in getting started with national cyberinfrastructure (CI) is the pain of actually *using* it. For support staff, it is a continual struggle to effectively onboard new users and provide interfaces to compute resources. With the advent of cloudy research CI, it has become possible to provide highly customized resources for a variety of scientific domains, while at the same time giving access to those resources through gateways. I will discuss how customized infrastructure can enable a wide range of scientific projects, from bioinformatics to real-time data gathering. I will also demonstrate how the use of Ansible makes it relatively easy to create configurable, replicable infrastructure on Jetstream's Openstack cloud, and provide participants with a starting point for building their own customized infrastructure.
January 15, 2019
Searching through the SRA - A focus on the ECSS work
Presenter(s): Mats Rynge (USC)
The Sequence Read Archive (SRA), the world's largest database of sequences, hosts approximately 10 petabases (10^16 bp) of sequence data and is growing at the alarming rate of 10 TB per day. Yet this rich trove of data is inaccessible to most researchers: searching through the SRA requires large storage and computing facilities that are beyond the capacity of most laboratories. Enabling scientists to analyze existing sequence data will provide insight into ecology, medicine, and industrial applications. As a prototype project, we specifically focus on providing a search capability against metagenomic sequences (whole community datasets from different environments). These data represent approximately 46 TB of data in the SRA. We provided two different search algorithms that can be used by domain scientists to explore this data. The presentation includes details on how XSEDE ECSS helped to create a science gateway using open community science gateway framework, Apache Airavata, and an auto-scaled processing setup using Jetstream and direct mounted Wrangler storage for efficient data access for the growing user community of Searching the SRA.
Hyperglyphs: Pushing the Limits of Glyph Structure to Gain Insight Into Large Datasets
Presenter(s): Jeff Sale (SDSC)
The concept of a glyph in scientific visualization is well known and has found numerous applications over the years. However, the limits to the level of complexity of glyph structure have only begun to be fully explored. At the same time, a growing percentage of the big data torrent consists of semi-structured, unstructured, and non-traditional data, presenting a challenge for conventional visualization methods. Some data are so complex it is difficult to know where to begin to gain insight into trends and anomalies hidden within. We need new and innovative ways to visually explore such massive amounts of complex data. In this symposium I will provide a brief history of glyphs in scientific visualization and conditions in which their use is appropriate and beneficial. Then I make the case that conventional, simple glyphs should be extended and complexified into what I call ‘hyperglyphs', highly complex visual structures designed to encapsulate much more information within a single glyph and which, when thousands are arrayed in an interactive 3D space, can significantly enhance perception and information assimilation leading to new knowledge and insight. I will provide a wide range of examples from diverse fields including education, physiology, meteorology, public health, and social media.
December 18, 2018
Bioinformatics: Working with Campus Champion Fellows
Presenter(s): Alex Ropelewski (PSC)
Problems which require Bioinformatics skills are attractive to a wide variety of researchers, including researchers at Research Intensive Universities as well as researchers at smaller institutions. In this talk, I will highlight two projects involving the analysis of Next Generation Sequencing data that I've worked on with XSEDE Campus Champion Fellows – one involving Cancer Data and one involving Metagenomics. I will conclude the talk with advice for integrating a Campus Champion Fellow into an ECSS project.
Dream Lens: Exploration and Visualization of Large-Scale Generative Design Datasets
Presenter(s): Justin Matejka (Autodesk Research)
With traditional Computer Aided Design users typically create a single model. In contrast, generative design allows users specify high-level goals and constraints, and then the system can automatically generate hundreds or thousands of candidate designs all meeting the design criteria. Once a large collection of design variations is created, the designer is left with the task of finding the design, or set of designs, which best meets their requirements. This is a complicated task which could require analyzing the structural characteristics and visual aesthetics of the designs. In this talk we present Dream Lens, an interactive visual analysis tool for exploring and visualizing large-scale generative design datasets.
October 16, 2018
PolyRun - Polymer Microstructure Exploration HPC Gateway
Presenter(s): Amit Chourasia (SDSC) Christopher Thompson (Purdue)
Polymers are long chain macromolecules with physical properties that make them appealing for a wide range of uses in structural support, organic electronics, and biomedical applications. The microscopic structure adopted by polymers plays a key role in determining their suitability for advanced applications. Computational simulation tools provide a convenient and powerful method to guide experiments to create desirable structures. In this talk we will discuss ECSS activity to support development of PolyRun Gateway that allows seasoned and non-HPC users to easily perform complex computations and utilize simulations as an aid in designing experiments towards desired materials.
Efficient construction of limit order books for financial markets
Presenter(s): Robert Sinkovits (SDSC)
A limit order book (LOB) is a record of unexecuted orders to buy or sell a stock at a specified price. The LOB can then be used as a starting point for deeper analysis of markets, leading to a better understanding of the impact of trading behaviors, suggestions for regulations to make markets more effective or identification of manipulative practices such as quote stuffing. Construction of full-resolution LOBs is computationally demanding and, as a consequence, approximations are often employed. Unfortunately, this limits the utility of the LOBs in the era of high frequency trading. In this collaboration with Mao Ye (U. Illinois), we describe how we were able to first optimize the performance of existing full-resolution LOB construction software to achieve a 100x reduction in run time, and then refactor the software to ultimately improve time to solution by 1000-3000x.
September 18, 2018
The XSEDE Monthly HPC Workshops
Presenter(s): John Urbanic (PSC)
I will talk about the XSEDE Monthly Workshop Series, which uses the Wide Area Classroom. It has exceeded 10,500 actual-sitting-in-the-classroom students over the past 5 years, with growth continuing. The HPC topics core to the series will be discussed, as will the benefits of the WAC approach. We will discuss audience satisfaction and demographics as well as discuss the latest improvements and developments. All of this with the intention that many of these techniques are of use to other XSEDE outreach, training and education efforts.
GISandbox: A Science Gateway for Geospatial Computing
Presenter(s): Davide Del Vento (NCAR)
Science gateways provide easy access to domain-specific tools and data. The field of Geographic Information Science and Systems (GIS) uses myriad tools and datasets, which raises challenges in designing a science gateway to meet users' diverse research and teaching needs. GISandbox is a new science gateway designed to meet the needs of researchers and educators leveraging geospatial computing. The GISandbox is built on Jupyter Notebooks to create an easy, open, and flexible platform for geospatial computing. Jupyter Notebooks is a widely used interactive computing environment running in the browser that integrates live code, narrative, equations and images. We extend the Jupyter Notebook platform to enable users to run interactive notebooks on the cloud resource Jetstream or computationally-intensive notebooks on the Bridges supercomputer located at the Pittsburgh Supercomputing Center. A novel Job Management platform allows the user to easily submit a Jupyter Notebook for batch execution on Bridges (and eventually Comet), monitor the SLURM job, and retrieve output files. GISandbox Virtual Machines are created in Jetstream's Atmosphere interface and then deployed and configured using a series of Ansible scripts. When properly used, Ansible scripts allow to create an easily reproducible and scalable system. In this talk we will highlight use cases of GISandbox, give a bird's view on how we have met their requirements in our implementation and discuss future plans including how it could be applied in other domains.
August 21, 2018
OpenTopography: A gateway to high resolution topography data and services
Presenter(s): Choonhan Youn (SDSC)
Over the past decade, there has been dramatic growth in the acquisition of publicly funded high-resolution topographic and bathymetric data for scientific, environmental, engineering and planning purposes. Because of the richness of these data sets, they are often extremely valuable beyond the application that drove their acquisition and thus are of interest to a large and varied user community. However, because of the large volumes of data produced by high-resolution mapping technologies such as lidar, it is often difficult to distribute these datasets. Furthermore, the data can be technically challenging to work with, requiring software and computing resources not readily available to many users. Some of these complex algorithms require high performance computing resources to run efficiently, especially in an on-demand processing and analysis environment. With the steady growth in the number of users, complex and resource intensive algorithms to generate derived products from these invaluable datasets, HPC resources are becoming more necessary to meet the increasing demand. By utilizing the comet XSEDE resource, OpenTopography aims to democratize access and processing of these high-resolution topographic data.
Development of multiple scattering theory method: the recent progress and applications
Presenter(s): Yang Wang (PSC)
Multiple scattering theory is an ab initio electronic structure calculation method in the framework of density functional theory. It differs from other ab initio methods in that it is an all-electron method and is not based on variational approach. Its advantage of having easy access to the Green function makes it a unique tool for the study of random alloys and electronic transport. In this presentation, I will give a brief overview of the multiple scattering theory, and will discuss the recent ECSS projects relevant to the development and applications of multiple scattering theory method.
June 19, 2018
An Innovative Tool for IO Workload Management on Supercomputers
Presenter(s): Si Liu (TACC)
Modern supercomputer applications have been driving a high demand for capable storage resources in addition to fast computing resources. However, these storage systems, especially parallel shared filesystems, have become the Achilles' heel of powerful supercomputers. Single user's improper IO work can easily result in global filesystem performance degradation and even unresponsiveness. In this project, we developed an innovative IO workload managing system that optimally controls the IO workload from the users' side. This system will automatically detect and restrict improper IO workload from supercomputer users to protect parallel shared filesystems.
The Brain Image Library
Presenter(s): Derek Simmel (PSC)
The Brain Image Library (BIL) is a national public resource enabling researchers to deposit, analyze, mine, share and interact with large brain image datasets. As part of a comprehensive U.S. NIH BRAIN cyberinfrastructure initiative, BIL encompasses the deposition of datasets, the integration of datasets into a searchable web-accessible system, the redistribution of datasets, and a High Performance Computing enclave to allow researchers to process datasets in-place and share restricted and pre-release datasets. BIL serves a geographically distributed user base including large confocal imaging centers that are generating petabytes of confocal imaging datasets per year. For these users, the library serves as an archive facility for whole brain volumetric datasets from mammals, and a facility to provide researchers with a practical way to analyze, mine, share or interact with large image datasets. The Brain Image Library is a operated as a partnership between the Biomedical Applications Group at the Pittsburgh Supercomputing Center, the Center for Biological Imaging at the University of Pittsburgh and the Molecular Biosensor and Imaging Center at Carnegie Mellon University.
In this talk, I will briefly review the characteristics of the data that the Brain Image Library will store, and the infrastructure we are building at PSC to ingest and manage the data for access.
May 15, 2018
Computational fluid-structure interaction of biological systems
Presenter(s): Hang Liu (TACC)
Principal Investigator(s): Haoxiang Luo (Vanderbilt University)
I will briefly discuss what we have done to optimize the VICAR3D codes developed by the PI's group through this ECSS project. This includes those standard procedures we usually do in this kind efforts such as profiling code performance characteristics, sorting out the performance glitches, reorganizing the data domain decomposition, making the code more efficient in parallel, examining the performance portability when applying the code on architectures from Sandy Bridge and Knights Corner on Stampede1 to Knight's Landing on Stampede2. I would also like to share some interesting collisions and pleasant collaborations with the PI during the project and the lessons we learned.
A historical big data analysis to disclose the social construction of juvenile delinquency
Presenter(s): Sandeep Puthanveetil (NCSA)
Principal Investigator(s): Yu Zhang (The State University of New York at Brockport)
Social construction is a theoretical position that social reality is created through the human's definition and interaction. As one type of social reality, juvenile delinquency is perceived as part of social problems, deeply contextualized and socially constructed in American society. The social construction of juvenile delinquency started far earlier than the first juvenile court in 1899 in the U.S. Scholars have tried traditional historical analysis to explore the timeline of the social construction of juvenile delinquency in the past, but it is inefficient to examine hundred years of documents using traditional paper-and-pencil methods. This project aims to study the social construction of juvenile delinquency in the United States using data analysis of scanned historic newspaper collections. It combines image and linguistic analyses, and big data tools to analyze hundreds of years of scanned newspaper images and show a clear development of social construction of juvenile delinquency in the American society. Currently, the startup phase analyzes data from an archive of newspapers (1853-1921) from the Library of Congress Chronicling America website (http://chroniclingamerica.loc.gov/newspapers/). Sandeep will provide a very brief overview of the project, discuss the image analysis tools being designed and developed as part of this project, specifically with regard to segmentation of newspaper articles and OCR, their current progress, and some of the upcoming tasks in the text analysis and visualization stages of the project.
April 17, 2018
Clusters in the Cloud - Programmable, Elastic Cyberinfrastructure
Presenter(s): Eric Coulter (IU)
Principal Investigator(s): Sudhakar Pamidighantam (IU) Amit Majumdar (SDSC) Borries Demeler (UTHSC)
Eric will discuss the process of building a customized virtual cluster using Openstack, Ansible and SLURM, the benefits of elastic resources for gateway groups, and how this can be applied to extend the compute resources available to traditional hardware systems. Eric has worked with PI Sudhakar Pamidighantam (SEAGrid science gateway), PI Amit Majumdar (Neuroscience Gateway) and PI Borries Demeler (UltraScan science gateway) to enable production-ready virtual clusters on Jetstream.
Software as a Service Gateways
Presenter(s): Eroma Abeysinghe (IU)
Principal Investigator(s): Alison Marsden (Stanford) Charles Danko (Cornell)
Research groups producing open source scientific software often have a daunting task of helping their user communities with build instructions for a wide variety of hardware platforms, assist in optimizing the applications and develop detailed documentation to use the software. Such software communities can ease the support by developing custom science gateways for these specialized software. In this talk, Eroma Abeyasinghe will discuss two such efforts in developing, deploying and operating science gateways for Finite-Element Blood flow solver (SimVascular) and Detection of Regulatory Elements (dReg). Working in collaborations with PI's Alison Marsden and Charles Danko respectively. Eroma will discuss her experiences in developed these gateways based on the open community science gateway framework Apache Airavata and the PI's early success in community engagement with research and education.
March 20, 2018
ECSS Symposium featuring PI Panel
Presenter(s): Michael Cianfrocco (University of Michigan) Cameron Smith (Rensselaer Polytechnic Institute) Jian Tao (Texas A&M University) Sever Tipei (University of Illinois)
Curious about XSEDE's Extended Collaborative Support Services (ECSS)? Join us at our ECSS Symposium webinar on March 20 to hear from a panel of PIs about their experiences working with ECSS! They'll share what it was like requesting ECSS support, what the collaboration was like throughout the course of the project, and how ECSS support helped them achieve results.
Michael Cianfrocco is a Research Assistant Professor at the University of Michigan's Life Sciences Institute. Michael's ECSS project, "Analysis of Cryo-EM data on Comet and Gordon," began with a postdoctoral position with Andres Leschziner's lab at UCSD. Michael has been working with Mona Wong (SDSC) through both ECSS and the Science Gateways Community Institute to develop a gateway that would offer the cryoEM science community a web-based tool to simplify the analysis of data using a standardized workflow running on XSEDE's supercomputers. This gateway will lower the barrier to high performance computing tools and contribute to the fast-growing field of structural biology.
Cameron Smith is a Computational Scientist at the Scientific Computation Research Center at Rensselaer Polytechnic Institute. Cameron's project, "Adaptive Finite-element Simulations of Complex Industrial Flow Problems" focuses on scaling and performance analysis of adaptive in-memory workflows using PHASTA CFD, EnGPar load balancing, and PUMI unstructured mesh services on Stampede2's Knights Landing processors. The workflows are executed through the PHASTA science gateway. Cameron worked with ECSS staff Lars Koersterke and Lei Huang (both at TACC) on this project.
Jian Tao is a Research Scientist in the Strategic Initiatives Group at Texas A&M Engineering Experiment Station and High Performance Research Computing at Texas A&M University. Jian's work, "Deploying Containerized Coastal Model on XSEDE Resources," first began while he was at Louisiana State University. The goal is to develop and deploy enhancements into the SIMULOCEAN science gateway, integrating new Docker features of Bridges and Globus capabilities for authentication, file transfer and sharing. The PI worked with Mona Wong and Andrea Zonca (SDSC) and Stuart Martin from the Globus team.
Sever Tipei is a Professor of Composition-Theory in the School of Music at University of Illinois' College of Fine and Applied Arts. His project, "DISSCO, a Digital Instrument for Sound Synthesis and Composition" involves optimization and parallelization of the multi-threaded code DISSCO (developed jointly at the UIUC Computer Music Project and at Argonne National Laboratory). DISSCO combines the field of Computer-assisted Composition with that of the Sound Design in a seamless process. Sever has worked with ECSS staff Paul Rodriguez and Bob Sinkovits (both at SDSC) on this project.