ECSS staff share technical solutions to scientific computing challenges monthly in this open forum.
The ECSS Symposium allows the over 70 ECSS staff members to exchange on a monthly basis information about successful techniques used to address challenging science problems. Tutorials on new technologies may be featured. Two 30-minute, technically-focused talks are presented each month and include a brief question and answer period. This series is open to everyone.
Day and Time: Third Tuesdays @ 1 pm Eastern / 12 pm Central / 10 am Pacific
Add this event to your calendar.
Note – Symposium not held in July and November due to conflicts with PEARC and SC conferences.
Webinar (PC, Mac, Linux, iOS, Android): Launch Zoom webinar
Meeting ID: 892 8873 8446
One tap mobile
+13462487799,,89288738446# US (Houston)
+16027530140,,89288738446# US (Phoenix)
Find your local number to join by phone: https://illinois.zoom.us/u/konD1P8cl
Upcoming events are also posted to the Training category of XSEDE News.
Due to the large number of attendees, only the presenters and host broadcast audio. Attendees may submit chat questions to the presenters through a moderator.
To better secure Zoom meetings all participants are now required to log in to their Zoom account (personal or university/institution) in order to access any XSEDE meeting. If you do not currently have an account, you can create one at https://zoom.us/signup
Previous years' ECSS seminars may accessed through these links:
August 18, 2020
RDA Recommendations and Outputs
Presenter(s): Anthony Juehne (RDA Foundation)
The RDA was launched in 2013 to fill the identified need for a neutral, collaborative space gathering the diverse data communities and, through informed consensus, building the social and technical bridges to enable open data sharing. Since its founding, the RDA principles - Open, Consensus, Balance, Harmonization, Community-driven, Non-profit, and Technology-Neutral - have resonated across research communities. RDA membership includes currently over 11,000 participants representing 144 countries from all populated continents collaborating in 97 working or interest groups. The RDA is focused on actively building outcomes to accelerate the work to support open data interoperability, sharing, and use. This happens through the development and deployment of two primary output categories: i) technical infrastructure (e.g., tools, models, registries); and ii) social infrastructure (e.g., common standards, best practices, policies). This presentation will discuss an approach to implementing RDA developed outputs and recommendations across multiple areas of organizational operation, including human development and education, data laws and policies, research practices, data and metadata formats and standards, data sharing workflows, and infrastructure management for enhanced interoperability.
FAIR Data and SEAGrid Gateway a Research Data Alliance Adoption Project
Presenter(s): Rob Quick (Indiana University)
The Science and Engineering Grid (SEAGrid) Gateway has been an active resource for the computational community since 2016. During this time the utility of persistent identifiers for research data products has become prevalent in research communities as defined in the FAIR principles for open data. At the beginning of 2020 the Research Data Alliance funded an adoption project to integrate RDA outputs and recommendations focused on PID issuance to data and software components that make up a science workflow within the SEAGrid environment. This presentation will summarize this project and describes the gateway and data infrastructure components required for integration along with the details of the integration process. The work done in this adoption project can be used to inform future gateway projects that adopt the technical components of FAIR which are reliant on a persistent identifier resolution infrastructure.
June 16, 2020
Scalable Research Automation using Globus
Presenter(s): Rachana Ananthakrishnan (Globus)
REST APIs exposed by the Globus service, combined with high-speed networks and Science DMZs, create a data management platform that can be leveraged to increase efficiency in research workflows. In many cases, current ad hoc or human centered processes fall short of addressing the needs of researchers as their work becomes more data intensive. As data volumes grow, the overhead introduced by such non-scalable processes hampers core research activities, sometimes to the point where research takes a back seat to wrangling with IT infrastructure. However, technologies exist for reducing this burden and reengineering processes such that they can easily cope with growing data velocity and volume. One such technology is the Globus platform-as-a-service that facilitates access to advanced data management capabilities, and enables integration of these capabilities into existing and new scientific workflows to automate repetitive tasks: data replication, ingest from instruments, backup, archival, data distribution, etc. We will present real-world examples that illustrate how Globus can be used to perform data management tasks at scale, with no or minimal effort on the part of the researcher. Examples include streamlined data flows at the Advanced Photon Source data sharing system, used to distribute data from light source experiments. We will describe how the Globus platform provides intuitive access to authentication, authorization, sharing, transfer, and synchronization capabilities that can be included in simple scripts or integrated into more full-featured applications.
Building Source-to-Source Tools for High-Performance Computing
Presenter(s): Chunhua "Leo" Liao (LLNL)
Computational scientists face numerous challenges when trying to exploit powerful and complex high-performance computing (HPC) platforms. These challenges arise in multiple aspects including productivity, performance, correctness and so on. In this talk, I will introduce a source-to-source approach to addressing HPC challenges. Our work is based on a unique compiler framework named ROSE. Developed at Lawrence Livermore National Laboratory, ROSE encapsulates advanced compiler analysis and optimization technologies into easy-to-use library APIs so developers can quickly build customized program analysis and transformation tools for C/C++/Fortran and OpenMP programs. Several example tools will be introduced, including the AST inliner, outliner, and a variable move tool. I will also briefly mention ongoing work related to benchmarks, composable tools, and training for compiler/tool developers. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 (LLNL-ABS-810981).
May 19, 2020
Gateway Production Monitoring
Presenter(s): Kenneth Yoshimoto (SDSC)
In order to monitor the function of a production gateway, Neuroscience Gateway (NSG), two programs were developed to test gateway functions: data upload, job submission, and output retrieval. NSG uses the Workbench Framework (WF) code base. Other gateways using WF are COSMIC2 and CIPRES. The WF gateway can provide both a non-API web interface and a RESTful API. NSG makes both these interfaces available to users. For routine monitoring of production status, programs were written to do a daily test of both interfaces. The programs and testing process will be presented.
Essentials for a Successful XRAC Proposal: Code Performance and Scaling
Presenter(s): Lars Koesterke (TACC)
Many PI's struggle with putting together a sound computational plan based on code performance and scaling information. In fact, for first-time PI's the most common reason for rejection is an insufficient computational plan. With this new training module we are trying to address this problem. The training module attempts to answer two questions: Why is scaling and performance data important and how is it used by reviewers, and how to use this data to put together a computational plan? Currently the module is geared towards traditional HPC communities and we are working on extending the content towards new communities. The purpose of my talk at the ECSS symposium is to bring staff members on the same page and to raise awareness that there is a new resource available that may help educating users struggling with writing a successful XRAC proposal.
April 21, 2020
Beginners tutorial on cloud devops on Jetstream focused on Kubernetes and JupyterHub
Presenter(s): Andrea Zonca (SDSC)
This symposium assume no previous knowledge of cloud technologies and will cover the following topics: * Example virtual machine setup with Openstack command line tools * Deploying a Kubernetes Cluster on Jetstream * How Kubernetes works, architecture, differences between containers and Virtual Machines * Deploying JupyterHub on Jetstream for a workshop
March 17, 2020
AMP Gateway: An portal for atomic, molecular and optical physics simulations.
Presenter(s): Sudhakar Pamidighantam (Indiana University)
We describe the creation of a new Atomic and Molecular Physics science gateway (AMPGateway). The gateway is designed to bring together a subset of the AMP community to work collectively to make their software suites available and easier to use by the partners as well as others. By necessity, a project such as this requires the developers to work on issues of portability, documentation, ease of input, as well as making sure the codes can run on a variety of architectures. The gateway was built using Apache Airavata gateway middleware framework. Initially it was deployed using the Airavata PHP client on the web but has since been redeployed under a Django web framework. Here we outline the organization and facility of the Django deployment and how it has been used discuss future directions for the AMP gateway.
Bursting into the public Cloud – Sharing my experience doing it at large scale for IceCube
Presenter(s): Igor Sfiligoi (SDSC)
When compute workflow needs spike well in excess of the capacity of a local compute resource, capacity should be temporarily provisioned from somewhere else to both meet deadlines and to increase scientific output. Public Clouds have become an attractive option due to their ability to be provisioned with minimal advance notice. I have recently helped IceCube expand their resource pool by a few orders of magnitude, first to 380 PFLOP32s for a few hours and later to 170 PFLOP32s for a whole workday. In the process we moved O(50 TB) of data to and from the clouds, showing that networking is not a limiting factor, either. While there was a non-negligible dollar cost involved with each, the effort involved was quite modest. In this session I will explain what was done and how, alongside an overview of why IceCube needs so much compute.