XSEDE13 is hosted by
In cooperation with
Non-Profit Silver Sponsors
Non-Profit Bronze Sponsors
XSEDE is supported by the
National Science Foundation
- A key component to distributed computing and collaboration is a functional and efficient network. It is well known that scientific data sets continue to increase in number, size, and importance for numerous research and education (R&E) communities. Migration of the data from the instruments that observe or create data (e.g. particle accelerators, telescopes, etc.), to those that perform processing (e.g. supercomputing resources), involves use of network infrastructure, communication protocols, and the end user applications that direct workflow. The tie that binds scientific use cases on a network is a reliable network experience—free of architectural flaws, and physical limitations. Operational staffs are limited in the support they can deliver; innovative tools are required to solve the "end-to-end" performance problems that can often delay the delivery of data, or affect activities such as processing.
We will present an overview of network performance tools and techniques, focusing on the pS-Performance Toolkit and select XSEDE monitoring tools. The pSPT is an "all-in-one" monitoring solution allows local control, while providing a global view of performance that will directly impact the use of networks. While it has traditionally been the case that network problems are the realm of ‘operators', the definition of "the network" now includes the end hosts and applications as well – both candidates for inducing problems into the flow of end to end information. Goals include familiarizing attendees with how these tools aid in debugging networks, hosts, applications, and the proper way to install and configure software
- This half-day tutorial will provide a hands-on introduction to the OpenACC programming model. OpenACC is the new programming standard for accelerators (especially GPUs and Xeon Phis) that allows directive-based, incremental parallelization of codes at a high level. This is a major change from past programming approaches that are hardware-specific and fairly low-level. With this advantage we can take C and Fortran programmers from serial programming to successful parallelization of exercises in a half day, assuming only introductory knowledge, but also providing a useful experience for intermediate parallel programmers. John Urbanic is Parallel Computing Specialist at the Pittsburgh Supercomputing Center and will be leading the lectures. Tom Maiden from the PSC will also be assisting.
- [Download slides] [Additional Slides]
The Innovative Technology component of the recently deployed XSEDE Stampede supercomputer at TACC provides access to 8 PetaFlops of computing power in the form of the new Intel Xeon Phi Coprocessor, also known as MIC. While the MIC is x86 based, hosts its own Linux OS, and is capable of running most user codes with little porting effort, the MIC architecture has significant features that are different from that of present x86 CPUs, and optimal performance requires an understanding of the possible execution models and basic details of the architecture. This tutorial is designed to introduce XSEDE users to the MIC architecture in a practical manner. Multiple lectures and hands-on exercises will be used to get the user acquainted with the MIC platform and explore the different execution modes as well as parallelization and optimization through example testing and reports. Compiler reports are particularly helpful on the MIC because the slower clock speed, many cores, and increased vector depth makes vectorization critical for achieving high performance. The tutorial will be divided in four sections: Introduction to the MIC architecture; native execution and optimization; offload execution; and symmetric execution. In each section the users will spend half the time doing guided hands-on exercises.
- This tutorial presents state-of-the-art performance tools for leading-edge HPC systems founded on the Score-P community instrumentation and measurement infrastructure, demonstrating how they can be used for performance engineering of effective scientific applications based on standard MPI, OpenMP, hybrid combination of both, and increasingly common usage of accelerators (e.g., Nvidia GPUs & Intel Xeon Phi). Parallel performance evaluation tools from the VI-HPS (Virtual Institute – High Productivity Supercomputing) are introduced and featured in hands-on exercises with Scalasca, Vampir and TAU. We present the complete workflow of performance engineering, including instrumentation, measurement (profiling and tracing, timing and PAPI counters), data storage, analysis, and visualization. Emphasis is placed on how tools are used in combination for identifying performance problems and investigating optimization alternatives, illustrated with a case study using a major application code. Using their own notebook computers to connect to an XSEDE HPC system where the tools are installed for the exercises will help to prepare participants to locate and diagnose performance bottlenecks in their own parallel programs.
- The HPC community is now using powerful supercomputing systems composed of heterogeneous nodes built from multi-core processors and accelerator like NVIDIA GPUs and Intel Xeon Phi (MIC). Numerical libraries are some of the most used libraries on these HPC systems, which are usually the most consuming part of scientific and engineering applications. This tutorial will introduce state of the art numerical libraries that assist computational scientists to develop parallel code using robust and scalable solutions to achieve a better performance. This tutorial, suitable for attendees with a beginning to intermediate level in parallel programing, will provide a comprehensive overview on selected numerical libraries most used by scientific applications. It will focus on simple examples of using these libraries, and demonstrate programming and running code efficiently on XSEDE computing resources. Tools to be described include LAPACK, ScaLAPACK, FFTW, PETSc, along with PLASMA, MAGMA projects and vendor libraries (MKL, ACML, LibSci) to take advantage of the multicores and accelerators like GPUs and MICs. Participants are encouraged to bring laptop computers and follow live demonstrations with detailed examples using the numerical libraries on XSEDE resources. Source code for demonstrations and exercises in this tutorial will be available for download.
- SAGA-BigJob provides a framework for running both very large-scale parallel simulations and many small high-throughput across a variety of middleware. These applications also may utilize a variety of storage systems when staging data in and out. BigJob has been used for parameter sweeps, many instances of the same task (ensemble), chained tasks, loosely coupled but distinct tasks, as well as tasks with data or compute dependencies. BigJob have seen their widest usage across the heterogeneous resources that XSEDE provides. Simple installation into user space on any resource that supports Python >2.6 makes the uptake of BigJob virtually seemless. BigJob supports thousand of jobs (millions of SUs) and has been at the heart of two recent and successful ECSS projects. It has been used by a wide range of application types -- ranging from Computational Chemistry applications (uncoupled ensembles) to loosely coupled applications. BigJob is a reference implementation of the P* Model of Pilot-Jobs. Pilot-Jobs support the decoupling of workload submission from resource assignment. BigJob and its data-management layer (BigData) address the fundamental challenges of co-placement and scheduling of data and compute in heterogeneous and distributed environments with interoperability and extensibility as first-order concerns. We present a half-day introductory tutorial to using BigJob to effectively manage submission of computations and movement of data on XSEDE. Attendees will need to bring their laptops.
- The R language is the lingua franca of data analysis and statistical computing. However, it has issues with scalability. This tutorial will introduce to attendees both the R language, as well as the high performance extension of the language, pbdR. Examples will utilize common techniques from data analytics, such as principal components and clustering. No background in R is assumed, but even R veterans will benefit from the session. The tutorial will have hands-on opportunities for attendees to actively follow along.
- The rapid growth of data in scientific research endeavors is placing massive demands on campus computing centers and high-performance computing (HPC) facilities to provide robust user services and supporting network infrastructure that scales with the needs of their users. Existing research data management (RDM) services are typically difficult to use and error prone, and the underlying networking and security infrastructure is complex and inflexible, resulting in user frustration and sub-optimal use of resources. An approach that is increasingly common in HPC facilities—such as those managed by XSEDE service providers—includes software-as-a-service (SaaS) solutions like Globus Online for moving, syncing, and sharing large data sets. The SaaS approach allows HPC resource owners and systems administrators to deliver enhanced RDM services to end users at optimal quality of service, while minimizing the administrative and operations overhead associated with traditional software.Globus Online is the first service to pass the XSEDE Operations Acceptance Test and be accepted for production deployment, making it an official software service on XSEDE resources. Usage has grown rapidly, with more than 8,000 registered users and over 12 petabytes moved. Globus Online's reliable file transfer, combined with the recently announced data sharing service, is key functionality for bridging between campus and external resources, and is enabling scientists to more easily scale their research work flows. Tutorial attendees will explore the challenges such facilities face in delivering scalable RDM solutions. They will be introduced to the RDM functions of Globus Online, learn how other resource owners are using Globus Online, and have the opportunity for hands-on interaction with the service at various levels of technical depth.
- Many HPC developers still use command-line tools and tools with disparate, and sometimes confusing, user interfaces for the different aspects of the HPC project life cycle. The Eclipse Parallel Tools Platform (PTP) combines tools for coding, debugging, job scheduling, tuning, revision control, and more into an integrated environment for increased productivity. Leveraging the successful open-source Eclipse platform, PTP helps manage the complexity of HPC scientific code development and optimization on diverse platforms, and provides tools to gain insight into complex code that is otherwise difficult to attain. Moreover, PTP is preconfigured to work seamlessly with a large number of XSEDE resources. This tutorial will provide attendees with a hands-on introduction to Eclipse and PTP, with an advanced section focusing on the use of performance tools through Eclipse PTP.
- Visualization is largely understood and used as an excellent communication tool by researchers. This narrow view often keeps scientists from fully using and developing their visualization skillset. This tutorial will provide a "from the ground up" understanding of visualization and its utility in error diagnostic and exploration of data for scientific insight. When used effectively visualization can provide a complementary and effective toolset for data analysis, which is one of the most challenging problems in computational domains. In this tutorial we plan to bridge these gaps by providing end users with fundamental visualization concepts, execution tools, customization and usage examples. The tutorial will be presented in three sessions covering visualization fundamentals, hands on introduction to VisIt software and learning advanced, new features in VisIt.
- Security is crucial to the software that we develop and use. With the growth of both Grid and Cloud services, security is becoming even more critical. This tutorial is relevant to anyone wanting to learn about minimizing security flaws in the software they develop. We share our experiences gained from performing vulnerability assessments of critical middleware. You will learn skills critical for software developers and analysts concerned with security. This tutorial presents coding practices subject to vulnerabilities, with examples of how they commonly arise, techniques to prevent them, and exercises to reinforce them. Most examples are in Java, C, C++, Perl and Python, and come from real code belonging to Cloud and Grid systems we have assessed. This tutorial is an outgrowth of our experiences in performing vulnerability assessment of critical middleware, including Google Chrome, Wireshark, Condor, SDSC Storage Resource Broker, NCSA MyProxy, INFN VOMS Admin and Core, and many others.
- The goal of this tutorial is to enable application users and developers to readily optimize the performance of their applications on Stampede. The heterogeneous compute nodes of Stampede include both Sandy Bridge multicore chips and a manycore Kinghts Corner chip. This tutorial will demonstrate use of PerfExpert and MACPO for optimizing applications for the Sandy Bridge chips and a simple process for partitioning of MPI-based applications across the Sandy Bridge and Knights Corner chips to most effectively use the computational power of both types of chips. The tutorial will be "hands on" with participants following the demonstrations and doing the laboratory sessions on their laptops. Each participant will have a guest account on Stampede. Each participant should bring a laptop with which she/he can access these systems. Example and demonstration applications are provided as a part of the tutorial but participants are encouraged to come prepared to optimize one of their own applications. Performance optimization has four phases: measurement, diagnosis of performance bottlenecks, selection of optimizations and implementation of optimizations. Past versions of PerfExpert have almost fully automated the first three phases of optimization. The optimization processes are based upon use of an extended version of PerfExpert which integrates data from the MACPO data structure analysis with the PerfExpert analyses of code segments. The extended version of PerfExpert fully automates a subset of commonly used optimizations
- [Download slides]
As more scientific applications are being executed on large-scale platforms, how to utilize the underlying parallel storage system to provide a matching I/O performance becomes critical in ensuring scientific productivity. Due to complicated system architecture and constrains from different aspects, we must look for a revolutionary way to enable fast I/O. One of the major challenges is how to write and read large self-describing datasets quickly and efficiently at scale. Since many applications often want to process data in an efficient and flexible manner, in terms of data formats and operations performed (e.g., files, data streams), I/O abstractions need to allow easy techniques to transform I/O through the wide-variety of I/O choices. Our group has researched, developed, and created an I/O framework, ADIOS, which abstracts the API from the implementation, and allows a large and growing set of users to write and read data efficiently on large levels of concurrency. The tutorial is arranged in three parts. Part I introduces parallel I/O and the ADIOS framework to the audience. Specifically we will discus the concept of ADIOS I/O abstraction, the binary-packed file format, and I/O methods along with the benefits to applications. Part II includes a hands-on session on how to write/read data, and how to use different I/O componentizations inside of ADIOS. Part III shows users how complex codes can take advantage of the ADIOS framework.
- [Download slides]
This tutorial will provide training and hands-on activities to help new users learn and become comfortable with the basic steps necessary to first obtain, and then successfully employ an XSEDE allocation to accomplish their research goals. The tutorial will consist of three sections: The first part of the tutorial will explain the XSEDE allocations process and how to write and submit successful allocation proposals. The instructor will describe the contents of an outstanding proposal and the process for generating each part. Topics covered will include the scientific justification, the justification of the request for resources, techniques for producing meaningful performance and scaling benchmarks, and navigating the POPS system through the XSEDE Portal for electronic submission of proposals. The second section, "Information Security Training for XSEDE Researchers," will review basic information security principles for XSEDE users including: how to protect yourself from on-line threats and risks, how to secure your desktop/laptop, safe practices for social networking, email and instant messaging, how to choose a secure password and what to do if you account or machine have been compromised. The last part of the tutorial will cover the New User Training material that is been delivered remotely quarterly, but will delve deeper into these topics. New topics will be covered, including how to troubleshoot why a job has not run and how to improve job turnaround by understanding differences in batch job schedulers on different platforms. We will demonstrate how to perform the various tasks on the XSEDE Portal with live, hands-on activities and personalized help. We will practice submitting a job, figuring out why it has not run and transferring files between supercomputers. In the event of network issues we will have demos available as a backup. We anticipate significant interest from Campus Champions, and therefore we will explain how attendees can assist others, as well as briefly describe projects that are being currently carried out in non-traditional HPC disciplines.
- This tutorial will provide an introduction to writing scalable scientific applications that are robust, elastic, and portable across different distributed systems. We will use Makeflow and Work Queue, developed by the Cooperative Computing Lab at the University of Notre Dame, to build and run such applications. Makeflow and Work Queue are designed to be lightweight, easy to install, and easy to use with simple programming interfaces. They are used around the world to attack large problems in fields such as life sciences, data mining, chemistry, biology, and more. The tutorial will introduce participants to Makeflow and Work Queue and teach their use in building scalable scientific applications. The tutorial will consist of two parts: (i) a lecture that gives an overview of Work Queue and Makeflow, their architecture, and salient features, and (ii) a hands-on instruction on installing and using these tools to build and run scalable applications. The hands-on instruction will be conducted using the HPC/HTC compute resources and services offered in XSEDE.
- This tutorial aims to introduce parallel computing capability to the large group of researchers, from engineering and physical sciences to social sciences, who use MATLAB either as the primary software for their computing needs, or for processing of data from simulations performed with other software. With the MATLAB Distributed Computing Server (MDCS) now available on the supercomputer Blacklight at the Pittsburgh Supercomputing Center, these researchers can scale up their MATLAB codes on multiple processing cores, often with a minimum of code changes. In some cases, the only change needed in the code to achieve significant parallelism may be to parallelize an appropriate, computationally heavy loop by replacing the keyword for with parfor. Additional parallelism, as well as more complicated algorithms may need more sophisticated parallelization. This half-day tutorial, conducted jointly by MathWorks and PSC, will provide a comprehensive overview of the different parallel computing capabilities of MATLAB as well as how to submit code to run on Blacklight.
- The Hadoop framework is extensively used for scalable distributed processing of large datasets. The SDSC Gordon Compute cluster is ideally suited to running Hadoop with fast SSD drives enabling HDFS performance and the high speed Infiniband interconnect to provide scalability. Hadoop can be set up on Gordon in two ways – 1) using the myHadoop framework through the regular batch queue, and 2) utilizing dedicated I/O nodes with associated compute nodes. All users interested in utilizing Hadoop on XSEDE resources are invited to attend. The tutorial will provide a short introduction to Hadoop, info on how to run Hadoop jobs on Gordon, performance info on Gordon, and an overview of Hadoop based tools for data intensive computing. We begin the tutorial with an introduction to Hadoop with a simple illustrative example. We then give an overview of the Gordon architecture and its advantages for running Hadoop. We then go over the basic configuration files needed to set up Hadoop and detail how the myHadoop framework enables this for users within the regular queue structure. After completing the overview of Hadoop configuration and the Gordon architecture, we will have several hands on examples (e.g. TestDFS, and TeraSort) illustrating the use of Hadoop and myHadoop on Gordon. We will detail the SSD storage configurations on Gordon, and modes of access for users in the context of running Hadoop. The various network configuration options will also be detailed with performance info with Hadoop benchmarks on Gordon. The remainder of the tutorial will go over various API options with a hands-on example of Hadoop streaming. A brief description of tools that use Hadoop for data intensive computing – HIVE, Hbase, PIG, Apache Mahout (scalable machine learning algorithms) will be provided. SDSC Staff will be available to meet with individual users, to further discuss the Hadoop environment on Gordon, at the conclusion of the tutorial.
- This session provides a broad overview of High Performance Computing (HPC). Topics include: what is supercomputing?; the fundamental issues of HPC (storage hierarchy, parallelism); hardware primer; introduction to the storage hierarchy; introduction to parallelism via an analogy (multiple people working on a jigsaw puzzle); Moore's Law; the motivation for using HPC.
- This session provides an intuitive, nontechnical analogy for understanding distributed parallelism (desert islands), as a precursor for understanding the MPI programming model: distributed execution, communication, message passing, independence, privacy, latency vs. bandwidth; parallel strategies (client-server, task parallelism, data parallelism, pipelining). Assumed background: 1 semester of programming in C or C++, recently; basic Unix/Linux experience, recently.
- The Predictive Analytics Center of Excellence (PACE) strives to promote, educate and innovate in the area of predictive analytics to enhance the wellbeing of the global population and economy. With the ubiquity of computing, databases, and automated data collection tools like sensors all connected and integrated by sophisticated networks the world has become saturated with data. While advances in technology have led to an unprecedented ability to compute, collect and store huge amounts of data, the ability to extract useful information from raw data has become progressively critical. Integrating the methodologies of mathematics, statistics, visualization, and computer science, domain knowledge experts can address the Big Data challenge. This half-day tutorial is designed as an introduction for researchers seeking to extract meaningful predictive information from within massive volumes of data. The scientific community will get an introduction to the field of predictive analytics and learn to use a variety of data analysis tools to discover patterns and relationships in data that can contribute to building valid predictions. The tutorial is designed to provide professionals in business enterprises and scientific communities with the skills critical to design, build, verify, and test predictive data models. Data mining –– the art and science of learning from data –– covers a number of different procedures. This hands-on course emphasizes key learning techniques: decision trees, numeric prediction, clustering, Bayesian learning, artificial neural networks (ANNs), support vector machines (SVMs), etc. Workshop participants will have access to a comprehensive set of data mining tools available on SDSC's Gordon, one of the world's most powerful supercomputers with 300 terabytes of flash memory. Moreover, with access to this computing resource, participants will be able to sharpen their skills, apply data mining algorithms to real data, and interpret the results.
- This tutorial provides an introduction to visualization by exploring underlying principles used in information and scientific visualization. Hands-on exercises using Gephi (Information Visualization) and VisIt (Scientific Visualization) are designed to give participants experience in discerning what type of visualization tool would be most effective in gaining insight into different ways of interpreting data. The visualization process is presented as a vehicle for using visualization as a tool for knowledge discovery, gaining insight, making better informed decisions when analyzing data. The format will serve both those who wish to participate hands-on (using their own laptop) and those who wish to observe and ask questions.