Welcome!

Web Content Display Web Content Display

 

SCHEDULE: JULY 17-22 2011


View Only || Sunday / Monday / Tuesday / Wednesday / Thursday / All

 

SUNDAY JULY 17

TIME EVENT LOCATION
4:00PM - 7:00PM Registration Pre-Function Area
4:00PM - 9:00PM Campus Champions Salon GHIJ
6:00PM - 8:00PM Champions Dinner (Campus Champion Representatives and Invited Guests only) Deer Valley 1,2,3

TeraGrid New User Tutorial

TeraGrid New User Tutorial

Presenters: Philip Blood, Ken Hackworth, Jim Marsteller, Tom Maiden

Level: Introductory to Intermediate

Length: Half day

Time: 8:00am - 12:00pm

Abstract: This tutorial will provide training and hands-on activities to help new users learn and become comfortable with the basic steps necessary to obtain and successfully employ a TeraGrid allocation to accomplish their research goals.
For the first part of the tutorial, the topics and content will follow the TeraGrid New User Training sessions delivered quarterly via Readytalk, but we will delve deeper into these topics and demonstrate how to perform the various tasks on the TeraGrid with live, hands-on activities and personalized help. In the event of network issues we will have canned demos available as a backup.
The second part of the tutorial will provide "Cybersecurity 101" training. This session will review basic information security principles for TeraGrid users including: user responsibilities, how to protect yourself from on-line threats and risks, what to do if your account or machine is compromised, and how to secure your desktop/laptop.
The final part of the tutorial will help attendees to understand the TeraGrid allocations process and train them in writing successful proposals for review by the TeraGrid Research Allocations Committee (TRAC). The instructor will describe the contents of an outstanding proposal and the process for generating each part. Topics covered will include the scientific justification, the justification of the request for resources, techniques for producing meaningful performance and scaling benchmarks, and navigating the POPS system for electronic submission of proposals. As we anticipate significant interest from Campus Champions, emphasis will be placed on how attendees can assist others through this process.

Prerequisites: Active TeraGrid account with access to a compute resource; TeraGrid User Portal username and password

Requirements: A laptop with wireless network access and a current web browser, Java 1.5 or better, and software for connecting to TeraGrid machines via SSH.

Preparing for XD: How TeraGrid Resource Providers Can Easily Enable Globus Online for Data Movement

Preparing for XD: How TeraGrid Resource Providers Can Easily Enable Globus Online for Data Movement

Presenters: Borja Sotomayor, Rajkumar Kettimuthu, Stuart Martin

Level: Intermediate

Length: Half day

Time: 8:00am - 12:00pm

Abstract: TeraGrid resource providers preparing for the upcoming transition to XD will benefit from this tutorial, which covers how to enable Globus Online as a data movement service for RP end users. An important element of XD going forward, Globus Online is a secure, reliable file transfer service that makes data movement fast and easy. Globus Online provides a solution to large data transfer challenges over GridFTP by providing a robust and highly monitored environment for file transfers that has powerful yet easy-to-use interfaces. This hosted service simplifies secure data movement between TeraGrid endpoints, or from a TeraGrid endpoint to the user's local server or laptop, without requiring construction of custom end-to-end systems. This hands-on session gives RPs the opportunity to learn from the Globus Online expert team and get their questions answered as they prepare to enable Globus Online for their end users.

Prerequisites: GridFTP familiarity.

Requirements: Attendees will need to bring a laptop.

XSEDE Client Installation and Use - From the Global Federated File System to Running Jobs

XSEDE Client Installation and Use - From the Global Federated File System to Running Jobs

Presenters: Andrew Grimshaw, Karolina Sarnowska-Upton

Level: Intermediate

Length: Half day

Time: 8:00am - 12:00pm

Abstract: XSEDE introduces a new approach to satisfying user needs. The approach combines an emphasis on interoperability and standards as a means to reduce risk and provide a diversity of software sources, inclusion of campus and research group resources as first-class members of XSEDE, as well as particular attention to non-functional quality attributes such as ease-of-use and availability. This tutorial introduces students to using XSEDE access layer tools and sharing capabilities such as the Global Federated File System (GFFS), the XSEDE queue, and the XSEDE client user interfaces.

Prerequisites: Familiarity with Unix and a shell such as BASH

Requirements: Each student must provide his/her own laptop. We also assume that there will be reasonable bandwidth to the internet.

Hands-on Tutorial for Building Science Gateway Applications on Cyberinfrastructure

Hands-on Tutorial for Building Science Gateway Applications on Cyberinfrastructure

Presenters: Yan Liu, Shaowen Wang, Raminder Singh, Suresh Marru, Marlon Pierce, Nancy Wilkins-Diehr

Level: Intermediate/Advanced

Length: Half day

Time: 8:00am - 12:00pm

Abstract: The science gateway approach has been widely adopted to establish bridges between cyberinfrastructure (CI) and domain science communities and enable end-to-end domain-specific computations on CI through efficient management of CI complexities within science gateways. As CI resources become increasingly available and accessible for researchers and scientists, the effectiveness of gateways depends on their community-wide usability and to the degree which researchers are able to concentrate on their domain problem-solving. This tutorial uses SimpleGrid, a toolkit for efficient learning and development of science gateway building blocks, to provide hands-on experience in leveraging TeraGrid for domain-specific scientific computing, developing TeraGrid-enabled science gateway application services, and integrating application services as modular and highly usable Open Social gadgets. The Open Gateway Computing Environment (OGCE) gadget container is used to build a prototype gateway portal. The intended audience for this tutorial includes researchers and developers who are interested in building CI-powered science gateways.

Prerequisites: General understanding of Grid computing, Web 2.0 technologies (JavaScript, AJAX, Open Social Gadgets), Web application development (PHP and Java), and Web services

Requirements: Web browser (Firefox is strongly recommended), ssh client (e.g, Putty)

Getting the most out of the TeraGrid SGI Altix UV System

Getting the most out of the TeraGrid SGI Altix UV System

Presenters: Scott Simmerman, Mahin Mahmoodi, Galen Arnold, Pragnesh Patel, Amy Szczepanski

Level: Advanced

Length: Half day

Time: 8:00am - 12:00pm

Abstract: The SGI Altix UV has an SMP architecture that allows large amounts of memory to be accessed from a single thread of execution. This allows researchers a way to approach some computational problems that are not well-suited for clusters and distributed memory systems. Currently, three SGI Altix UV systems have been deployed on the TeraGrid: Blacklight at PSC, Ember at NCSA, and Nautilus at RDAV. This tutorial will include an overview of the architecture of these systems, with an emphasis on NUMA (Non-Uniform Memory Access) and how to place processes in a way that maximizes performance. We will also describe strategies for developing code for these systems, including development case studies, approaches to parallelism, memory management and the GRU. Finally we will give an overview of available performance tools and how they can be used to achieve optimizations. This tutorial is ideal for researchers whose computational work would benefit from the large memory available on the SGI Altix UV.

Prerequisites: Familiarity with UNIX and either Fortran 90 or C/C++ is preferred. Knowledge of a parallel programming method (e.g. MPI or OpenMP) is helpful but not required.

Requirements: This tutorial will include the use of performance tools, many of which are open source, that are installed on TeraGrid UV systems.
Materials: Tutorial materials are available at: http://www.nics.tennessee.edu/~tg11-uv-tutorial

An Introduction to the TeraGrid Track 2D Systems: Future Grid, Gordon, and Keeneland

An Introduction to the TeraGrid Track 2D Systems: Future Grid, Gordon, and Keeneland

Presenters: Robert Sinkovits, Renato Figueiredo, Jeffrey Vetter

Level: Introductory

Length: Full day

Time: 8:00am - 5:00pm

Abstract: This tutorial will consist of a series of three sessions, each focusing on one of the Track 2D systems: Future Grid, Gordon, and Keeneland. Each session will start with a brief overview of the performance metrics and overall organization of the system, with an emphasis on the unique architectural features. The majority of the time though will be spent addressing the issues that impact end users and application developers. We will describe the types of codes, algorithms, and middleware that are expected to run well on each system. Where applicable, this will be based on the results of benchmarking studies carried out on the production hardware or early prototypes. We will then cover the use of performance monitoring tools and techniques for achieving better performance. Programming techniques and important features of languages extensions and run-time environments (e.g. OpenMP, CUDA, Hadoop/MapReduce) needed to efficiently use the machine will also be described. Each session will conclude with advice on writing a successful allocations proposal, stressing the need to provide a strong justification for access to the requested resource.

Prerequisites:

Requirements: None

A new and Improved Eclipse Parallel Tools Platform: Advancing the Development of Scientific Applications

A new and Improved Eclipse Parallel Tools Platform: Advancing the Development of Scientific Applications

Presenters: Greg Watson, Beth Tibbitts, Jay Alameda, Galen Arnold, Jeff Overbey

Level: Introductory/Intermediate/Advanced

Length: Full day

Time: 8:00am - 12:00pm

Abstract: Many HPC developers still use command-line tools and tools with disparate, and sometimes confusing, user interfaces for the different aspects of the HPC project life cycle. The Eclipse Parallel Tools Platform (PTP) combines tools for coding, debugging, job scheduling, tuning, revision control, and more into an integrated environment for increased productivity. Leveraging the successful open-source Eclipse platform, PTP helps manage the complexity of HPC scientific code development and optimization on diverse platforms, and provides tools to gain insight into complex code that is otherwise difficult to attain. This tutorial will provide attendees with a hands-on introduction to Eclipse and PTP.

Prerequisites:

Requirements: Bring a laptop and pre-install Eclipse and PTP. See http://wiki.eclipse.org/PTP/tutorials/TG11 for installation instructions.

Opening Reception

Opening Reception

Time: 6:00pm - 8:00pm

Abstract: TeraGrid '11 conference chair John Towns will host a discussion with two of the senior statesmen in our community--John Connolloy and Sid Karin--to look at the winding road that has brought us to the dawn of the eXtreme Digital program from NSF. They will look at the early days of the NSF Supercomputing Center Program, the PACI Program, and the series of activities collectively known as TeraGrid. Along the way we will get to hear some of the history that has preceded us, reflections on how it all really happened, perspectives on the events of the past 25 years, and perhaps some words of wisdom on where we are going.

Converting a Serial Code to a Parallel AMR Code using the Cactus Computational Framework

Converting a Serial Code to a Parallel AMR Code using the Cactus Computational Framework

Presenters: Steven Brandt, Frank Lffler

Level: Intermediate/Advanced

Length: Full day

Time: 8:00am - 5:00pm

Abstract: This tutorial will demonstrate the process of modifying a serial example code to run in parallel with adaptive mesh refinement using the Cactus Computational Framework and Carpet adaptive mesh refinement driver. The demonstration code will be suitable for running on either a laptop or high-end supercomputing resource. The tutorial will explain the basics of Cactus programming and the build system, and showcase examples of large-scale applications that have been constructed with it. We will also walk students through using other features built in to the framework, i.e. checkpoint restart, and remote and parallel visualization. Finally, there will be a clinic session in which we will assist users in beginning the process of adapting their codes to the Cactus/Carpet framework.

Prerequisites:

Requirements: The target audience should consist of computer-proficient scientists able to write basic codes in either C/C++ or Fortran.

Selecting and Using TeraGrid/XD Resources for Maximum Productivity

Selecting and Using TeraGrid/XD Resources for Maximum Productivity

Presenters: Kimberley Dillman

Level: Introductory to Intermediate

Length: Half day

Time: 1:00pm - 5:00pm

Abstract: The TeraGrid/XD program provides a wide variety of resources to the research and academic community. Due to the varied nature of these resources, it is not always easy or even clear on which resources to select for a project. This tutorial will provide an overview of the requirements that need to be determined and how to find resources that "match" these requirements.
It will also cover some basic information about the computational and data resources available to the user community as well as the tools and information provided to assist in the selection of these resources. Some examples of "matching users to resources" will be provided as well as information on the various methods of accessing the resources. Tips on making the most of the resources selected will also be covered.
Example serial and parallel code as well as access to several resources for a "hands-on" workshop will be provided in the second half of the tutorial. Some of the code examples will highlight the utilization of the DC-WAN file system for data input/output during job execution on the Steele and Condor resources. Usage of the new CUE environment will also be highlighted.

Prerequisites: An active TeraGrid account with access to a compute resource is preferred but training accounts will be made available to those who do not have a current allocation.

Requirements: A laptop with wireless network access and a current web browser, Java 1.5 or better, and software for connecting to TeraGrid machines via SSH.

Using Globus Online to Move Data To/From TeraGrid Resource Providers

Using Globus Online to Move Data To/From TeraGrid Resource Providers

Presenters: Steve Tuecke, Bryce Allen, Stuart Martin

Level: Introductory

Length: Half day

Time: 1:00pm - 5:00pm

Abstract: In this tutorial, TeraGrid researchers will learn to use Globus Online (GO) to move data between TeraGrid endpoints, or from a TeraGrid endpoint to the user's local server or laptop. GO provides a solution to large data transfer challenges over GridFTP by providing a robust, reliable, secure, and highly monitored environment for file transfers that has powerful yet easy-to-use interfaces. This hosted service simplifies data movement without requiring construction of custom end-to-end systems. Instructors will provide a brief overview and have participants move data using various interfaces. Attendees will use a web browser for simple transfers and a command line for more advanced tasks. We will show attendees how to use the Globus Connect feature to create a local transfer endpoint (e.g. campus server or home computer/laptop) even if behind a firewall or NAT. We will also teach attendees how to use GO's command-line interface in scripted workflows and how to use GO's REST-style transfer API for programmatic integration with other systems such as web portals and science gateways.

Prerequisites:

Requirements: Attendees will need to bring a laptop

XSEDE/Genesis II Installation, Configuration, and Management

XSEDE/Genesis II Installation, Configuration, and Management

Presenters: Andrew Grimshaw and Karolina Sarnowska-Upton

Level: Intermediate

Length: Half day

Time: 1:00pm - 5:00pm

Abstract: XSEDE introduces a new approach to satisfying user needs. The approach combines an emphasis on interoperability and standards as a means to reduce risk and provide a diversity of software sources, inclusion of campus and research group resources as first-class members of XSEDE, as well as particular attention to non-functional quality attributes such as ease-of-use and availability.
One of the software components that comprises XSEDE is Genesis II. Genesis II will be used in XSEDE initially in three roles: 1) as the software implementing the Global Federated File System that spans the centers, campuses, and research groups, 2) as the mechanism for sharing compute resources located at campuses and research groups, and 3) as one of the access layer components by which users interact with back-end resources.
By the end of the tutorial, the attendees will understand the XSEDE system and resource model, be able to install and configure the client-side tools, and be able to install, configure, and manage XSEDE resources such as compute clusters, shared queues, and shared data.

Prerequisites: Familiarity with basic system administration tasks, accounts, the Unix file system model, shells, etc.

Requirements: Attendees must bring a laptop. Students will need to decide whether they intend to install on Windows, Linux, or Mac (Mac users will have to install on their own machine; we do not have access to MacOS VMs). Each student machine should have an X-Windows server and ssh client. For Windows they will need a remote desktop client.

TeraGrid REST API Tutorial

TeraGrid REST API Tutorial

Presenters: Rion Dooley, Stephen Mock, John-Paul Navarro

Level: Intermediate

Length: Half day

Time: 1:00pm - 5:00pm

Abstract: This hands-on tutorial will educate developers and technology savvy users on the REST APIs available in the TeraGrid. The tutorial primarily targets developers and users familiar with web technologies and/or consuming REST services interested in using them for their projects.
During the second half of the tutorial, we will be holding a hands-on, learn as you go, developer's contest for the most innovated application using TeraGrid REST services. Attendees are not required to participate, but everyone in attendance at the tutorial may participate. The contest will run for 24 hours starting immediately after the mid-tutorial break. Who knows, you may even become famous. Come to the tutorial for more information.

Prerequisites: Attendees should have a familiarity with client-server and/or database driven applications, web technologies, and REST. Other skill sets that may help attendees get the most out of this tutorial are jQuery, Prototype, YUI, PHP, Python, and Java. Familiarity with these additional technologies is beneficial but not required.

Requirements:

First Steps with OpenMP: Parallel Programming for Everyone

First Steps with OpenMP: Parallel Programming for Everyone

Presenters: Amy Szczepanski, Yashema Mack

Level: Introductory

Length: Half day

Time: 1:00pm - 5:00pm

Abstract: OpenMP is an easy way to get started writing code for HPC systems because OpenMP programs can run in parallel on many architectures with multiple cores. This includes desktop computers with multiple cores, larger shared memory systems, and multicore nodes of distributed memory systems. This tutorial covers an elementary introduction to OpenMP, from the ground up, assuming only a beginners level of programming skill, and we present the material in a way that will be valuable to those coming to HPC from all backgrounds. We introduce the concepts behind OpenMP, present several examples that illustrate how to write the code, discuss common pitfalls, show how to compile the code, and explain how to run and monitor the program on an HPC system.

Prerequisites: Familiarity with a programming language such as C or C++ at the lower division undergraduate level.

Requirements: (Optional) Attendees who wish to work along with our examples should have a text editor of their choice and a compiler that supports the OpenMP API. (See: http://openmp.org/wp/openmp-compilers/.)
Materials: Tutorial materials are available at: http://rdav.nics.tennessee.edu

Scientific Software Optimization and Parallelization

Scientific Software Optimization and Parallelization

Presenters: Axel Kohlmeyer, Christopher Macdermaid, Raymond Lauff

Level: Introductory to Intermediate

Length: Half day

Time: 1:00pm - 5:00pm

Abstract: With the example of a simple force kernel for molecular dynamics simulations, a number of different optimization and parallelization techniques are discussed and tested. The tutorial intermixes short presentations of specific aspects of the problem with hands-on exercises using provided example codes. The focus is on how the problem motivates the implementation(s).

Prerequisites: Basic programming skills in C or Fortran are required, familiarity with basic MPI programming desirable.

Requirements: Programming and benchmarks can be performed on local (laptop) and remote facilities (e.g. TeraGrid).

Science Agency Uses of Clouds and Grids

Presenters:

Level: Intermediate/Advanced

Length: Full day

Time: 8:00am - 5:00pm

Abstract: The "Science Agency Uses of Clouds and Grids" workshop comes as a follow-on to the USDOE-sponsored workshop on High Performance Applications of Cloud and Grid Tools held in April, 2011.

This workshop will serve as a venue to develop formal input to several ongoing standards and software roadmapping efforts, including the European SIENA Cloud Standards Roadmap project and the US National Institute of Science and Technology's Cloud Computing Business Use Cases group, as well as to OGF's own standards efforts. International contributions to this workshop are encouraged and will be accepted.

Project status updates on cloud and grid framework software efforts and major projects are recruited to be presented at this workshop. The primary goal of this effort will be to produce a process that will effectively inventory and document essentially all large-scale use of cloud and grid infrastructures for projects that support science agency user communities.

The schedule will consist of an introduction to the goals of the above roadmapping efforts presented by representatives from these projects and from the workshop sponsors. Materials relating to the inventory aspects of this effort are in preparation and will be distributed before the meeting. Workshop time will be taken in the morning session to present materials gathered to that point and to recruit and gather further information from workshop participants, and to go over the gathered material for completeness.

The afternoon session will be devoted to presentations from projects that have not already presented at the previous HPACG workshop. We anticipate special presentations from the Open Science Grid on how to create a new Virtual Organization and a preview of the upcoming evolution of the Teragrid project into the eXtreme Digital era by representatives in attendance. A special presentation on the Globus Online service and other projects of the Globus group will also be made.

Prerequisites:

Sponsors: This workshop is co-sponsored by the US National IT R&D Large Scale Network (LSN) Middleware And Grid Infrastructure Coordination (MAGIC) Team, Internet2 and the US Department of Energy with funding from the US DOE Office for Advanced Scientific Computing Research.

TeraGrid 2011 Opening Reception

Bob Borchers, John Connolly, Sid Karin--all of whom were there at the start of the NSF supercomputer centers program--will highlight accomplishments from the past 25 years and will give their thoughts on the future of high-performance computing.

 

MONDAY JULY 18

TIME LEVEL LENGTH PRESENTATION PRESENTER LOCATION TYPE
7:00AM - 6:00PM Registration Pre-Function Area
7:15AM - 8:00AM Breakfast Salons E & F
8:00AM - 12:00PM Introductory to Intermediate Half Day TeraGrid New User Tutorial Philip Blood, Ken Hackworth, Jim Marsteller, Tom Maiden Salon G Tutorial
8:00AM - 12:00PM Intermediate Half Day Preparing for XD: How TG Resource Providers Can Easily Enable Globus Online for Data Movement Borja Sotomayor, Rajkumar Kettimuthu, Stuart Martin Salon H Tutorial
8:00AM - 12:00PM Intermediate Half Day XSEDE Client Installation and Use - From the Global Federated File System to Running Jobs Andrew Grimshaw, Karolina Sarnowska-Upton Salon J Tutorial
8:00AM - 12:00PM Intermediate to Advanced Half Day Building Science Gateway Applications on Cyberinfrastructure Yan Liu, Shaowen Wang, Raminder Singh, Suresh Marru, Marlon Pierce, Nancy Wilkins-Diehr Deer Valley 1 Tutorial
8:00AM - 12:00PM Advanced Half Day Getting the Most out of the TG SGI Altix UV Systems Scott Simmerman, Mahin Mahmoodi, Galen Arnold, Pragnesh Patel, Amy Szczepanski Deer Valley 2 Tutorial
12:00PM - 1:00PM Lunch Buffet Salons E&F
8:00AM - 5:00PM Introductory Full Day An Introduction to the TG Track 2D Systems: FutureGrid, Gordon, & Keeneland Robert Sinkovits, Renato Figueiredo, Jeffrey Vetter Salon I Tutorial
8:00AM - 5:00PM Introductory/ Intermediate/ Advanced Full Day A New & Improved Eclipse Parallel Tools Platform: Advancing the Development of Scientific Applications Greg Watson, Beth Tibbitts, Jay Alameda, Galen Arnold, Jeff Overbey Snowbird Tutorial
8:00AM - 5:00PM Intermediate to Advanced Full Day Converting a Serial Code to a Parallel AMR Code Using the Cactus Computational Framework** Steven Brandt, Frank Lffler Brighton Tutorial
8:00AM - 5:00PM Intermediate to Advanced Full Day Science Agency Uses of Clouds and Grids   Solitude Tutorial
2:15PM - 2:30PM Break Salons E & F
1:00PM - 5:00PM Introductory to Intermediate Half Day Selecting and Using TG/XD Resources for Maximum Productivity Kimberley Dillman Salon G Tutorial
1:00PM - 5:00PM Introductory Half Day Using Globus Online to Move Data to/from TG Resource Providers Steve Tuecke, Bryce Allen, Stuart Martin Salon H Tutorial
1:00PM - 5:00PM Intermediate Half Day XSEDE/Genesis II Installation, Configuration, and Management Andrew Grimshaw and Karolina Sarnowska-Upton Salon J Tutorial
1:00PM - 5:00PM Intermediate Half Day TeraGrid REST API Tutorial Rion Dooley, Stephen Mock, John-Paul Navarro Deer Valley 1 Tutorial
1:00PM - 5:00PM Introductory Half Day First Steps with OpenMP: Parallel Programming for Everyone Amy Szczepanski, Yashema Mack Deer Valley 2 Tutorial
1:00PM - 5:00PM Introductory to Intermediate Half Day Scientific Software Optimization & Parallelization
Axel Kohlmeyer, Christopher Macdermaid, Raymond Lauff Alta Tutorial
6:00PM - 8:00PM Opening Reception and Panel Discussion: Sid Karin and John Connolly Grand Ballroom Salon E & F

(Invited talk) Kinetic Simulation of Magnetic Fusion Plasmas on High Performance Computing Platforms

Speaker:

W. W. Lee (Princeton Plasma Physics Laboratory, Princeton University)


Time:

10:00am - 12:00pm


Abstract:



Rapid progress has been made in recent years in the area of numerical simulation of magnetic fusion plasmas on modern massively parallel computers. These self-consistent simulations are based on the kinetic description for the plasma via the Boltzmann (Vlasov) equation and the associated electromagnetic Maxwell's equations. The basic algorithms involve 1) Monte-Carlo type calculations for the particle trajectories, 2) gather and scatter operations for obtaining the necessary information for the time advance of the particles, i.e., particle pushing, and 3) iterative solutions for Maxwell's equations. This type of numerical procedures, which has been dubbed as Particle-In-Cell simulation in the plasma physics community, enables us to solve a 3 dimensional problem in configuration space (x) by carrying out particle pushing in Lagrangian coordinates rather than solving the problem on a 6 dimensional phase space (x, v) Eulerian grid. As such, the PIC codes, with reduced dimensionality, are easily parallelizable on MPP platforms using OpenMP and MPI protocols with domain decomposition.

However, the original Vlasov-Maxwell system of equations has many undesirable numerical properties in terms of small time steps and grid size, and enhanced numerical noise, when applying the PIC methods for studying magnetic fusion problems. For the last twenty plus years, researchers in the fusion community 1) have systematically developed a new reduced description for the Vlasov-Maxwell system based on the gyrokinetic ordering of _/L << 1, where _ is the particle gyroradius and L is the scale length of the background magnetic field, 2) have made use of magnetic flux coordinates with straight field lines for more efficient descriptions of the plasma dynamics, and 3) have devised various perturbative particle simulation techniques to reduce the intrinsic particle noise and to expedite the time advance of the particles. These advances together with the recent close collaborations with the applied mathematics, computer science and scientific visualization communities have not only enabled us to gain a better understanding of the complex behavior of fusion plasmas but also have put fusion research at the fore-front of high performance computing.

This talk will emphasize 1) the theoretical work which makes it possible for us to carry out petascale simulations on modern day supercomputers with hundred thousand cores and billions of particles, 2) the invaluable simulation results obtained, that have made it possible for us to compare them with experimental observations as well as 3) the future prospects of global PIC codes for low- and high-frequency gyrokinetics.

Open-source Astrophysics: the Enzo Community Code

Authors:

Brian O'Shea (Michigan State University), Matthew Turk (Columbia University), Michael Norman (University of California San Diego/SDSC), Greg Bryan (Columbia University)


Time:

10:00am - 12:00pm


Abstract:

Enzo is an adaptive mesh refinement code for doing computational astrophysics--primarily, but not exclusively, cosmological structure formation. Enzo has dozens of users, more than two dozen user-developers at 10 institutions, has been the primary computational tool for more than 25 PhD theses, and will be one of the first codes to run on NSF's upcoming Blue Waters platform. In this talk, I will discuss the evolution of the Enzo code and user community from a very small, closed collaboration to the fully open-source, widely used simulation tool that it is today. I will discuss the challenges (and advantages) of distributed code development, particularly in the case where all of the people doing development are doing so part-time. I will also present some of the most recent scientific results from Enzo calculations, on topics ranging from the formation of the first generation of stars in the universe to the evolution of massive clusters of galaxies, and will discuss our plans for Blue Waters and future development efforts.

Petascale Kinetic Simulation of the Magnetosphere

Authors:

Homa Karimabadi (SciberQuest Inc. and University of California San Diego), Hoanh Vu (University of California San Diego), Burlen Loring (Lawrence Berkeley National Lab), Yuri Omelchenko (University of California San Diego), Tamara Sipes (SciberQuest Inc.), Vadim Roytershteyn (University of California San Diego), William Daughton (Los Alamos National Laboratory), Mahidhar Tatineni (University of California San Diego/SDSC), Amit Majumdar (University of California San Diego/SDSC), Umit Catalyurek (Ohio State University), Alper Yilmaz (Ohio State University)


Time:

10:00am - 12:00pm


Abstract:

The term space weather has been coined to describe the conditions in space that affect the Earth and its technological systems such as global positioning system satellites, geosynchronous communication and weather satellites, large electric power grids on the ground, and navigation and communication systems through the ionosphere. Understanding of the complex interactions of the solar wind with the Earth's magnetosphere and how the mass, momentum and energy of the solar wind is transferred to the Earth's magnetosphere is not only of scientific interest but has direct implications for development of accurate space weather forecasting models. The Earth's magnetic field provides a shield from effects of space weather but this shield is not perfect and a physical process known as magnetic reconnection enables the solar wind to enter the magnetosphere. Reconnection is triggered on electron scales but it leads to global rearrangement of magnetic fields and particles. The details of magnetic reconnection and its effect on the magnetosphere are not well understood. Currently global simulations are predominantly based on single-fluid magnetohydrodynamics (MHD). MHD simulations have proven useful in studies of the global dynamics of the magnetosphere with the goal of predicting eminent features of substorms and other global events but the magnetosphere is dominated by ion kinetic effects and many key aspects of magnetosphere relating to transport and structure of boundaries have waited for global kinetic simulations. Some regions such as ion foreshock or ring currents are not even formed in MHD.

Given the dominance of ion kinetic effects in the magnetosphere, it has long been desired to perform 3D global hybrid (electron fluid, kinetic ions) simulations of the magnetosphere. However, a major impediment to such simulations has been the high computational cost. The ion skin depth in the solar wind, which is resolved in hybrid code, is ~1/25-1/50 RE (Earth's radius of ~ 6400 Km) while the ion inverse gyrofrequency is ~1 second. A global hybrid simulation would need to resolve such scales in a box extending 256 x 64 x 64 RE and run for an equivalent of several hours of magnetospheric time. As a result, carrying out such calculations has remained an elusive goal but one that can be overcome with petascale computing as we have been doing in the past year or so. These important simulations are opening new horizons for scientific understanding of the Sun-Earth system and will, for the first time, enable exploration of many long standing issues that are forever out of reach of MHD simulations. Examples include self-consistent formation of ring currents, access of ionospheric ions to various parts of the magnetosphere, the effect of ionospheric oxygen ions on reconnection, and mass filtering effects at the magnetopause, among others.

Our multi-disciplinary research team has created the most accurate picture of reconnection process and its consequences in the magnetosphere using machine scale runs on Kraken at NICS.

We have done scaling study of the hybrid codes on the Kraken machine using up to ~98K cores and will present the scaling results using the metric of effective number of particles pushes per second per processor which matches the underlying computational metrics of the physics study. Our production runs have been carried out regularly at ~98K cores of Kraken in the past year. We have done significant parallel I/O optimization and implementation and have achieved ~19 GB/sec I/O rate on the Kraken's Lustre file system and this is 2/3rd of the theoretical peak I/O rate achievable on Kraken's Lustre filesystem. Our simulations produce many 10s of TBs of data and we have applied data mining and computer vision techniques to visualize the data and observe and track new physics events, such as Flux Transfer Events (FTEs), among others. These are of immense importance in space plasma physics and only possible now to simulate and observe due to petascale computing. The breakthrough science results, computational analysis (scaling, I/O) results, and novel approaches to visualization and analysis of massive data sets will be presented.

On the Density Distribution in Star-Forming Interstellar Clouds

Authors:

Alexei Kritsuk (University of California San Diego), Michael Norman (University of California San Diego/SDSC), Rick Wagner (University of California San Diego/SDSC)


Time:

10:00am - 12:00pm


Abstract:

We use deep adaptive mesh refinement simulations of isothermal self-gravitating supersonic turbulence to study the imprints of gravity on the mass density distribution in molecular clouds. The simulations show that the density distribution in self-gravitating clouds develops an extended power-law tail at high densities on top of the usual lognormal. We associate the origin of the tail with self-similar collapse solutions and predict the power index values in the range from --7/4 to --3/2 that agree with both simulations and observations of star-forming molecular clouds.

DataONE Member Node Pilot Integration with TeraGrid

Authors:

Nicholas Dexter (University of Tennessee), John Cobb (Oak Ridge National Laboratory), Matt Jones (UC Santa Barbara), Mike Lowe (Indiana University Pervasive Technology Institute) Dave Vieglais (University of Kansas)


Time:

10:00am - 12:00pm


Abstract:

The NSF DataONE DataNet project and the NSF TeraGrid project have started a pilot collaboration to deploy and operate the DataONE Member Node software stack on TeraGrid infrastructure. The appealing feature of this collaboration is that it opens up the possibility to add large scale computing as an adjunct to DataONE data, metadata, and workflow manipulation and analysis tools. Additionally, DataONE data archive and curation services are exposed as an option for large scale computing and storage efforts such as TeraGrid. Beginning with an April 1st, 2011 allocation, the DataONE Core Cyberinfrastructure Team has been working with the IU Quarry virtual hosting service and more generally with the TeraGrid data area to implement this pilot.

The implementation includes multiple virtual servers in order to test different reference implementations of the common DataONE Member node RESTful web-service functions. These implementations include implementation as a Metacat server, as well as a Python Generic Member Node developed by DataONE. The implementations will also mount TeraGrid-wide global storage services (DC-WAN and Albedo) and thus allow integration of input and output of large scale computational runs with wide area archival data and metadata services.

Initial results show a smooth and easy Metacat configuration implementation due mainly to a concise and clean interface design both on the part of DataONE and the Quarry VM infrastructure. We are also testing mounted TeraGrid-wide storage services such as DC-WAN and Albedo, and will initiate data replication from TeraGrid external DataONE nodes to Quarry member nodes. Additionally, we will present preliminary results from scale and performance of replication and science metadata consistency traffic.

We are also exploring using the Quarry DataONE Member Node implementation for production use to simplify and accelerate results data transport to the Cornell lab of Ornithology for the 2012 eBird/State of the Bird statistical analyses that will be conducted on TeraGrid resources.

A Roadmap for Using NSF Cyberinfrastucture with InCommon

Authors:

William Barnett (Indiana University), Von Welch (Indiana University), Alan Walsh (Indiana University), Craig Stewart (Indiana University)


Time:

10:00am - 12:00pm


Abstract:

The InCommon Federation represents an identity federation serving the higher education community. Prior work by the authors has demonstrated the success use of InCommon to access the TeraGrid. Based on a desire from the broader community for more guidance on the use of federated identity for cyberinfrastructure (CI), the "InCommon Roadmap for NSF CI" provides guidance and practical how-to information for using the InCommon identity federation to enable researchers to access NSF CI. It is intended for both technical and non-technical readers, avoiding detail by referencing existing InCommon documentation whenever possible. It presents the big picture of using InCommon for CI, including benefits, alternatives, challenges, and guidance in overcoming the challenges.

The Roadmap has three main sections. The first, "Benefits, Challenges and Overview," is intended for campus and project leadership, scientists and engineers using CI. It provides a summary of InCommon, relevant technologies and the benefits and challenges their adoption brings. The second section is intended for information technology professionals, from campuses and NSF cyberinfrastructure projects, and is a guide for deployment of InCommon software and services. The third section is intended for managers and policy makers, and discusses the policy, privacy, financial and other factors of InCommon deployment. The document is dual-tracked in that it provides guidance both for staff from campuses and NSF cyberinfrastructure projects.

Our presentation will focus on using InCommon from the perspective of a CI project. The presentation will cover benefits, alternatives, challenges, and current best practices.

(Invited talk) The XSEDE Architecture -- A Renewed Emphasis on Quality Attributes

Speaker:

Andrew Grimshaw (University of Virginia)


Time:

10:00am - 12:00pm


Abstract:

N/A

(Invited talk) An API to feed the World

Speaker:

Rion Dooley (University of Texas/TACC)


Time:

10:00am - 12:00pm


Abstract:

N/A

UltraScan Gateway Enhancements through TeraGrid Advanced User Support

Authors:

Borries Demeler (Department of Biochemistry, University of Texas Health Science Center), Raminderjeet Singh (Pervasive Technology Institute at Indiana University), Marlon Pierce (Pervasive Technology Institute at Indiana University), Suresh Marru (Indiana University) and Emre H Brookes (Department of Biochemistry, University of Texas Health Science Center)


Time:

10:00am - 12:00pm


Abstract:

The gateway provides an interface between the UltraScan modeling software, TeraGrid and local computational resources for the evaluation of experimental analytical ultracentrifuge data. The UltraScan gateway is consistently among the top five gateway community account users, but its continued growth and sustainability needed additional support. In this paper we describe the enhancements to the gateway middleware infrastructure provided through the TeraGrid Advanced User Support (AUS) program. These advanced support efforts primarily focused on a) expanding and updating the number of TeraGrid resources used by UltraScan; b) upgrading UltraScan's job management interfaces to use GRAM5 in place of the deprecated WS-GRAM; c) providing realistic parallel computing usage scenarios to the GRAM5 and INCA resource testing and monitoring teams; d) creating general-purpose, resource-specific, and UltraScan-specific error handling and fault tolerance strategies based on production usage; and e) providing forward and backward compatibility for the job management system between UltraScan's version 2 (currently in production) and version 3 (expected to be released mid-2011).

Developing an Integrated End-to-end TeraGrid Climate Modeling Environment

Authors:

Lan Zhao (Purdue University), Carol Song (Purdue University), Cecelia Deluca (NOAA/CIRES) and Don Middleton (NCAR)


Time:

10:00am - 12:00pm


Abstract:

The Community Earth System Model (CESM) is a widely used community model for studying the climate system on the Earth. The CESM model is both data and computationally intensive, its the software complex, hence it is difficult for users to set up and run CESM simulations using local resources. In this paper, we describe an integrated climate modeling environment that supports CESM simulations on the TeraGrid, online data analysis, and automatic archival of model data and metadata. This system builds upon and integrates several existing efforts -- the Purdue CCSM modeling portal, the Earth System Grid, the Earth System Modeling Framework, and the Earth System Curator. We present the design and implementation of our prototype system as well as an end-to-end usage scenario which is broken down into three workflows: model execution, data publishing, and metadata collection/publishing. The system will be used to support research and education on climate systems. We describe our plan and early efforts to engage users and obtain their feedback.

A CyberGIS Gateway Approach to Interoperable Access to the National Science Foundation TeraGrid and the Open Science Grid

Authors:

Anand Padmanabhan (University of Illinois at Urbana-Champaign), Shaowen Wang (University of Illinois at Urbana-Champaign) and John-Paul Navarro (Argonne National Laboratory)


Time:

10:00am - 12:00pm


Abstract:

The vision of creating a "virtual supercomputing" environment to solve large-scale scientific problems has largely been facilitated by the development and deployment of Grid middleware. However, with the deployment of multiple disconnected Grid environments, we are now faced with the problem of interoperable access to resources from multiple environments to meet the requirements of scientific applications. Within the U.S. cyberinfrastructure environments, both the National Science Foundation TeraGrid and the Open Science Grid (OSG) provide varied but important capabilities and resources needed by diverse computational communities. Hence, it is critical to understand how these communities can benefit from bridging these different environments and utilize them when needed. In this paper we present a novel approach to interoperable access to both OSG and TeraGrid to users through the CyberGIS Gateway -- an online geographic information system. In particular, five key bridging themes are described: authentication/authorization, information services, data management, and computation management and auditing. We take a scientific application use-case (viewshed analysis) on the CyberGIS Gateway to demonstrate how it is able to leverage resources on both TeraGrid and OSG.

Extending BioVLab Cloud Workbench to a TeraGrid Gateway

Authors:

Suresh Marru (Indiana University), Marlon Pierce (Indiana University), Patanachai Tangchaisin (Indiana University), Heejoon Chae (Indiana University), Kenneth Newphew (Indiana University) and Sun Kim (Indiana University)


Time:

10:00am - 12:00pm


Abstract:

BioVLab is Open Gateway Computing Environments based system is currently used as reconfigurable cloud computing workbench.

Reducing the Barrier to Entry Using Portable Apps

Authors:

Dirk Colbry (Michigan State University)


Time:

10:00am - 12:00pm


Abstract:

An increasing number of turnkey, domain specific software packages are available to help users take advantage of advanced cyber-infrastructure and resources such as TeraGrid. However, novice users of cyber-infrastructure are often overwhelmed by the complexities of using cyber-infrastructure. For instance, the user may need to install multiple software tools just to connect with advanced hardware, and successfully installing and navigating this software frequently requires the use of Command Line Interfaces (CLI) that are unfamiliar to novice users. Even when applications provide a Graphical User Interface (GUI), special software (such as an X11 server) may be required to use the interface. Installing, configuring and running this software is generally a multi-step process that can be overly confusing to novice users and presents a barrier to entry, particularly in research domains not traditionally associated with advanced computation.

Scientific gateways (such as the TeraGrid Portal) are one possible solution to this problem. However, not all research projects or High Performance Computing (HPC) centers have the resources necessary to provide scientific gateways. We have developed an alternative solution: a "plug and play" HPC system portal stored on a USB thumb drive. The thumb drive contains all the software necessary to connect to traditional cyber-infrastructure and all programs run directly from the thumb drive -- no installation or setup is required. To access the software from a Windows-based machine, the user simply connects the thumb drive and runs the desired programs. The current thumb drive includes all the typical software necessary to connect to an HPC resource, such as x11, ssh, and scp. Since the software is pre-installed on the drive, it can also be preconfigured with the necessary preferences required to immediately connect to the resource.

This presentation will describe the development process for the "Portable Apps" HPC thumb drive, including lessons learned and suggestions for adapting the paradigm for other systems. The Portable Apps drive has been successfully distributed to both expert and novice HPC users at Michigan State University (MSU) and has proved to be a popular and easy-to-use tool for accessing local and national cyber-infrastructure resources, including TeraGrid. This presentation will offer specific suggestions for adapting the Portable Apps idea and developing similar outreach and educational tools for other institutions and resources.

Educational Virtual Clusters for On-demand MPI/Hadoop/Condor in Future Grid

Authors:

Renato Figueiredo (University of Florida), David Wolinsky (University of Florida) and Panoat Chuchaisri (University of Florida)


Time:

10:00am - 12:00pm


Abstract:

FutureGrid (www.futuregrid.org) is an experimental computational testbed slated to become part of the TeraGrid infrastructure. FutureGrid provides unique capabilities that enable researchers to deploy customized environments for their experiments in grid and cloud computing. A key enabling technology for this is virtualization and the provisioning of Infrastructure-as-a-Service (IaaS) through cloud computing middleware. A previous presentation on TeraGrid 2010 has described the core technology that forms the basis for FutureGrid training and education activities -- a system which leverages virtualization, cloud computing and self-configuring capabilities to create self-contained, flexible, plug-and-play "educational appliances".

In this abstract, and in the presentation at the conference, we will describe and demonstrate how educational virtual appliances are automatically self-configured to enable on-demand deployment of three popular distributed/cloud computing stacks (Condor, MPI, and Hadoop) within and/or across FutureGrid sites. The main benefit of this capability to educational activities is that isolated virtual clusters for individuals or classes can be created on demand by instructors (or students themselves), and the entire middleware is self-configured, allowing activities to focus on the application of distributed/cloud computing platforms rather than the configuration of middleware. The provisioning time of an isolated, educational virtual cluster is of the order of a few minutes. It requires no configuration from instructors or students other than joining a Web 2.0 site to create and manage a group for their class.

In terms of design and implementation, the educational virtual cluster on-demand uses Condor as the underlying scheduler by leveraging previous work on the Grid Appliance. Condor is then used to dispatch MPI and Hadoop daemons -- users create on-demand MPI or Hadoop virtual clusters by submitting Condor jobs. MPI tasks are submitted as together with the job that creates an MPI ring, while Hadoop tasks are submitted to the virtual cluster using standard Hadoop tools.

The integration with FutureGrid is such that new users can deploy a single virtual machine on FutureGrid (using Nimbus or Eucalyptus), and the machine automatically connects to a small "sandbox" public shared resource pool where the user is able to go through simple tutorials and interact with a virtual cluster with short turn-around time. Users can also deploy private virtual clusters by creating a group using a simple Web 2.0 user interface.

At the conference, the presentation of this abstract will describe the design principles, architecture, and deployment experiences with on-demand virtual clusters based on virtual appliance technologies used in education and training in FutureGrid. This includes an overview of the underlying virtual appliance technology, and the process of automatic configuration of Condor, MPI and Hadoop on-demand. In addition to training/education, it is expected that these techniques can be of interest to a broader TeraGrid'11 audience because they can be applied for the creation of virtual clusters on demand to support other applications and experiments on FutureGrid and TeraGrid. The presentation will describe examples of using these appliances in hands-on educational activities, including a TeraGrid introduction to MPI module, with a brief hands-on demonstration using the presenter's laptop. Attendees of this presentation will be exposed to the process of how they would go about creating a virtual cluster of their own on FutureGrid for educational activities involving Condor, MPI and/or Hadoop.

The Shape of the TeraGrid: Social Network Analysis of an Affiliation Network of TeraGrid Users and Projects

Authors:

Richard Knepper (Indiana University)


Time:

10:00am - 12:00pm


Abstract:

I examine the makeup of the users and projects of the TeraGrid using social network analysis techniques. Analyzing the TeraGrid as an affiliation (two-mode) network allows for understanding the relationship between types of users and field of science and allocation size of projects. The TeraGrid data shows that while less than half of TeraGrid users are involved in projects that are connected to each other, a considerable core of the TeraGrid emerges that constitutes the most-commonly-related projects. The largest complete subgraph of TeraGrid users and projects constitutes a more dense and more centralized network core of TeraGrid users. I perform latent network analysis on the largest complete subgraph in order to identify additional groupings of projects and users within the TeraGrid. This analysis of users and projects provides substantive information about the connections of individual scientists, projects groups, and fields of science in a large-scale environment that incorporates both competition and cooperation between actors.

Building Cyberlearning Communities at the Crossroads: Successes, Challenges, Lessons Learned, and the Extreme Road Ahead

Authors:

Jeff Sale (University of California San Diego/SDSC), Ange Mason (University of California San Diego/SDSC), and Diane Baxter (University of California San Diego/SDSC)


Time:

10:00am - 12:00pm


Abstract:

The Internet is vastly different from where it was only 10 years ago. Technologies such as grids, clouds, and smart phones and the applications they support are changing rapidly. We must find more effective ways to keep pace with these advances and to educate ourselves as we pursue our research. Three years ago, in an effort to respond to this challenge, the National Science Foundation (NSF) established a task force on cyberlearning that included many of the leading researchers in distributed learning and they made the following recommendations: Help build a vibrant cyberlearning field by promoting cross-disciplinary communities of cyberlearning researchers and practitioners Instill a "platform perspective"--shared, interoperable designs of hardware, software, and services--into the NSF's cyberlearning activities Emphasize the transformational power of information and communications technology for learning, from K to grey Adopt programs and policies to promote open educational resources Take responsibility for sustaining NSF-sponsored cyberlearning innovations

Since then, the NSF appears to have listened and is beginning to support cyberlearning communities of users with common goals and interests.

The TeraGrid serves a community of over 4,000 users distributed across the nation and the world. Each one of these users is or has been or will be a cyberlearner in one capacity or another. Online distributed learning is an essential part of every TeraGrid user's efforts to use the grid better. Within this comparatively large community exist multiple potential sub-communities made up of users who share common interests in one or more aspects of high-performance computing (HPC). Many have already undertaken the process of forming their own communities in the form of TeraGrid Gateways using grid toolkits to integrate directly with the TeraGrid. Some of these already include useful education and training components.

Other communities do not require that level of grid portal technology and prefer a more user-friendly customizable portal interface with a large user base, numerous extensions, and strong social network support. We are currently involved in one capacity or another with the design, development, implementation, and administration of portals for six such communities, including the TeraGrid Campus Champions, MSI-CIEC, the HPC University, the SDSC TeacherTECH, StudentTECH, and Discovering Data in Education communities.

In this talk we share experiences and lessons learned from several years trying to nurture and grow some of these sub-communities in one capacity or another online. These include some great successes and others that are still 'works-in-progress'. Each is unique, but we have developed some overarching principles useful in guiding our decision-making when considering new cyberlearning communities. These are:
Obtain adequate funding (develop a clear idea what this is and make a case for it) (identify both start-up funding and support methods for continuation) If you build it they will not necessarily come (plan a strategy to build interest and create synergy and assign specific tactics to identified individuals) Know your communities' technical limitations (and your own) Know your communities' online social skills/inclinations (ask them how they like to interact; use surveys if possible to learn more) Identify community leaders, synergy catalysts, change agents, give them incentives if possible (both leaders and incentives are essential for adoption) Identify examples of appropriate uses of social networking technologies, tied to the community culture Practice what you preach (or find a partner who does)

(Invited talk) High Performance Computational Nanotechnology for Research and Education on nanoHUB.org

Speaker:

Gerhard Klimeck (Purdue University)


Authors:

Gerhard Klimeck, Krishna P.C. Madhavan, Lynn Zentner, George B. Adams III, Victoria Farnsworth, Michael Zentner, Nathan Denny, Mathieu Luisier, Sebastian Steiger, Michael Povolotsky, Hong-Hyun Park, Tillmann C. Kubis Network for Computational Nanotechnology, Purdue University


Time:

2:45pm - 5:15pm


Abstract:

Established in 2002, the Network for Computational Nanotechnology (NCN) is an NSF infrastructure and research network. NCN supports the National Nanotechnology Initiative with nanoHUB.org, a highly successful cyber-community for theory, modeling, and simulation now serving over 170,000 researchers, educators, students, and professionals annually. We seek to 1) engage an ever-larger and more diverse cybercommunity sharing novel, high-quality research and educational resources that spark new modes of discovery, innovation, learning, and engagement; 2) accelerate the transformation of nanoscience to nanotechnology through tight linkage of simulation to experiment; 3) develop open-source software; and 4) inspire and educate the next workforce generation.

nanoHUB.org now delivers over 200 online simulation tools and accesses national computing resources. These tools range in their computational demands from seconds on a single core to hours on hundreds of cores. The highly scalable Nanoelectronic Modeling Tool Suite OMEN and NEMO5 scales to hundreds of thousands of cores and also powers several tools on nanoHUB which served over 6,000 users. 719 papers in the scholarly literature cite nanoHUB.org. In turn these 719 papers are cited on average 5.6 times to a total of 4013. These citations give nanoHUB an h-index of 30, demonstrating high research quality. Experimental data is reported alongside simulation results in 176 (29%) of the 605 nano research papers. nanoHUB impacts experimental research!

nanoHUB.org provides over 2,300 educational resources, such as tutorials, and courses that are offered for self-study, to augment traditional courses, and to serve as models for new ways of education. We count around 1,000 downloads monthly from the Beyond Campus section of iTunes U. Since January 2010 we have contributed high quality animations to Wikipedia and are averaging almost 3,000 monthly visitors being driven from Wikipedia. In calendar year 2010, 135 graduate and undergraduate classes at 81 institutions made use of nanoHUB.

This presentation will overview aspects of HPC for nanoelectronics, its delivery to the masses, and impact metric developments.

Investigation of Substrate-recognition and the Active-site Stability of CYP2B4 and CYP3A4 X-ray Structures Using Large-scale Molecular Dynamics Simulations on TeraGrid Resources

Authors:

Kiumars Shahrokh (University of Utah), Garold S Yost (University of Utah), Thomas Cheatham (University of Utah)


Time:

2:45pm - 5:15pm


Abstract:

Recent X-ray crystallographic structures of mammalian P450 enzymes with and without bound substrates demonstrate a profound conformational plasticity of the enzyme active site during P450-substrate binding. As the crystal structures impose particular packing depending on the crystallization conditions and since the structure is time averaged, it is unclear what the functional significance of these structural changes are. To provide greater insight into the influence of classically defined substrate recognition in the context of these plastic regions in the structures, we applied large-scale molecular dynamics simulations of various P450 enzymes with varied substrates in explicit solvent on TeraGrid resources. A key question is whether these resources are sufficient, defined in terms of the effective computational power, conformational sampling and force field performance, to model the plasticity and conformational influences of ligand interaction. A series of molecular dynamics (MD) simulations using the AMBER ff99SB and ff99SBildn on different crystal structures of two microsomal P450 shown to adopt multiple conformations were performed. Specifically, CYP2B4, which displays some of the largest-scale rearrangements yet observed in P450 enzymes in response to changes in the X-ray crystallography conditions and choice of bound ligand, and also CYP3A4, the major drug metabolizing P450 enzyme, which demonstrates smaller scale but similar rearrangements. To assess and validate the AMBER protein force fields, including newly developed heme parameters across the full catalytic cycles of the heme, simulations included ligand-free monomeric units of 2B4 in the "open" structure from the crystallographic dimer (PDB code: 1PO5), and also "closed" structures bound to various ligands: 4-(4-chlorophenyl)imidazole (PDB code: 1SUO) and bifonazole (PDB code: 2BDM) and various permutations where drugs were swapped and docked into the alternative structures. Similar permutations of ligand switching for ligand-bound and ligand-free structures have also been performed for CYP3A4 structures (PDB code: 1TQN) the closed ligand-free form, (PDB code: 2JOD) bound to erythromycin, (PDB code: 2VOM) bound to ketoconazole, and (PDB code: 3NXU) bound to ritonavir. A considerable challenge is in supporting the simulation workflow and data analysis for the large series of molecular dynamics runs.

Mechanism of 150-cavity Formation in Influenza Neuraminidase

Authors:

Rommie Amaro (University of California Irvine), Ross Walker (University of California San Diego/SDSC), Wilfred Li (University of California San Diego), Robin Bush (University of California Irvine), Robert Swift (University of California Irvine), Lane Votapka (University of California Irvine)


Time:

2:45pm - 5:15pm


Abstract:

The recently discovered 150-cavity in the active site of group-1 influenza A neuraminidase (NA) proteins provides a target for rational structure-based drug development to counter the increasing frequency of antiviral resistance in influenza. Surprisingly, the 2009 H1N1 pandemic virus (09N1) was crystalized without the 150-cavity characteristic of group-1 NAs. Here we demonstrate key insights into the mechanism of 150-cavity formation in both N1 and N2 NA enzymes through 1.6 microseconds of biophysical simulations, carried out on the Teragrid machines, and an extensive bioinformatics analysis. Our results provide a new atomic-level understanding of antiviral compound activity across both human-infecting subtypes.

Download the Presentation

Biomolecular Simulation and Computer Aided Drug Design on TeraGrid Resources: Promise and Peril

Authors:

Thomas Cheatham (University of Utah)


Time:

2:45pm - 5:15pm


Abstract:

Although access to TeraGrid resources has allowed significantly greater insight into biomolecular structure, dynamics, and interactions through simulation, the continued growth in resources and explosion in data have stretched our ability to most effectively utilize these powerful resources. The paradigm of running batch jobs for weeks to months followed by transfer of the data back to local resources for analysis is breaking down. Moreover, new modes of operation involving coupled ensembles of simulations lead to greater workflow and data management issues while adding extra layers of complexity for analysis. Despite these challenges, we are improving the underlying models and methods and providing scientific insight into problems ranging from the design of hemostatic peptides to better drug-RNA interactions.

From Fullerenes to Nano-devices - Modeling Reactions between Carbon Nano-structures from Quantum Chemical Molecular Dynamics Simulations

Authors:

Jacek Jakowski (National Institute for Computational Sciences)


Time:

2:45pm - 5:15pm


Abstract:

Carbon nanostructures (fullerenes, nanotues, etc.) are promising building blocks of nanotechnology. Potential applications include optical and electronic devices, sensors, and nanoscale machines. We characterize coalescence of carbon nanostructures by means of direct molecular dynamics in which electrons are treated quantum mechanically via self-consistent-charge density-functional tight-binding (SCC-DFTB) method. In agreement with experimental data we find that the highest probability for collision induced coalescence of buckminster fullerenes is for incident energy range of 120 and 140 eV. In this energy region, fusion occurs by way of the formation of hot, vibrationally excited peanut-shaped structures within 1 ps. These nanopeanuts further undergo relaxation to short carbon nanotubes and are cooling by evaporation of short carbon chains during the next 200 ps. Contrary to fullerenes, open structures such as bubbles are very reactive and can fuse very easily. Carbon-carbon bonds in open nano-bubbles are observed to be very labile and the structures can easily re-arrange bonds to allow aggregation of both structures.

Globus GridFTP: What's new in 2011?

Authors:

Rajkumar Kettimuthu (Argonne National Laboratory and University of Chicago)


Time:

2:45pm - 5:15pm


Abstract:

GridFTP is a high-performance, secure data transfer protocol designed to move large datasets rapidly and reliably between geographically distributed systems. The Globus implementation of GridFTP is widely used to carry out secure, robust, high-speed bulk data transport in distributed computing environments. It is the recommended tool for transferring large files to or from a TeraGrid resource.

GridFTP in Globus Toolkit 5 has many new features that are of interest to the TeraGrid community. In this presentation, we will provide details on the new features such as synchronizing datasets, chrooting a GridFTP server, a new command that enables data channel security for certain new use cases, and bandwidth limiting. We will also discuss future plans including integration of network reservation in GridFTP, enhancements for emerging 100G networks and beyond.

Globus XIO Pipe Open Driver: Enabling GridFTP to Leverage Standard Unix Tools

Authors:

Rajkumar Kettimuthu (Argonne National Lab and University of Chicago), Steven Link (Northern Illinois University)


Time:

2:45pm - 5:15pm


Abstract:

Scientific research of all disciplines unavoidably creates substantially large volumes of data throughout the process of discovery, analysis and conclusion. Given the necessity for data sharing and data relocation, members of the scientific community are often faced with a productivity loss which correlates with the time cost incurred during the data transfer process. GridFTP protocol was developed to improve this situation by addressing the performance, reliability and security limitations of standard FTP and other commonly used data movement tools such as SCP. The Globus implementation of GridFTP is widely used to rapidly and reliably move data between geographically distributed systems. Traditionally, GridFTP performs well for datasets containing large files. When the data is partitioned into many small files, it suffers from lower transfer rates. Though the pipelining and concurrency solution in GridFTP provides improved transfer rates for lots of small files datasets, these solutions cannot be applied in environments that have strict firewall rules. In such scenarios, tarring up the files in a dataset on the fly will help. In certain scenarios, compression is desired, in other cases, a checksum of the files after they are written to disk, is desired. There are robust system tools in Unix that perform these tasks (tar, compress, checksum, etc.). In this paper, we present the Globus XIO Pipe Open Driver (Popen) that enables GridFTP to leverage the standard Unix tools to perform certain tasks. We show how this driver is used in GridFTP to provide a number of useful features. We demonstrate the effectiveness of this functionality through an experimental study.

Using Globus Online for Reliable, Secure File Transfer

Authors:

Lisa Childers (Argonne National Laboratory and The University of Chicago)
Steve Tuecke (University of Chicago and Argonne National Laboratory) will present on her behalf


Time:

2:45pm - 5:15pm


Abstract:

File transfer is both a critical and frustrating aspect of computational research. For a relatively mundane task, moving terabytes of data reliably and efficiently can be surprisingly complicated. One must discover endpoints, determine available protocols, negotiate firewalls, configure software, manage space, negotiate authentication, configure protocols, detect and respond to failures, determine expected and actual performance, identify, diagnose and correct network misconfigurations, integrate with file systems, and a host of other things. Automating these makes researchers' lives much, much easier. In this presentation we will provide a technical overview of Globus Online: a fast, reliable file transfer service that simplifies large-scale, secure data movement without requiring construction of custom end-to-end systems. We will walk the audience through the process of signing up to Globus Online and using both GUI and CLI interfaces to move data between TeraGrid endpoints. We will also cover use of the Globus Connect feature, which allows users to transfer files between TeraGrid sites and their local servers or laptops, even if behind a firewall, without the complexity of a full Globus install. The presentation will include a demonstration as well as highlights from several user case studies.

Managing Appliance Launches in Infrastructure Clouds

Authors:

John Bresnahan (Argonne National Laboratory), Tim Freeman (University of Chicago), David Labissoniere (University of Chicago), Kate Keahey (Argonne National Laboratory)


Time:

2:45pm - 5:15pm


Abstract:

Infrastructure cloud computing introduces a significant paradigm shift that has the potential to revolutionize how scientific computing is done. However, while it is actively adopted by a number of scientific communities, it is still lacking a well-developed and mature ecosystem that will allow the scientific community to better leverage the capabilities it offers. This paper introduces a specific addition to the infrastructure cloud ecosystem: the cloudinit.d program, a tool for launching, configuring, monitoring, and repairing a set of interdependent virtual machines in an Infrastructure-as-a-Service (IaaS) cloud or over a set of IaaS clouds. The cloudinit.d program was developed in the context of the Ocean Observatory Initiative (OOI) project to help it launch and maintain complex virtual platforms provisioned on-demand on top of infrastructure clouds. Like the UNIX init.d program, cloudinit.d can launch specified groups of services, and the VMs in which they run, at different run levels representing dependencies of the launched VMs. Once launched, cloudinit.d monitors the health of each running service to ensure that the overall application is operating properly. If a problem is detected in a service cloudinit.d will restart only that service, and any other service that failed which depended upon it.

Generic FutureGrid Image Management

Authors:

Javier Diaz (Indiana University), Gregor Von Laszewski (Indiana University), Fugang Wang (Indiana University), Geoffrey C. Fox (Indiana University)


Time:

2:45pm - 5:15pm


Abstract:

N/A

Methods of Creating Student Cluster Competition Teams

Authors:

Stephen Harrell (Purdue University), Preston Smith (Purdue University), Doug Smith (University of Colorado), Torsten Hoefler (University of Illinois at Urbana-Champaign), Anna Labutina (Nizhni Novgorod State University) and Trinity Overmyer (Purdue University)


Time:

2:45pm - 5:15pm


Abstract:

N/A

A Multi-Cultural Success Story for Achieving Diversity in Multi-Core/ many-Core Internships

Authors:

Jennifer Houchins (Shodor), Jeff Krause (Shodor), Robert Panoff (Shodor) and Scott Lathrop (National Center for Supercomputing Applications)


Time:

2:45pm - 5:15pm


Abstract:

One of the foremost challenges in high performance computing is promoting involvement of historically underrepresented undergraduate students. The Blue Waters Undergraduate Petascale Education Program, funded by the National Science Foundation Office of Cyberinfrastructure, has been facing this challenge head-on while supporting undergraduate internship experiences that involve the application of HPC to problems in the sciences, engineering, or mathematics. This paper describes an evolving approach to the recruitment efforts for undergraduate interns and mentors that demonstrates the importance of formative assessment supporting proactive program changes leading to success in generating a large, diverse applicant pool including substantial numbers of qualified women and minority candidates.

Coming to Consensus on Competencies for Petascale Computing Education and Training

Authors:

Steven Gordon and Judith Gardiner (Ohio Supercomputing Center)


Time:

2:45pm - 5:15pm


Abstract:

At the TeraGrid 2010 Conference we began the process of defining a set of education competencies for Petascale Computing. At that time, we launched a survey of computational science faculty and professionals focused on the skills needed by those undertaking research that use petascale computational resources. The 89 responses indicated that competencies should probably be divided into several categories: competencies for all students (basic competencies) that introduce a wide range of computational skills and code development issues, advanced computer science skills that may apply only to a subset of code developers, and domain specific skills that apply to the special issues arising from the mathematical basis of particular classes of models.

Using the results of the survey, we have engaged several groups of computational science experts in focus group sessions during which the division of competencies were discussed in depth. Although there was not unanimous agreement on the basic competencies, there was broad acceptance of a large number of those competencies.

We will present the findings of the survey and focus groups. The basic competencies will be reviewed along with a preliminary draft of the more advanced and domain specific competencies. We will also discuss the opportunities for devising formal and informal curriculum offerings focused on those competencies that could be incorporated into certificate and degree programs, as well as within institutes, summer schools, and training sessions.

The TeraGrid'11 audience is a uniquely qualified community to provide constructive feedback to ensure that the competencies reflect the needs and requirements of the broad HPC constituency. We will use this meeting to engage the audience in a discussion of the draft competencies and potential paths to implementation of the resulting curricula.

Finally, we will use this opportunity to recruit additional participants in further focus groups as we work towards refining the set of competencies with a goal of publishing them during 2012.

A Training roadmap for New HPC Users

Authors:

Mark Richards and Scott Lathrop (National Center for Supercomputing Applications)


Time:

2:45pm - 5:15pm


Abstract:

Many new users of TeraGrid or other HPC resources are scientists or other domain experts by training and are not necessarily familiar with core principles, practices, and resources within the HPC community. As a result, they often make inefficient use of their own time and effort and of the computing resources as well. In this work, we present a training roadmap for new members of the HPC community. The roadmap outlines key concepts, technologies, and tools that new HPC users will benefit from being aware of, even if they are not required to develop immediate proficiency in all of them. In addition to providing a brief overview of a wide variety of topics, the roadmap includes links and pointers to numerous resources that users can refer to for their own training as the need arises.

Extending Cyberinfrastructure Beyond its own Boundaries; Campus Champions Program - Panel Discussion

Authors:

Kay Hunt (Purdue University)


Time:

2:45pm - 5:15pm


Abstract:

As we prepare for the TeraGrid "eXtreme Digital" (XD) future, establishing a national Campus Champion group focused on meeting the requirements of a much broader national community has tremendous potential to increase participation and provide a foundation from which TeraGrid and other cyberinfrastructure (CI) resources and services may be broadened among traditional communities of users of TG as well as among under-represented communities. Particular emphasis must be placed on including women, minorities, and people with disabilities, as well as including under-represented institutions, including minority-serving and EPSCoR institutions.

The vision is to realize the potential of CI and high-performance computing (HPC) to empower a larger and more diverse set of individuals and institutions to participate in science and engineering education, research and innovation.

The mission is to develop and evaluate an extensible, scalable, and comprehensive program to assist researchers, educators, staff and students at academic institutions to effectively use CI, TG, and XD resources and services, to advance scientific discovery in all fields.

The goal of the Campus Champion Program is to identify key people on campuses who can be pro-active advocates and brokers of information among local campus users to help them make informed choices among the range of national, regional and local CI resources available and advance scientific discovery in all fields.

While the Campus Champions Program has been quite successful, including the recruitment of over 80 member institutions, there is tremendous potential to further improve the program to better serve diverse communities across the country. TeraGrid and all of the aforementioned collaborating organizations, continue to seek advice and recommendations from all Campus Champion institutions and potential member institutions to further improve the Program offerings and services.

Using Hybrid Parallelism to Improve Memory Use in the Uintah Framework

Authors:

Qingyu Meng (SCI Institute, University of Utah), Martin Berzins (SCI Institute, University of Utah), John Schmidt (University of Utah)


Time:

2:45pm - 5:15pm

Abstract:

The Uintah Software framework was developed to provide an environment for solving fluid-structure interaction problems on structured adaptive grids on large-scale, long-running, data-intensive problems. Uintah uses a combination of fluid-flow solvers and particle-based methods for solids together with a novel asynchronous task-based approach with fully automated load balancing. Uintah's memory use associated with ghost cells and global meta-data has become a barrier to scalability beyond O(100K) cores. A hybrid memory approach that addresses this issue is described and evaluated. The new approach based on a combination of Pthreads and MPI is shown to greatly reduce memory usage, as predicted by a simple theoretical model, with comparable CPU performance.

Autotuned Parallel I/O for Highly Scalable Biosequence Analysis

Authors:

Haihang You (University of Tennessee/NICS), Bhanu Rekapalli (University of Tennessee/NICS), Qing Liu (University of Tennessee/NICS), Shirley Moore (University of Tennessee/NICS)


Time:

2:45pm - 5:15pm

Abstract:

N/A

Runtime Analysis Tools for Parallel Scientific Applications

Authors:

Oleg Korobkin (Louisiana State University), Gabrielle Allen (Louisiana State University), Eloisa Bentivegna (Max-Planck-Institute for Gravitational Physics), Steven Brandt (Louisiana State University), Peter Diener (Louisiana State University), Jinghua Ge (Louisiana State University), Frank Löffler (Louisiana State University), Erik Schnetter (Perimeter Institute for Theoretical Physics), Jian Tao (Louisiana State University)


Time:

2:45pm - 5:15pm

Abstract:

This paper describes the Alpaca runtime tools. These tools leverage the component infrastructure of the Cactus Framework in a novel way to enable runtime steering, monitoring, and interactive control of a simulation. Simulation data can be observed graphically, or by inspecting values of variables. When GPUs are available, images can be generated using volume ray casting on the live data. In response to observed error conditions or automatic triggers, users can pause the simulation to modify or repair data, or change runtime parameters. In this paper we describe the design of our implementation of these features and illustrate their value with three use cases.

Performance Metrics and Auditing Framework for High Performance Computer Systems

Authors:

Thomas R. Furlani (CCR SUNY at Buffalo), Matthew D. Jones (CCR SUNY at Buffalo), Steven M. Gallo (CCR SUNY at Buffalo), Andrew E. Bruno (CCR SUNY at Buffalo), Charng-Da Lu (CCR SUNY at Buffalo), Amin Ghadersohi (CCR SUNY at Buffalo), Ryan J. Gentner (CCR SUNY at Buffalo), Abani Patra (CCR SUNY at Buffalo), Robert L. Deleon (CCR SUNY at Buffalo), Gregor Von Lazewski (Indiana University), Lizhe Wang (Indiana University), Ann Zimmerman (University of Michigan)


Time:

2:45pm - 5:15pm

Abstract:

This paper describes a comprehensive auditing framework, XDMoD, for use by high performance computing centers to readily provide metrics regarding resource utilization (CPU hours, job size, wait time, etc), resource performance, and the center's impact in terms of scholarship and research. This role-based auditing framework is designed to meet the following objectives: (1) provide the user community with an easy to use tool to oversee their allocations and optimize their use of resources, (2) provide staff with easy access to performance metrics and diagnostics to monitor and tune resource performance for the benefit of the users, (3) provide senior management with a tool to easily monitor utilization, user base, and performance of resources, and (4) help ensure that the resources are effectively enabling research and scholarship. While initially focused on the NSF TeraGrid (TG) and follow-on XSEDE (XD) program, this auditing system is intended to have a wide applicability to any HPC system.

The XDMoD auditing system is architected using a set of modular components that facilitate the utilization of community contributed components information. It includes an active and reactive (as opposed to passive) service set accessible through a variety of endpoints such as web-based user interface, RESTful web services, and provided development tools. One component also provides a computationally lightweight and flexible application kernel auditing system that reflects best-in-class performance kernels to measure overall system performance with respect to existing applications that are actually being run by users. This allows continuous resource auditing to monitor all aspects of system performance, most critically from a completely user-centric point of view.

Automatically Mining Program Build Information via Signature Matching

Authors:

Charng-Da Lu (Center for Computational Research, SUNY at Buffalo), Matthew Jones (Center for Computational Research, SUNY at Buffalo), Thomas Furlani (Center for Computational Research, SUNY at Buffalo)


Time:

2:45pm - 5:15pm

Abstract:

N/A

Data and Compute-Driven Modern Science: How Do We Prepare the Future for It?

Speaker:

Nora Sabelli (SRI International)

Time:

8:15am - 9:45am

Abstract:

Much has been said about the data- and compute-driven transformation that science is undergoing and about the impacts this transformation will have in society--impact that is not limited to its professionals. This talk will ask if we know enough about what the multiple institutions that educate society's members should be doing to serve the new needs, and discuss some challenging research questions that could inform future action.

Hierarchical Data Storage Strategy in XSEDE

Time:

5:30pm - 7:00pm

Abstract:

The problem of data management looms ever larger for the users of XSEDE as we enter the era of sustained petaflop simulations and petabyte data analytics. Over a wide spectrum of scientific and scholarly disciplines, it is become imperative to build and deploy reliable processes for dealing with the data life cycle, including long-term repositories with role-based access by collaborators and colleagues. At the same time, there is rapid evolution in the space of technologies for hierarchical and distributed data storage. This BOF proposes to launch a dialog among the XSEDE providers and users towards a common vision and implementation strategy for addressing these problems.

Parallel Tools Platform TG11 BOF Proposal

Time:

5:30pm - 7:00pm

Abstract:

The Eclipse Parallel Tools Platform (PTP, http://eclipse.org/ptp) is an open_source project providing a robust, extensible platform for parallel application development. PTP includes a parallel runtime, parallel debugger, remote development tools, static analysis tools, and support for the integration of existing external tools. It supports development on C/C++, UPC, and Fortran. PTP is the basis for the NCSA Blue Waters application development workbench. The BOF will consist of brief demos and discussions about PTP features, including:

  • PTP support for job schedulers/resource managers, including the recent additions of a completely configurable resource manager
  • Remote services, including editing, launching, and debugging on remote targets, including a new capability to support remote synchronized projects
  • Performance tools and the External Tools Framework (ETFw), which integrates existing command_line tools into Eclipse, as well as providing mechanisms for linking feedback from external tools to source code displayed in Eclipse.
  • Static and dynamic analyses for MPI programs
  • Fortran development and refactoring support via the Eclipse Photran project
  • The NCSA Blue Waters workbench, based on PTP
  • Use of PTP for teaching parallel programming

We will be especially looking to engage potential and current users of Eclipse PTP, to help us understand what directions future improvements of PTP should take, as well as TeraGrid/XSEDE Resource/Service Providers, to make sure we can properly support TeraGrid/XSEDE resources. We are also looking to build relationships with TeraGrid/XSEDE for optimal support of capabilities embodied in Eclipse PTP. The discussion will also be designed to find possible collaborators and directions for future development.

Globus

Time:

5:30pm - 7:00pm

Abstract:

The XSEDE Training, Education, and Outreach Services (TEOS) Team proposes to conduct a BOF during TeraGrid '11.

The purpose is to share information about the XSEDE TEOS plans during the first 15-20 minutes. The team will describe the training, education and outreach plans for the XSEDE program.

Since XSEDE will have only recently been made public prior to the TeraGrid '11 Conference, the TEOS team would like to ensure that the community is aware of the programs and services to be offered, and the range of opportunities for the community to become involved and benefit from XSEDE. Included will be the initial plans and activities with dates and topics, along with information about long-term plans to engage the community.

The TEOS team will then spend time answering questions from the audience about the planned activities.

The TEOS team will also use the time to collect community needs and requirements that can help to inform and enhance the TEOS offerings.

MapReduce applications and environments

Time:

5:30pm - 7:00pm

Abstract:

As the computing landscape becomes increasingly data-centric, data-intensive computing environments are poised to transform scientific research. In particular, MapReduce based programming models and run-time systems such as the open-source Hadoop system have increasingly been adopted by researchers with data-intensive problems, in areas including bio-informatics, data mining and analytics, and text processing. While Map/Reduce run-time systems such as Hadoop are currently not supported across all TeraGrid systems (it is available on systems including FutureGrid), there is increased demand for these environments by the science community. This BOF session will provide a forum for discussions with users on challenges and opportunities for the use of MapReduce. It will be moderated by Geoffrey Fox who will start with a short overview of MapReduce and the applications for which it is suitable. These include pleasingly parallel applications and many loosely coupled data analysis problems where we will use genomics, information retrieval and particle physics as examples.

We will discuss the interest of users, the possibility of using Teragrid and commercial clouds, and the type of training that would be useful. The BOF will assume only broad knowledge and will not need or discuss details of technologies like Hadoop, Dryad, Twister, Sector/Sphere (MapReduce variants).

Enhancing the "MATLAB on the TeraGrid" Experimental Resource with GPGPUs

Time:

5:30pm - 7:00pm

Abstract:

In a new research partnership with NVIDIA, Dell, and MathWorks, the Cornell University Center for Advanced Computing (CAC) is testing the performance of general purpose graphics processing units with MATLAB applications. MATLAB GPU computing capabilities include data manipulation on NVIDIA GPUs, GPU-accelerated MATLAB operations, and the use of multiple GPUs on the desktop via the Parallel Computing Toolbox and a computer cluster via MATLAB Distributed Computing Server.

Testing at Cornell is occurring on Dell C6100 servers with the C410x PCIe expansion chassis which supports server connections to NVIDIA Tesla 2070 GPGPU processors. This capability has been added to the NSF-funded 512-core experimental "MATLAB on the TeraGrid" resource operated by Cornell in partnership with Purdue University. The resource serves as a bridge to high end national resources by introducing students to computational science and data analysis and by helping less experienced researchers realize the benefits of parallel computing and application scaling. 504,000 jobs have run on the resource in the past 18 months, directly and through Science Gateways such as nanoHUB.org.

This BOF will present an overview of the resource, focusing on the addition of NVIDIA nodes and other system improvements. How to profile code to determine if it is suitable for GPGPUs will be demonstrated. The use and availability of the resource and NVIDIA nodes will be discussed.

XSEDE Town Hall

Time:

1:30pm - 2:30pm

Abstract:

XSEDE principal investigator John Towns and leaders of the various XSEDE initiatives (including Ralph Roskies, Patricia Kovatch, Scott Lathrop, Nancy Wilkins-Diehr, and Jay Boisseau) will give an overview of the project and will take questions.

 

TUESDAY JULY 19

TIME PRESENTATION PRESENTER LOCATION TYPE
7:00AM - 6:00PM Registration Pre-Function Area
7:15AM - 8:15AM Breakfast Salons E & F
8:15AM - 9:45AM Data and Compute-Driven Modern Science: How Do We Prepare the Future for It?
[download presentation]
Nora Sabelli Grand Ballroom A-D Invited Talk
9:45AM - 10:00AM Break Salons E & F
10:00AM - 10:30AM Kinetic Simulation of Magnetic Fusion Plasmas on High Performance Computing Platforms W. W. Lee Salon G&H Science Session
10:30AM - 11:00AM Open-source Astrophysics: the Enzo Community Code Brian O'Shea, Matthew Turk, Michael Norman, Greg Bryan Salon G&H
11:00AM - 11:30AM Petascale Kinetic Simulation of the Magnetosphere Homa Karimabadi, Hoanh Vu, Burlen Loring, Yuri Omelchenko, Tamara Sipes, Vadim Roytershteyn, William Daughton, Mahidhar Tatineni, Amit Majumdar, Umit Catalyurek, Alper Yilmaz Salon G&H
11:30AM - 12:00PM On the Density Distribution in Star-Forming Interstellar Clouds Alexei Kritsuk, Michael Norman, Rick Wagner Salon G&H
10:00AM - 10:30AM DataONE Member Node Pilot Integration with TeraGrid
[download presentation]
Nicholas Dexter, John Cobb, Dave Vieglais Salon I&J Technology Session
10:30AM - 11:00AM A Roadmap for Using NSF Cyberinfrastucture with InCommon William Barnett, Von Welch, Alan Walsh, Craig Stewart Salon I&J
11:00AM - 11:30AM The XSEDE Architecture -- A Renewed Emphasis on Quality Attributes
[download presentation]
Andrew Grimshaw Salon I&J
11:30AM - 12:00PM An API to feed the World
[download presentation]
Rion Dooley Salon I&J
10:00AM - 10:30AM UltraScan Gateway Enhancements through TeraGrid Advanced User Support Borries Demeler, Raminderieet Singh, Marlon Pierce, Suresh Marru, Emre H Brookes Deer Valley 1&2 Gateway Session
10:30AM - 11:00AM Developing an Integrated End-to-end TeraGrid Climate Modeling Environment Lan Zhao, Carol Song, Cecelia Deluca, Don Middleton Deer Valley 1&2
11:00AM - 11:30AM A CyberGIS Gateway Approach to Interoperable Access to the National Science Foundation TeraGrid and the Open Science Grid Anand Padmanabhan, Shaowen Wang, John-Paul Navarro Deer Valley 1&2
11:30AM - 12:00PM Extending BioVLab Cloud Workbench to a TeraGrid Gateway
[download presentation]
Suresh Marru, Marlon Pierce, Patanachai Tangchaisin, Heejoon Chae, Kenneth Newphew, Sun Kim Deer Valley 1&2
10:00AM - 10:30AM Reducing the Barrier to Entry Using Portable Apps
[download presentation]
Dirk Colbry Solitude Training, Education, and Outreach Session
10:30AM - 11:00AM Educational Virtual Clusters for On-demand MPI/Hadoop/Condor in Future Grid
[download presentation]
Renato Figueiredo, David Wolinsky, Panoat Chuchaisri Solitude
11:00AM - 11:30AM The Shape of the TeraGrid: Social Network Analysis of an Affiliation Network of TeraGrid Users and Projects
[download presentation]
Richard Knepper Solitude
11:30AM - 12:00PM Building Cyberlearning Communities at the Crossroads: Successes, Challenges, Lessons Learned, and the Extreme Road Ahead
[download presentation]
Jeff Sale, Ange Mason, Diane Baxter Solitude
12:00PM - 1:00PM Lunch Buffet Salons E & F
1:30PM - 2:30PM XSEDE Town Hall Grand Ballroom A-D
2:30AM - 2:45AM Break Salons E & F
2:45PM - 3:15PM High Performance Computational Nanotechnology for Research and Education on nanoHUB.org Gerhard Klimeck Salon G&H Science Session
3:15PM - 3:45PM Investigation of Substrate-recognition and the Active-site Stability of CYP2B4 and CYP3A4 X-ray Structures Using Large-scale Molecular Dynamics Simulations on TeraGrid Resources
[download presentation]
Kiumars Shahrokh, Garold S Yost, Thomas Cheatham Salon G&H
3:45PM - 4:15PM Mechanism of 150-cavity Formation in Influenza Neuraminidase
[download presentation]
Rommie Amaro, Ross Walker, Wilfred Li, Robin Bush, Robert Swift, Lane Votapka Salon G&H
4:15PM - 4:45PM Biomolecular Simulation and Computer Aided Drug Design on TeraGrid Resources: Promise and Peril
[download presentation]
Thomas Cheatham Salon G&H
4:45PM - 5:15PM From Fullerenes to Nano-devices - Modeling Reactions between Carbon Nano-structures from Quantum Chemical Molecular Dynamics Simulations
[download presentation]
Jacek Jakowski Salon G&H
2:45PM - 3:15PM Globus GridFTP: What's new in 2011? Rajkumar Kettimuthu Salon I&J Technology Session
3:15PM - 3:45PM Globus XIO Pipe Open Driver: Enabling GridFTP to Leverage Standard Unix Tools Rajkumar Kettimuthu, Steven Link Salon I&J
3:45PM - 4:15PM Using Globus Online for Reliable, Secure File Transfer Steve Tuecke Salon I&J
4:15PM - 4:45PM Managing Appliance Launches in Infrastructure Clouds John Bresnahan, Tim Freeman, David Labissoniere, Kate Keahey Salon I&J
4:45PM - 5:15PM Generic FutureGrid Image Management Javier Diaz, Gregor Von Laszewski, Fugang Wang, Geoffrey C Fox Salon I&J
2:45PM - 3:15PM Methods of Creating Student Cluster Competition Teams Stephen Harrell, Preston Smith, Doug Smith, Torsten Hoefler, Anna Labutina, Trinity Overmyer Deer Valley Training, Education, and Outreach Session
3:15PM - 3:45PM A Multi-Cultural Success Story for Achieving Diversity in Multi-Core/ many-Core Internships Jennifer Houchins, Jeff Krouse, Robert Panoff, Scott Lathrop Deer Valley
3:45PM - 4:15PM Coming to Consensus on Competencies for Petascale Computing Education and Training Steven Gordon, Judith Gardiner Deer Valley
4:15PM - 4:45PM A Training roadmap for New HPC Users
[download presentation]
Mark Richards, Scott Lathrop Deer Valley
4:45PM - 5:15PM Extending Cyberinfrastructure Beyond its own Boundaries: Campus Champions Program - Panel Discussion
[download presentation]
Kay Hunt Deer Valley
2:45PM - 3:15PM Using Hybrid Parallelism to Improve Memory Use in the Uintah Framework Qingyu Meng, Martin Berzins, John Schmidt Solitude Technology Session
3:15PM - 3:45PM Autotuned Parallel I/O for Highly Scalable Biosequence Analysis Haihang You, Bhanu Rekapalli, Qing Liu, Shirley Moore Solitude
3:45PM - 4:15PM Runtime Analysis Tools for Parallel Scientific Applications
[download presentation]
Oleg Korobkin, Gabrielle Allen, Eloisa Bentivegna, Steven Brandt, Peter Diener, Jinghua Ge, Frank Loffler, Erik Schnetter, Jian Tao Solitude
4:15PM - 4:45PM Performance Metrics and Auditing Framework for High Performance Computer Systems Thomas R Furlani, Matthew D Jones, Steven M Gallo, Andrew E Bruno, Charng-Da Lu, Amin Ghadersohi, Ryan J Gentner, Abani Patra, Robert L Deleon, Gregor Von Lazewski, Lizhe Wang, Ann Zimmerman Solitude
4:45PM - 5:15PM Automatically Mining Program Build Information via Signature Matching
[download presentation]
Charn-Da Lu, Matthew Jones, Thomas Furlani Solitude
5:30PM - 7:00PM Hierarchical Data Storage Strategy in XSEDE Snowbird Birds-of-a-Feather
5:30PM - 7:00PM Parallel Tools Platform Park City Birds-of-a-Feather
5:30PM - 7:00PM XSEDE TEOS Plans for Community Engagement Alta Birds-of-a-Feather
5:30PM - 7:00PM MapReduce applications and environments Brighton Birds-of-a-Feather
5:30PM - 7:00PM Enhancing the "MATLAB on the TeraGrid" Experimental Resource with GPGPUs Salon A Birds-of-a-Feather
7:00PM - 9:00PM Poster Session/Viz Gallery Reception Grand Ballroom D,E&F

 

WEDNESDAY JULY 20

TIME PRESENTATION PRESENTER LOCATION TYPE
8:00AM - NOON Registration Pre-Function Area
7:15AM - 8:30AM Breakfast Salons E & F
9:45AM - 10:00AM Break Salons E & F
8:30AM - 9:45AM The Sunset of TeraGrid John Towns Grand Ballroom Salons A-D
10:00AM - 10:30AM A Model-Driven Partitioning and Auto-tuning Integrated Framework for Sparse Matrix-Vector Multiplication on GPUs
[download presentation]
Ping Guo, He Huang, Qichang Chen, Liqiang Wang, En-Jui Lee, Po Chen Salon G&H Joint Science and Training, Education, and Outreach Session
10:30AM - 11:00AM Ultrascalable Fourier Transforms in Three Dimensions
[download presentation]
Dmitry Pekurovsky Salon G&H
11:00AM - 11:30AM Using the TeraGrid to Teach Scientific Computing
[download presentation]
Frank Loffler, Gabrielle Allen, Werner Benger, Andrei Hutanu, Shantenu Jha, Erik Schnetter Salon G&H
11:30AM - 12:00PM Can High Performance Computing Accelerate High Performance Innovation?
[download presentation]
Leslie J. Button Salon G&H
10:00AM - 10:30AM Data-intensive CyberShake Computations on an Opportunistic Cyberinfrastructure
[download presentation]
Allan Espinosa, Daniel Katz, Michael Wilde, Ian Foster, Scott Callaghan, Philip Maechling, Ketan Maheshwari Salon I&J Technology Session
10:30AM - 11:00AM Developing an Infrastructure to Support Multiscale Modelling and Simulation
[download presentation]
Stefan Zasada, Derek Groen, Peter Coveney Salon I&J
11:00AM - 11:30AM Facilitating Data-Intensive Science with RAM disk and Hadoop on Large Shared-Memory Systems Bryon Gill, Chad Vizino, Philip Blood Salon I&J
11:30AM - 12:00PM SLASH2 -- File System for Widely Distributed Systems
[download presentation]
Paul Nowoczynski, Zhihui Zhang, Jared Yanovich Salon I&J
10:00AM - 10:30AM SURFconext: Collaboration without Limits
[download presentation]
Harold Teunissen and Paul Van Dijk Deer Valley 1&2 Gateway Session
10:30AM - 11:00AM The CIPRES Science Gateway:A Community Resource for Phylogenetic Analyses
[download presentation]
Mark Miller, Wayne Pfeiffer, Terri Schwartz Deer Valley 1&2
11:00AM - 11:30AM An OAuth Service for Issuing Certificates to Science Gateways for TeraGrid Users
[download presentation]
Jim Basney, Jeff Gaynor Deer Valley 1&2
11:30AM - 12:00PM Securing Science Gateways
[download presentation]
Victor Hazlewood, Matthew Woitaszek Deer Valley 1&2
12:00PM - 1:00PM Lunch Buffet Salons E&F
1:00PM - 1:30PM Break Salons E&F
1:30PM - 2:30PM Gold Sponsors Presentation Grand Ballroom A-D
2:30AM - 2:45AM Break Salons E & F
2:45PM - 3:15PM RNA Structure Refinement and Force Field Evaluation Using Molecular Dynamics Simulations Niel Henriksen, Thomas Cheatham Salon G&H Science Session
3:15PM - 3:45PM A Scalable Multi-scale Framework for Parallel Simulation and Visualization of Microbial Evolution Vadim Mozhayskiy, Bob Miller, Kwan-Liu Ma, Ilias Tagkopoulos Salon G&H
3:45PM - 4:15PM Towards High-Throughput and High-Performance Computational Estimation of Binding Affinities for Patient Specific HIV-1 Protease Sequences Owain Kenway, David Wright , Helmut Heller, Andre Merzky, Gavin Pringle, Jules Wolfrat, Peter Coveney, Shantenu Jha Salon G&H
4:15PM - 4:45PM neoGRID: Enabling Access to Teragrid Resources - Application for Sialylmotif Analysis in the Protozoan Toxoplasma gondii Arun Datta, Anthony Sinai Salon G&H
4:45PM - 5:15PM Statistical Modeling of Avian Distributional Dynamics on the TeraGrid Daniel Fink, Theodoros Damoulas, Andrew Dolgert, John W. Cobb, Nivan Ferreira, Lauro Lins, Claudio Silva, Juliana Freire, Steve Kelling Salon G&H
2:45PM - 3:15PM Accessible, Transparent, Reproducible Data Analysis with Galaxy
[download presentation]
James Taylor Salon I&J Invited Talk: Technology Session
3:15PM - 3:45PM Gateway Hosting Past, Present, and Future
[download presentation]
Mike Lowe, David Hancock, Matt Link, Craig Stewart Salon I&J
3:45PM - 4:15PM Trestles: A High-Productivity HPC System Targeted to Modest-Scale and Gateway Users
[download presentation]
Richard Moore, David Hart, Wayne Pfeiffer, Mahidhar Tatineni, Kenneth Yoshimoto, William Young Salon I&J
4:15PM - 4:45PM Audited Credential Delegation: A Usable Identity Management Solution for Grid Environments
[download presentation]
Stefan Zasada, Ali Haidar, Peter Coveney Salon I&J
4:45PM - 5:15PM An Information Architecture Based on Publish/Subscribe Messaging
[download presentation]
Warren Smith Salon I&J
2:45PM - 3:10PM Automated Grid-Probe System to Improve End-To-End Grid Reliability for a Science Gateway
[download presentation]
Lynn Zentner, Steven Clark, Krishna Madhavan, Swaroop Shivarajapura, Victoria Farnsworth, Gerhard Klimeck Deer Valley 1&2 Joint Gateway and Training, Education, and Outreach Session
3:10PM - 3:35PM Building Gateways for Life-Science Applications using the Distributed Adaptive Runtime Environment (DARE) Framework Joohyun Kim , Sharath Maddineni, Shantenu Jha Deer Valley 1&2
3:35PM - 4:00PM Benefits of NoSQL Databases for Portals & Science Gateways Matthew Hanlon, Rion Dooley, Stephen Mock, Maytal Dahan, Praveen Nuthulapati, Patrick Hurley Deer Valley 1&2
4:00PM - 4:25PM A solution looking for lots of problems: Generic Portals for Science Infrastructure
[download presentation]
Thomas Uram, Michael Papka, Mark Hereld, Michael Wilde Deer Valley 1&2
4:25PM - 4:50PM Nascent HPC Activities at an HBCU M. Farrukh Khan, C. J. Tymczak Deer Valley 1&2
4:50PM - 5:15PM Developing Effective Instructional Materials Sandie Kappes Deer Valley 1&2
5:30PM - 7:00PM Carry It Forward: From TeraGrid to XSEDE Solitude Birds-of-a-Feather
5:30PM - 7:00PM XSEDE Advanced User Support and Campus Champion Fellows program information Alta Birds-of-a-Feather
5:30PM - 7:00PM FutureGrid: What an Experimental Infrastructure Can Do for You Brighton Birds-of-a-Feather
5:30PM - 7:00PM Globus GRAM 5: Requirements and Use Cases for XSEDE Park City Birds-of-a-Feather
5:30PM - 7:00PM Local Campus Cyberinfrastructure and Campus Bridging Canyons Birds-of-a-Feather
6:30PM - 10:00PM Excursion to Clark Planetarium Grand Ballroom A-D

A Model-Driven Partitioning and Auto-tuning Integrated Framework for Sparse Matrix-Vector Multiplication on GPUs

Authors:

Ping Guo (University of Wyoming), He Huang (University of Wyoming), Qichang Chen (University of Wyoming), Liqiang Wang (University of Wyoming), En-Jui Lee, Po Chen (University of Wyoming)


Time:

10:00am - 12:00pm


Abstract:

Sparse Matrix-Vector Multiplication (SpMV) is very common to scientific computing. Graphics processing units (GPUs) have recently emerged as a high-performance computing platform due to its massive processing capability. This paper presents an innovative model-based approach for partitioning sparse matrix into appropriate formats, and auto-tuning configurations of CUDA kernels in order to obtain optimal performance of SpMV on GPUs. The paper makes the following contributions: (1) Propose an empirical CUDA performance model to predict the execution time of SpMV CUDA kernels. (2) Design and implement a model-driven partitioning framework to predict how to partition the target sparse matrix into one or more partitions and transform each partition into appropriate storage format based on the difference of storage characteristics between each partition. The designing principle is based on the fact that the different storage formats of sparse matrix can significantly affect the performance of SpMV. (3) Design and implement an auto-tuning framework to automatically adjust CUDA-specific parameters for obtaining optimal performance on specific GPUs. (4) Integrate the model-driven partitioning framework and the auto-tuning framework for the purpose of maximizing the benefit of performance. Compared to the existing implementations of SpMV, our approach shows a substantial (210 percent on average) performance improvement.

Ultrascalable Fourier Transforms in Three Dimensions

Authors:

Dmitry Pekurovsky (University of California San Diego/SDSC)


Time:

10:00am - 12:00pm


Abstract:

Fourier and related types of transforms are widely used in scientific community. Three-dimensional Fast Fourier Transforms (3D FFT), for example, are used in many areas such as DNS turbulence, astrophysics, material science, chemistry, oceanography and X-ray crystallography. In many cases this is a very compute-intensive operation. Lately there has been a need for implementations of 3D FFT and related algorithms with good scaling on Petascale parallel machines. Most existing implementations of 3D FFT use one-dimensional task decomposition, and therefore are subject to scaling limitation when the number of cores reaches domain size. To overcome this limitation and fill the void in scalable FFT packages, P3DFFT library has been created. P3DFFT is an open-source, easy-to-use software package (http://code.google.com/p/p3dfft) providing general solution for 3D FFT based on two-dimensional decomposition. The library is written in Fortran90 and MPI, with C interface available. P3DFFT has been demonstrated to scale quite well up to tens of thousands cores on several platforms, including Kraken at NICS/ORNL. Theoretically it is scalable up to N-squared cores, provided suitable hardware support, where N is the domain size. A test benchmark P3DFFT program has shown about 50 percent efficiency in strong scaling from 4k to 64k cores on Cray XT5. More details will be included in presentation.

P3DFFT has been used in a number of simulations to date, by groups throughout the world. For example a team led by TeraGrid user P.K. Yeung (Georgia Tech) studying fundamental turbulence properties through Direct Numerical Simulations, have been able to scale their code up to 128k cores and employ 4096-cubed grid dimensions, which is the largest grid used to date in the world.

The evolution of the library is ongoing, supported by NSF grant OCI-085-0684, and includes a number of further performance enhancements taking advantage of modern architectures, such as overlap of communication with computation, and MPI/OpenMP implementation. In addition, the package will be extended to support other types of transforms such as Chebyshev. To the author's knowledge no other publicly available library provides the same functionality and performance.

Using the TeraGrid to Teach Scientific Computing

Authors:

Frank Loffler (Louisiana State University), Gabrielle Allen (Louisiana State University), Werner Benger (Louisiana State University), Andrei Hutanu (Louisiana State University), Shantenu Jha (Louisiana State University) and Erik Schnetter (Perimeter Institute For Theoretical Physics, Canada)


Time:

10:00am - 12:00pm


Abstract:

We describe how a new graduate course in scientific computing, taught during Fall 2010 at Louisiana State University, utilized TeraGrid resources to familiarize students with some of the real world issues that computational scientists regularly deal with in their work. The course was designed to provide a broad and practical introduction to scientific computing, creating the basic skills and experience to very quickly get involved in research projects involving modern cyberinfrastructure and complex real world scientific problems. As an integral part of the course, students had to utilize various TeraGrid resources, e.g., by deploying, using and extending scientific software within the national cyberinfrastructure.

(Invited Talk) Can High Performance Computing Accelerate High Performance Innovation?

Speaker:

Leslie J. Button, Research Fellow, Modeling & Simulation, Science and Technology Division; Corning, Incorporated


Time:

10:00am - 12:00pm


Abstract:

N/A

Data-intensive CyberShake Computations on an Opportunistic Cyberinfrastructure

Authors:

Allan Espinosa (Department of Computer Science, University of Chicago), Daniel Katz (Computation Institute, University of Chicago and Argonne National Laboratory), Michael Wilde (Mathematics and Computer Science Division, Argonne National Laboratory), Ian Foster (Computation Institute, University of Chicago and Argonne National Laboratory), Scott Callaghan (Southern California Earthquake Center, University of Southern California), Philip Maechling (Southern California Earthquake Center, University of Southern California), Ketan Maheshwari (Mathematics and Computer Science Division, Argonne National Laboratory)


Time:

10:00am - 12:00pm


Abstract:

This abstract describes the aggregation of TeraGrid and Open Science Grid to run the SCEC CyberShake application faster than on TeraGrid alone. Because the resources are distributed and data movement is required to use more than one resource, a careful analysis of the cost of data movement vs. the benefits of distributed computation has been done in order to best distribute the work across the resources.

Developing an Infrastructure to Support Multiscale Modelling and Simulation

Authors:

Stefan Zasada (University College London), Derek Groen (University College London), Peter Coveney (University College London)


Time:

10:00am - 12:00pm


Abstract:

Today scientists and engineers are commonly faced with the challenge of modelling, predicting and con- trolling multiscale systems which cross scientific disciplines and where several processes acting at different scales coexist and interact. Such multidisciplinary multiscale models, when simulated in three dimensions, require large scale or even extreme scale computing capabilities. The MAPPER project [1] develops compu- tational strategies, software and services for distributed multiscale simulations across disciplines, exploiting existing and evolving European e-infrastructure. To facilitate such an infrastructure, the MAPPER project is developing and deploying a multi-tiered software stack, which we will describe in this talk. The MAPPER software stack seeks to deploy a set of services to facilitate the execution of multiscale scientific applications, that is single applications composed of multiple different single scale models, with each model usually exe- cuted by a different application code. This set of services, building on computational resources from EGI [2], PRACE [3] and TeraGrid/XD, as well as testbed resources and services run by the MAPPER project, constitute a European wide infrastructure for multiscale modelling and science. By necessity, some of these components run on on target compute resources (such as those operated by PRACE and EGI) and some components run at a higher level, on resources operated by the MAPPER project.
MAPPER makes a distinction between loosely coupled and tightly coupled application scenarios, the difference been that tightly coupled applications require constant communication between components, whereas loosely coupled applications are executed in a chain of dependent steps. To drive the development of the multi-scale modelling infrastructure, the MAPPER project works with exemplar applications from five rep- resentative scientific domains (fusion, clinical decision making, systems biology, nano science, engineering). The nano science domain's involves a loosely coupled application consisting of three levels of simulation. The lowest level simulates the electronic degrees of freedom, using the Car-Parrinello Molecular Dynamics (CPMD) code. This code is a parallelized plane wave/pseudopotential implementation of Density Functional Theory, particularly designed for ab-initio molecular dynamics. The high level of accuracy of this method provides a mechanism for deriving accurate atomic charges which can be used in classical molecular dynamics, where the electronic degrees of freedom are removed. The atomic charges are passed to the initial models simulated using LAMMPS classical molecular dynamics code. To increase the size and length of simulation we use the classical molecular dynamics simulation to create input model parameters for Coarse-Grained Molecular Dynamics (CGMD) simulations, again using the LAMMPS code. The CGMD simulations have reduced degrees of freedom, by combining atoms into single larger particles. The parameters which are transferred between these levels are the interparticle positions and interactions, calculated to reduce the structural details of the simulation. We will use this exemplar to illustrate the operation of the MAPPER infrastructure in our talk.

Facilitating Data-Intensive Science with RAM disk and Hadoop on Large Shared-Memory Systems

Authors:

Bryon Gill (Pittsburgh Supercomputing Center, Carnegie Mellon University), Chad Vizino (Pittsburgh Supercomputing Center, Carnegie Mellon University), Philip Blood (Pittsburgh Supercomputing Center, Carnegie Mellon University)


Time:

10:00am - 12:00pm


Abstract:

Traditionally, Hadoop is run on parallel machines with modest processing capabilities and large amounts of disk. A large shared-memory system may not, at first blush, seem a likely host for a Hadoop cluster, but when researchers asked to run Hadoop on our 16 TB shared memory system, Blacklight, we set out to determine how effectively Hadoop could work in this non-traditional environment. We will begin by discussing the technical hurdles involved with making the Hadoop platform run efficiently on Blacklight, including configuration management and dynamic port allocation. Then we will discuss the logistics of using RAM disk on Blacklight, where a portion of the system's memory is allocated on a per-job basis to be used in place of the local disk expected by Hadoop. To conclude, we'll share some test results highlighting the cases where our shared-memory system running Hadoop outperforms a traditional parallel cluster. We expect this work to increase user productivity by making fast memory-based I/O available to solve a large set of data-intensive problems, including those already using, or easily adaptable to, the ubiquitous MapReduce framework.

SLASH2 -- File System for Widely Distributed Systems

Authors:

Paul Nowoczynski (Pittsburgh Supercomputing Center, Carnegie Mellon University), Zhihui Zhang (Pittsburgh Supercomputing Center, Carnegie Mellon University), Jared Yanovich (Pittsburgh Supercomputing Center, Carnegie Mellon University)


Time:

10:00am - 12:00pm


Abstract:

SLASH2 is a portable user mode file system which has been created to aid users who frequently deal with large datasets in widely distributed environments. SLASH2 has been designed from the ground up to meet the technological needs in this emerging space. SLASH2 is highly portable and may be layered atop existing storage system deployments so that users may more easily leverage various elements of organically changing storage topologies. This project recognizes that such a storage topology may span a multitude of heterogeneous systems and possibly several institutions. The SLASH2 software stack has been positioned so that it may link independent storage systems in a manner which is relatively benign to system administrators. By binding otherwise independent storage systems with the SLASH2 object-based protocol, SLASH2 aims to provide users with fine grained control of data locality while presenting a single name space and access through POSIX file system calls.

(Invited Talk) SURFconext: Collaboration without Limits

Speakers:

Harold Teunissen (SURFnet, Netherlands), Niels Van Dijk (SURFnet, Netherlands) and Paul Van Dijk(SURFnet, Netherlands)


Time:

10:00am - 12:00pm


Abstract:

This short paper discusses the online collaboration needs of users, the implications for universities and institutions, and Virtual Organizations, and briefly describes SURFnet's vision in regard to future collaboration platforms. It goes into dealing with institutional and vendor provided services and showcases how combining standards, concepts and technology can be turned into a fully operational collaboration platform, setting the stage for a new collaboration paradigm.

The CIPRES Science Gateway:A Community Resource for Phylogenetic Analyses

Authors:

Mark Miller (San Diego Supercomputer Center), Wayne Pfeiffer (San Diego Supercomputer Center) and Terri Schwartz (San Diego Supercomputer Center)


Time:

10:00am - 12:00pm


Abstract:

The CIPRES Science Gateway (CSG) provides researchers and educators with browser-based access to community codes for inference of phylogenetic relationships from DNA and protein sequence data. The CSG allows users to deploy jobs on the scalable computational resources of the TeraGrid without requiring detailed knowledge of the complexities of high-performance computing clusters. Use of the CSG is growing rapidly; to date it has had more than 2,300 users and has enabled more than 180 peer-reviewed publications. The rapid growth in resource consumption was ac-commodated by deploying codes on the new TeraGrid resource Trestles. Tools and policies were developed to insure efficient and effective resource use. This report describes progress in managing the rapid growth of this public cyberinfrastructure resource and reviews the domain science which it has enabled

An OAuth Service for Issuing Certificates to Science Gateways for TeraGrid Users

Authors:

Jim Basney (NCSA) and Jeff Gaynor (NCSA).


Time:

10:00am - 12:00pm


Abstract:

In this paper, we present a TeraGrid OAuth service, integrated with the TeraGrid User Portal and TeraGrid MyProxy service, that provides certificates to science gateways. The OAuth service eliminates the need for TeraGrid users to disclose their TeraGrid passwords to science gateways when accessing their individual TeraGrid accounts via gateway interfaces. Instead, TeraGrid users authenticate at the TeraGrid User Portal to approve issuance of a certificate by MyProxy to the science gateway they are using. We present the design and implementation of the TeraGrid OAuth service, describe the underlying network protocol, and discuss design decisions and security considerations we made while developing the service in consultation with TeraGrid working groups and staff.

Securing Science Gateways

Authors:

Victor Hazlewood (NICS) and Matthew Woitaszek (NCAR)


Time:

10:00am - 12:00pm


Abstract:

Science gateways began to emerge and evolve on the NSF-sponsored national HPC cyberinfrastructure, known today as the TeraGrid, in the early 2000s. Currently, the TeraGrid supports twenty-five science gateways that utilize a diverse collection of software and methods for integrating with the TeraGrid. This paper surveys TeraGrid science gateway implementations and security models, details a pilot study highlighting changes employed by the GridAMP science gateway to securely access the Kraken supercomputer, and describes possible solutions and recommendations for improving the security posture of science gateway implementations across the TeraGrid. Securing TeraGrid science gateways by employing one or more methods that balance security, developer ease-of-use, and end user ease-of-use improves the overall security posture of the implementations of science gateways across the TeraGrid.

RNA Structure Refinement and Force Field Evaluation Using Molecular Dynamics Simulations

Authors:

Niel Henriksen (University of Utah), Thomas Cheatham (University of Utah)


Time:

2:45pm - 5:15pm


Abstract:

In this study we investigate the structure of two similar RNA hairpins using molecular dynamics. Both structures were originally determined using solution NMR. Despite having a nearly identical base sequence, the published 3D structures have striking differences. To further understand these differences we used AMBER to perform a series of explicitly solvated simulations with the experimental structure restraints enforced. These simulations revealed incorrect restraints in one hairpin and a problematic nucleotide conformation in the other. Removal of the bad restraints and addition of a heating step to the structure refinement procedure yielded two high-quality structural ensembles with very low pairwise RMSD to one another. Thus our restrained simulations suggest that these two RNA hairpins have quite similar 3D structures and the differences in the published structures are likely due to refinement problems. In addition to the structure refinement study, we were also interested in how the AMBER RNA force fields performed in unrestrained simulations. We found that non-canonical features, such as bulge and loop regions, were modeled very poorly. In some cases, pathological conformations dominated the entire simulation trajectory. The results of both the restrained and unrestrained simulations show that while the molecular dynamics environment is a powerful tool for structure refinement of RNA, unrestrained simulations must be treated with great caution until force field performance is improved.

A Scalable Multi-scale Framework for Parallel Simulation and Visualization of Microbial Evolution

Authors:

Vadim Mozhayskiy (UC Davis), Bob Miller (UC Davis), Kwan-Liu Ma (UC Davis), Ilias Tagkopoulos (UC Davis)


Time:

2:45pm - 5:15pm


Abstract:

Bacteria are some of the most ubiquitous, simple and fastest evolving life forms in the planet, yet even in their case, evolution is painstakingly difficult to trace in a laboratory setting. However, evolution of microorganisms in controlled and/or accelerated settings is crucial to advance our understanding on how various behavioral patterns emerge, or to engineer new strains with desired proprieties (e.g. resilient strains for recombinant protein or bio-fuels production). We present a microbial evolution simulator, a tool to study and analyze hypotheses regarding microbial evolution dynamics. The simulator employs multi-scale models and data structures that capture a whole ecology of interactions between the environment, populations, organisms, and their respective gene regulatory and biochemical networks. For each time point, the evolutionary "fossil record" is recorded in each run. This dataset (stored in HDF5 format for scalability) includes all environmental and cellular parameters, cellular (division, death) and evolutionary events (mutations, HGT). This leads to the creation of a coherent dataset that could not have been obtained experimentally. To efficiently analyze it, we have developed a novel visualization tool that projects information in multiple levels (population, phylogeny, networks, and phenotypes). Additionally, we present some of the unique insights in microbial evolution that were possible through simulations in Teragrid, and we describe further steps to address scalability issues for populations beyond 32,000 cells.

Towards High-Throughput and High-Performance Computational Estimation of Binding Affinities for Patient Specific HIV-1 Protease Sequences

Authors:

Owain Kenway (University College London), David Wright (University College London), Helmut Heller ( Leibniz-Rechenzentrum), Andre Merzky (Louisiana State University), Gavin Pringle (EPCC Edinburgh), Jules Wolfrat (SARA Computing and Networking Services, Amsterdam), Peter Coveney (University College London), Shantenu Jha (Rutgers University and Louisiana State University)


Time:

2:45pm - 5:15pm


Abstract:

AIDS has been identified as one of the top three causes of human death. The rapid acquisition of mutations conferring resistance to particular drugs remains a significant cause of treatment failure. It is impossible to identify straightforward relationships between genotype and drug response. Informatics-based techniques give resistance scores to individual mutations, which can be combined additively to assess the resistance levels of complete sequences. It is likely, however, that the full picture is more complicated with non-linear epistatic effects between combinations of mutations playing an important role in determining the level of viral resistance. The inclusion of insight into combinatorial mutational effects from a broader range of sources, including computational modeling, in decision support software offers the potential to further improve the treatment advice produced by decision-support software. Molecular dynamics is one simulation technique which offers the ability to derive quantitative as well as qualitative insight into the interplay of resistance causing mutations. Models over time can then be simulated and the free energy change associated with the binding of the drug calculated.

Building upon earlier work done to evaluate the strength of binding between the drug lopinavir (LPV) and variants of the HIV-1 protease, a major drug target, which are known to exhibit a range of resistance levels, this work aims to extend our investigations to look at other FDA-approved inhibitors.

Not only do the size, complexity and time-scales of individual models make accurate binding-affinity calculations computationally demanding, the patient-specific genotype and the large-number of models for each genotype that need to be covered make this a computational grand-challenge problem that require extreme scales of throughput for high-performance simulations. This makes high-end infrastructure such at the TeraGrid/XD the infrastructure of choice.

This paper presents initial results and experience of using the TeraGrid in conjunction with DEISA.

neoGRID: Enabling Access to Teragrid Resources - Application for Sialylmotif Analysis in the Protozoan Toxoplasma gondii

Authors:

Arun Datta (National University), Anthony Sinai (University of Kentucky)


Time:

2:45pm - 5:15pm


Abstract:

The neoGRID is under development in Quarry, a virtual hosting environment, for working with Taverna-based workflow utilizing grid computing. Taverna is a graphical workbench often used for biomedical informatics. neoGRID offers a HPC-supported collaborative environment for the researchers from multidisciplinary scientific fields to gather data, integrate and analyze using teragrid resources. A significant number of resources including bioinformatics tools has been deployed in TeraGrid, an NSF funded project. Here we demonstrate the usage of neoGRID for sialylmotifs analysis. The presence of sialylmotifs is the cardinal feature of mammalian sialyltransferases, a group of enzymes that transfers sialic acid from CMP-NeuAc to the terminal carbohydrates group of various glycoproteins and glycolipids. Sialic acid has been recognized as the key determinant of a diverse oligosaccharide structures involved in a large variety of biological events as diverse as animal cell-cell interaction to oncogenic transformation. These conserved protein domains, sialylmotifs, have been shown to be involved in binding either the donor or acceptor substrates or both. A thorough analysis of these motifs is now under study for drug discovery research using sialyltransferase as a target. Such analysis demanding high-performance computing power is available in neoGRID.

Here we show the usage of neoGRID for bioinformatics analysis of the protozoan parasite Toxoplasma gondii. T. gondii, which chronically infects roughly 30 percent of the world's population, is typically asymptomatic but can cause life threatening disease in immune compromised individuals and birth defects of acquired during pregnancy. Chronic infection is mediated by the heavily glycosylated tissue cyst although the specific mechanisms of glycosylation remain unknown. Lectin affinity binding studies suggest that the glycoproteins in the cyst wall may be terminally sialylated. The origin or enzymatic basis for this modification however is unclear. Here we show that detailed bioinformatics analysis of the parasite genome (www.toxodb.org) fails to demonstrate that mammalian type sialyltransferase genes are present. However, these results don't rule out the possibility of the presence of previously unknown sialyltransferase gene(s) that are Toxoplasma (or protozoan) specific and have no sequence similarity with either mammalian or bacterial type enzymes. Furthermore, the potential trans-sialidase gene(s) in T.gondii is being explored. Failure to identify novel classes of sialyltrasnferases and/or trans-sialidases would suggest activities in the host are responsible to this unique modification of critical parasite proteins.

(Invited talk) Statistical Modeling of Avian Distributional Dynamics on the TeraGrid

Speakers:

Daniel Fink (Cornell Lab of Ornithology), Theodoros Damoulas (Computing and Information Science, Cornell University), Andrew Dolgert (Cornell Center for Advanced Computing), John W. Cobb (Oak Ridge National Lab, Computer Science & Math Division), Nivan Ferreira, Lauro Lins, Claudio Silva and Juliana Freire, (NYU-Poly), Steve Kelling (Cornell Lab of Ornithology)


Time:

2:45pm - 5:15pm


Abstract:

Today biodiversity faces serious large-scale threats including invasive species, emergent diseases, pollution, and climate change. To better anticipate and mitigate these threats we need to understand species occurrence and abundance across wide geographic and temporal extents for large groups of species. There are several challenges to species distribution modeling at this scale including data collection, modeling, computation, visualization, and interpretation. The goal of our research program is to advance data intensive ecology to meet these challenges and improve our understanding of the broad-scale dynamics of continent-scale bird migrations.

First, we assembled a data warehouse that links bird observation data collected by the citizen science project, eBird (), with local-scale environmental covariates such as climate, habitat, and vegetation phenology. eBird is unique among broad-scale bird monitoring projects because it collects data year-round. Next we employed a novel multi-scale statistical model designed to automatically adapt to both large-scale patterns of movement and local-scale habitat associations. By associating environmental inputs with observed patterns of bird occurrence, predictive models provide a convenient statistical framework to harness available data for predicting distributions and identifying the habitats that species' depend on. Finally, we developed BirdVis, a spatiotemporal visualization tool, that allows ecologists and land managers to efficiently explore and interpret the information from these analyses.

As part of the scientific Exploration, Visualization, and Analysis (EVA) working group of the Data Observation Network for Earth (DataONE) (
https://dataone.org/), an NSF DataNet project, we have begun work scaling our analysis as an exemplar project of data intensive analysis and visualization. Using a startup allocation from the TeraGrid in 2010 we calculated year-round species distribution estimates for over 100 bird species, including several species of conservation concern, on Lonestar. This is essential information to advance ecology and develop more comprehensive science-based management strategies. Ecologists are using these results to study how local-scale ecological processes vary across a species range, through time, and between species. This work was used to generate the data for the 2011 State of The Birds Report, a national conservation report produced by the Department of the Interior.

We will discuss how we are using a 2011 Teragrid research allocation to extend this analysis and develop a phenological Atlas of North American birds. Our goal is to produce year-by-year migration estimates across a broader set of species at finer spatial resolution. This will provide a unique resource to study the connections between environmental change and the response of complex ecological systems.

(Invited talk) Title TBD

Speaker:

James Taylor


Time:

2:45pm - 5:15pm


Abstract:

High-throughput sequencing has transformed biomedical research, however making sense of this resource requires sophisticated computational tools. The Galaxy project seeks to make these tools available to a wide audience of researchers, while ensuring that analysis are reproducible and can be communicated transparently. The Galaxy framework provides a consistent accessible user interface for complex tools; however, many such tools require significant computational resources, which may not be cost effective for a small research group to maintain. Here we describe Galaxy cloud, which allows researchers to instantiate an analysis environment which can be scaled up and down on demand as needed, using nothing more than a web browser.

Gateway Hosting Past, Present, and Future

Authors:

Mike Lowe (Indiana University), David Hancock (Indiana University), Matt Link (Indiana University), Craig Stewart (Indiana University)


Time:

2:45pm - 5:15pm


Abstract:

The Gateway Hosting Project at IU has been active for three years. Several TeraGrid services and gateways use the service. The current state of the project and the future direction will be reviewed. Design choices for the current service will be examined and both successes and shortcomings will be highlighted. Later this year a complete redesign of the service will take place, this will include both hardware and software changes. Details of the transition process will be outlined.

Trestles: A High-Productivity HPC System Targeted to Modest-Scale and Gateway Users

Authors:

Richard Moore (UCSD/SDSC), David Hart (NCAR), Wayne Pfeiffer (UCSD/SDSC), Mahidhar Tatineni (UCSD/SDSC), Kenneth Yoshimoto (UCSD/SDSC), William Young (UCSD/SDSC)


Time:

2:45pm - 5:15pm


Abstract:

Trestles is a new 100TF HPC resource at SDSC designed to enhance scientific productivity for modest-scale and gateway users within the TeraGrid. This paper discusses the Trestles hardware and user environment, as well as the rationale for targeting this user base and the planned operational policies and procedures to optimize scientific productivity, including a focus on turnaround time in addition to the traditional system utilization. A surprisingly large fraction of TeraGrid users run modest-scale jobs (e.g. <1K cores), and an increasing fraction of TeraGrid users access HPC resources via gateways; while these users represent a large percentage of the user base, they consume a smaller fraction of the TeraGrid resources. Thus, while Trestles is the fifth-largest HPC resource in TeraGrid, it will be able to support this large class of TeraGrid users in an environment designed to enhance their productivity. This targeted usage model also frees up other TeraGrid systems for users/jobs that require large-scale, SMP or other specific resource features. A key differentiator for Trestles is that it will be allocated and scheduled to optimize queue wait times and expansion factors, as well as the standard system utilization metric. In addition, the node design, with 32 cores and 64GB DRAM, will accommodate many jobs without inter-node communications, while the 120GB local flash memory will speed up many applications. A robust set of application software, including Gaussian, BLAST, Abaqus, GAMESS, Amber and NAMD, is installed on the system. Standard job limits are 32 nodes (1K cores) and 48 hour runtime, but exceptions can be made, particularly for long jobs up to two weeks. Standing system reservations ensure that some nodes are always set aside for shorter, smaller jobs and user-settable reservations are available to ensure users predictable access to the system. Nodes can be accessed in exclusive or shared mode. Finally, Trestles is the only TeraGrid resource with automatic on-demand access; a limited number of nodes is configured for jobs to "run at risk" (with a discount in the usage rate charged) and be subject to being pre-emptively killed by on-demand jobs (which carry a premium in the usage rate). The allocation, scheduling and software environments will be adjusted and tuned over time as usage patterns emerge and users provide feedback to further enhance their productivity.

Audited Credential Delegation: A Usable Identity Management Solution for Grid Environments

Authors:

Stefan Zasada (University College London), Ali Haidar (University College London), Peter Coveney (University College London)


Time:

2:45pm - 5:15pm


Abstract:

Grid infrastructures are dedicated to provide access to high performance computing resources distributed across multiple administrative domains. One major problem faced by end-users and administrators of computational grid environments arise in connection with the usability of the security mechanisms usually deployed in these environments, in particular identity management solutions. Many of the existing computational grid environments use Public Key Infrastructure (PKI) and X.509 digital certificates as a corner stone for their security architectures to provide authentication and authorisation security goals. However, it is well documented that security solutions based on PKI lack user friendliness for both administrators and end-users, which is essential for the uptake of any grid security solution. The problems stem from the process of acquiring X.509 digital certificates, which can be a lengthy one, and generating proxy certificates to get access to remote resources as part of the authentication process. As a result, many users engage in practices that weaken the security of the environment, such as the sharing of the private key of a single personal certificate to get on with their tasks.

In order to design a usable grid security solution that addresses authentication, authorisation and auditing, it is fundamental to understand end-users and resource providers' requirements. End-users, such as scientists who are not security experts, are concerned with the results of the analysis they perform on grids rather than how to acquire and use digital certificates. Administrators are concerned with setting up virtual organizations (VOs) and administering their security infrastructures in an efficient way. Resource providers are concerned with securing access to their shared resources, trace users responsible for performing a task on their resources, and avoiding the consequences of security breaches including negative publicity and fines. We introduce Audited Credential Delegation (ACD) security solution that accommodates the above requirements. The solution has been developed in Java and Web services technologies. This allows its integration with any tool developed in a programming language that has web services libraries compliant with current Web Services standards. A model of ACD has been developed based on formal notation, which is used for building safety critical systems. The recommendations of the Open Web Application Security consortium for developing secure software based were adopted in the implementation. ACD has been successfully integrated with the Application Hosting Environment (AHE), a lightweight grid middleware that provides a simple set of services, allowing users to interact with grid resources without requiring specific knowledge of the details of each grid resource they wish to use.

In this talk we will described the architecture of the ACD system and discuss its relevance to the TeraGrid/XD communities. Developing an Infrastructure to Support Multiscale Modelling and Simulation Today scientists and engineers are commonly faced with the challenge of modelling, predicting and con- trolling multiscale systems that cross scientific disciplines and where several processes acting at different scales coexist and interact. Such multidisciplinary multiscale models, when simulated in three dimensions, require large scale or even extreme-scale computing capabilities. The MAPPER project develops computational strategies, software and services for distributed multiscale simulations across disciplines, exploiting existing and evolving European e-infrastructure. To facilitate such an infrastructure, the MAPPER project is developing and deploying a multi-tiered software stack, which we will describe in this talk. The MAPPER software stack seeks to deploy a set of services to facilitate the execution of multiscale scientific applications, that is single applications composed of multiple different single scale models, with each model usually executed by a different application code. This set of services, building on computational resources from EGI, PRACE and TeraGrid/XD, as well as testbed resources and services run by the MAPPER project, constitute a European-wide infrastructure for multiscale modeling and science. By necessity, some of these components run on target compute resources (such as those operated by PRACE and EGI) and some components run at a higher level, on resources operated by the MAPPER project.

MAPPER makes a distinction between loosely coupled and tightly coupled application scenarios, the dif- ference been that tightly coupled applications require constant communication between components, whereas loosely coupled applications are executed in a chain of dependent steps. To drive the development of the multi-scale modelling infrastructure, the MAPPER project works with exemplar applications from five representative scientific domains (fusion, clinical decision making, systems biology, nanoscience, engineering).

The nanoscience domain involves a loosely coupled application consisting of three levels of simulation. The lowest level simulates the electronic degrees of freedom, using the Car-Parrinello Molecular Dynamics (CPMD) code. This code is a parallelized plane wave/pseudopotential implementation of Density Functional Theory, particularly designed for ab-initio molecular dynamics. The high level of accuracy of this method provides a mechanism for deriving accurate atomic charges which can be used in classical molecular dynamics, where the electronic degrees of freedom are removed. The atomic charges are passed to the initial models simulated using LAMMPS classical molecular dynamics code. To increase the size and length of simulation we use the classical molecular dynamics simulation to create input model parameters for Coarse-Grained Molecular Dynamics (CGMD) simulations, again using the LAMMPS code. The CGMD simulations have reduced degrees of freedom, by combining atoms into single larger particles. The parameters that are transferred between these levels are the interparticle positions and interactions, calculated to reduce the structural details of the simulation. We will use this exemplar to illustrate the operation of the MAPPER infrastructure in our talk.

An Information Architecture Based on Publish/Subscribe Messaging

Authors:

Warren Smith (University of Texas)


Time:

2:45pm - 5:15pm


Abstract:

Distributed cyberinfrastructures such as the TeraGrid have typically deployed information systems that are based on querying. Users query (or pull) information from an information system and the services that make up an information system often query each other to distribute information around the system. The problem with this approach is that frequent queries can generate excess load, but less infrequent queries result in information being stale when it arrives. This problem is exacerbated when information travels through several information services before reaching a destination.

Our proposed approach is to use publish/subscribe messaging as the foundation of an information system. This messaging model supports a push-style information flow where information is moved from where it is generated to where it is needed very quickly without any polling loops. In addition, this messaging model allows a variety of information producers in a number of locations and administrative domains to publish information to a centralized message broker without knowing the ultimate destination of this information. Similarly, an information consumer can subscribe for information from a message broker without knowing what component is producing the information.

Automated Grid-Probe System to Improve End-To-End Grid Reliability for a Science Gateway

Authors:

Lynn Zentner (Purdue University -- NCN) , Steven Clark (Purdue University -- ItaP), Krishna Madhavan (Purdue University - Engineering Education), Swaroop Shivarajapura (Purdue University -- NCN), Victoria Farnsworth (Purdue University -- NCN) and Gerhard Klimeck (Purdue University - School of Electrical and Computer Engineering)


Time:

2:45pm - 5:15pm


Abstract:

In 2010, the science gateway nanoHUB.org, the world's largest nanotechnology user facility, hosted 9809 simulation users who performed 372,404 simulation runs. Many of these jobs are compute intensive runs that benefit from submission to clusters at Purdue, TeraGrid, and Open Science Grid. Most of the nanoHUB users are not computational experts but end-users who expect a complete and uninterrupted level of service. Within the ecology of grid computing resources we need to manage the grid submissions of these users transparently with the highest possible degree of user satisfaction. In order to best utilize grid computing resources, we have developed a grid probe protocol to test the job submission system from end to end. Beginning in January 2009, we have collected a total of 1.2 million probe results from job submissions to TeraGrid, OSG, Purdue, and nanoHUB compute clusters. We then utilized these results to intelligently submit jobs to various grid sites using a model for probability of success based in part on probe test history. In this paper we present details of our grid probe model, results from the grid probe runs, and discuss the data from production runs over the same time period. These results have allowed us to begin assessing our utilization of grid resources while providing our users with satisfactory outcomes.

A solution looking for lots of problems: Generic Portals for Science Infrastructure

Authors:

Thomas Uram (Mathematics and Computer Science Division, Argonne National Laboratory), Michael Papka (Mathematics and Computer Science Division, Argonne National Laboratory), Mark Hereld (Mathematics and Computer Science Division, Argonne National Laboratory) and Michael Wilde (Mathematics and Computer Science Division, Argonne National Laboratory)


Time:

2:45pm - 5:15pm


Abstract:

Science gateways have dramatically simplified the work required by science communities to run their codes on TeraGrid resources. Gateway development typically spans the duration of a particular grant, with the first production runs occurring some months after the award and concluding near the end of the project. Scientists use gateways as a means to interface with large resources. Our gateway infrastructure facilitates this by hiding away the various details of the underlying resources and presents an intuitive way to interact with the resource. In this paper, we present our work on GPSI, a general-purpose science gateway infrastructure that can be easily customized to meet the needs of an application. This reduces the time to deployment and improves scientific productivity. Our contribution in this paper is two-fold: to elaborate our vision for a user-driven gateway infrastructure that includes components required by multiple science domains, thus aiding the speedy development of gateways, and presenting our experience in moving from our initial portal implementations to the current effort based on Python and Django.

Benefits of NoSQL Databases for Portals & Science Gateways

Authors:

Matthew Hanlon (Texas Advanced Computing Center), Rion Dooley(Texas Advanced Computing Center), Stephen Mock (Texas Advanced Computing Center), Maytal Dahan (Texas Advanced Computing Center), Praveen Nuthulapati (Texas Advanced Computing Center) and Patrick Hurley (Texas Advanced Computing Center)


Time:

2:45pm - 5:15pm


Abstract:

Portals and gateways are increasingly offering users complex interfaces to interact with massive data sets. As dealing with big data becomes more commonplace, portal and gateway developers need to readdress how data is stored and rethink the supporting infrastructure that enables quick and simple access and analysis of data. It is becoming evident that traditional, relational databases are not always the most appropriate solution to allow users on-demand access to big data sets. In this study we show that using non-relational, "NoSQL" databases such as key-value stores and document stores can offer large benefits in performance, accessibility, and availability. We present a use case from the TeraGrid User Portal that demonstrates solutions for processing and auditing user job data efficiently in order to provide users rapid access to this data.

Building Gateways for Life-Science Applications using the Distributed Adaptive Runtime Environment (DARE) Framework

Authors:

Joohyun Kim (Center for Computation and Technology, LSU), Sharath Maddineni (Center for Computation and Technology, LSU) and Shantenu Jha (Center for Computation and Technology, LSU)


Time:

2:45pm - 5:15pm


Abstract:

This work is predicated on three important trends: (i) that the importance, impact and percentage of TeraGrid/XD resources assigned to the life sciences is increasing at a rate that is probably greater than other disciplines, (ii) that gateways have proven to be a very effective access mechanism to distributed HPC resources provided by the TG/XD, and in particular a very successful model for shared/community access models, and (iii) that there are missing capabilities and abstractions to enable the use of the collective capacity of distributed cyberinfrastructure such as TG/XD, especially those that can be used to develop gateways in an easy, extensible and scalable fashion for both compute-intensive and data-intensive applications. We introduce the Distributed Adaptive Runtime Environment (DARE) framework that is a SAGA-based higher-level abstraction, and from which extensible, versatile gateway that seamlessly uses scalable infrastructure can be built for a life-science application effectively. We discuss the architecture of DARE-framework based gateways, and four life-science gateways -- DARE-RFOLD, DARE-DOCK, DARE-HTHP and DARE-NGS -- that use the DARE-framework to impart a wide range of life-science capabilities.

Nascent HPC Activities at an HBCU

Authors:

M. Farrukh Khan and C. J. Tymczak (Texas Southern University)


Time:

2:45pm - 5:15pm


Abstract:

We describe the formative activities related to HPC applications and TeraGrid at Texas Southern University (TSU). TSU is an urban MSI (minority-serving institution) and HBCU with a very small student body in STEM areas (7% of the undergraduate population). We have a vibrant and diverse faculty. Limited resources compound the challenges faced by the HPC community at TSU. We faced several difficulties as well as successes in the building of HPC access and its use at TSU.

Developing Effective Instructional Materials

Authors:

Sandie Kappes (National Center for Supercomputing Applications)


Time:

2:45pm - 5:15pm


Abstract:

The purpose of instruction is to build new knowledge and skills in the target audience. Its success is measured by whether the student has been transformed by the new knowledge and skills acquired. In other words, they need to be able to do something they could not do before receiving the instruction. It is not enough to simply convey information on a topic and call it instruction. Information transmittal is not without value but it only teaches about something not how to do it.

There are two primary activities in designing successful, or effective, instruction. They are: 1) decide what the learner should be able to do at the end of the instruction and 2) design the instruction so that the learner acquires the knowledge and/or skills to do it. Note that both of these activities are learner-centered and performance-based. They address the key issues of instruction – 1) the new knowledge the learner needs to build and 2) the skills they need to attain in order to perform the job correctly.

This paper provides an overview of a simple instructional design process that subject matter experts can use to write effective instructional materials for online delivery to TeraGrid users. This process is based on instructional design and learning theories but does not require expertise in these areas to be applied successfully. Development of both new instructional materials and retrofit of existing content-rich materials into effective instruction will be covered. Development of a sample course in CI-Tutor (www.citutor.org), one of the TeraGrid's online course delivery environments, will be described to illustrate the key components of the instructional design process.

Excursion to Clark Planetarium

 

This event is FREE for registered TeraGrid '11 participants; $30 for guests, which includes the planetarium show and meal. Please sign up to attend this event when you check-in or register on-site for TeraGrid'11.


Time:

6:30pm - 10:00pm


Abstract:

N/A

The Sunset of TeraGrid

Speaker:

John Towns (TeraGrid)


Time:

8:30am - 9:45am


Abstract:

Since 2001, the TeraGrid has developed into a world-class integrated, national-scale computational science infrastructure with funding from the NSF's Office of Cyberinfrastructure (OCI). Recently, the TeraGrid project came to an end and has been succeeded by the NSF's eXtreme Digital program, opening a new chapter in cyberinfrastructure by creating the most advanced, powerful, and robust collection of integrated advanced digital resources and services in the world. This talk will provide a brief retrospective on the TeraGrid project and introduce the new program.

Carry It Forward: From TeraGrid to XSEDE

Time:

5:30pm - 7:00pm

Abstract:

In preparation for the transition from TeraGrid to XSEDE, a BOF of TeraGrid users will discuss the lessons learned from TeraGrid that are pertinent to XSEDE. The session will include an overview of XSEDE that contrasts the differences with TeraGrid--focusing on those differences most likely to impact users.

The session will be used to gather feedback from the users regarding their experiences with TeraGrid to determine what worked well and what did not. We will discuss 4 topics:

  • Impact to Science and Engineering
  • Impact to Researchers, Educators and the broader Community
  • Impacts to Education, Outreach and Training
  • Administration and Organization

We will ask for suggestions from the attendees about what general and specific lessons should be learned from the TeraGrid experience. These suggestions will help guide the TeraGrid in writing its final report, which will influence future cyberinfrastructure projects, as well as future large scale projects with distributed management.

The discussion will also adddress users' hopes, concerns, and suggestions regarding XSEDE.

Time:

5:30pm - 7:00pm

Abstract:

[missing]

XSEDE Advanced User Support and Campus Champion Fellows program information

Time:

5:30pm - 7:00pm

Abstract:

Come learn about new opportunities to apply the latest cyberinfrastructure offerings to your science challenges by working for an extended period with XSEDE's Advanced User Support team.

The program has four areas of focus:
  • Advanced Support for Research Teams (ASRT) provides dedicated support to individual research teams. Typical ASRT projects will last for six months to one year and might include the optimization and scaling of application codes, aggregating petabyte databases from distributed heterogeneous sources and mining them interactively; or helping to discover and adapt the best work and dataflow solution for simulations generating large amounts of data.
  • Advanced Support of Community Capabilities (ASCC) efforts are aimed at deploying, hardening, and optimizing software systems used by large research communities. Collaborations may involve projects from NSF-funded programs such as PetaApps, SDCI, STCI, SI2 and MREFC. Projects of this type will also support communities that wish to use XD resources via science gateways or data repositories.
  • Advanced Support for Training, Education and Outreach (ASTEO) will include the development and delivery (in person or online) of training modules on many aspects of cyberinfrastructure - petascale programming, workflow building, co- scheduled data transfers, data reduction concurrent with simulation, and algorithms for petascale data mining; as well as domain-specific results. Users may recommend a topic they'd like to see addressed or request a presentation or tutorial at their institution.
  • Novel and Innovative Projects (NIP) will extend the impact of XSEDE to new user communities . Examples of novel science areas might include plant science, epidemiology, economics, neuroscience, and many subfields of computer science. Examples of demographic diversity will include researchers based at MSIs and EPSCoR institutions, and SBIR recipients. Examples of innovative technologies might include applications supporting mobile computing clients and seamless integration of distributed, heterogeneous databases.

Also new is the Campus Champion Fellows Program which will expand expertise at a campus level by sending 8 campus champions per summer to an AUSS site to work side-by-side with AUSS mentors on real world, high priority projects. Fellows will develop expertise that will enable them to teach training classes related to their newly acquired skills, and will present their experiences at a symposium before returning to their home institutions.

FutureGrid: What an Experimental Infrastructure Can Do for You

Time:

5:30pm - 7:00pm

Abstract:

Rapidly evolving technology landscape creates the need for experimenting with new paradigms, technologies and tools. Domain scientists need a way to try if a "new way of doing things" does indeed satisfy their needs, technology experts need an environment to develop and experiment with new approaches to solving problems in distributed environments, and students need an environment in which to learn, try out new things, and develop solutions. All these activities create a need for an environment that would provide resources in which such experiments and educational activities could take place.

FutureGrid is a scientific instrument designed to support experiment-driven research in all areas of computer science related to parallel, large-scale or distributed computing and networking. This includes providing an environment that is flexible enough to support Computer Science research as well as stable enough to provide paradigm exploration supporting trial runs from domain sciences. The ability to conduct consistent, controlled and repeatable large-scale experiments in all areas of computer science related to parallel, large-scale or distributed computing and networking as well as the availability, repeatability, and open sharing of electronic products are core to the Future Grid mission.

This Birds-of-a-Feather session will provide a forum for users to get informed about the opportunities available on FutureGrid, present success stories describing different ways in which it has been used in the past year, and encourage discussion from the participants on features that they would like to see supported by this infrastructure. The BOF will be organized and moderated by Renato Figueiredo, Kate Keahey and Warren Smith who will also drive the discussion on various aspects of use of FutureGrid.

Globus GRAM 5: Requirements and Use Cases for XSEDE

Time:

5:30pm - 7:00pm

Abstract:

As we develop Globus GRAM5, we've been gathering definitions, requirements and use cases from TeraGrid members to ensure the product meets the needs of this important user community. At TG11, we plan on continuing this level of engagement with the XSEDE community. In this BOF, we will review the current plans for GRAM5 and open the floor to discuss any changes or improvements you'd like to see. This will be a valuable session for end-users, admin, gateway developers and others who use (or plan to use) Globus GRAM in XSEDE. Please join us to ensure we develop GRAM5 to meet your needs.

Some of the changes we will be discussing and prioritizing include:

  • How to eliminate the need to maintain custom GRAM LRM patches by adding configuration files or procedural callouts
  • How to better interface with newer machines like Kraken that require job startup using aprun
  • How to provide a better way to specify job "geometry" on machines like Kraken and Blacklight
  • How to provide a better mechanism whereby an external process can gather accounting information per GRAM job
  • How to allow an admin to configure additional LRM parameters that can be added to every job

Local Campus Cyberinfrastructure and Campus Bridging

Time:

5:30pm - 7:00pm

Abstract:

Many academic institutions provide local cyberinfrastructure for computational research, but on occasion more computational resources are needed.

Campus bridging is an effort within XSEDE to foster information exchange and strategic planning, designed to assist campus leaders, researchers, and educators in effectively utilizing local, regional and national CI resources to advance scientific discovery.

This BoF session will begin with a forum to discuss how to facilitate a smooth migration from Campus Clusters to XSEDE, with strategies for Operations and Software Deployment. This session targets individuals who extensively use or administer local computational resources to discuss operational issues and how to best helping users adapt research software and scripts from local to national systems.

The BOF will provide the community with the opportunity to learn about the XSEDE architecture plans and how those plans will affect the campus bridging effort. Feedback on these plans will be welcomed.

Finally, the BOF will provide the community and XSEDE the opportunity to discuss plans for engaging a few pilot campuses in helping to prototype the XSEDE architecture within their campuses during the next year.

 

THURSDAY JULY 21

TIME PRESENTATION PRESENTER LOCATION TYPE
7:15AM - 8:15AM Breakfast Salons E & F
8:15AM - 8:45AM Award Ceremony Grand Ballroom Salons A-D
11:00AM - 12:00PM The Data and Compute-Driven Transformation of Modern Science: The Role of TeraGrid and Beyond Ed Seidel Grand Ballroom Salons A-D
9:00AM - 9:30AM Solving the Strong Interactions of Quarks and Gluons through High Performance Computing Carleton DeTar Salon G&H Science Session
9:30AM - 10:00AM Subset Removal On Massive Data with Dash Jonathan Myers, Robert Sinkovits, Mahidhar Tatineni Salon G&H
10:00AM - 10:30AM Modern, Scalable, Reliable Modeling of Turbulent Combustion . Levent Yilmaz, Patrick Pisciuneri, Mehdi Nik, Peyman Givi Salon G&H
10:30AM - 11:00AM An Efficient Parallelized Discrete Particle Model for Dense Gas-Solid Flows on Unstructured Mesh
 
Chunliang Wu, K. Nandakumar Salon G&H
9:00AM - 9:30AM The Development of Mellanox/NVIDIA GPUDirect over InfiniBand -- a New Model for GPU to GPU Communications Gilad Shainer, Pak Lui Salon I&J Technology Session
9:30AM - 10:00AM Early Experiences with the Intel Many Integrated Cores Accelerated Computing Technology Lars Koesterke Salon I&J
10:00AM - 10:30AM Visualization of Multiscale Simulation Data: Brain Blood Flow
 
Joseph Insley, Leopold Grinberg, Michael Papka Salon I&J
10:30AM - 11:00AM Interactive Large Data Exploration over the Wide Area
 
Joseph Insley, Mark Hereld, Eric Olson, Michael Papka, Venkatram Vishwanath, Michael Norman, Rick Wagner Salon I&J
9:00AM - 9:30AM Enabling online geospatial isotopic model development and analysis
 
Hyojeong Lee, Lan Zhao, Gabriel Bowen, Christopher Miller, Ajay Kalangi, Tonglin Zhang, Jason West Deer Valley 1&2 Gateway Session
9:30AM - 10:00AM Virtual Laboratory for Planetary Materials (VLab): An Updated Overview of System Service Architecture Pedro Da Silveira, Maribel Nunez, Renata Wentzcovitch, Marlon Pierce, Cesar Da Silva, David Yuen Deer Valley 1&2
10:00AM - 10:30AM Molecular Parameter Optimization Gateway (Paramchem)
[download presentation]
Jayeeta Ghosh, Ye Fan, Nikhil Singh , Kenno Vanomesslaeghe, Suresh Marru, Sudhakar Pamidighantam Deer Valley 1&2
10:30AM - 11:00AM A European framework to build Science Gateways: architecture and use cases Valeria Ardizzone, Roberto Barbera, Antonio Calanducci, Marco Fargetta, Elisa Ingra', Giuseppe La Rocca, Salvatore Monforte, Fabrizio Pistagna, Riccardo Rotondo, Diego Scardaci Deer Valley 1&2
9:00AM - 11:00AM XSEDE TEOS Plans for Student Engagement XSEDE TEOS Student Engagement team: Laura McGinnis, PSC, Beth Albert, PSC Solitude Birds-of-a-Feather
12:00AM - 1:00PM Lunch Salons E & F

(Invited talk) Solving the Strong Interactions of Quarks and Gluons through High Performance Computing

Speaker:

Carleton DeTar (University of Utah)


Time:

10:00am - 12:00pm


Abstract:

Quarks and gluons make up the bulk of the visible universe. They combine to make protons and neutrons among other particles. The well-accepted theory of quantum chromodynamics (QCD) explains their interactions, but solving this innocuously simple theory from first principles is possible, as far as we know, only through massive numerical simulation. Over the past decade the algorithms and techniques for solving QCD through numerical simulation have become increasingly refined and effective, while, at the same time, the capabilities of high performance computing tools and national resources have grown. Today, as a result, we are able to do calculations of the properties of strongly interacting matter with unprecedented accuracy. The implications are far reaching: (1) We are subjecting the Standard Model of the fundamental interactions of elementary particles to increasingly stringent tests. Combining the "precision frontier" of numerical simulation with the "high energy frontier" of the Large Hadron Collider could very well lead to the discovery of new, fundamental physics beyond the Standrad Model. (2) We are gaining insights into the properties of strongly interacting matter in the extreme environment of the early universe, moments after the big bang. (3) We are determining fundamental constants of nature with increasing accuracy, which puts strong constraints on models of physics beyond the Standard Model.

I will give a brief review of the history of numerical QCD calculations during the Teragrid era, give highlights of recent results, and speculate on advances to come in the Extreme Digital era.

Subset Removal On Massive Data with Dash

Authors:

Jonathan Myers (Large Synoptic Survey Telescope, Tucson, AZ) Robert Sinkovits (University of California San Diego/SDSC), Mahidhar Tatineni (University of California San Diego/SDSC)


Time:

10:00am - 12:00pm


Abstract:

Ongoing efforts by the Large Synoptic Survey Telescope (LSST) involve the study of asteroid search algorithms and their performance on both real and simulated data. Images of the night sky reveal large numbers of events caused by the reflection of sunlight from asteroids. Detections from consecutive nights can then be grouped together into tracks that potentially represent small portions of the asteroids' sky-plane motion.

The analysis of these tracks is extremely time consuming and there is strong interest in the development of techniques that can eliminate unnecessary tracks, thereby rendering the problem more manageable. One such approach is to collectively examine sets of tracks and discard those that are subsets of others. Our implementation of a subset removal algorithm has proven to be fast and accurate on modest sized collections of tracks, but unfortunately has extremely large memory requirements for realistic data sets and cannot effectively use conventional high-performance computing resources. We report our experience running the subset removal algorithm on the TeraGrid Appro Dash system, which uses the vSMP software developed by ScaleMP to aggregate memory from across multiple compute nodes to provide access to a large, logical shared memory space. Our results show that Dash is ideally suited for this algorithm and has performance comparable to or superior to that obtained on specialized, heavily demanded, large-memory systems such as the SGI Altix UV.

Modern, Scalable, Reliable Modeling of Turbulent Combustion

Authors:

S. Levent Yilmaz (University of Pittsburgh), Patrick Pisciuneri (University of Pittsburgh), Mehdi Nik (University of Pittsburgh), Peyman Givi (University of Pittsburgh)


Time:

10:00am - 12:00pm


Abstract:

"Turbulence is the most important unsolved problem of classical physics." That was Richard Feynman decades ago, referring to a century old problem. Today, the situation is no different. Turbulent combustion, which deals with a fluid mixture reacting and mixing under turbulent conditions (as found in rockets, jet engines, power generators, car engines, furnaces, ...), is harder still. While a solution that would satisfy a physicist is yet to be found, engineers all over the world are tackling the problem with computational modeling and simulation. There are a plethora of models for turbulence and combustion with a whole wide range of competing characteristics of applicability, accuracy, reliability and computational cost. Nowadays, reliability is the key feature required of such modeling (but, most often than not, sacrificed or oversight) for the design of environment friendly and efficient machines.

There exists an unproven (but undeniable) direct correlation between reliability and computational cost. However, the era of sacrificing the former because one cannot overcome and afford the latter for a full scale engineering application is over, thanks to TeraGrid and other resources for open research coupled with relentless efforts of countless developers to provide software that runs faster and better. This project is one sampling of how these resources are utilized to overcome an important research problem. We take on the Filtered Density Function (FDF) for large eddy simulation (LES) of turbulent reacting flow, which is a novel and robust methodology that can provide very accurate predictions for a wide range flow conditions. FDF involves an expensive particle/mesh algorithm where stiff chemical reaction computations cause quite interesting, problem specific, and in most cases extremely imbalanced (a couple of orders of magnitude) computational loads. We introduce an advanced implementation with a simple and smart parallelization strategy that combines optimized solvers and high-level parallelization libraries (eg. Zoltan). Brief outline of the methodology and a discussion of the implementation will be presented along with results and benchmarks on the TeraGrid.

An Efficient Parallelized Discrete Particle Model for Dense Gas-Solid Flows on Unstructured Mesh

Authors:

Chunliang Wu (Louisiana State University), K. Nandakumar (Louisiana State University)


Time:

10:00am - 12:00pm


Abstract:

An efficient, parallelized implementation of discrete particle/element model (DPM or DEM) coupled with the computational fluid dynamics (CFD) model has been developed. Two parallelization strategies are used to partly overcome the poor load balancing problem due to the heterogeneous particle distribution in space. Firstly at the coarse-grained level, the solution domain is decomposed into partitions using bisection algorithm to minimize the number of faces at the partition boundaries while keeping almost equal number of cells in each partition. The solution of the gas-phase governing equations is performed on these partitions. Particles and the solution of their dynamics are associated with partitions according to their hosting cells. This makes no data exchange between processors when calculating the hydrodynamic forces on particles. By introducing proper data mapping between partitions, the cell void fraction is calculated accurately even if a particle is shared by several partitions. Neighboring partitions are grouped by a gross evaluation before simulation, with each group having similar particle number. The computation task of a group of partitions is assigned to a compute node, which has multi-cores or multi-processors with a shared memory. Each core or processor in a node takes the computation of the gas governing equations in one partition. Processors communicate and exchange data through Message Passing Interface (MPI) at this coarse-grained parallelism. Secondly, the multithreading technique is used to parallelize the computation of the dynamics of the particles in each partition. The number of compute threads is determined according to the number of particles in partitions and the number of cores in a compute node. In such a way there is almost no waiting of the threads in a compute node. Since the particle numbers in all compute nodes are almost the same, the above strategy yields an efficient load balancing among compute nodes. Test numerical experiments on TeraGrid HPC cluster Queen Bee show that the developed code is efficient and scalable to simulate dense gas-solid flows with up to more than tens of millions of particles by 128 compute nodes. Bubbling in a middle-scale fluidized bed and granular Rayleigh-Taylor instability are well captured by the parallel code

The Development of Mellanox/NVIDIA GPUDirect over InfiniBand -- a New Model for GPU to GPU Communications

Authors:

Gilad Shainer (Mellanox Technologies)


Time:

10:00am - 12:00pm


Abstract:

The usage and adoption of general purpose GPUs (GPGPUs) in HPC systems is increasing due to the unparalleled performance advantage of the GPUs and the ability to fulfill the ever-increasing demands for floating points operations. While the GPU can offload many of the application parallel computations, the system architecture of a GPU-CPU-InfiniBand server does require the CPU to initiate and manage memory transfers between remote GPUs via the high speed InfiniBand network. In this paper we introduce for the first time a new innovative technology - GPUDirect that enables Tesla GPUs to transfer data via InfiniBand without the involvement of the CPU or buffer copies, hence dramatically reducing the GPU communication time and increasing overall system performance and efficiency. We also explore for the first time the performance benefits of GPUDirect using Amber and LAMMPS applications.

Early Experiences with the Intel Many Integrated Cores Accelerated Computing Technology

Authors:

Lars Koesterke (TACC)


Time:

10:00am - 12:00pm


Abstract:

N/A

Visualization of Multiscale Simulation Data: Brain Blood Flow

Authors:

Joseph Insley (Argonne National Laboratory), Leopold Grinberg (Brown University), Michael Papka (Argonne National Laboratory)


Time:

10:00am - 12:00pm


Abstract:

Accurately modeling many physical and biological systems requires simulating at multiple scales. This results in large heterogeneous data sets on vastly differing scales, both physical and temporal. To address the challenges in multi-scale data analysis and visualization we have developed and successfully applied a set of tools, which we describe in this paper.

Interactive Large Data Exploration over the Wide Area

Authors:

Joseph Insley (Argonne National Laboratory), Mark Hereld (Argonne National Laboratory), Eric Olson (University of Chicago), Michael Papka (Argonne National Laboratory), Venkatram Vishwanath (Argonne National Laboratory), Michael Norman (San Diego Supercomputer Center/University of California, San Diego), Rick Wagner (San Diego Supercomputer Center/University of California, San Diego)


Time:

10:00am - 12:00pm


Abstract:

The top supercomputers typically have aggregate memories in excess of 100 TB, with simulations running on these systems producing datasets of comparable size. The size of these datasets and the speed with which they are produced define the minimum performance that modern analysis and visualization must achieve. We report on interactive visualizations of large simulations performed on Kraken at the National Institute for Computational Sciences using the parallel cosmology code Enzo, with grid sizes ranging from 10243 to 64003. In addition to the asynchronous rendering of over 570 timesteps of a 40963 simulation (150 TB in total), we developed the ability to stream the rendering result to multi-panel display walls, with full interactive control of the renderer(s).

To enable the Enzo scientists to interactively explore their multiterabyte datasets, we have developed a flexible framework that couples our hybrid-parallel volume renderer vl3 running on Eureka at Argonne National Laboratory, to parallel display software running on a tiled display wall at the San Diego Supercomputer Center, over dedicated high-bandwidth dynamic virtual local area networks, using an AJAX-based Web client for control of the vl3 processes. Vl3 is a modular-design parallel volume rendering system, enabling various components to be easily swapped out and extended. It uses a ray casting method to implement direct volume rendering using OpenGL and the OpenGL Shading Language (GLSL). In order to manipulate the Enzo vl3 volume rendering, its controls are exposed through a control channel made available via web browser.

This control is connectionless, so there is an added latency, but it has the advantages of being crossplatform and not requiring a client installation. The rendered image is divided up into rectangular pieces, which are sent as separate streams to the tiled display cluster. The picture is divided up because the display responsibilities in a cluster-based tiled display are divided between the computers in the cluster. We use the Celeritas library to facilitate high-throughput streaming of the raw pixel data over TCP. We have created flTile, a tiled display application that handles which machines show which parts of an image or movie so that they look correct on the full display. It uses simple OpenGL for rendering and MPI to handle the synchronization between machines.

An Information Architecture Based on Publish/Subscribe Messaging Distributed cyberinfrastructures such as the TeraGrid have typically deployed information systems that are based on querying. Users query (or pull) information from an information system and the services that make up an information system often query each other to distribute information around the system. The problem with this approach is that frequent queries can generate excess load, but less infrequent queries result in information being stale when it arrives. This problem is exacerbated when information travels through several information services before reaching a destination.

Our proposed approach is to use publish/subscribe messaging as the foundation of an information system. This messaging model supports a push-style information flow where information is moved from where it is generated to where it is needed very quickly without any polling loops. In addition, this messaging model allows a variety of information producers in a number of locations and administrative domains to publish information to a centralized message broker without knowing the ultimate destination of this information. Similarly, an information consumer can subscribe for information from a message broker without knowing what component is producing the information.

Enabling online geospatial isotopic model development and analysis

Authors:

Hyojeong Lee (Purdue University), Lan Zhao (Purdue University), Gabriel Bowen (Purdue University), Christopher Miller (Purdue University), Ajay Kalangi (Purdue University), Tonglin Zhang (Purdue University) and Jason West (Texas A&M)


Time:

10:00am - 12:00pm


Abstract:

In recent years, there has been a rapid growth in the amount of environmental data collected over large spatial and temporal scales. It presents unprecedented opportunities for new scientific discovery, while in the same time poses significant challenges to the research community on how to effectively identify and integrate these datasets into their research models and tools. In this paper, we describe the design and implementation of IsoMAP - a gateway for Isoscapes (isotopic landscapes) modeling, analysis and prediction. IsoMAP provides an online workspace that helps researchers access and integrate a number of disparate and diverse datasets, develop Isoscapes models over selected spatio-temporal domains using geo-statistical algorithms, and predict maps for the stable isotope ratios of water, plants, and soils. The IsoMAP system leverages the computation resources available on the TeraGrid to perform geospatial data operations and geostatistical model calculations. It builds on a variety of open source technologies for GIS, geospatial data management and processing, grid computing, and gateway development. The system was successfully used to teach a tutorial in the 2011 conference on the Roles of Stable Isotopes in Water Cycle Research. A post-tutorial survey was conducted. We review the users' feedback and present a future development plan based on that.

Virtual Laboratory for Planetary Materials (VLab): An Updated Overview of System Service Architecture

Authors:

Pedro Da Silveira (University of Minnesota), Maribel Nunez (University of Minnesota), Renata Wentzcovitch (University of Minnesota), Marlon Pierce (Indiana University), Cesar Da Silva (Universidade Federal de Uberlândia) and David Yuen (University of Minnesota)


Time:

10:00am - 12:00pm


Abstract:

In this paper we review the main features and illustrate the use of VLab, a Science Gateway that provides cyberinfrastructure (CI) for distributed first principles computations in materials science. The VLab CI includes web services running in different computers, controlled from a web portal running predefined, distributed, and interactive workflows by multiple users. Currently, in addition to simple electronic structure calculations, it supports calculations of materials properties important for mineral physics. These properties such as elastic constants, thermodynamic properties, static and thermal equations of state, usually require substantial number of tasks. It uses the Quantum ESPRESSO software for first principles computations. Here, we show details of the task distribution in batch system using a "bag of tasks" approach. We also explain VLab's approach to interactive user tools.

Molecular Parameter Optimization Gateway (Paramchem)

Authors:

Jayeeta Ghosh (NCSA), Ye Fan (NCSA), Nikhil Singh (NCSA), Kenno Vanomesslaeghe (University of Maryland), Suresh Marru (Indiana University) and Sudhakar Pamidighantam (NCSA)


Time:

10:00am - 12:00pm


Abstract:

Parameter optimization for chemical systems requires generation of initial guesses. These parameters should be generated using systematic sampling of parameter space, minimizing differences between output data and the corresponding reference data. In this paper we discuss the ParamChem project, which is creating reusable and extensible infrastructure for the computational chemistry community that will reduce unnecessary and eliminate redundancies in parametrized computations using modern software engineering tools.

The paper particularly focuses on constructing and executing coupled molecular chemistry models as complicated workflow graphs. These workflow management capabilities have been integrated with the GridChem Science Gateway infrastructure through the TeraGrid advanced user support program. Further, we describe how the project is enabling a sustainable growth for science gateway infrastructure by building upon tools provided by the Open Gateway Computing Environments. The paper also discusses plans for integrating TeraGrid information, monitoring and prediction services to provide automated job scheduling with resource maintenance and fault aware services.

A European framework to build Science Gateways: architecture and use cases

Authors:

Valeria Ardizzone (Consorzio COMETA), Roberto Barbera (University of Catania and INFN), Antonio Calanducci (Consorzio COMETA), Marco Fargetta (Consorzio COMETA), Elisa Ingra' (Consortium GARR and INFN), Giuseppe La Rocca (Italian National Institute of Nuclear Physics, Division of Catania), Salvatore Monforte (Italian National Institute of Nuclear Physics, Division of Catania), Fabrizio Pistagna (Italian National Institute of Nuclear Physics, Division of Catania), Riccardo Rotondo (National Institute of Nuclear Physics (Catania)) and Diego Scardaci (Italian National Institute of Nuclear Physics, Division of Catania)


Time:

10:00am - 12:00pm


Abstract:

Science Gateways are playing an important role in scientific research performed using e-Infrastructures and their relevance will further increase with the development of more sophisticated user interfaces and easier access mechanism. Through the highly collaborative environment of a Science Gateway, users spread around the world and belonging to various Virtual Research Communities can easily cooperate to reach common goals and exploit all the resources of the cyber-infrastructure they are entitled to use.

One of the major tasks of a Science Gateway is to supervise the user access to the available services, denying the use to those people who are not authorised. This activity has to comply with the role of users inside the VRC.

Users operating in a Science Gateway can belong to different organisations having their own security policies and the Virtual Research Community has to comply with them. As a result, the security chain inside the Science Gateway has to allow each organisation to keep the control of their users hiding, at the same time, the complexity of the security mechanisms underneath the portal.

In this work we present a general framework to build Science Gateways and the customisations made to meet the requirements of a couple of use cases coming from different scientific communities: those of the European Union funded DECIDE (www.eu-decide.eu) and INDICATE (www.indicate-project.eu) projects. The goal of DECIDE project is to design, implement, and validate a Science Gateway for the computer-aided extraction of diagnostic markers from medical images for the early diagnosis of Alzheimer's disease and other forms of dementia. The INDICATE project aims instead at demonstrating, with real-life examples, the advantages of the adoption of e-Infrastructures in the digital cultural heritage domain.

The framework is powered by Liferay and takes advantage of its features and its whole set of web 2.0 tools and services. These have been integrated with a more flexible security workflow and a new set of portlets to access the Grid services. The new developed security system merges three different security mechanisms in a single workflow allowing users to access Grid resources based on the credentials provided by the organisations they belong to. The idea behind was to combine Shibboleth2 identities with X.509 proxies generated by robot certificates. The former enable the federation of organisations having different authentication policies while the latter allow users to access Grid resources without needing any personal certificates whose request and management procedure is very often judged quite cumbersome by non-experts. The "glue" between the two layers is an LDAP server running in the back-end that implements a mechanism to map authorised users on Grid resources.

Once the user is authenticated, the portlets developed provide the functionalities to manage the Grid credentials in order to access the e Infrastructure behind. The portlet-based interface to Grid is built on the OGF-standard SAGA Java API and it is not bound to any particular middleware.

Besides the interaction with the computational services of an e-Infrastructure, the proposed framework includes the possibility to easily build and manage data repositories interacting with the gLibrary framework and to encrypt/decrypt sensible data with the Secure Storage System.

The Data and Compute-Driven Transformation of Modern Science: The Role of TeraGrid and Beyond

Speaker:

Ed Seidel (National Science Foundation)


Time:

8:15am - 9:45am


Abstract:

We all know that modern science is undergoing a profound transformation as it aims to tackle the complex problems of the 21st Century. It is becoming highly collaborative; problems as diverse as climate change, renewable energy, or the origin of gamma-ray bursts require understanding processes that no single group or community alone has the skills to address. At the same time, after centuries of little change, compute, data, and network environments have grown by 9-12 orders of magnitude in the last few decades. Moreover, science is not only compute-intensive but is dominated now by data-intensive methods. This dramatic change in the culture and methodology of science will require a much more integrated and comprehensive approach to development and deployment of hardware, software, and algorithmic tools and environments supporting research, education, and increasingly collaboration across disciplines. The TeraGrid has been a huge step forward in this direction, providing a great foundation for future developments to support this new science. Motivating with examples ranging from astrophysics to emergency forecasting, I will discuss the kinds of capabilities needed for the next generation national cyberinfrastructure, and the central role that TeraGrid can play.

XSEDE TEOS Plans for Student Engagement

Speaker:

Laura McGinnis, PSC, Beth Albert, PSC


Time:

10:00AM - 12:00PM


Abstract:

XSEDE Outreach is making opportunities available to the next generation of STEM practitioners through the Student Engagement program. The program will match undergraduate and graduate students with projects provided and managed by researchers and staff in the XSEDE community, to provide out-of-classroom experience for the students.

The purpose of this session is to share information about the initial design of the program, and solicit input from students as well as XSEDE researchers and staff as we refine the program. This input is critical to implementing an experience that is valuable to both the students and the XSEDE community.

We invite students interested in deepening their involvement with XSEDE, as well as XSEDE researchers and staff interested in working with students to join us for an open discussion about how to make this program work at the national scale. We have a lot of ideas, but we need your input as well, to be sure we're addressing the community needs and requirements that will make this program a success.