Science Track

Sessions demonstrate the impact of the TeraGrid through scientific results or the emergence of new communities. Presenters will articulate the scientific problem; describe the scientific and computational methods (algorithms, techniques, software) and TeraGrid resources used; share their results; and future plans.

Ross Walker. Acceleration of AMBER Molecular Dynamics Simulations using NVIDIA GPUs: Performance Improvements, Validation and Lessons Learned
Download the presentation(PDF)

Abstract: Recent work in close collaboration with NVIDIA has produced a GPU accelerated version of the AMBER Molecular Dynamics Code PMEMD. This is available as a patch on the AMBER website and will be made generally available as part of AMBER v11 to be released in spring 2010. This supports both explicit solvent particle mesh ewald (PME) and implicit solvent simulations providing speedups on a single Tesla C1060 card of between 20 and 30x over a single Intel Nehalam core and more for multiple GPUs. This talk will provide an overview of the AMBER software, background behind this GPU work, benchmarks, the impact that GPU accelerated MD can have on the field and then discuss the validation methods used. Ensuring that a GPU implementation of a MD package provides results that are indistinguishable from the CPU code is extremely tricky and often the desire to take shortcuts to boost performance can affect accuracy with unpredictable results. We have developed a comprehensive validation suite that can be used to perform the detailed testing that is required to ensure the approximations necessary for GPU performance do not impact the scientific results. Additionally we will discuss how we have made careful use of mixed single and double precision arithmetic in the AMBER implementation to achieve equivalence in the results without excessively compromising performance. Finally we provide examples of recent breakthrough simulations conducted using GPU enabled AMBER 11.

Emre Brookes and Borries Demeler. UltraScan: High-resolution modeling of analytical ultracentrifugation experiments on TeraGrid

Download the presentation(PDF)
Abstract: UltraScan is a comprehensive software package for the analysis of hydrodynamic data from analytical ultracentrifugation experiments. Results from such experiments provide insight into the dynamic interactions among macromolecules involved in the processes of the living cell, and allow their study in the solution state which most closely resembles the physiological conditions in the cell. Current studies involving UltraScan focus on the role of macromolecular properties involved in disease and cancer, such as aggregation, and on the basic understanding of structure and function of biological polymers like proteins, RNA and DNA. The implementation of efficient optimization algorithms in UltraScan, using parallel resources on TeraGrid, and similar resources in Europe and Australia, has attracted a world-wide community of biochemists, biophysicists, cell- and structural- biologists, and material scientists for the study of a wide range of experimental applications. The main parallel algorithm implemented in UltraScan employs the 2-dimensional spectrum analysis, which fits a linearized set of finite element solutions of the Lamm equation by non-negatively constrained least squares to experimental data. The results are refined by Monte Carlo methods and parsimonious regularization. These computationally intensive procedures have been optimized to utilize the high performance computing resources of TeraGrid.

Brian O'Shea, Michael Norman, Jack Burns, Matthew Turk, Britton Smith, Sam Skillman and John Wise. Cosmological structure formation at the largest and smallest scales
Abstract: In this presentation I will describe recent progress made in studying cosmological structure formation using the Enzo adaptive mesh refinement code and a range of Teragrid resources. Galaxies are complicated beasts, and require a large dynamical range and variety of input physics to model accurately. In addition, given that their evolutionary time scales are so long compared to a human lifetime, we are forced to compare our models to statistics obtained from astronomical surveys, so large numbers of galaxies and large volumes of the universe must be simulated. As a result of these two factors, progress in our understanding of galaxy formation and evolution is driven almost entirely by the available computational power and our ability to harness it effectively. To this end, the Enzo adaptive mesh code has been extremely successful, having demonstrated the ability to scale to almost 100,000 cores on Kraken, and to very efficiently use Teragrid systems to produce ground-breaking scientific results. I will highlight three areas where much recent progress has been made with Enzo on Teragrid resources: (1) Population III stars and the first galaxies in the universe, (2) Calculations of self-consistent radiation transport and the reionization of the Universe, and (3) studies of the detailed properties of galaxy clusters and the cosmic web. These particular topics are of particular interest to the astrophysical community due to the impending launch of the James Webb Space Telescope, the construction of the ALMA radio telescope, and the design of the next generation of thirty meter-class optical telescopes, which will give us unprecedented views on the formation and evolution of cosmological structure.

Henian Xia, Kwai Wong, Wenjun Ying and Xiaopeng Zhao. Numerical Simulations of Coupled Electro- Mechanical Dynamics in a Dog Ventricle
Abstract: We develop a coupled electromechanical model, which integrates properties of cardiac electrophysiology, electro-mechanics, and mechano-electrical feedback. The model is implemented on the supercomputer, Kraken. Numerical simulations are carried out on a dog ventricle to investigate the interaction of electrical and mechanical functions in the heart and their influences to cardiac arrhythmias.

Yaakoub El Khamra, Shantenu Jha and Christopher White. Modelling Data-driven CO2 Sequestration Using Distributed HPC Cyberinfrastructure
Abstract: In this paper, we lay out the computational challenges involved in effectively simulating complex phenomenon such as sequestering CO2 in oil and gas reservoirs. The challenges arise at multiple levels: (i) the computational complexity of simulating the fundamental processes; (ii) the resource requirements of the computationally demanding simulations; (iii) the need for integrating real-time data (intensive) and computationally intensive simulations; (iv) and the need to implement all of these in a robust, scalable and extensible approach. We will outline the architecture and implementation of the solution we develop in response to these requirements, and discuss results to validate claims that our solution scales to effectively solve desired problems sizes and thus provides the capability to generate novel scientific insight.

Steven Kelling, Daniel Fink, Robert Cook, Suresh SanthanaVannan, John Cobb, Kevin Webb and William Michener Michener. Patterns in Bird Migration Phenology Explored through Data Intensive Computation
Abstract: Assessing, and mitigating threats to biodiversity must overcome three challenges: 1) species’ distributions vary through time and space; 2) sufficient data are seldom available; 3) conventional analyses are not effective for facilitating pattern discovery. The goal of our DataONE project is to advance data intensive computation by developing new techniques that describe the broad-scale dynamics of continent-scale bird migration.
First, we developed a data warehouse where locations of bird observations made through eBird ( are linked to environmental covariates such as climate, habitat and human demography (land cover composition and configuration, elevation, human population, urbanization), and vegetation phenology (MODIS land products) from multiple sources. At present, more than 300,000 bird observation locations have been linked to over 500 environmental variables. Next, we employ novel semi-parametric spatiotemporal models to analyze dynamic patterns of species occurrence and identify predictors of habitat suitability. The models produce highly accurate simulations of intra-annual bird migrations. Finally, VisTrails ( scientific workflow software is used to provide a systematic means to organize, document, explore, and visualize data and results from these analyses. Encapsulating this research in a workflow environment allows others to efficiently reproduce and validate results and use this data and methodology for other applications.
To exploit these data and workflow resources fully, we will deploy the analysis on TeraGrid. The HPC challenges include porting memory intensive analysis in the R language to a HPC environment and the organization and storage of large volumes of output. For example, a small-scale analysis of 200 species transforms 3 GB of input data into 1 TB of valuable output; continental analysis will require several TB of input data.
Birds engage in the most spectacular long-distance migrations of any animal on the planet and demonstrate the biological integration of seemingly disparate ecosystems around the globe. Furthermore, they are unrivaled windows into biotic processes at all levels and are proven indicators of ecological well-being. By interpreting and analyzing these models, novel patterns “born from the data” provide valuable insight into the underlying ecological processes of bird migration. DataONE techniques and the TeraGrid resources will allow scientists to analyze bigger and more complex systems efficiently, and complement more traditional scientific processes of hypothesis generation and experimental testing for a better understanding of the natural world.

Yang Wang and Anthony Rollett. A Novel Approach to Parallel 3-D FFT
Abstract: Fast Fourier transform (FFT) plays an important role in numerous scientific applications. In electronic structure calculations for materials science and engineering applications, for example, FFT is widely used for solving the Schrödinger equation for one-electron in a mean field in the framework of density functional theory in local density approximation (or generalized gradient approximation), as well as for solving the Poisson equation for electrostatic potential in solids. For data on a uniform mesh of N points, FFT demands O(N) memory access versus O(N logN) floating point operations, requiring not only high computation throughput but also high memory bandwidth. Conventional approaches to parallel FFT of multi-dimensional data consist of many one-dimensional transforms along each dimension of the data array. For a 3-D data, in particular, most parallel FFT algorithms are implemented in such way that the data array is distributed along the z-dimension of the data. To perform the FFT, each processor first transforms x-y dimensions of the data that are completely local to it, then, the processors have to perform a transpose of the data in order to get the remaining dimension local to a processor. This dimension is then Fourier transformed, and the transformed data is usually needed to be transposed back to its original order. The transpose involves all-to-all global communications, in which every processor sends a different message to every other processor, and when data size is sufficiently large, the transpose forms a major performance bottleneck of multi-dimensional, distributed FFT. In this presentation, we will show a novel parallel 3-D FFT method that is able to take full advantage of data localization in practice without the need for data re-distribution and most importantly it does not require data transpose operations so that the all-to-all communication bottleneck can be entirely avoided. We will demonstrate the performance of this method on massively parallel distributed supercomputing systems with more than 100,000 compute cores.

Jeffry Madura, Eliana Asciutto and Bonnie Merchant. Biomolecular Simulations
Abstract: The structure, function, and dynamics of biological molecules remain a challenging problem to experimentalists as well as computational scientists. Using computational methods such as multiconfigurational thermodynamic integration (MCTI), as implemented in NAMD, and replica exchange molecular dynamics (REMD), as implemented in AMBER, we are studying the function of monoamine transporters and the structure of 21 amino acid peptides in salt solutions, respectively. In the MCTI calculation we are examining the transport of molecules through our comparative model for the dopamine transporter. In the REMD calculations we are comparing our calculated structures and free energy profiles for the unfolding of a 21 amino acid peptide in salt solutions against ultraviolet resonance Raman spectroscopic data. These calculations have been performed on TeraGrid resources located at Pittsburgh Supercomputing Center and National Institute for Computational Sciences.

Martin Berzins, Justin Luitjens, Qingyu Meng, Todd Harman, Charles Wight and Joseph Peterson. Uintah a scalable framework for hazard analysis

Download the presentation(PDF)
Abstract: The Uintah Software system was developed to provide an environment for solving a fluid-structure interaction problems on structured adaptive grids on large scale, long running, data intensive problems. Uintah uses a novel asynchronous task-based approach with fully automated load balancing. The application of Uintah to a petascale problem in hazard analysis arising from ``sympathetic'' explosions in which the collective interactions of a large ensemble of explosives results in dramatically increased explosion violence, is considered. The advances in scalability and combustion modeling needed to begin to solve this problem are discussed and illustrated by prototypical computational results.

Homa Karimabadi. 3D Global Hybrid Simulations of the Magnetosphere and I/O Strategies for Massively Parallel Kinetic Simulations
Abstract: The main goal in magnetospheric physics is to understand how solar wind transfers its mass, momentum and energy to the magnetosphere. Aside from a theoretical interest, the interaction of solar wind with the magnetosphere has a great practical relevance. The Earth’s magnetic field provides an imperfect shield from the direct effects of the solar wind. The term “space weather” has been coined to describe the conditions in space that affect the Earth and its technological systems. Space weather affects the global community through its impact on (i) GPS satellites, (ii) geosynchronous communication and weather satellites, (iii) large-scale power grids on the ground, and (iv) navigation and communications systems through the magnetosphere and ionosphere. A major reason for the complexity in solar wind-magnetosphere interaction is the dominance of ion kinetic effects which occur on small ion scales but affect the large-scale dynamics of the magnetosphere. Global single-fluid MHD simulations have proven useful as a key component of space weather studies but many of its key aspects relating to transport and particle energization cannot be addressed. Hybrid simulations, which treat the electrons as mass-less fluid and ions as kinetic particles, overcome these deficiencies. However, a major impediment to such simulations has been the enormous computational cost. We have been pushing the state-of-the-art in kinetic simulations and have recently performed the largest 3D global hybrid simulation to date using 98,304 cores on the NSF Kraken The unprecedented simulations on Kraken have been instrumental in revealing new science which has now been confirmed in spacecraft data. In particular this led to the discovery that vortex flows are often associated with flux transfer events (FTEs). Evidence for such association was recently reported from collaborators at UCLA using spacecraft data and we are currently working with them on a direct comparison of simulation and NASA THEMIS mission data to understand the physics behind this association.
This work also highlights the challenges of running simulations at such large core counts. One of the significant issues, which we overcome in our simulation, is the large I/O requirements for particle based codes. For instance a particle codes’ checkpoint files can easily be a factor of 200 larger than those produced by corresponding fluid codes. Current particle codes can produce more than 200TB in a single run and are expected to scale to 10s of PBs in the next few years. However, in large part the I/O issues we face stem from file system specific limitations and the relative underspecification of I/O hardware to compute hardware. There are numerous examples of file system specific limitations that impact codes running on a large scale. File systems such as Lustre (LFS) that implement the POSIX I/O serial consistency semantics are not designed for massive shared file I/O. On LFS the maximum number of physical I/O targets (OSTs) for a shared file is limited to 160. The interprocess file lock contention that can occur during shared file I/O drastically impacts performance. File locking mechanisms vary among file systems significantly and often what reduces contention on one file system increases it on another. The naive or inappropriate use of striping parameters can introduce unnecessary lock contention and contention that occurs for the lock manager itself can be an issue for both file per process and shared file I/O. Cluster specific hardware limitations are also an issue. A relative underpsecification of I/O hardware can preclude fully parallel I/O during a large run. For example the LFS file system deployed on Kraken, where our largest runs were made, has a single MDS and 336 OST to service the over 99000 cores available during a full scale run. These hardware and file system limits can introduce insurmountable resource contention during large scale runs which can severely and fatally degrade simulation performance. In preparation for our full scale run on Kraken we found that a single checkpoint dump took nearly two hours to complete making a large scale science run impractical. We subsequently developed a partial serialization technique that resulted in a factor of 8 speed up in our I/O kernel, making our unprecedented simulation possible. We have made a comparison of file per process POSIX I/O, to our technique applied to each of POSIX I/O, MPI independent I/O and MPI Collective I/O making use of the new Lustre specific lock protocol based file domain partitioning scheme implemented in Cray MPT. Detailed scaling results, associated parallel I/O analysis and new science derived from our simulation will be presented.

Shantenu Jha. High-throughput computing of large-scale ensembles on the TeraGrid and DEISA for HIV-1 Protease Simulation
Abstract: HIV-1 protease is one of the enzymes responsible for the replication of the HIV virus and as a result is the target of numerous HIV drugs (known as protease inhibitors). The active site of the enzyme is gated by a pair of flexible structures known generally as the “flaps”. The behaviour of these flaps is not generally well known, and due to their apparent function this behaviour is believed to be important for drug inhibition. Considerable work has been carried out[1] investigating computational methods for estimating the binding affinity of drugs to the protease (i.e. how strongly the drug binds to the protease) as this is considered to be a useful metric in categorising drug resistance. Recent work indicates that ensemble simulations offer the best (closest to experiment) computational estimate of the binding affinity.
We have validated experimental binding affinity rankings and discrimination of a series of HIV-1 protease mutants of varying resistance levels to the inhibitory drug lopinavir (using a simulation and analysis protocol based on running 50 replicas each resulting in 4ns of production data) [1]; we are extending this to clinically relevant situations. In conjunction with the EU Virolab project a series of patient viral sequences were identified for which existing clinical decision support systems gave discordant resistance ratings. Performing simulations on both the full sequence and the constituent mutations individually we have begun to elucidate both the overall level of resistance and the interactions between mutations which produce it.
The accurate binding affinity calculation for a particular mutant of HIV-1 protease with the available drugs requires multiple, large-scale ensemble-based Molecular Dynamics (EnMD) simulations; each individualMD simulation can be hundred of cores; the number of ensembles required can be of the O(100). EnMD aims to explore configurational space more effectively than traditional molecular dynamics, by running a large number of replicas of a system, varying a physical parameter (usually temperature) and then possibly exchanging configurations between replicas at pre-defined intervals. From a performance point of view, EnMD/replicaexchange molecular dynamics (REMD) is a good match for distributed resources, because most communication is kept within the replicas, and the exchange step happens relatively infrequently (for REMD, and no exchange for EnMD). Typically the sampling improves as the number of replicas increases.
The greater sampling gained by using EnMD simulations to obtain the results presented here requires the use of high performance computing (HPC) resources in a high-throughput computing (HTC) mode. Achieving HTC on HPC resources is not trivial; it requires application-level and runtime-level abstractions. We use SAGA (Simple API for Grid Applications) based Pilot-Jobs (called BigJob) that facilitates the use of multiple distributed resources. Multiple SAGA-based pilotjobs can be started and ensembles are assigned to appropriate pilot-jobs. Ensembles migrate to different resources as they become available (i.e., as the pilot-job goes active) or expand/contract as resources become available or more scarce. The dynamic execution of many tasks (ensembles) enables highly reduced timetosolution. We will discuss our work in the context of a NSFfunded project (via HPCOPS) to utilize interoperability across multiple Grids (TeraGrid and DEISA in the EU) to demonstrate effective and large-scale science. Using SAGA-based pilot-jobs, we have furthered research into HIV, both in the areas of flap behaviour of the HIV- 1 protease (and mutants thereof), and in research into simulation of the binding properties of protease mutants to various drugs. We will present new results and resultant capabailities to discriminate between different potential mutations that arise from greater sampling that would be possible otherwise. More reliable binding affinity calculations offer the potential of patient specific medicine, with time-scales improved by the transparent access to federated Grid resources.

Scott Michael Michael, Stephen Simms, W. B. (Trey) Breckenridge, Roger Smith and Matthew Link. A Compelling Case for a Centralized Filesystem on the TeraGrid: Enhancing an astrophysical workflow with the Data Capacitor WAN as a test case

Download the presentation(PPT)
Abstract: In this article we explore the utility of a centralized filesystem provided by the TeraGrid to both TeraGrid and non-TeraGrid sites. We highlight several common cases in which such a filesystem would be useful in obtaining scientific insight. We present results from a test case using Indiana University's Data Capacitor over the wide area network as a central filesystem for simulation data generated at multiple TeraGrid sites and analyzed at Mississippi State University. Statistical analysis of the I/O patterns and rates, via detailed trace records generated with VampirTrace, for both the Data Capacitor and a local Lustre filesystem are provided. The benefits of a centralized filesystem and potential hurdles in adopting such a system for both TeraGrid and non-TeraGrid sites are discussed.

Anand Padmanabhan, Wenwu Tang and Shaowen Wang. Agent-based Modeling of Agricultural Land Use on TeraGrid

Download the presentation(PDF)
Abstract: Agent-based models are a simulation approach that has been widely applied in the investigation of complex land use systems. To obtain a better understanding of dynamics in land use systems, agent-based models require significant computational support. The TeraGrid, a representative cyberinfrastructure, provides massive high-performance computing resources to meet computational requirement for agent-based models. Leveraging the TeraGrid computing resources for agent-based models, however, poses a challenging issue. In this paper we present an agent-based model of agricultural land use supported by the TeraGrid to gain insight into this issue.
Our study areas include rural counties in Illinois. Farmers within our study area are represented as intelligent geospatial agents who make adaptive land use decisions in response to environmental and socioeconomic drivers. Farmer agents are associated with a set of parcels each assigned with specific crops (e.g., biofuel and food crops). Specifically, farmer agents make decision to determine which crop types they will plant for their owned parcels to optimize their economic return. A machine learning algorithm (reinforcement learning) was designed to represent adaptive mechanisms that farmers use to adjust their land use behavior in response to changes in environmental and socioeconomic variables, including crop yield and market prices. This makes the land use agent-based model data-intensive, since every agent consumes a number spatio-temporal datasets to make appropriate land use decisions. Because of stochastic parameters involved and multiple model repetitions required to obtain reasonable simulation results, the agent-based model presented in this research is highly compute intensive.
We deploy the agent-based model to the TeraGrid so that we can leverage supercomputing resources for simulation modeling. With support from a TeraGrid supercomputer, Abe at NCSA, we conduct simulation experiments to examine how change in environmental and socioeconomic drivers influence land use decisions of farmers in a computationally tractable manner. Our modeling efforts demonstrate that our TeraGrid-supported agent-based simulation has great potential in enhancing our understanding of complex land use systems.

Yan Liu and Shaowen Wang. Asynchronous Implementation of A Parallel Genetic Algorithm for the Generalized Assignment Problem
Abstract: As an effective meta-heuristic to search for optimal or near-optimal solutions, genetic algorithm (GA) is inherently parallel, allowing us to leverage parallel computing resources for more computationally-intensive evolutionary computation. In this paper we develop a scalable parallel genetic algorithm (PGA) to exploit massively parallel high-end computing resources for solving large problem instances of a classic combinatorial optimization problem -- the Generalized Assignment Problem (GAP).
GAP belongs to the class of NP-hard 0-1 Knapsack problems. Numerous capacity-constrained problems in a wide variety of domains can be abstracted as GAP. In practice, the size of problem instances is often large while the problem-solving requires quick response time and the capability of finding alternative solutions with a specified quality, therefore exhibiting a big computational challenge. A coarse-grained PGA is developed to address the aforementioned computational challenge by focusing on the scalability to large amounts of processors available from high-end computing resources such as those provided by the National Science Foundation TeraGrid cyberinfrastructure. PGAs are often implemented by synchronizing each iteration on operations involving communication among processors. The overhead of such synchronization increases dramatically as more processors are used. An asynchronous migration strategy is thus designed to achieve desirable scalability through controlled migration operators (i.e., export and import) and buffer-based asynchronous communications among processors connected through a regular grid topology. This asynchronous migration mechanism is effective to not only lower the cost of inter-processor communication, but also increase the overlapping of computation and communication within a single process by allowing delayed migration operations.
Experiments are conducted on the NCSA Abe and TACC Ranger clusters. Results show that our PGA is able to fully utilize 2,048 processors with marginal communication cost and exhibits better numerical performance than synchronous implementation. Linear and super-linear speedups were observed in solving large GAP instances. The parallel algorithm is adaptive to different system characteristics of underlying clusters, in terms of system message buffer size and different sending and receiving delays, in order to efficiently propagate good solutions across processors and inject random noise to diversify solution space search.

Michael Norman and Allan Snavely. Accelerating Data-Intensive Science with Gordon and Dash
Abstract: In 2011 SDSC will deploy Gordon, an HPC architecture specifically designed for data-intensive applications. We describe the Gordon architecture and the thinking behind the design choices by considering the needs of two targeted application classes: massive database/data mining and data-intensive predictive science simulations. Gordon employs two technologies that have not been incorporated into HPC systems heretofore: flash SSD memory, and virtual shared memory software. We report on application speedups obtained with a working prototype of Gordon in production at SDSC called Dash, currently available as a TeraGrid resource.

Abhinav Bhatele, Eric Lee, Ly Le, Leonardo Trabuco, Eduard Schreiner, Jen Hsin, James C. Phillips, Laxmikant V. Kale and Klaus Schulten. Biomolecular modeling using NAMD on TeraGrid machines

Download the presentation(PDF)
Abstract: NAMD is a highly scalable parallel molecular dynamics program. It is publicly available and installed on most supercomputing centers. It has been developed over the last two decades to perform simulations of biomolecular systems of varying sizes over a broad range of platforms. Over the recent years, NAMD has also been ported to accelerators such as Cell BE and GPGPU. Performance improvements in the Charm++ runtime and NAMD have been exploited over the years for scientific discoveries. Recently, NAMD has been used for several large size and long time simulations on the TeraGrid machines (Ranger, Abe and Bigben) and we present three characteristic studies in this talk.

Shawn Brown, Phil Cooley, Bruce Y. Lee, William Wheaton, John Grefenstette and Donald Burke. Computational Explorations into the H1N1 Pandemic
Abstract: During the 2009 H1N1 pandemic, the University of Pittsburgh MIDAS Center of Excellence engaged in several agent-based modeling efforts for academic research and advising government. The Pitt/RTI agent-based model is a high-performance computing program capable of supporting infectious disease models of US regional areas. Synthetic census based populations are created within the computer program, allowing for a computational laboratory for exploring disease dynamics and control measures for mitigation. Using PSC and Teragrid resources, the Pitt/RTI model was used to explore school closure scenarios in Allegheny County and the Commonwealth of Pennsylvania, workplace vaccination strategies in the DC metro area, and the benefits of vaccination priority recommendations. In this talk, an overview of the agent-based model and its computational requirements will be given, along with a review of results produced with the model for the H1N1 influenza.

Liwei Li, Khuchtumur Bum-Erdene, Josh Rosen, Marlon Pierce and Samy Meroueh. BioDrugScreen: a computational drug design resource for ranking molecules docked to the human proteome
Abstract: BioDrugScreen is a computational drug design and discovery resource and server. The portal contains the DOPIN (Docked Proteome Interaction Network) database. The DOPIN database contains millions of pre-docked and pre-scored complexes from thousands of targets from the human proteome and thousands of drug-like small molecules from the NCI diversity set and other sources. The portal is also a server that can be used to (i) customize scoring functions and apply them to rank molecules and targets in DOPIN; (ii) dock against pre-processed targets of the PDB; and (iii) search for off-targets. BioDrugScreen enables users to create their own scoring function — using affinity, structures, and pre-computed descriptors from existing databases such as PDBbind, BindingDB, and PDBcal — to validate these scoring functions with enrichment curves and ROC plots, and to apply these scores to rank molecules in the DOPIN database. The portal provides users the option to dock molecules to pre-identified cavities within targets of the PDB databank. In a few simple steps, the user submits the docking and GBSA scoring to the TeraGrid using our account. The jobs can be monitored within the portal. Finally, the portal provides off-target information for all molecules within DOPIN making possible to identify new uses for molecules and drugs within the database. We present data on our first attempt to characterize the pharmacology of molecules within the DOPIN database. In addition, we describe two examples whereby the Web resource was used to identify molecules that were acquired and experimentally validated for inhibition of their intended targets. Some of these compounds were found to possess anti-cancer properties by inhibiting growth of lung tumor cells in culture.

Alexei Kritsuk, Segrey Ustyugov and Michael Norman. Super-Alfvenic Turbulence in Molecular Clouds: Inferences from Simulations of Multiphase Interstellar Medium
Abstract: Highly compressible magnetized turbulence is a major agent shaping the hierarchical structure of interstellar clouds on a wide range of scales from 0.1 to 100 parsecs. Giant molecular clouds are believed to form from large-scale compressions in the diffuse interstellar medium. Turbulent motions within molecular clouds generate shock waves that tear the cloud material into a hierarchy of smaller and smaller clumps. They also provide the necessary kick to overcome the outward pressure and cause the densest cloud cores to collapse leading to the birth of stars. This "turbulent fragmentation" is believed to shape the initial mass function of newly born stars. Details of the fragmentation process, however, depend critically on the degree of magnetization of molecular cloud material which is very hard to measure observationally. A competing "traditional" view of molecular cloud fragmentation assumes that the clouds are magnetically supported and that star formation is instead mainly controlled by ambipolar diffusion. We use numerical simulations of molecular cloud formation to explore which of these two star formation scenarios is more viable.
Our numerical experiments exploit self-organization of structures in multiphase MHD turbulence in a periodic domain of 100 pc threaded by a uniform magnetic field. Our statistical analysis of the turbulence in simulations with different degrees of magnetization demonstrates that molecular clouds are born super-Alfvenic even in models with the equipartition of kinetic and magnetic energy established on large scales. This gives strong support to the turbulent fragmentation scenario. Our next step is to include self-gravity of the gas to get a more complete picture of molecular cloud formation. We will be using a version of the ENZO code developed by the Laboratory for Computational Astrophysics at UC San Diego, which incorporates our new state-of-the-art solver for supersonic MHD turbulence simulation.
This research is supported in part by NSF grants AST-0808184 and AST-0908740, and by TeraGrid allocation MCA07S014. We utilized computing resources provided by the National Institute for Computational Sciences (Cray XT5 Kraken) and by San Diego Supercomputer Center. J Kinter. Dedicated High-End Computing to Revolutionize Climate Modeling: An International Collaboration

J Kinter. Dedicated High-End Computing to Revolutionize Climate Modeling: An International Collaboration
Abstract: The presentation will describe a collaboration bringing together an international team of over 30 people, from six institutions on three continents, including climate and weather scientists and modelers, and experts in high-performance computing (HPC), to demonstrate the feasibility of using dedicated HPC resources to rapidly accelerate progress in addressing one of the most critical problems facing the global community, namely, global climate change. The scientific basis for undertaking this project was established in the World Modeling Summit, held in May 2008 in Reading, UK. In this project, two types of computationally-intensive experiments used the entire 18,048-core Athena Cray XT-4 supercomputer at the University of Tennessee’s National Institute for Computational Sciences (NICS) for the period October 2009 – March 2010. The numerical experiments were intended to determine whether increasing weather and climate model resolution to accurately resolve mesoscale phenomena in the atmosphere can improve the fidelity of the models in simulating the mean climate and the distribution of variances and covariances. Explicitly resolving cloud processes in the atmosphere without approximation by parameterization will be examined as well. The effect of increasing greenhouse gas concentrations, associated with global warming, on the regional aspects of extreme temperature and precipitation, storminess, floods and droughts in key regions of the world also was evaluated in these experiments.
IFS Experiments: Experimental versions of the European Centre for Medium-range Weather Forecasts (ECMWF) Integrated Forecast System (IFS), a global atmospheric general circulation model, which is used operationally every day to produce 10-day weather forecasts, was run at several resolutions down to 10-km grid spacing to evaluate the statistical distribution and nature of high-impact and extreme events in 20th and 21st century simulations.
NICAM Experiments: The NICAM global atmospheric model from the Japan Agency for Marine-Earth Science and Technology (JAMSTEC) was run at 7-km grid resolution to simulate the boreal summer climate, over many years, focusing on tropical cyclones, monsoon systems, and summer flood and drought situations.
The project has stretched the limits of CPU, disk, I/O, metadata management and tape archive resources. Both models were run in long simulations for the first time in the U.S. The data generated by this project will be invaluable for the large communities of climate scientists interested in the impact of high-resolution modeling and computational scientists who will learn about operational considerations of running dedicated production at nearly petascale.

Ron Dror. Long-timescale simulations of GPCRs on Anton, a specialized molecular dynamics machine
Abstract: A mounting body of evidence indicates that G-protein–coupled receptors (GPCRs)—which represent the largest class of both human membrane proteins and drug targets—can interconvert between numerous conformational states with distinct intercellular signaling profiles. Molecular dynamics (MD) simulation offers a promising method to characterize these states and the transitions between them, but the timescales on which such transitions occur extend well beyond those traditionally accessible by MD. We recently completed a special-purpose machine, named Anton, that accelerates MD simulations of biomolecular systems by orders of magnitude compared with the previous state of the art, enabling for the first time all-atom protein simulations as long as a millisecond. This talk will describe ongoing studies of GPCRs on Anton, which have provided a hitherto elusive glimpse of the conformational dynamics underlying GPCR-mediated signaling by both endogenous ligands and drugs.

He Huang and Liqiang Wang.  PLSQR: An MPI based Parallel Implementation of LSQR Algorithm for Seismic Tomography
LSQR (Sparse Equations and Least Squares) is a Krylov subspace method widely used in seismic tomography. It usually contains hundreds or thousands of iterations in which sparse matrix vector multiplication (SpMV) is mainly involved. In the real seismic application, the matrix can be very large and sparse. It is almost impossible to execute LSQR sequentially on such large datasets because of time and memory limits. In this paper, we propose and implement a parallelized LSQR (PLSQR) based on MPI. The main features of PLSQR includes: (1) It partitions both the matrix and vector; (2)We avoid keeping the transposed copy of the matrix by reusing the original matrix; (3) To utilize the emerging GPU (Graphics Processing Unit) computing, CUDA-based SpMV kernels are developed and integrated with PLSQR; (4) A load balance strategy is designed to distribute the data partitions approximately evenly among compute processes. PLSQR has been tested and evaluated on TeraGrid Lincoln Cluster.


Contact Science Track Co-Chairs Philip Blood (PSC), and Amit Majumdar (SDSC).