PIs in University of California, Davis active in the last 90 days
Allocations with low numbers of SUs (10,000 or less) are usually those used as educational allocations, or are given as startup allocations, or extensions. Allocations with less than 10 SUs are usually used for storage purposes.

Go back Choose a different time period.

Name Project Title Teragrid Resource Discipline Board Type Base Allocation
Hajar Amini Construction of De novo transcriptome assembly to identify candidate pathway involved in the production of medicinal compounds in Ferula assafoetida IU/TACC Jetstream Biological Sciences Startup 300,000
" " IU/TACC Storage (Jetstream Storage) " " 2,000
Varaprasad Bandaru Developing Spatially Explicit Regional Modeling Framework for Studying Impacts of Poplar based Bioenergy Systems PSC Regular Memory (Bridges) Ecological Studies Startup 50,000
" " SDSC Dell Cluster with Intel Haswell Processors (Comet) " " 50,000
" " PSC Storage (Bridges Pylon) " " 500
" " SDSC Medium-term disk storage (Data Oasis) " " 500
Sebastian Bender Soil biodiversity and ecosystem functioning in agricultural systems IU/TACC Jetstream Ecological Studies Startup 50,000
" " IU/TACC Storage (Jetstream Storage) " " 100
C. Titus Brown Compute Infrastructure to Support the Data Intensive Biology Summer Institute for Sequence Analysis at UC Davis IU/TACC Jetstream Biological Sciences Educational 432,000
" UC Davis GGG 201b: lab section IU/TACC Jetstream " " 23,100
Nancy Chen Recombination rate variation in a wild population of Florida Scrub-Jays PSC Regular Memory (Bridges) Biological Sciences Startup 50,000
" " PSC Large Memory Nodes (Bridges Large) " " 1,000
" " PSC Storage (Bridges Pylon) " " 500
Roland Faller Direct phase equilibrium simulation of NIPAM oligomers in water and optimization of the potential XStream/Stanford University GPU Supercomputer (Cray CS-Storm, Intel Ivy-Bridge, NVIDIA K80) Chemistry Startup 5,000
Melissa Kardish The role of microbiota in mediating local adaptation and plant influence on ecosystem function in a marine foundation species, Zostera marina SDSC Dell Cluster with Intel Haswell Processors (Comet) Ecological Studies Startup 50,000
" " SDSC Medium-term disk storage (Data Oasis) " " 500
Louise Kellogg CIG Science Gateway and Community Codes for the Geodynamics Community TACC Dell/Intel Knights Landing, Skylake System (Stampede2) Geophysics Research 51,070
" " HP/NVIDIA Interactive Visualization and Data Analytics System (Maverick) " " 15,000
" " TACC Dell PowerEdge C8220 Cluster with Intel Xeon Phi coprocessors (Stampede) " " 12,063
" " TACC Long-term tape Archival Storage (Ranch) " " 10,000
Yong Jae Lee Large-scale Video Object Detection PSC Storage (Bridges Pylon) Robotics and Machine Intelligence Startup 15,000
" " PSC Bridges GPU (Bridges GPU) " " 6,250
John Naliboff Testing the scalability and numerical efficiency of long-term tectonic models of continental extension SDSC Dell Cluster with Intel Haswell Processors (Comet) Geophysics Startup 50,000
" " SDSC Medium-term disk storage (Data Oasis) " " 500
N Tessa Pierce Assessment of RNA editing over Doryteuthis opalescens development IU/TACC Jetstream Genetics and Nucleic Acids Startup 100,000
" " IU/TACC Storage (Jetstream Storage) " " 8,000
Yundi Quan surveying binary superconducting hydrides using ab-initio methods PSC Regular Memory (Bridges) Physics Startup 50,000
" " IU/TACC Jetstream " " 50,000
" " PSC Storage (Bridges Pylon) " " 1,000
NAVNEET RAI Prediction of cellular state using deep neural networks Open Science Grid (OSG) Biological and Critical Systems Startup 200,000
" " SDSC Comet GPU Nodes (Comet GPU) " " 2,500
" " SDSC Medium-term disk storage (Data Oasis) " " 1,000
Anandkumar Surendrarao : Genome assembly, annotation and characterization of Fusarium strains to understand their evolution in the context of chickpea pathogenicity TACC Dell/Intel Knights Landing, Skylake System (Stampede2) Biological Sciences Startup 1,600
" " TACC Long-term tape Archival Storage (Ranch) " " 500
Dean Tantillo MECHANISMS OF BIOORGANIC AND ORGANOMETALLIC CYCLIZATION REACTIONS SDSC Dell Cluster with Intel Haswell Processors (Comet) Organic and Macromolecular Chemistry Research 1,660,603
" " TACC Dell/Intel Knights Landing, Skylake System (Stampede2) " " 48,979
" " TACC Long-term tape Archival Storage (Ranch) " " 500
" " SDSC Medium-term disk storage (Data Oasis) " " 500
Igor Vorobyov Elucidation of molecular mechanisms of sex-dependent pro-arrhythmia through hERG block by drugs and steroid hormones SDSC Dell Cluster with Intel Haswell Processors (Comet) Biophysics Startup 50,000
" " XStream/Stanford University GPU Supercomputer (Cray CS-Storm, Intel Ivy-Bridge, NVIDIA K80) " " 5,000
" " SDSC Comet GPU Nodes (Comet GPU) " " 2,500
" " SDSC Medium-term disk storage (Data Oasis) " " 1,000
" Atomistic simulations to elucidate molecular mechanisms of drug- and hormone-induced pro-arrhythmia proclivities SDSC Dell Cluster with Intel Haswell Processors (Comet) " Research 1,503,790
" " SDSC Comet GPU Nodes (Comet GPU) " " 154,733
" " TACC Dell/Intel Knights Landing, Skylake System (Stampede2) " " 26,038
" " SDSC Medium-term disk storage (Data Oasis) " " 22,500
" " TACC Long-term tape Archival Storage (Ranch) " " 500
Andrew Wetzel Simulating the Local Group TACC Dell/Intel Knights Landing, Skylake System (Stampede2) Astronomical Sciences Research 116,674
" " TACC Long-term tape Archival Storage (Ranch) " " 30,000
Matthew Williamson Spatially explicit estimates of the likelihood of conservation action Open Science Grid (OSG) Ecological Studies Startup 100,000
" " PSC Regular Memory (Bridges) " " 10,000
" " PSC Large Memory Nodes (Bridges Large) " " 1,000
" " PSC Storage (Bridges Pylon) " " 500
Vladimir Yarov-Yarovoy State-dependent drug modulation of sodium channels Open Science Grid (OSG) Biophysics Startup 200,000
" " PSC Regular Memory (Bridges) " " 50,000
" " LSU Cluster (superMIC) " " 50,000
" " XStream/Stanford University GPU Supercomputer (Cray CS-Storm, Intel Ivy-Bridge, NVIDIA K80) " " 5,000
" " PSC Bridges GPU (Bridges GPU) " " 2,500
" " PSC Storage (Bridges Pylon) " " 1,000
Close

Project Abstract

Construction of De novo transcriptome assembly to identify candidate pathway involved in the production of medicinal compounds in Ferula assafoetida

PI: Hajar Amini



Ferula assafoetida is an important source of oleo-gum-resins such as asafoetida, which is useful for therapeutic industries such as inflammations, neurological disorders, digestive disorders, rheumatism, neurological disorders, headache, arthritis and dizziness. Therefore it is important to determine the biological properties of Oleo-gum-resin compounds isolated from F. assafoetida. However, in spite of the known medicinal attributes of compounds from F. assafoetida, most of these compounds as well as the enzymes involved in their biosynthesis remain uncharacterized at the molecular level. Therefore we decided to evaluate the transcriptome and metabolome of different tissues of F. assafoetida to identify candidate mechanisms and pathway involved in the production of some important medicinal compounds. This proposal is for requesting resources from Jetstream cloud for the purpose of assembling the transcriptome of Ferula from RNA-Seq reads generated from three different plant species and from four different tissues. De novo transcriptome assembly will be constructed using Trinity after the reads have been subjected for quality trimming and digital normalization. De novo assembly construction is considered as highly memory intensive and time taking process and it involves several iterations of running the assembler with different k-mer sizes until an optimum assembly is generated. Once the assembler is constructed, the assembly will be assessed using a variety of tools such as Transrate, BUSCO and so on. The final part of the analysis will be annotating the assembling transcriptome using Dammit software. Currently I am using High Performance Computing cluster at UC Davis for initial assembly, but there is a long wait time to start any kind of analysis on the cluster. Allocation of resources on the public Jetstream and persistent storage will allow us to further exploration of this data set and running whole pipeline easily and rapidly. Our results from this analysis will facilitate studies on the functions of genes involved in the secondary metabolite biosynthesis pathway in other medicinal plants. Furthermore the information about metabolic pathways of this transcriptome is very valuable for understanding the biosynthesis process of the production Oleo-gum-resin such as the place that is produced, or the tissue that transfer it to other parts etc., Resources Request Information: In order to achieve the goals, i request the following: 100,000 SU’s s1.xxlarge (44 CPUs, 120 GB memory, 480 GB disk) instance 1 TB external volume space for storing my raw RNA-Seq reads as well as all the outputs and intermediate files generated from
Close

Project Abstract

Developing Spatially Explicit Regional Modeling Framework for Studying Impacts of Poplar based Bioenergy Systems

PI: Varaprasad Bandaru



As a renewable energy source, biofuels are expected to play an important role in sustainably meeting U.S long term energy goals. As part of a larger regional research and development initiative focused on researching sustainable ways of producing biofuel from poplar production in the U.S Pacific Northwest region (for details, visit http://hardwoodbiofuels.org), we are interested in modeling hybrid poplar to understand different aspects at the regional level including 1) identification of potential locations for growing poplar plantation; 2) assessing inherent biomass potential on suitable locations; 3) evaluating environmental and economic impacts with adoption of hybrid poplar in the place of current croplands and conserved grasslands. For this assessment, we are planning to implement the Environmental Policy Integrated Climate Model (EPIC) at high spatial resolution. The EPIC is an integrated biophysical and biogeochemical simulation model that can be used to assess available feedstock for bioenergy, water and soil quality, greenhouse gas emissions, and nutrient loss under various climate and management conditions. Since the EPIC model is a point scale model, when applied at the spatial scale, each pixel is considered as one simulation point. Using earlier allocations, we were able to build a framework to run the EPIC model using parallel computing and made simulations for small regions in Pacific Northwest region but requires implementing over all croplands and grasslands in PNW region. As such, we need to have access to GORDON and we would like to request renewal of our project to another one year.
Close

Project Abstract

Developing Spatially Explicit Regional Modeling Framework for Studying Impacts of Poplar based Bioenergy Systems

PI: Varaprasad Bandaru



As a renewable energy source, biofuels are expected to play an important role in sustainably meeting U.S long term energy goals. As part of a larger regional research and development initiative focused on researching sustainable ways of producing biofuel from poplar production in the U.S Pacific Northwest region (for details, visit http://hardwoodbiofuels.org), we are interested in modeling hybrid poplar to understand different aspects at the regional level including 1) identification of potential locations for growing poplar plantation; 2) assessing inherent biomass potential on suitable locations; 3) evaluating environmental and economic impacts with adoption of hybrid poplar in the place of current croplands and conserved grasslands. For this assessment, we are planning to implement the Environmental Policy Integrated Climate Model (EPIC) at high spatial resolution. The EPIC is an integrated biophysical and biogeochemical simulation model that can be used to assess available feedstock for bioenergy, water and soil quality, greenhouse gas emissions, and nutrient loss under various climate and management conditions. Since the EPIC model is a point scale model, when applied at the spatial scale, each pixel is considered as one simulation point. Using earlier allocations, we were able to build a framework to run the EPIC model using parallel computing and made simulations for small regions in Pacific Northwest region but requires implementing over all croplands and grasslands in PNW region. As such, we need to have access to GORDON and we would like to request renewal of our project to another one year.
Close

Project Abstract

Soil biodiversity and ecosystem functioning in agricultural systems

PI: Sebastian Bender



Soils are among the most species rich habitats on Earth, and are of fundamental importance for terrestrial ecosystems. It is increasingly being recognized that human land-use, such as intensive agricultural land management, has adverse effects on soil biota and their diversity. Moreover, recent research findings suggest that soil organisms are key players for ecosystem functioning and, hence, determine the ecosystem services delivered by soils. Therefore, reductions in soil biodiversity induced by human land-use may also lead to a decline in ecosystem functioning. First evidence for this has been generated in model systems in greenhouse experiments showing that reductions in soil biodiversity lead to a decline of several ecosystem functions simultaneously (i.e. ecosystem multifunctionality), but field-based evidence for this relationship is rare. A detailed understanding of the factors and processes determining ecosystem-service delivery is, however, of pivotal importance for human well-being. In this project, the effect of agricultural management intensity on soil biota and ecosystem multifunctionality will be investigated in a range of fields differing in management intensity across Northern California. Moreover, it will be tested whether the removal of soil organisms has stronger effects in natural and extensively managed ecosystems as compared to intensively managed systems. It is hypothesized that natural ecosystems possess a high capacity for internal self-regulation, provided by soil organisms. With increasing land-use intensity, the capacity of ecosystems for internal self-regulation is reduced, as these systems comprise lower soil biodiversity and depend on external resource inputs (e.g. fertilizers). Assessments of soil biodiversity and ecosystem functioning will be complemented with state-of-the-art metagenomic analyses to analyse the functional capacities of soils and to identify potential indicator species for the respective land-use types. This project will provide important basic information on the role played by soil biodiversity in ecosystem functioning and how this is affected by land-use intensity.
Close

Project Abstract

Compute Infrastructure to Support the Data Intensive Biology Summer Institute for Sequence Analysis at UC Davis

PI: C. Titus Brown



Large datasets have become routine in biology. However, performing a computational analysis of a large dataset can be overwhelming, especially for novices. From June 18 to July 21, 2017 (30 days), the Lab for Data Intensive Biology will be running several different computational training events at the University of California, Davis for 100 people and 25 instructors. In addition, there will be a week-long instructor training in how to reuse our materials, and focused workshops, such as: GWAS for veterinary animals, shotgun environmental -omics, binder, non-model RNAseq, introduction to Python, and lesson development for undergraduates. The materials for the workshop were previously developed and tested by approximately 200 students on Amazon Web Services cloud compute services at Michigan State University’s Kellogg Biological Station from 2010 and 2016, with support from the USDA and NIH. Materials are and will continue to be CC-BY, with scripts and associated code under BSD; the material will be adapted for Jetstream cloud usage and made available for future use.
Close

Project Abstract

UC Davis GGG 201b: lab section

PI: C. Titus Brown



Prokaryotic and eukaryotic genomes. Experimental strategies and analytical challenges of modern genomics research and the theory and mechanics of data analysis. Structural, functional, and comparative genomics. Related issues in bioinformatics. In this course, we run 10 practical computational labs and have three associated homeworks. The labs cover software install, shell-level data analysis, Jupyter Notebook & RStudio-based visualization, etc.
Close

Project Abstract

Recombination rate variation in a wild population of Florida Scrub-Jays

PI: Nancy Chen



Meiotic recombination plays an important role in determining levels of genetic diversity in eukaryotic genomes. Understanding the causes and consequences of variation in recombination rates is therefore crucial for predicting how populations respond to selection and studying genotype-phenotype associations. However, our knowledge of the factors governing recombination rate variation in natural populations remains limited. We propose to investigate the recombination landscape and individual variation in recombination rates using extensive pedigree and genomic data in a wild population of Florida Scrub-Jays (Aphelocoma coerulescens). We will build a high-density linkage map using CRIMAP to estimate individual recombination rates across the genome for males and females, then test for environmental and genetic factors associated with recombination rate variation. A detailed linkage map for the Florida Scrub-Jay will also provide insights to the avian recombination landscape and serve as an important resource for studies of evolution.
Close

Project Abstract

Direct phase equilibrium simulation of NIPAM oligomers in water and optimization of the potential

PI: Roland Faller



N-isopropylacrylaminde-based polymers (PNIPAM) are one of the best-studied thermoresponsive materials. These can be used in a wide range of applications, including catalysis, sensors, enzyme encapsulation and drug delivery, which makes it very desirable to understand the molecular behavior of PNIPAM. In water, PNIPAM shows a lower critical solution temperature (LCST) at 305 K and a conformational transition of single chains at the same temperature. Below this temperature PNIPAM is completely soluble in water, but above the LCST water and PNIPAM separate into two pure phases. In the last years this behavior has been simulated in atomistic molecular dynamics (MD) simulations to gain a deeper understanding of the mechanisms leading to the phase separation. Because the molecular mechanisms are very complex, to this date the correct phase behavior has not been simulated without an error regarding the LCST. For better results in MD simulations a modification of the potential for PNIPAM can be introduced, such that the LCST is shifted to the experimental observed value. The objective of this work is to adapt a modification of the potential, such that it fits the experimentaly observed data. To archive this goal, MD simulations of oligo-NIPAM using Amber94 + TIP3P force fields will be performed. The parameter for modification of the potential will then be fitted to match the experimental results. Therefore, the experimental results for NIPAM-oligomers (ONIPAM), synthesized at the “Leibniz-Institut für Interaktive Materialien” (DWI), are available. Thus a model, which can simulate the real LCST of ONIPAM, will be developed.
Close

Project Abstract

The role of microbiota in mediating local adaptation and plant influence on ecosystem function in a marine foundation species, Zostera marina

PI: Melissa Kardish



Increasing research suggests that microbiota interact with plants and animals to alter host fitness and disease resistance. Furthermore, microbiome composition can vary among host genotypes and environments, and may contribute to observed variation in host phenotype. Individual variation in phenotype within key species, such as foundation plant species or keystone consumers, affects the structure and functioning of entire ecosystems, providing a potentially important mechanism by which microbiomes contribute to the functioning of macroscopic ecosystems. However, few experiments test causal links between host phenotype and microbiome composition, and, outside of a few model systems, virtually no studies examine the cascading effects of variation in a host’s microbiome on communities or ecosystems. I conducted a series of reciprocal transplants of the marine angiosperm, Zostera marina, and have sequenced the V4-V5 region of the 16 S gene of bacteria associated with leaves, roots, and adjacent sediment. This will allow me to examine the sources of natural variation in the microbiome of the marine angiosperm Zostera marina (eelgrass), and the potential consequences of microbiome composition for host fitness, host local adaptation, and the effect of eelgrass on ecosystem structure and functioning. To accomplish this analysis, I would like to use Qiime the Gordon Computing Cluster to assist in the processing of 16S data from these transplants as well as from temporal data.
Close

Project Abstract

CIG Science Gateway and Community Codes for the Geodynamics Community

PI: Louise Kellogg



The Computational Infrastructure for Geodynamics (CIG), an NSF cyberinfrastructure facility, aims to enhance the capabilities of the geodynamics community through developing software that can be used to address a range of challenging problems in geophysics. CIG supports code development and benchmarking, user training, and new users by providing small allocations of computation time along with user support for CIG codes. CIG supports the aforementioned efforts in the following areas of activity: mantle dynamics, seismic wave propagation, geodynamo, and crustal and lithospheric dynamics on both million-year and earthquake time-scales. These efforts have resulted in successful allocation requests by our community and involvement of international researchers in benchmarking the next generation of geodynamo codes all of which were enabled by our community allocation.
Close

Project Abstract

Large-scale Video Object Detection

PI: Yong Jae Lee



Visual object detection is a fundamental problem in computer vision, and has broad applicability in numerous fields including AI, defense, medicine, and agriculture. While there has been a long history of research in detecting objects in static images, there has been relatively little research in detecting objects in videos. However, cameras on robots, surveillance systems, unmanned vehicles, wearable devices, etc., receive videos and not static images. Thus, for these systems to recognize the key objects and their interactions, it is critical that they be equipped with accurate video object detectors. In this project, we propose a novel machine perception framework for detecting objects in video. The key contribution is a deep recurrent spatial temporal memory network that models the long-term temporal dependencies of an object's appearance and motion. We have begun development of the model and have initial results on the ImageNet-VID dataset, which contains 5000+ videos across 30 object categories. We would like to make improvements to do the approach in both speed and accuracy, and to evaluate it on a larger-scale dataset (e.g., YouTube-BB dataset, which has 240,000 videos). The compute requirements to do so are beyond the current resources available at UC Davis. Our model currently takes 0.2 seconds to process each frame, which by in itself is not slow, but this quickly adds up when we have to process e.g., 240,000 videos each with thousands of frames. Thus, we would like to parallelize the work on a GPU cluster. We are requesting 2500 GPU hours on the Bridges GPU (PSC) cluster and 15 terabytes of storage on the PSC Pylon.
Close

Project Abstract

Testing the scalability and numerical efficiency of long-term tectonic models of continental extension

PI: John Naliboff



This request for a startup allocation on the XSEDE-supported cluster Comet follows preliminary scaling tests on Comet and prior work on the Stampede1 cluster related to my research at the Computational Infrastructure for Geodynamics (CIG). As a project scientist at CIG, my work centers on developing, testing and applying the finite-element code ASPECT to simulations of long-term tectonic deformation (viscous and brittle behavior) in the solid earth. ASPECT is built on the open source finite element library deal.II, which provides massive scalability across 10^3-10^4 cores, adaptive mesh-refinement capabilities and robust linear and non-linear solvers. To date, strong and weak scaling tests with ASPECT have been performed on a wide range of clusters, including Stampede1, Lonestar, HLRN (Berlin) and many additional smaller clusters. These scaling results have been published in multiple peer-reviewed articles and are also contained in the CIG proposal for computing on Stampede 1: “CIG Science Gateway and Community Codes for the Geodynamics Community” (TG-EAR080022N). Here, I am applying for a startup allocation of 50,000 core-hours on Comet to perform additional scaling tests with ASPECT and test the relative efficiency of different model configurations (non-linear solver tolerances, linear solver schemes, etc) for 3-D simulations of continental extension. These simulations of continental extension are built on extensive 2-D and 3-D sensitivity tests for relatively small model sizes (< 10^7 degrees of freedom) and a limited (< 5) number of large 3-D simulations (> 10^8 degrees of freedom) run on STAMPEDE 1. Through a small trial allocation (1000 SUs) on Comet, I have performed strong and weak scaling tests on up to 96 cores for models that range from ~60,000 to ~16,000,000 degrees of freedom. This trial allocation will be used in part for scaling tests that examine models with up to 10^9 degrees of freedom run across hundreds or thousands of cores (up to 1536). Notably, these scaling tests are based on relatively simple models that only require using linear solvers and do not contain large variations in material parameters. To ensure the code scales efficiently on Comet for models using a non-linear rheology and large (orders of magnitude) variations in material properties, I will perform a second series of scaling tests with a model setup derived from earlier simulations of continental breakup. While these two series of scaling tests will likely require on the order of 10-20 thousand core-hours, the models are only run for a small number (1-2) of time steps. In contrast, the simulations of continental breakup require thousands of time steps, during which the dynamics can change significantly. To ensure that the predicted scaling behavior extends throughout the model duration, the remaining core-hours (30-40 thousand) will be used to run one large simulation to completion. This estimate is based on the preliminary 3-D simulation of continental extension (~ 108 degrees of freedom) run on STAMPEDE1. As an example, one model required 10.333 hours and 960 cores (~ 9920 core hours) to run for 25% of the simulation time planned for future models. The results of the scaling tests and trial simulation outlined above will form the basis of a proposal requesting further computing time on Comet for a series of production runs. If further details regarding the details of ASPECT or the planned scaling tests is required, I will provide this information in haste. Thank you for the consideration of this startup allocation request of 50,000 core-hours and I will be looking forward to hearing from you.
Close

Project Abstract

Assessment of RNA editing over Doryteuthis opalescens development

PI: N Tessa Pierce



This proposal is requesting resources from the IU/TACC Jetstream cloud system for the purpose of assessing differential RNA editing over development in the California market squid, Doryteuthis opalescens. Recent work has identified extensive RNA editing of coding sequences as a unique characteristic of adult coleoid cephalopods, and suggested that it may contribute to the neural and behavioral plasticity that characterizes these animals [1]. Preliminary identification of putative transcriptome-wide editing sites and RNA editing enzymes in Doryteuthis opalescens developmental transcriptomes suggests a role for RNA editing in developmental plasticity as well. To elucidate this role, I will analyze corresponding RNA and DNA Illumina data from Doryteuthis opalescens to investigate the prevalence of RNA editing across a time course of replicated samples from six time points ranging from early development until hatching. [1] Liscovitch-Brauer, Noa, Shahar Alon, Hagit T. Porath, Boaz Elstein, Ron Unger, Tamar Ziv, Arie Admon, Erez Y. Levanon, Joshua JC Rosenthal, and Eli Eisenberg. "Trade-off between transcriptome plasticity and genome evolution in cephalopods." Cell 169, no. 2 (2017): 191-202.
Close

Project Abstract

surveying binary superconducting hydrides using ab-initio methods

PI: Yundi Quan



Discovery of superconductivity in pressurized hydrogen sulfides has stimulated renewed interest in hydrides. Much research effort has been focused on structure prediction, while the issue of numerical convergence with respect to k- and q-mesh is often neglect. In this project, we carry out systematic ab-initio calculations based on Wannier function interpolation of electronic and lattice degrees of freedom to study all the binary hydrides discovered so far under various pressures. We aim to build a database of highly accurate electronic structure and phonon spectrum of existing binary compounds to help understand possible connection between the Tc of a hydride and its various physical properties.
Close

Project Abstract

Prediction of cellular state using deep neural networks

PI: NAVNEET RAI



Accurate prediction of cellular state in new conditions is of significant interest in biology due to it’s impact on food, medicine and the environment. To capture complexity of cellular organisms, a predictive model should exhibit high knowledge representation capacity in par with the cell itself. Deep neural networks provide high representation capacity but their dependence on big datasets is a challenge for biological predictive tasks depending on OMICS data. Even for the most well studied microbe Escherichia coli, the largest OMICS compendium contains only 4389 genome-wide profiles for 649 conditions. To circumvent this gap, we generate large realistic OMICS datasets given various assumed biophysical properties of cellular organism using simulation software. This enables exploration of various neural network architectures and helps identify applicability of each architecture given the circumstances (e.g. organism complexity, data size, etc.). Using such simulation data for the task of predicting steady state gene expression in novel conditions, we developed novel neural network architecture outperforming existing models (10%-40% higher PCC) when evaluated on small sub-organisms (2-100 genes). To this end, we want to use OSG to help finish the current evaluation and provide evidence for applicability of our novel approach. For larger models we expect to need GPUs hence requested Comet.
Close

Project Abstract

: Genome assembly, annotation and characterization of Fusarium strains to understand their evolution in the context of chickpea pathogenicity

PI: Anandkumar Surendrarao



Fusarium oxysorum is a fungal pathogen with a very broad host range. It can associate with animals, including humans, and also plants, including both gymnosperms and angiosperms. One of the plant species that is adversely affected is chickpea, Cicer spp. The cultivated C. arietinum crop can experience even 100% losses due to wilt caused by F. oxysporum forma specialis ciceris (FOC). A major goal of our research group at UC Davis is to re-domesticate chickpea using wild germplasm. One of the agronomic traits of interest, that we wish to introduce into new varieties is Fusarium resistance. This requires understanding of Fusarium genomics and evolution by itself, and also in the context of chickpea genomics and co-evolution. Towards this goal, 290 strains of FOC were collected from a wide range of geographies in Ethiopia. Currently I am using Farm – a High Performance Computing cluster at UC Davis for my bioinformatic nees. However, user account limits me to not more than 7 jobs at a time, due to RAM availability limits. This is a major computational bottleneck that STAMPEDE allocation can help overcome quickly and efficiently. Results from my Fusarium genome analyses will serve many scientific goals - understand evolution of core and accessory genomes across these 290 pathovars, extrapolate these findings to other forma specialis strains of Fusarium oxysporum (that infect other plant hosts), and inform breeding strategies for development of Fusarium resistant chickpea varieties, locally adapted to various growing regions of the world. The STAMPEDE allocation will be used to perform typical genomics analyses steps, including but not limited to Illumina reads quality control and pre-processing, de novo genome assembly, gene prediction, genome annotation, orthology determination, core/accessory genome prediction, gene flow analyes, and correlating pathogen phenotypes with structural and single nucleotide variants (i.e. GWAS).
Close

Project Abstract

MECHANISMS OF BIOORGANIC AND ORGANOMETALLIC CYCLIZATION REACTIONS

PI: Dean Tantillo



The focus of the research proposed herein, a renewal of CHE030089N, is to apply modern quantum chemical methods to the elucidation of molecular mechanisms of organic chemical reactions that are used in the synthesis and biosynthesis of polycyclic organic molecules. During this award period, we focus on reactions for which we suspect non-statistical dynamics effects play important roles. Consequently, we will focus our efforts on direct/ab initio molecular dynamics calculations, which are the most time-consuming calculations we carry out (other routine calculations will be carried out in-house).
Close

Project Abstract

Elucidation of molecular mechanisms of sex-dependent pro-arrhythmia through hERG block by drugs and steroid hormones

PI: Igor Vorobyov



Common and sometimes fatal heart rhythm disorders such as long QT syndrome (LQTS) have been linked to mutations in cardiac ion channels as well as unwanted drug interactions with those proteins. Female sex has been shown to be an independent risk factor for both inherited and acquired LQTS as well as associated arrhythmias. tentatively correlated with differential levels of sex hormones (estradiol, progesterone and testosterone) playing opposite roles in proclivities for heart rhythm disturbances. There is a critical need to understand cardiac ion channel modulation by drugs and/or sex hormones at the molecular level to develop safer and effective therapeutics. We will focus on drug and/or hormone interactions with the human ether-a-go-go (hERG) potassium channel (KV11.1), a major contributor to a cardiac action potential repolarization and an anti-target for diverse drug molecules. We propose atomistic modeling and simulation approaches to compute binding affinities of hormones and LQTS inducing drugs such as dofetilide as a first step. A recent cryo-EM structure of hERG will be used for those studies. We will use quantum mechanical (QM) calculations using Gaussian software to develop and/or validate drug and hormone force field parameters. Drug and hormone binding to hERG will be tested using both long unbiased drug/hormone “flooding’ as well as multi-window restrained umbrella sampling (US) molecular dynamics (MD) simulations using NAMD. Therefore we request the following XSEDE allocations: Comet (SDSC) – 50,000 cpu hours (to be used for QM and US MD simulations, XStream (Stanford) – 5000 GPU hours (or Comet GPU when it becomes available) for “flooding” MD. These estimates are based on benchmarks on our local resources (small 10-node GeForce GTX 1080 / Xeon E5-2620 GPU/CPU cluster and workstations). Equivalent resource substitution or using Open Science Grid up to an allowed maximum is a reasonable alternative for this project as well. The proposed allocation will be used for a few runs described above to provide preliminary scientific (including feasibility and convergence) and benchmarking data for a larger scientific allocation to be submitted in the nearest future.
Close

Project Abstract

Atomistic simulations to elucidate molecular mechanisms of drug- and hormone-induced pro-arrhythmia proclivities

PI: Igor Vorobyov



The human voltage gated potassium channel Kv11.1 encoded by the hERG gene is the key repolarizing K+ current in cardiomyocytes. Many drugs are known to block IKr, which can lead to an acquired long QT syndrome, the standard clinical ECG-based indicator for an increased risk of ventricular arrhythmias such as Tosades de Pointes (TdP). In fact, hERG block is a common side effect for drugs and drug candidates, leading to their withdrawal from the market. However, hERG blockers have very different proclivities for arrhythmogenesis and thus cardiac safety profiles. We hypothesize that the fundamental mode of drug interaction, derived from the unique structure activity relationship of a drug, determines the resultant effects on cardiac electrical activity in cells and tissue. In addition, female sex has been shown to be an independent risk factor for an acquired LQTS and TdP, and while our recent work has shown that drugs and estrogen can coexist in the hERG pore, the molecular mechanisms of these interactions and their effects remain unknown. Multiple unbiased and restrained molecular dynamics (MD) simulations on XSEDE CPU and GPU resources reaching tens of microseconds in total are ideally suited to explore atomic-scale basis of these effects. Now is also the best time for these studies since high-resolution cryo-EM structures of an open state hERG and a closed state of a homologous rEAG channels have recently become available. We have already begun testing hERG open state stability through long unbiased MD simulations using our local resources and an active DE Shaw Anton 2 allocation. Also, we have been working on homology models of hERG inactivated state through structural modeling with ROSETTA supported by electrophysiological measurements on mutant channels by our experimental collaborators. Multiple MD simulations will be crucially important for the validation of those models by considering different applied voltages, as well as channel mutants that are known to preferentially stabilize hERG in distinct conformational states. Accurate empirical force field parameters for several cardiac-safe and pro-arrhythmic hERG-blocking drugs have been developed in our laboratory as well using both local resources and a startup XSEDE allocation on Comet at SDSC. We will be validating those parameters and computing drug membrane partitioning thermodynamics and permeation rates using umbrella sampling MD simulations for drug/hydrated lipid membrane systems. This information will be used to perform multi-microsecond drug and hormone “flooding” unbiased and umbrella sampling all-atom MD simulations with hERG in distinct conformational states. We will also investigate the effects of drug ionization states as well as applied voltage, which can all modulate drug binding affinities and thus their pro-arrhythmia proclivities in hope to obtain an accurate molecular picture of channel state-dependent drug binding and egress pathways along with corresponding energetics. This information will allow us to predict pro-arrhythmia determinants at the molecular level and will help to develop new pharmaceuticals with improved cardiac safety profiles.
Close

Project Abstract

Simulating the Local Group

PI: Andrew Wetzel



A wealth of exciting ongoing/upcoming observational projects are targeted to near-field cosmology and galactic archaeology in the Local Group, by measuring stellar populations and phase-space distribution of stars in/around the Milky Way (MW), Andromeda (M31), and its satellite dwarf galaxies at unprecedented levels (for example, Hubble Space Telecope, SDSS-APOGEE, the Dark Energy Survey, Gaia, and LSST). These observational campaigns are revolutionizing our understanding of galaxy formation as well as the nature of dark matter on the smallest cosmological scales. However, interpreting and understanding these results, including making predictions for upcoming observations, requires ultra-high-resolution cosmological simulations, which can resolve structure on 1 pc scales, and which include the necessary physics of hydrodynamics, star formation, and feedback, all carefully targeted to the environment of the Local Group. We request a renewal allocation to continue our ultra-high-resolution simulations of galaxy evolution, star formation, and stellar feedback, with which we will study the physics of the interstellar medium (ISM), the formation of stars, stellar feedback, galaxy formation, and the cosmological distribution of dark matter, with new physics and unprecedented resolution. This renewal will allow us to build on the significant numerical and physical advances that our previous XSEDE research allocations have enabled, to run a targeted suite of simulations to understand the Local Group, comprising the MW, M31, the Large Magellanic Cloud (LMC), and numerous satellite dwarf galaxies. Each of our simulated systems will be resolved with > 200 million particles and followed self-consistently over their entire history to the present day in live cosmological settings carefully matched to the Local Group environment. Our proposed simulations will address a wide array of timely scientific questions. For the galaxies like the MW and M31, we will study in detail (1) gas accretion, angular momentum transport, and its role in disk formation, including the impact of close pairs of galaxies like the MW and M31, (2) stellar migration and chemical mixing within the disk, (3) the impact of massive satellites/subhalos on kinematic heating of the disk. Our simulations also span the dynamic range needed to model the satellite dwarf galaxies that are observed around the MW and M31, including the relevant baryonic physics to predict the properties of their observed stars: from massive satellites like the LMC with Mstar = 2 × 10^9 M⊙ to faint dwarf galaxies with Mstar ∼ 10^5 M⊙. Because they are so faint and dark-matter dominated, such “dwarf” galaxies represent a key frontier field for testing (1) the Cold Dark Matter (CDM) paradigm of cosmology, (2) the epoch of reionization, and (3) the most extreme regimes of galaxy formation. In this renewal, we request 180,000 SUs (node-hours) on Stampede2 to run a suite of simulations targeted to the environment of the Local Group. Specifically, we will run a realization of a Local Group-like pair of MW and M31-like galaxies (120,000 SUs) and a realization of a MW- like galaxies with an LMC-like satellites (60,000 SUs). To compare with observations of the Local Group, we must run each simulation across its entire formation history to the present day (z = 0).
Close

Project Abstract

Spatially explicit estimates of the likelihood of conservation action

PI: Matthew Williamson



Conservation is an inherently human endeavor - initiated, designed, and deployed by humans to alter future behavior or undo previous impacts to affect positive changes in biodiversity. Predicting where conservation will occur in the future, however, remains a challenge. We propose a conceptual model where conservation action is determined by gradients of ecological value, individual willingness, and institutional ability. Conservation action may occur at any location along this 3-dimensional continuum, but becomes increasingly likely as ecological values, individual willingness, and institutional ability simultaneously approach their maxima. We demonstrate this approach using available high-resolution, spatially explicit data on demographics, economic drivers, institutional characteristics, and environmental conditions to evaluate the degree to which the spatial coincidence of these factors affects the likelihood of conservation. We use Bayesian hierarchical models that treat past conservation action as probabilistic outcomes of the interaction of ecological, institutional, and social covariates to identify key explanatory variables influencing the likelihood of conservation action using multi-model inference and hierarchical variance partitioning to evaluate the relative importance of each factor in explaining past conservation. We then implement these models in a GIS to generate probabilistic surfaces of the likelihood conservation action to identify where conservation is likely in the future.
Close

Project Abstract

State-dependent drug modulation of sodium channels

PI: Vladimir Yarov-Yarovoy



The goal of this project is to study the molecular mechanisms of voltage gated sodium (Nav) channel gating and modulation using molecular dynamics (MD) simulations. Our proposal will take advantage of several recent breakthroughs in the field of Nav channel structure: (1) a cryo-electron microscopy (cryoEM) structure of the first eukaryotic Nav channel (with pore-forming domain in the closed state and voltage-sensing domains in either activated or intermediate state); (2) new X-ray structures of bacterial Nav channels (with pore-forming domain in its open state); and (3) we have used Rosetta computational modeling software and MD simulations to generate stable ion conductive open state models of a bacterial Nav channel. We propose to simulate our new Rosetta structural models for human Nav channels in open, closed and inactivated states. This will enable demonstration of the molecular mechanisms of channel activation and inactivation. Experimental studies have identified structural regions forming the binding sites of small molecule inhibitors on Nav channels, yet the molecular mechanisms of modulation remain unclear. The proposed simulations on XSEDE supercomputers will significantly advance our basic knowledge of Nav channel gating and modulation, providing new understanding that may lead to novel therapeutics for neurological, muscular and cardiac diseases.