Science Success Story

« Back

Supercomputers Aid in Maternal and Paternal DNA Research at UC Irvine

XSEDE Allocations May Help Improve Medicine and Food Production

By Kimberly Mann Bruch, SDSC Communications

 

Genomes of the Chardonnay grape (Vitis vinifera), mosquito (Anopheles funestus; 200 Mb) and thorny skate (Amblyraja radiata; 2650 Mb) were sequenced using multiple supercomputers. 

Credit: UC Irvine.

University of California Irvine scientists recently used National Science Foundation Extreme Science and Engineering Discovery Environment (XSEDE) allocations on Comet at the San Diego Supercomputer Center, at UC San Diego, and Bridges at the Pittsburgh Supercomputing Center, to better understand contributions from maternal and paternal lineages in genome sequences. 

"While sequencing genomes has become a fundamental goal and tool in science, the problem has been that many genomes have been difficult to fully resolve by sequencing because they contain distinct contributions from maternal and paternal lineages," said Brandon Gaut, an ecology and evolutionary biology professor at UC Irvine. "Our work used computer science optimization methods to help resolve the accuracy of separating maternal and parental DNA in genomes."

Why It's Important

This novel research, detailed in a January 2021 BMC Bioinformatics journal article, not only leads to improvements in genome completeness, but also helps scientists better understand the genetic relationships between individuals, populations and species. In turn, that may lead to improvements in medicine and food production for varying populations.

"Comet and Bridges were powerful enough to run our new genome sequence haplotype separation and optimization method called HapSolo," said NSF Graduate Student Fellow Edwin Solares, first author of the journal article and also funded by the UC President's Pre-Professoriate Fellowship. "With the help of the XSEDE allocations on supercomputers, we were able to illustrate the performance of HapSolo on genome data from three species: the Chardonnay grape, a mosquito and the thorny skate."

How XSEDE Helped

NSF Graduate Student Fellow Edwin Solares

Solares explained that Comet and Bridges ran calculations for the genomes of the Chardonnay grape (Vitis vinifera) with a genome of 490 Mb, a mosquito (Anopheles funestus; 200 Mb) and the thorny skate (Amblyraja radiata; 2650 Mb). "The use of supercomputers for these analyses cut our run time in half for several of our samples," he said. "Being able to use XSEDE resources allowed us to focus on the science – instead of computational issues that are certain to arise without the use of supercomputers like Comet and Bridges."

Solares is supported by an NSF Graduate Research Program Fellowship Grant (DGE-1321846), which supported his time to formulate and execute the study. Additional support came from the NSF (grant no. 1741627), NIH (grant nos. R01OD010974 and R01GM115562) and XSEDE awards (ACI-1548562, ACI-1445606 and TG-MCB180035). 

About SDSC 

The San Diego Supercomputer Center (SDSC) is a leader and pioneer in high-performance and data-intensive computing, providing cyberinfrastructure resources, services, and expertise to the national research community, academia, and industry. Located on the UC San Diego campus, SDSC supports hundreds of multidisciplinary programs spanning a wide variety of domains, from astrophysics and earth sciences to disease research and drug discovery. In December 2020 SDSC's newest National Science Foundation-funded supercomputer, Expanse, entered production. At over twice the performance of Comet, Expanse supports SDSC's theme of ‘Computing without Boundaries' with a data-centric architecture, public cloud integration, and state-of-the art GPUs for incorporating experimental facilities and edge computing.

About PSC

The Pittsburgh Supercomputing Center (PSC) is a joint computational research center of Carnegie Mellon University and the University of Pittsburgh. Established in 1986, PSC is supported by several federal agencies, the Commonwealth of Pennsylvania and private industry and is a leading partner in XSEDE, the National Science Foundation cyber infrastructure program. PSC provides university, government and industrial researchers with access to several of the most powerful systems for high-performance computing, communications and data storage available to scientists and engineers nationwide for unclassified research. PSC advances the state of the art in high-performance computing, communications and data analytics and offers a flexible environment for solving the largest and most challenging problems in computational science.

Media Contact: 

Kimberly Mann Bruch, SDSC Communications, kbruch@sdsc.edu

Related Links:

San Diego Supercomputer Center: https://www.sdsc.edu/

Pittsburgh Supercomputing Center: https://www.psc.edu/

UC San Diego: https://ucsd.edu/

UC Irvine: https://uci.edu/

National Science Foundation: https://www.nsf.gov/

XSEDE: https://www.xsede.org/

 

At a Glance:

  • XSEDE supercomputers were used to showcase the performance of a novel computer science optimization method on genome data from a thorny skate, a mosquito and a Chardonnay grape.
  • The optimization method, known as HapSolo, helps researchers better understand an array of genetic relationships.
  • This new knowledge may lead to improved medicine and food production for an array of populations.