Science Success Story
Cloud Computing Expands Brain Sciences
Leading neuroscientist relies on XSEDE resources for Brainlife.io platform
By Faith Singer, Texas Advanced Computing Center
|White matter anatomy segmentation using diffusion-weighted magnetic resonance imaging. Major white matter tracts were created using the White Matter Anatomy segmentation App from the brainlife.io platform. Photo courtesy of UT Research Fellow Sandra Hanekamp. Adapted from: Hanekamp, S., Ćurčić-Blake, B., Caron, B. et al. Scientific Reports (2021).|
As an expert in vision science, neuroinformatics, brain imaging, computational neuroscience, and data science, Franco Pestilli's research has advanced the understanding of human cognition and brain networks over the last 15 years.
|Franco Pestilli, Neuroscientist, Department of Psychology, The University of Texas at Austin|
"The field of neuroscience looks at the brain in multiple ways," says Pestilli, a neuroscientist at The University of Texas at Austin (UT Austin). "For example, we're interested in how neurons compute and allow us to quickly react—it's a fast response requiring visual attention and motor control. Understanding the brain needs big data to capture all dimensions of human behavior."
How XSEDE Helped
New cloud technologies are becoming necessary to help researchers collaborate, process, visualize, and manage large amounts of data at unprecedented scales. These include supercomputers, many of which are allocated by the National Science Foundation-funded Extreme Science and Engineering Discovery Environment (XSEDE).
A key aspect of Pestilli's work started in 2017 when he received a grant from the BRAIN Initiative through the National Science Foundation (NSF) to launch Brainlife.io.
The Brainlife.io computing platform provides a full suite of web services to support reproducible research on the cloud. More than 1,600 scientists from around the world have accessed the platform thus far. BrainLife.io allows them to upload, manage, track, analyze, share, and visualize the results of their data.
The platform relies on supercomputing infrastructure to run simulations on high-performance computing (HPC) hardware. "National systems like Jetstream (Indiana University/TACC), Stampede2 (TACC), and Bridges-2 (Pittsburgh Supercomputing Center) are fundamental to what we do. We've received a lot of support from XSEDE."
The platform currently serves different scientists from psychology to medical science to neuroscience and includes more than 600 data-processing tools. Brainlife.io integrates different expertise and development mechanisms for making code and publishing it on the cloud—while tracking every detail that happens to the data.
To make an impact in neuroscience and connect the discipline to the most cutting-edge technologies such as machine learning and artificial intelligence, the community needs a cohesive infrastructure for cloud computing and data science to bring all these tremendous tools, libraries, data archives, and standards closer to the researchers who are working for the good of society. -- Franco Pestilli, Neuroscientist, Department of Psychology, The University of Texas at Austin
"We've processed more than 300,000 datasets thus far—and we're serving many new users as the number of scientists accessing our platform has exploded during the pandemic," Pestilli said. "A lot of new people came to Brainlife.io because they lost access to their physical facilities."
BrainLife.io is also funded via collaborative awards from the National Institutes of Health (NIH) and the Department of Defense.
Aina Puce is a professor in Psychological and Brain Science at Indiana University. She is a self-proclaimed neophyte with regard to Brainlife.io. Yet, she is a world expert in neuroimaging, and the principal investigator of an NIH grant that supports the development of neurophysiological data management and analyses on the platform.
"I jumped in at the deep end to help Franco and his team expand the functionality of the platform to neurophysiological data," Puce said.
"Brainlife.io is allowing us to start to perform cutting-edge analyses, integrating neurophysiological data and MRI-based data. Studies include research explicitly linking brain structure to brain function, such as how information gets transported from region to region, and how blood flow and brain electrical activity change when performing particular tasks."
Soon, a suite of new tools will be available on Brainlife.io for users to integrate EEG (electroencephalography), MEG (magnetoencephalography), and MRI (magnetic resonance imaging) data, which is "unique and will be tremendously helpful for both science and society," she said.
"This is what we are bringing to Brainlife.io for the first time," Puce said.
Why It's Important
The field of neuroscience is moving from small data sets to large data sets. Larger data sets mean that scientists can extract more statistically powerful insights from the information they collect.
From 1,000 subjects to 10,000 subjects to 500,000 subjects — the data sets keep growing.
|Aina Puce, Eleanor Cox Riggs Professor, Psychological & Brain Sciences, Programs in Neuroscience & Cognitive Science Affiliate, Indiana University / Indiana University Network Institute|
For example, the Adolescent Brain Cognitive Development Study is one of the largest, long-term studies of brain development and child health in the United States. The study collects data from over 10,000 adolescent brains to understand biological and behavioral development from adolescence into young adulthood. In another part of the world, the UK Biobank contains in-depth health information from more than 500,000 participants who donated their genetic and clinical data for the good of science; 100,000 of these participants donated brain scans.
"As each new project scales up," Pestilli said, "the size of the data set also scales up, and as a result, the needs for storage and computing change. We're building datasets of a size and impact that only supercomputers can effectively cope. With the recent advent of machine-learning and artificial-intelligent methods, and their potential to help humans understand the brain, we need to change our paradigm for data management, analysis, and storage."
Pestilli says that neuroscience research can't survive unless a cohesive ecosystem is built that will integrate the needs of the scientists with hardware and software needs, given the tremendous amount of data and the next-generation questions to be explored.
He says that many of the tools developed so far are not easily integrated into a typical workflow or ready to use.
"To make an impact in neuroscience and connect the discipline to the most cutting-edge technologies such as machine learning and artificial intelligence, the community needs a cohesive infrastructure for cloud computing and data science to bring all these tremendous tools, libraries, data archives, and standards closer to the researchers who are working for the good of society," he said.
Pestilli found a like-minded collaborator who shares this vision in Dan Stanzione, the executive director of the Texas Advanced Computing Center (TACC) and a nationally recognized leader in HPC.
Together, they plan to create a national infrastructure that provides a registry for permanent data and analyses records. Researchers will be able to find data and more transparently see the root of how the analysis was conducted. The infrastructure will facilitate what the NSF requires in data proposals and what researchers want, which is scientific impact and reproducibility.
In addition, this means that access to data, analysis methods, and computational resources will move toward a more equitable model, providing opportunities for many more students, educators, and researchers than ever before.
"I'm confident that we can get it done—this vision is a crucial part of my efforts here."
At a Glance:
Neuroscientist Franco Pestilli advances the understanding of human cognition and brain networks. He is advocating for new cloud technologies and infrastructure to help researchers collaborate, process, visualize, and manage large amounts of data at unprecedented scales.
One of his key projects, Brainlife.io, is a computing platform that provides a full suite of web services to support reproducible research on the cloud.
Brainlife.io relies on supercomputing infrastructure to run simulations on supercomputers, including Jetstream, Stampede2, and Bridges-2, which are all XSEDE resources.