Science Success Story
Sipping from the Fire Hose
XSEDE GPU Resources Help WVU Scientists Create AI Package to Make Processing Thousands of Fast Radio Burst Candidates Manageable by Human Experts
By Ken Chiacchia, Pittsburgh Supercomputing Center
This artist's impression represents the path of the fast radio burst FRB 181112 traveling from a distant host galaxy to reach the Earth. Credit: European Southern Observatory/M. Kornmesser
Fast radio bursts (FRBs) puzzle astronomers. They're so brief—lasting only a few thousandths of a second—that scientists haven't quite been able to identify their points of origin or how they are generated. Using the GPU nodes of the XSEDE-allocated Bridges supercomputing platform, a team from West Virginia University created a package of artificial intelligence (AI) programs that can sift through the thousands of FRB candidates expected to be detected in upcoming surveys quickly enough for astronomers to figure out where to point their telescopes to learn more.
Why It's Important
One of the most difficult challenges in astronomy is figuring out the origin of FRBs. Incredibly quick and unpredictable, these events flash out for only a few thousandths of a second before disappearing without any lingering trace—at least, not that scientists have yet detected. Despite that brevity, astronomers have used radio-frequency telescopes to detect more than 100 of them since they were discovered by researchers at West Virginia University (WVU) in 2007.
Scientists do know one thing about FRBs. They're coming from incredibly far away, outside our Milky Way galaxy. Though their signals are relatively weak when they get to Earth, they have to be pretty powerful at their distant points of origin. But that's it. Scientists have some educated guesses as to what might be causing FRBs—flaring neutron stars with powerful magnetic fields, interacting pairs of neutron stars and black holes are all possible sources. We just don't know. One big problem has been that, because FRBs are so brief, they don't give astronomers any warning time to redirect visible-light or X-ray telescopes to their location to check if there are lingering signals in those frequencies that could help them decide between the candidate causes.
"Because we are basically observing all the time, we're getting data 24 hours all days of the week … If you're getting thousands and thousands of candidates in your pipeline every day, you need to automatically detect out of these thousands maybe 10 that are real."—Kshitij Aggarwal, WVU
Upcoming surveys will discover many more of these events by monitoring broad swaths of the sky. Researchers expect them to detect about a dozen FRBs every day, which would be great for the science. The problem is they'll also detect many thousands of false signals—similar but distinct astronomical signals, interfering radio signals from Earth-bound sources like mobile phones and satellites, as well as random noise. It's a "sipping from the fire hose" problem. Human experts can tell FRBs from these other signals, but with many thousands coming daily they simply can't sort them fast enough to re-aim other telescopes to look for non-radio signals. That's why Devansh Agarwal and Kshitij Aggarwal, graduate students working with advisors Duncan Lorimer and Sara Burke-Spolaor of WVU, respectively, wanted to use AI to make an automated "first cut" that reduced the number of candidates to a number manageable by humans.
How XSEDE Helped
The type of AI that Agarwal and Aggarwal used is called a convolutional neural network (CNN). In CNNs, the computer creates several layers that represent different characteristics of an image. It also creates a network of connections between the data in each layer. It then trains itself on images that have been identified by humans. Somewhat like a developing biological brain, it removes faulty connections until the network succeeds in correctly identifying those images. Next scientists test it on data that have not been labeled, going back and forth between training and testing until it's got a high success rate. Then, the CNN can be used on real data.
The WVU scientists faced several challenges in their plan. First, whatever they came up with had to be fast. If it couldn't create a smaller list of candidates fast enough for humans to spot the real FRBs and then redirect telescopes to search for non-radio signals that followed the radio burst, it wouldn't improve the situation. Second, they decided to speed the development time by using pre-existing, freely available image-classifying CNNs. By training thousands of them, they could winnow it down to a small set that were really good at FRB classification. The end result would be a package of CNNs that are openly available to researchers.
Lastly, CNN works best when carried out on graphics processing units, or GPUs. Originally developed to create realistic images in video games, GPUs turned out to have huge scientific applications in processing image data and in AI. But training thousands of CNNs would require many more GPUs than the team had available through local resources at WVU.
"FETCH could not have been possible without Bridges-GPU. There's no way we could have done this project … When it came time to train thousands of models, each taking several hours, it cannot be done locally on a desktop [computer]. XSEDE offered this Big Data workshop at our university … and we got really excited to learn there was an XSEDE machine nearby in Pittsburgh [that could fill this role]."—Devansh Agarwal, WVU
The solution to all three problems came in the form of an XSEDE workshop offered at WVU. There the WVU scientists learned about Bridges, an XSEDE-allocated supercomputing platform at the Pittsburgh Supercomputing Center that possesses a total of 58 powerful, late-model GPU nodes. Bridges offered them the GPU power they needed to scale up their testing.
The ease of getting a startup allocation (via the XSEDE Resource Allocation System) on Bridges-GPU, the platform's GPU nodes, as student researchers proved critical for getting their work started quickly and effectively.
Using Bridges-GPU, the team winnowed down thousands of candidate CNNs to a list of 11 that was over 99.5-percent accurate in classifying FRBs. A task that would have taken months using other resources available could now be done in a week. Their package of CNNs, available for free to scientists, is called FETCH, for Fast Extragalactic Transient Candidate Hunter. They reported their results and offered FETCH to astronomers carrying out upcoming FRB surveys in a report in the journal Monthly Notices of the Royal Astronomical Society online in June 2020.
You can read their paper here.
At a Glance:
Fast radio bursts outside our galaxy last only a few thousandths of a second, giving scientists little time to identify their points of origin or how they are generated.
Using the GPU nodes on XSEDE resources, a team from West Virginia University created a package of artificial intelligence (AI) programs that can sift through thousands of FRB candidates quickly.
This "first cut" can allow astronomers to figure out where to point telescopes to learn more about these mysterious phenomena.