Okay, back to how they did it.
When the function fits
The first thing to understand here is that neural networks are fundamentally function approximators. (Say what?) When they’re training on a data set of paired inputs and outputs, they’re actually calculating the function, or series of math operations, that will transpose one into the other. Think about building a cat detector. You’re training the neural network by feeding it lots of images of cats and things that are not cats (the inputs) and labeling each group with a 1 or 0, respectively (the outputs). The neural network then looks for the best function that can convert each image of a cat into a 1 and each image of everything else into a 0. That’s how it can look at a new image and tell you whether or not it’s a cat. It’s using the function it found to calculate its answer–and if its training was good, it’ll get it right most of the time.
Conveniently, this function approximation process is what we need to solve a PDE. We’re ultimately trying to find a function that best describes, say, the motion of air particles over physical space and time.
Now here’s the crux of the paper. Neural networks are usually trained to approximate functions between inputs and outputs defined in Euclidean space, your classic graph with x, y, and z axes. But this time, the researchers decided to define the inputs and outputs in Fourier space, which is a special type of graph for plotting wave frequencies. The intuition that they drew upon from work in other fields, says Anima Anandkumar, a Caltech professor who oversaw the research, is that something like the motion of air can actually be described as a combination of wave frequencies. The general direction of the wind at a macro level is like a low frequency with very long, lethargic waves, while the little eddies that form at the micro level are like high frequencies with very short and rapid ones.
Why does this matter? Because it’s far easier to approximate a Fourier function in Fourier space than to wrangle with PDEs in Euclidean space, which greatly simplifies the neural network’s job. Cue major accuracy and efficiency gains: in addition to its huge speed advantage over traditional methods, their technique achieves a 30% lower error rate when solving Navier-Stokes than previous deep-learning methods.
The whole thing is extremely clever, and also makes the method more generalizable. Previous deep-learning methods had to be trained separately for every type of fluid, whereas this one only needs to be trained once to handle all of them, as confirmed by the researchers’ experiments. Though they haven’t yet tried extending this to other examples, it should also be able to handle every earth composition when solving PDEs related to seismic activity, or every material type when solving PDEs related to thermal conductivity.
Anandkumar and the lead author of the paper, Zongyi Li, a PhD student in her lab, didn’t do this research just for the theoretical fun of it. They want to bring AI to more scientific disciplines. It was through talking to various collaborators in climate science, seismology, and materials science that Anandkumar first decided to tackle the PDE challenge with her students. They’re now working to put their method into practice with other researchers at Caltech and the Lawrence Berkeley National Laboratory.
One research topic Anandkumar is particularly excited about: climate change. Navier-Stokes isn’t just good at modeling air turbulence; it’s also used to model weather patterns. “Having good, fine-grained weather predictions on a global scale is such a challenging problem,” she says, “and even on the biggest supercomputers, we can’t do it at a global scale today. So if we can use these methods to speed up the entire pipeline, that would be tremendously impactful.”
There are also many, many more applications, she adds. “In that sense, the sky’s the limit, since we have a general way to speed up all these applications.”