Attention Mechanisms for CFD/FEA - Part 1
How the most essential part of ChatGPT can be leveraged to predict fields of huge 3D data in our CFD/FEA simulations. Part 1 is an introduction.
Even back in 2019 I was obsessed with the idea that machine learning’s ability to learn complex non-linear patterns could be leveraged in full 3D engineering simulation to better understand the physics we were modeling. This image below was from a film cooling (jet-in-crossflow) simulation CFD simulation I did, whereby identifying turbulence structures was implicit to understanding the resulting heat transfer.
At the time, I was processing 3D fields of data into an anisotropy scalar metric, based on Lumley triangles (what’s that?), so that I could understand where our RANS (eddy viscosity based) turbulence models might fail us the hardest (so that I could try to fix it by adding source terms). I processed the 3D fields of anisotropy in with a t-distributed stochastic neighbor embedding (t-SNE) method, to visualize the higher-dimensional (different levels of anisotropic turbulence) data into a reduced latent space.
It worked okay, but no major break throughs. However, new methods have emerged in the last 5 years which are total game changers and orders of magnitude improvements. I don’t like to talk like a hype person, but the method we will talk about specifically in this series of posts is the key element of ChatGPT (and similar LLMs).
This is part 1 of the series. Part 2 will be a massive literature review on attention mechanisms in CFD/FEA and come out in <7 days.