NVIDIA Part 1: Code, Tutorials, Papers, & the Conference You Should Attend
Shoutout to one of the biggest AI/scientific computing conferences every year (& free)
Part I: I want to make you aware of the NVIDIA GTC conference with some practical details/tips (how to register, talks of interest, notes). Note it starts Monday March 17th.
Part II: I want to introduce NVIDIA Modulus -- a powerful open-source framework for building, training, and fine-tuning Physics-ML models with a simple Python interface. This Part II will extend to further Substack posts in a series. But herein, we will scratch the surface to introduce you to Modulus and basic suggestions on where to start. In the next post, I will cover the publications behind the methods inside Modulus and point to more tutorials and exercises you can try. I tried to do this all in one post, but Part II is simply enormous.
Many of my posts are purely technical and much longer, so this one may feel a bit different. I think this is good information to share, and hope you enjoy it nevertheless.
Part I: NVIDIA GTC
I get asked every single week without fail “what are some conferences/events I can attend to upskill”. Well, this is definitely the post for those folks. Especially, the ‘Science and Computing‘ sessions track.
You can select this free form of registration:
Here are some talks I can highlight for you to consider attending. There are many more you can search for in the calendar (here). Note the flair on each post to designate if you can access it or not per your registration choices.
This one below has a ‘No Recording’ label, so I suggest trying to dig up the open access paper associated to the talk, if applicable.
Workflow is usually the bottleneck for adoption, in my experience. The capabilities of the AI/ML methods are sufficient (high enough accuracy, generality, ability to discern trust/or not from predictions, etc.). However, if the AI/ML based tools require an annoying workflow to execute (e.g. users going outside their existing tools to manually execute runs and manually pass data back and forth), then usually that is enough to motivate divestment and kill the ROI.
Foundation models - the phrase for the future we are all looking at closely to see what happens. In case you haven’t heard it, foundation models (generally) refer to models that are so general that they can be used on a super wide variety of problems out of the box, rather than the typical paradigm of having to re-train on your own data from scratch. I think this is a long term vision, but there is some possibility (in my opinion) in the mid-term for ‘narrow’ foundation models. For example, a foundation model could be one pre-trained ML model offer to a user base (engineers from different turbomachinery companies) that can predict accurate flow-fields for any large gas turbine stages (rows of stators/rotors). That is clearly ambitious to get accurate predictions over such a huge open-ended possible geometries. However, you could narrow the scope, for example to only certain classes of turbines (high pressure, a specific size, w/ or w/o enwall contouring, etc.), such that for a more narrow group of scenarios this pre-trained model should work relatively well. I think the natural progression is that people offer pre-trained models that still require training, but train much faster/more stable for a narrow breadth of cases.
I have seen some ML methods that simulate particle flows, but often the limitation is how many particles can be modeled at one time (among other things). So off the bat, I am curious what method facilitates 1 trillion particles.
The automotive industry is one that has been aggressively intaking AI/ML technologies and implementing into their practices. So, simply put, I am curious to learn from this talk where the bleeding is/what problems they are currently working through
Part II: Modulus — Papers, Code, Tutorials
As said earlier, this is a powerful framework you probably should get eyes on it if you care about training AI/ML models in science/physics-based problems (atleast to get an idea on what’s inside). There is a hefty amount of resources (blogs, tutorials, articles, …) on their web page, so I hope I can cut to the some of the most interesting resources for you to focus on.

Before getting overwhelmed with information, I personally would recommend jump-starting into reading the user guide (after making an account). This page shows the Topics/Table of Contents for theory, tutorials, and code (left-hand side).

One of my personal favorites, is their ‘Turbulence Super Resolution” example. The basic idea is that you can train a model to intake ‘low fidelity’ simulation data, and then such model will upscale the resolution to that of much higher fidelity simulation data, which would normally be prohibitively expensive. The value to industry is clear —higher quality results akin to simulation but rapidly from a ML model. It’s a cool example for a lot of reasons, but one in particular is that it exposes you to pyJHTDB. This is cloud-based data resource, provided by the Johns Hopkins University. It contains a ton of turbulence datasets available for download (high quality simulation too, as you can imagine).
The second example I would recommend to start with is that of Deep-O-Nets. This one helps you learn abstract operators using data-informed and physics-informed Deep operator network (DeepONet) [source].
Modulus offers a diverse model zoo of AI/ML architectures tailored for us in scientific computing (not just CFD). Here’s a list.
This next post in this NVIDIA series will dive into the publications/tutorials inside Modulus.
I couldn’t attend Nvidia GTC, but low key I was looking forward to reading a post on it from you! I think this is the perfect place for me to learn and apply the insights you’ve shared. Thanks Justin!
Well summarized. Good takeaways and key points highlighted from NVIDIA GTC! Thanks, Justin.