Anshul Jindal & Martin Piercy
Your Next AI Needs 10,000 GPUs. Now What?
#1about 2 minutes
Introduction to large-scale AI infrastructure challenges
An overview of the topics to be covered, from the progress of generative AI to the compute requirements for training and inference.
#2about 4 minutes
Understanding the fundamental shift to generative AI
Generative AI creates novel content, moving beyond prediction to unlock new use cases in coding, content creation, and customer experience.
#3about 6 minutes
Using NVIDIA NIMs and blueprints to deploy models
NVIDIA Inference Microservices (NIMs) and blueprints provide pre-packaged, optimized containers to quickly deploy models for tasks like retrieval-augmented generation (RAG).
#4about 4 minutes
An overview of the AI model development lifecycle
Building a production-ready model involves a multi-stage process including data curation, distributed training, alignment, optimized inference, and implementing guardrails.
#5about 6 minutes
Understanding parallelism techniques for distributed AI training
Training massive models requires splitting them across thousands of GPUs using tensor, pipeline, and data parallelism to manage compute and communication.
#6about 2 minutes
The scale of GPU compute for training and inference
Training large models like Llama requires millions of GPU hours, while inference for a single large model can demand a full multi-GPU server.
#7about 3 minutes
Key hardware and network design for AI infrastructure
Effective multi-node training depends on high-speed interconnects like NVLink and network architectures designed to minimize communication latency between GPUs.
#8about 3 minutes
Accessing global GPU capacity with DGX Cloud Lepton
NVIDIA's DGX Cloud Lepton is a marketplace connecting developers to a global network of cloud partners for scalable, on-demand GPU compute.
Related jobs
Jobs that call for the skills explored in this talk.
Wilken GmbH
Ulm, Germany
Senior
Kubernetes
AI Frameworks
+3
Picnic Technologies B.V.
Amsterdam, Netherlands
Intermediate
Senior
Python
Structured Query Language (SQL)
+1
ROSEN Technology and Research Center GmbH
Osnabrück, Germany
Senior
TypeScript
React
+3
Matching moments
03:55 MIN
The hardware requirements for running LLMs locally
AI in the Open and in Browsers - Tarek Ziadé
14:06 MIN
Exploring the role and ethics of AI in gaming
Devs vs. Marketers, COBOL and Copilot, Make Live Coding Easy and more - The Best of LIVE 2025 - Part 3
00:48 MIN
The shift to on-device AI models in smartphones
Fake or News: Coding on a Phone, Emotional Support Toasters, ChatGPT Weddings and more - Anselm Hannemann
04:28 MIN
Building an open source community around AI models
AI in the Open and in Browsers - Tarek Ziadé
09:10 MIN
How AI is changing the freelance developer experience
WeAreDevelopers LIVE – AI, Freelancing, Keeping Up with Tech and More
06:44 MIN
Using Chrome's built-in AI for on-device features
Devs vs. Marketers, COBOL and Copilot, Make Live Coding Easy and more - The Best of LIVE 2025 - Part 3
01:02 MIN
AI lawsuits, code flagging, and self-driving subscriptions
Fake or News: Self-Driving Cars on Subscription, Crypto Attacks Rising and Working While You Sleep - Théodore Lefèvre
02:20 MIN
The evolving role of the machine learning engineer
AI in the Open and in Browsers - Tarek Ziadé
Featured Partners
Related Videos
WWC24 - Ankit Patel - Unlocking the Future Breakthrough Application Performance and Capabilities with NVIDIA
Ankit Patel
A Deep Dive on How To Leverage the NVIDIA GB200 for Ultra-Fast Training and Inference on Kubernetes
Kevin Klues
Efficient deployment and inference of GPU-accelerated LLMs
Adolf Hohl
Unveiling the Magic: Scaling Large Language Models to Serve Millions
Patrick Koss
How AI Models Get Smarter
Ankit Patel
AI Factories at Scale
Thomas Schmidt
Exploring LLMs across clouds
Tomislav Tipurić
Generative AI power on the web: making web apps smarter with WebGPU and WebNN
Christian Liebel
Related Articles
View all articles



From learning to earning
Jobs that call for the skills explored in this talk.

Forschungszentrum Jülich GmbH
Jülich, Germany
Intermediate
Senior
Linux
Docker
AI Frameworks
Machine Learning

Nvidia
Bramley, United Kingdom
C++
PyTorch
TensorFlow

Nvidia
Bramley, United Kingdom
£230K
Senior
API
Terraform
Kubernetes
Amazon Web Services (AWS)

Nvidia
München, Germany
€230K
Senior
API
Terraform
Kubernetes
Amazon Web Services (AWS)



Nvidia
Remote
Intermediate
C++
Python
Machine Learning
Software Architecture

Nvidia
Bramley, United Kingdom
£292K
Senior
C++
Linux
Node.js
PyTorch
+1

IC Resources
Luton, United Kingdom
Remote
£100K
Senior
API
UML
OpenCL