Paul Graham
Accelerating Python on GPUs
#1about 1 minute
The evolution of GPU programming with Python
Python has become a first-class citizen in the CUDA ecosystem, making it easier to accelerate software on GPUs.
#2about 2 minutes
How GPUs evolved from graphics to AI powerhouses
The development of CUDA unlocked general-purpose GPU computing, which was supercharged by the AlexNet breakthrough in AI.
#3about 2 minutes
Understanding modern GPU architecture for parallelism
A look inside a modern data center GPU reveals thousands of cores and specialized hardware like Tensor Cores designed for massive parallelism.
#4about 2 minutes
Navigating the CUDA Python software ecosystem
The CUDA platform provides a layered stack of libraries, frameworks, and tools to access GPU power at your preferred level of abstraction.
#5about 3 minutes
Using high-level frameworks like Rapids for acceleration
Frameworks like Rapids provide GPU-accelerated versions of tools like pandas and scikit-learn, often requiring zero code changes for massive speedups.
#6about 1 minute
Using CuPy as a drop-in replacement for NumPy
CuPy offers a familiar NumPy-like API that allows you to move array computations to the GPU by simply changing the import statement.
#7about 5 minutes
Optimizing code with nvmath-python and a case study
The nvmath-python library enables kernel fusion for significant speedups, as demonstrated by a supernova detection project that went from 45 minutes to one minute.
#8about 2 minutes
A look at upcoming Python GPU programming tools
New tools like CuTe for array-based programming and Python bindings for CUDA Core Compute Libraries are making GPU development even more accessible.
#9about 2 minutes
Strategies for scaling your code to multiple GPUs
Explore various approaches for multi-GPU programming, from high-level libraries like Dask and JAX to lower-level communication libraries like NCCL and NVSHMEM.
#10about 2 minutes
Profiling and debugging your GPU applications
Use essential developer tools like Nsight Systems and Nsight Compute to profile your application, identify bottlenecks, and optimize performance.
#11about 2 minutes
Resources for getting started with GPU programming
Find examples, labs, and free courses through the NVIDIA Accelerated Compute Hub and Developer Program to begin your GPU programming journey.
Related jobs
Jobs that call for the skills explored in this talk.
Picnic Technologies B.V.
Amsterdam, Netherlands
Intermediate
Senior
Python
Structured Query Language (SQL)
+1
Matching moments
04:09 MIN
How Python became the dominant language for AI
AI in the Open and in Browsers - Tarek Ziadé
03:55 MIN
The hardware requirements for running LLMs locally
AI in the Open and in Browsers - Tarek Ziadé
02:49 MIN
Using AI to overcome challenges in systems programming
AI in the Open and in Browsers - Tarek Ziadé
02:20 MIN
The evolving role of the machine learning engineer
AI in the Open and in Browsers - Tarek Ziadé
04:57 MIN
Increasing the value of talk recordings post-event
Cat Herding with Lions and Tigers - Christian Heilmann
01:54 MIN
The growing importance of data and technology in HR
From Data Keeper to Culture Shaper: The Evolution of HR Across Growth Stages
03:07 MIN
Final advice for developers adapting to AI
WeAreDevelopers LIVE – AI, Freelancing, Keeping Up with Tech and More
08:07 MIN
Exploring modern JavaScript performance and new CSS features
WeAreDevelopers LIVE – AI, Freelancing, Keeping Up with Tech and More
Featured Partners
Related Videos
Accelerating Python on GPUs
Paul Graham
Accelerating Python on GPUs
Paul Graham
CUDA in Python
Andy Terrel
WWC24 - Ankit Patel - Unlocking the Future Breakthrough Application Performance and Capabilities with NVIDIA
Ankit Patel
Your Next AI Needs 10,000 GPUs. Now What?
Anshul Jindal & Martin Piercy
Coffee with Developers - Stephen Jones - NVIDIA
Stephen Jones
A Deep Dive on How To Leverage the NVIDIA GB200 for Ultra-Fast Training and Inference on Kubernetes
Kevin Klues
Python: Behind the Scenes
Diana Gastrin
Related Articles
View all articles



From learning to earning
Jobs that call for the skills explored in this talk.

Nvidia
Bramley, United Kingdom
C++
PyTorch
TensorFlow

Avantgarde Experts GmbH
München, Germany
Junior
C++
GIT
CMake
Linux
DevOps
+3


Nvidia
Bramley, United Kingdom
£292K
Senior
C++
Linux
Node.js
PyTorch
+1


Tecdata
Municipality of Madrid, Spain
Intermediate
API
Python
FastAPI



NVIDIA
Zwolle, Netherlands
Senior
Linux
DevOps
Python
OpenCL
Docker