High-performance Computing Engineer

The Next Chapter
Amsterdam, Netherlands
7 days ago

Role details

Contract type
Temporary contract
Employment type
Full-time (> 32 hours)
Working hours
Shift work
Languages
English
Experience level
Senior
Compensation
€ 13K

Job location

Remote
Amsterdam, Netherlands

Tech stack

Artificial Intelligence
C++
Computer Engineering
Distributed Systems
InfiniBand
Python
Linux kernel
Machine Learning
Parallel Computing
Performance Tuning
Graphics Processing Unit (GPU)
Go

Job description

We are seeking an experienced HPC Engineer to join a dedicated high-performance computing optimization team. This team sits at the intersection of R&D, hardware engineering, and distributed systems, focusing on maximizing computational throughput and efficiency rather than traditional system administration. You'll work with cutting-edge HPC technology to optimize parallel computing environments, GPU clusters, and interconnect systems, meeting the demanding requirements of AI and machine learning workloads.

You will focus on optimizing the performance of large-scale GPU clusters, targeting latency reduction, computational efficiency, and enhanced parallel processing capabilities. Working with InfiniBand networks and high-performance computing infrastructure, you'll collaborate with cross-functional teams to deliver scalable HPC solutions for client needs.

The role requires balancing operational optimization and troubleshooting (50%) with HPC architecture design and performance tuning projects (50%). You'll maintain and optimize distributed computing systems, managing over 30,000 GPUs across 10+ InfiniBand networks, while ensuring the optimal performance of global HPC infrastructure and driving continuous computational improvements.

Requirements

Do you have experience in Python?, Do you have a Master's degree?, * 5+ years of experience in HPC environments and parallel computing systems.

  • Strong proficiency in Linux Kernel optimization for HPC workloads.
  • Strong proficiency in C++ or C development for high-performance applications.
  • Experience with Golang and/or Python for HPC tooling and automation.
  • Experience with InfiniBand networking and high-speed interconnects.
  • Experience with distributed computing architectures and cluster management.

About the company

Our client is a rapidly growing organization at the forefront of the AI revolution, specializing in providing high-performance computing infrastructure to run heavy LLM models and AI products. They operate a global network of data centers with capacity specifically designed and tailored for extreme-scale computational workloads.

Apply for this position