CUDA Kernels Engineer
Palo Alto, Paris, Abu Dhabi Engineering / Full Time / On-site
Key Responsibilities
+ Develop high-performance GPU/CPU kernels and make trade-offs that maximize end-to-end hardware utilization
+ Utilize knowledge of hardware features and performance characteristics to make aggressive optimizations
+ Work with our other platform teams to deploy your kernels, manage our training uptime, and find opportunities for optimization
+ Develop low-precision algorithms that deliver high performance with little loss of ML accuracyWork with ML engineers to develop model architectures that are amenable to efficient training and inference
+ Work with hardware vendors and advise on HW/SW co-design
+ A strong coder with excellent skills in C/C++ and Python
+ Have a deep understanding of GPU, CPU, or other AI accelerator architectures
+ Have experience writing and optimizing compute kernels with CUDA or similar languages
+ Are familiar with LLM (and/or other foundation model) architectures and training infrastructure
+ Have experience driving ML accuracy with low-precision formatsHave 1+ years of relevant industry experience
+ Get a great deal of satisfaction with every percentage point in performance improvement
+ Senior Engineer for Ph.D. with 1-3 years working experience or Engineer with M.S. with 3-5 year working experience
Preferred Qualifications
+ PhD in Computer Science and Engineering with a specialization in Computer Architecture, Parallel Computing, Compilers, or other Systems
+ Experience building compilers
+ Experience working with hardware developers