EXALAT School: GPU coding for lattice practitioners

EXALAT School: GPU coding for lattice practitioners

Virtual School Monday and Tuesday, 15-16.02.2021

Course description:
This two-day course will provide an introduction to GPU computing with CUDA aimed at researchers in Lattice Quantum Field Theory. The course will give a background on the difference between CPU and GPU architectures as a prelude to introductory exercises in CUDA programming. The course will discuss the execution of kernels, memory management, and shared memory operations. Common performance issues are discussed and their solution addressed as well as the implementation of a sparse linear solver as a typical lattice application, concentrating on efficiency and parallelism.
At the end of the course, attendees should be in a position to make an informed decision on how to approach GPU parallelisation in their applications in an efficient and portable manner.

Requirements:
Attendees must be familiar with programming in C or C++. Some knowledge of parallel/threaded programming models would be useful as well as a good understanding of sparse linear solvers like conjugate gradient. Access to a GPU machine will be provided. Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) for which they have administrative privileges.

The lecturer:
The course will be given primarily by Nick Johnson, a Software Architect from EPCC, The University of Edinburgh. Nick has programmed most things from FPGAs and microcontrollers to supercomputers. Joining him will be Kevin Stratford, Senior Research Fellow at EPCC.

Timetable
DAY I
10:00 Intro
10:20 GPU Concepts and differences between CPU/GPU architectures
11:00 Break
11:20 CUDA programming: kernels; memory spaces and copies
12:00 Practical exercise: simple kernel
13:00 Lunch

14:00 CUDA optimisations: caching vs coalescence etc
14:30 Practical exercise on optimsiation
15:00 Break
15:20 Constant and shared memory
16:00 Practical exercises
17:00 Close

Day II
10:00 Recap
10:10 OpenCL and OpenACC directives
11:00 Break
11:20 OpenCL and/or OpenACC exercises
13:00 Lunch

14:00 Multi-GPU processing
14:30 Practical exercise (reworking a linear solver example to span multiple GPUs)
15:00 Break
15:20 Using more than one GPU in a single node
15:40 Practical exercises
16:30 Close

Registration: Please contact us.