TU Dortmund





Empfohlene Literatur



011316, SS14
Dozentinnen und Dozenten
Spezialvorlesung, 2+1
Ort und Zeit
Blockkurs im Rahmen der
Summer School. Veranstaltung beginnt im Juni 2014
und wird dann 4+2 gehalten.
CIP-Pool (9. Stock) Mo 16:00 2h
CIP-Pool (9. Stock) Di 16:00 2h
Modul-Zugehörigkeit (ohne Gewähr)
DPL:B:-:2 – Mathematik, Diplom (auslaufend)
DPL:F:-:1 – Mathematik für andere Fächer (Service)
MAMA:-:7:MAT-722 – Applied Scientific Computing
SRV:-:-:S-R20x – Mathematik für Automation and Robotics
DPL:E:-:- – Mathematik, Promotionsstudiengang
TMAMA:-:7:MAT-722 – Applied Scientific Computing
WIMAMA:-:7:MAT-722 – Applied Scientific Computing
Sprechstunde zur Veranstaltung
whenever the door is open
Beginn der Veranstaltung
Gewünschte Vorkenntnisse
Good/advanced programming skills are mandatory. Coding examples throughout the lecture are based on C/C++ on Linux, so participants are expected to be able to read and understand C/C++ code snippets, and work with a Linux command line. Access to capable Linux GPU machines will be provided. Note that (most) practicals can be done in other environments if necessary, for instance Fortran or Python, or Windows or Mac.

Over the past few years, graphics processing units (GPUs) have evolved
from specialised processors for computer graphics and gaming into a
capable architecture for general purpose scientific computing. Peak
performance numbers exceed those of multicore CPUs by at least an order
of magnitude, with current high-end models reaching more than 3 TFLOP/s
of floating point performance and 300 GB/s memory bandwidth. Many
researchers have demonstrated substantial speedups on GPUs for a wide
range of application domains, including but not limited to fluid
dynamics, structural mechanics, seismic wave propagation, protein
folding, medical physics, astrophysics, databases, big data and many
more. However, this potential performance improvement comes at the cost
of a much more fine-grained parallel programming model with literally
tens of thousands simultaneously active threads, that can be challenging
to master for beginners.

This block course teaches GPU Computing from scratch, assuming (besides
curiosity and a healthy degree of eagerness to get one's hands dirty
writing code) only very basic knowledge of parallel programming in a
shared memory setting. We will briefly review basic OpenMP to set a
common stage. Topics then include an overview of the hardware, the
programming model, language extensions to C/C++, debugging and analysis
tools, and a selection of established (advanced) performance tuning
techniques and guidelines to 'program for performance'. We will use
NVIDIA CUDA, but also briefly cover the platform-independent OpenCL
standard, the annotation-based OpenACC approach, and GPU-enabled plug-in
libraries that require little effort to use


This course is designed as a 'programming lab', with both classroom
lectures and hands-on sessions. Access to GPU machines will be provided.
Interested students (master- and PhD level) from all DoWiR institutes
(MINT departments) are strongly encouraged to participate. Non-graded
participation certificates will be awarded for successful participation,
as demonstrated by surviving the practicals. Graded participation
requires additional participation in a research/programming project.
Details will be provided in the first lecture.

Empfohlene Literatur
  • will be announced


Nummer der Übung