The assignments below are due on the date listed. Please come to each class having completed the reading and assignment(s) for that date shown below.
- Friday January 17 – A canonical problem: matrix-matrix multiplication
-
Read: Chacon & Straub Chapters 1, 2 and Eijkhout (HPC) Chapter 20
Assignment: As you read Eijkhout (HPC) Chapter 20, use one of the workstations (or your own Linux or mac computer) to try as many of the in-line exercises as you need to so that you are comfortable working the Linux command line.
- Wednesday, January 22 – Introduction to HPC (KOS 244)
-
Read: Eijkhout (HPC) Chapters 21–22
Assignment: Using one of the workstations (or your own Linux or mac computer), create a git repository and practice working with it as outlined in Chapter 2 of Chacon and Straub. In particular, practice adding files, making commits, editing files, committing the changes, examining the status, etc.
- Friday, January 24 – A brief history and overview of HPC
-
Read: Barlas 1.1–1.3
Assignment: (HW01) Finish the hands-on exercise: Using your notes from the hands-on exercise, write a report that provides your observed timings and FLOP rates along with your observations about the runs. Include what seems important to you, but be sure to address at least these questions: What parallel program and which ijk ordering performed the best? Did others observe similar behavior? What conclusions can draw?
Also, explore HPC University's student internship listings
- Monday, January 27 – Performance metrics, prediction, and measurement
-
Read: Barlas 1.4–1.5 and Eijkhout (HPC) 2.2 through 2.2.3
- Wednesday, January 29 – Debugging and profiling programs (KOSC 244)
-
Read: Familiarize yourself with the Gprof and Valgrind user manuals. Spend enough time to develop a basic understanding of the purpose of each tool.
Assignment: (HW02) Barlas Chapter 1: Exercises 4, 5, 6 (pg 26)
- Friday, January 31 – Dense matrix algebra and libraries
-
Read: Eijkhout (HPC) 2.4
Assignment: Finish the hands-on exercise if you were not able to complete it in Wednesday. Also, check out the assignment for Monday (listed just below) and start working on it as you are able.
- Monday, February 3 – Dealing with data; data files and HDF5
-
Read: Eijkhout (HPC) Chapter 24 and familiarize yourself with Chapter 1 of the HDF5 Users Guide).
Assignment: (HW03) Modify the program mult-template.c so that it uses the CBLAS routine
cblas_dgemm()
to perform the matrix-matrix product. Documentation on CBLAS is a little sketchy, but I'm sure if you search the web for “cblas examples” you'll find what you need. Be sure and load theatlas
module so that the CBLAS library is accessible:module load atlas
You'll need to link your program with the CBLAS and Atlas libraries; do this by including-lcblas -latlas
on the compiler command line.If you want to use
smake
to compile the program you'll need to make sure thesmake
module is loaded:module load smake
The Smake line for the compile the program will be something like/* * $Smake: gcc -Wall -O3 -o %F %f -lcblas -latlas */
- Wednesday, February 5 – Reading and writing HDF5 files (KOSC 244)
-
Assignment: (HW04) Barlas Chapter 1 Exercise 7 (pg 26), modified as follows: (1) Rather than reading and writing data from/to files, create and sort a list of random integers. The file makeList.cc can be used as a starter program for this assignment. (2) Feel free to base your mergesort functions on the functions
mergesort()
and the “Efficient variant” ofmerge()
found at http://www.iti.fh-flensburg.de/lang/algorithmen/sortieren/merge/mergen.htm. Note: You should modify both functions so the first parameter isint a[]
(or the equivalent toint* a
), a pointer to the array to be sorted. You should also modifymerge()
to add the lineint* b = new int [m-lo+1];
before the first loop and add the linedelete [] b;
just before the end of the function. - Friday, February 7 – A model HPC problem: Finite difference solution of the unsteady heat equation in multiple dimensions
-
Read: The first two pages (up to 15.2.1) of Finite Difference Approximation of Derivatives
Assignment: (HW05) Do both the following:
-
Finish
the HDF5
hands-on exercise and email the professor with
(1) the full path to the source and executable
version of your program, and (2) the full paths to
the HDF5 files produced by your program when run with
the
small.h5
andbig.h5
files. -
Modify the program you wrote for Barlas Chapter 1
Exercise 7 so that it reads integer data from an HDF5
file and writes the sorted list to another HDF5 file.
The file
containing the unsorted data is stored
in
/gc/cps343/random_list.dat
on the workstations. You can use theh5dump
utility with the-H
flag to examine the file header and determine the dataset name.
-
Finish
the HDF5
hands-on exercise and email the professor with
(1) the full path to the source and executable
version of your program, and (2) the full paths to
the HDF5 files produced by your program when run with
the
- Monday, February 10 – Parallel algorithm analysis and design
-
Read: Barlas 2.1–2.3. See also Eijkhout (HPC) 2.3
Assignment: Start working on Barlas Chapter 2 Exercise 1 (pg 54). (See the note immediately below that clarifies this assignment).
- Wednesday, February 12 – Memory hierarchy & data organization (KOS 244)
-
Read: Eijkhout (HPC) 1.7
Assignment: (HW06) Barlas Chapter 2 Exercise 1 (pg 54). Note: Compute the total communication volume, as was done in equation (2.4). This is not the same as the number of communication operations. You need to compute the total amount of data that is communicated.
- Friday, February 14 – Parallel algorithm analysis and design; Parallel architectures
-
Read: Barlas 2.4–2.5
Assignment: (HW07) Working with a partner, complete the hands-on exercise for Memory hierarchy and data organization. Your team should submit a report as described in the assignment section of the exercise.
- Monday, February 17 – Shared memory programming: threads, semaphores & monitors
-
Read: Barlas 3.1–3.5
Assignment: Start working on Parallel program design: PCAM
- Wednesday, February 19 – Using threads and OpenMP (KOS 244)
-
Read: Barlas 3.6–3.7
Assignment: (HW08) Parallel program design: PCAM
- Friday, February 21 – Shared memory programming made easy: OpenMP
-
Read: Barlas 4.1–4.4
Assignment: Project 1 Due. Also, turn in (HW09), your report from the hands-on exercise Using OpenMP.
- Monday, February 24 – Distributed memory programming: Introduction to MPI
-
Read: Barlas 5.1–5.3
- Wednesday, February 26 – Introduction to cluster computing with MPI (KOS 244)
-
Read: Barlas 5.4–5.7
- Friday, February 28 – MPI collective communication
-
Read: Barlas 5.8–5.11
Assignment: (HW10) Complete the hands-on exercise Introduction to cluster computing with MPI and turn in your report, including printed copies of your well-documented source code for the ring-pass3 program.
- Monday, March 2 – MPI derived datatypes
-
Read: Barlas 5.12–5.13
Assignment: (HW11) Consider the problem of forming the transpose of an N×N matrix A. Suppose that there are N processes and process i contains an N-element array u that is a ith row of A. Write a single
MPI_Alltoall()
function call so that after the function is called, process i contains the ith row of AT stored in the N-element array v. Draw “before” and “after” diagrams to show the contents of u and v and show that A is transformed into AT. - Wednesday, March 4 – Midterm Exam
-
Assignment: Review for exam
- Monday, March 16 – COVID-19 Reset day
- Wednesday, March 18 – Project 2 work day
- Friday, March 20 – MPI derived datatypes example: Cartesian grids
- Read: Barlas 5.12–5.13
- Monday, March 23 – Parallel I/O in MPI with HDF5
-
Read: Barlas 5.15
- Wednesday, March 25 – Working with Cartesian grids in MPI (Zoom)
- Friday, March 27 – MPI Example: Parallel sorting
- Monday, March 30 – Project 3 work day
- Wednesday, April 1 – Parallel sorting with MPI on Canaan cluster (Zoom)
- Friday, April 3 – Introduction to GPU programming and CUDA
-
Read: Barlas 6.1–6.3
Assignment: Complete hands-on exercise Using the Canaan parallel cluster. - Monday, April 6 – CUDA memory types
-
Read: Barlas 6.4–6.5
- Wednesday, April 8 – Introduction to CUDA (Zoom)
-
Read: Barlas 6.6 (focus on 6.6.1 and 6.6.2)
- Friday, April 10 – Good Friday
- Monday, April 13 – Easter Monday
- Wednesday, April 15 – Global and shared memory in CUDA (Zoom)
- Assignment: Complete hands-on exercise Introduction to CUDA.
- Friday, April 17 – CUDA optimization
-
Read: Barlas 6.7.1–6.7.3
Assignment: Complete hands-on exercise CUDA shared memory. - Monday, April 20 – CUDA optimization example: parallel reduction
-
Read: Barlas 6.7.4–6.7.7
- Wednesday, April 22 – CUDA profiling and debugging
-
Read: Barlas 6.9–6.10
- Friday, April 24 – Introduction to the Thrust template library
-
Read: Barlas 7.1–7.4.1
- Monday, April 27 – Thrust algorithms
-
Read: Barlas 7.4.2–7.4.5
- Wednesday, April 29 – Using Thrust (Zoom)
- Friday, May 1 – OpenACC: Accelerator programming made easy
- Monday, May 4 – More about OpenACC
- Wednesday, May 6 – Using OpenACC (Zoom)