Lecture 26: MPI Parallel Programming General Introduction. Point-to-Point Communication.
Lecture Summary
Last time
Wrapped up “Critical thinking” segment. Went through a case study, saw more than 100X speed up
No function-call blockers; loop unrolling & re-association; dropped in the wide-register vectorization
Started discussion about parallel computing via message passing (multi-process parallel computing)
Covered the hardware aspects related to HPC
Today
HPC via MPI: discuss the basic ideas/paradigms
MPI point-to-point communication
MPI
Introduction to message passing and MPI
CUDA: A kernel (a small snippet of code) is run by all threads spawned via an execution configuration
OpenMP: All threads execute an omp parallel region, work sharing
MPI: The entire code is executed in parallel by all processses
MPI does branching based on the process rank
Very similar to GPU computing, where one thread does work based on its thread index
Very similar to OpenMP function omp_get_thread_num()
Each MPI process has its own program counter and virtual address space
The variables of each program have the same name but live in different virtual memories and assume different values
MPI can be used whenever it is possible for processes to exchange messages:
Distributed memory systems
Network of workstations
One workstation with many cores
Data is passed through the main memory instead of a network
Different ranks share the same physical memory, but they are each tied to separate virtual memory spaces
Point-to-Point (P2P) Communication
P2P: Simplest form of message passing communication
One process sends a message to another process (MPI_Send, MPI_Recv)
int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dst, int tag, MPI_Comm comm)
buf: starting point of the message with count elements, each described with datatype
dst: rank of the destination process within the comm communicator
tag: used to distinguish between different messages
int MPI_Send(void buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Status status)
Envelope information is returned in an MPI_Status object
A custom communicator can be created using
MPI_Comm_create(MPI_COMM_WORLD, new_group, &MY_COMM_WORLD);
MPI data types and their C counterparts: see table below
The order of messages is preserved, i.e. messages do not overtake each other
Receiver can wildcard to received from any source/tag: MPI_ANY_SOURCE/MPI_ANY_TAG
For a communication to succeed:
Sender must specify a valid destination rank
Receiver must specify a valid source rank
The communicator must be the same
Tags must match
Message data types must match
Receiver's buffer must be large enough
MPI_Send and MPI_Recv are blocking: when a process sends, it does not stop until another process receives
Eager mode vs. Rendezvous mode
Eager mode: Small messages, the content of the buffer is picked up right away by the MPI runtime
Rendezvous mode: Large amount of data, the sender function waits for the receiver to post a receive before the runtime facilitates the sending of the actual data of the message
Last updated