Lecture 27: MPI Parallel Programming Point-to-Point communication: Blocking vs. Non-blocking sends.

Lecture Summary

  • Last time

    • HPC via MPI

    • MPI point-to-point communication: The blocking flavor

  • Today

    • Wrap up point-to-point communication

    • Collective communication

Point-to-point communication

  • Different "send" modes:

    • Synchronous send: MPI_SSEND

      • Risk of deadlock/waiting -> idle time

      • High latency but better bandwidth than bsend

    • Buffered (async) send: MPI_BSEND

      • Low latency/bandwidth

    • Standard send: MPI_SEND

      • Up to the MPI implementation to device whether to do rendezvous or eager

      • Less overhead if in eager mode

      • Blocks in rendezvous, switches to sync mode

    • Ready send: MPI_RSEND

      • Works only if the matching receive has been posted

      • Rarely used, very dangerous

  • Receiving, all modes: MPI_RECV

  • Buffered send

    • Reduces overhead associated with data transmission

    • Relies on the existence of a buffer. Buffering incurs an extra memory copy

    • Return from an MPI_Bsend does not guarantee the message was sent: the message remains in the buffer until a matching receive is posted

Non-blocking point-to-point

  • Blocking send: Covered above. Upon return from a send, you can modify the content of the buffer in which you stored data to be sent since the data has been sent

  • Non-blocking send: The sender returns immediately, no guarantee that the data has been transmitted

    • Routine name starts with MPI_I

    • Gets to do useful work (overlap communication with execution) upon return from the non-blocking call

    • Use synchronization call to wait for communication to complete

  • MPI_Wait: Blocks until a certain request is completed

    • Wait for multiple sends: Waitall, Waitany, Waitsome

  • MPI_Test: Non-blocking, returns quickly with status information

    • int MPI_Test(MPI_Request *request, int *flag, MPI_Status *status);

  • MPI_Probe: Allows for incoming messages to be queried prior to receiving them

Collective communications

  • Three types of collective actions:

    • Synchronization (barrier)

    • Communication (e.g., broadcast)

    • Operation (e.g., reduce)

  • Broadcast: MPI_Bcast

  • Gather: MPI_Gather

  • Scatter: MPI_Scatter

  • Reduce: MPI_Reduce

    • Result is collected by the root only

  • Allreduce: MPI_Allreduce

    • Result is sent out to all ranks in the communicator

  • Prefix scan: MPI_Scan

  • User-defined reduction operations: Register using MPI_Op_create()

Last updated