# Lecture 21: OpenMP Work Sharing.

## Lecture Summary

* Last time: OpenMP nested parallelism, work sharing (for loops, sections)
* Today
  * OpenMP: nested parallelism, work sharing (tasks)
  * OpenMP: variable scoping, synchronization, loose ends

## OpenMP Work Sharing

### omp sections

Ending example

![](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MWSRPz34AGvLjdJKN2u%2F-MWSTn2X37MJRJRVI-tw%2FScreen%20Shot%202021-03-23%20at%2012.07.33%20AM.png?alt=media\&token=c198110d-2cac-4d1f-ac4f-801bf42b207c)

![](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MWSRPz34AGvLjdJKN2u%2F-MWSTr_VW11tfM0YFYDo%2FScreen%20Shot%202021-03-23%20at%2012.07.53%20AM.png?alt=media\&token=8459a2ea-a72a-4097-a19b-f0461e8602a7)

### omp tasks

* Pros: Allows parallelization of irregular problems
  * Unbounded loops
  * Recursive algorithms
  * Producer/consumer
* Cons: Relatively tricky to deal with & introduce some overhead&#x20;
* Motivations
  * OpenMP started to be tailored for large array-based applications
  * For example, the parallelization of a dynamic list traversal cannot be done in OpenMP for a long time
  * Storing pointers to list elements in an array: High overhead for array construction (not easy to parallelize)
  * Using single nowait inside a parallel region: High cost of the single construct. Also, each thread needs to traverse the entire list to determine if another thread has already processed that element
* Who does what and when?
  * The developer
    * Uses a pragma to specify where & what the tasks are
    * Ensures that there are no dependencies (that is, tasks can be executed independently)
  * The OpenMP runtime system
    * Generates a new task whenever a thread encounters a task construct
    * Decide the moment of execution (can be immediate or delayed)
* Definition: A task is a specific instance/combo of executable code along w/ its data environment (the shared & private data manipulated by the task) and ICV (internal control variables: thread scheduling and environment variables, typically associated with OpenMP)
* Synchronization issues. Solution: use task barriers (`#pragma omp barrier`, `#pragma omp taskwait`) to ensure the completion of tasks.

![](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MWSRPz34AGvLjdJKN2u%2F-MWSWArBj6G2eAqlx9TX%2FScreen%20Shot%202021-03-23%20at%2012.18.00%20AM.png?alt=media\&token=98424788-6e23-481c-aeaf-ff8fd316eba5)

![](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MWSRPz34AGvLjdJKN2u%2F-MWSWfDdLK1Hrm_64t0x%2FScreen%20Shot%202021-03-23%20at%2012.20.09%20AM.png?alt=media\&token=1671d3e8-1bd8-4e1c-9cd6-4bf63f2b5b3a)

![](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MWSRPz34AGvLjdJKN2u%2F-MWSXBEAWGJLJYw73vYk%2FScreen%20Shot%202021-03-23%20at%2012.22.23%20AM.png?alt=media\&token=4b45f062-4719-4a42-b2f9-5131054495b2)

## OpenMP Variable Scoping Issues

* Threads have access to a pool of memory that is shared
* Threads can also have private data
* Basic rule: Any variable declared prior to a parallel region is shared in that parallel region
* The private clause reproduces for each thread variables declared private in the pragma
* There are also OpenMP variables treated as private by default
  * Stack (local) variables in functions called from within parallel regions
  * Loop iteration variables
  * Automatic variables within a statement block
* When in doubt, always explicitly indicate something to be private
* firstprivate: Specifies that each thread should have its own instance of a variable. Moreover, the variable is initializes using the value of the variable of the same name from the master thread
  * Usage: #pragma omp parallel num\_threads(4) firstprivate(i)&#x20;
* lastprivate: The enclosing context's version of the variable is set equal to the private version of whichever thread executes the final iteration of the work-sharing construct (for or section)
* Data scoping is a common source of errors in OpenMP. It is the programmer's responsibility to make sure data dependencies do not lead to race conditions

![](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MWSRPz34AGvLjdJKN2u%2F-MWSZIRLN6GvZpV7Wtu8%2FScreen%20Shot%202021-03-23%20at%2012.31.39%20AM.png?alt=media\&token=e1fbcbde-6897-43b1-a724-1045fdc05f48)

![Example of what's being shared and what's not](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MWSRPz34AGvLjdJKN2u%2F-MWSf4FMi0EBydTdwIaK%2FScreen%20Shot%202021-03-23%20at%201.01.16%20AM.png?alt=media\&token=ffca8b33-dce0-42d9-ab63-287ff17f231a)

## OpenMP Synchronization

* Explicit barrier: #pragma omp barrier
* Implicit barriers: parallel, for, single, sections
* Unnecessary barriers hurt performance and can be removed with the nowait clause (applicable to for, single, sections)

![The nowait clause](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MWSRPz34AGvLjdJKN2u%2F-MWScsVyj_zlTjnb_axv%2FScreen%20Shot%202021-03-23%20at%2012.51.38%20AM.png?alt=media\&token=a623f79f-a32f-450f-9742-10e2e280115f)

![The critical construct: prevents race conditions and protects access to shared, modifiable data](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MWSRPz34AGvLjdJKN2u%2F-MWScy5Eon9-swpWx3bD%2FScreen%20Shot%202021-03-23%20at%2012.52.01%20AM.png?alt=media\&token=269d2dbe-910c-4e0a-8781-a2b252edee12)

![The critical construct in action. Note that naming the critical construct RES\_lock is optional but highly recommended](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MWSRPz34AGvLjdJKN2u%2F-MWSdeBK4I5YQxxeN8OF%2FScreen%20Shot%202021-03-23%20at%2012.55.01%20AM.png?alt=media\&token=8b9f4a67-7bba-4292-8e61-10472f6fa413)
