Notes for Parallelism
Task: computation of instructions, distinct part of program or algorithm.
TaskParallelism: different tasks at the same time
DataParallelism: same task ,different data items at the same time (eg. 2 chef slicing one tomato each)
Dependencies: execution order of two tasks A and B. A must complete before B executes.
Notation: A > B
Dependencies lead s to partial ordering
A and B can execute parallel if
no path in dependence graph from A > B
no path in dependence graph from B > A
Note that dependency is transitive, as A>C, C>B implies A>B
Task parallelism example
Computing min,max,avg for a large dataset.
Task can be divided into
T1 MIN, T2  MAX, T3  AVG
Way to tell compiler to execute task in parallel is to use POSIX threads
Data parallelism example
Computing pairwise sum of two arrays
Use a SIMD extension of Intel processor to do sum in one step
When summing value of two arrays, conventional CPU needs one + operation per index, as they can only hold 1 data item at a time, namely a scalar register
Vector processors holds more values of same data type, registers of SSE extension of Intel's CPU are 128 bit wide
v4sf v; < This declares vector v, which consists of 4 fp numbers
^

an extension to the gcc, takes primitive data type and uses it across whole SSE register
v = v4sf {1.0,2.0,3.0,4.0}; < assigns values to elements of v
SSE can apply operation to all elements of vector at once
Adding example :
v4sf VA,VB,VC;
VC = _builtin_ia32_addps(VA,VB);
TaskParallelism: different tasks at the same time
DataParallelism: same task ,different data items at the same time (eg. 2 chef slicing one tomato each)
Dependencies: execution order of two tasks A and B. A must complete before B executes.
Notation: A > B
Dependencies lead s to partial ordering
A and B can execute parallel if
no path in dependence graph from A > B
no path in dependence graph from B > A
Note that dependency is transitive, as A>C, C>B implies A>B
Task parallelism example
Computing min,max,avg for a large dataset.
Task can be divided into
T1 MIN, T2  MAX, T3  AVG
Way to tell compiler to execute task in parallel is to use POSIX threads
Data parallelism example
Computing pairwise sum of two arrays
Use a SIMD extension of Intel processor to do sum in one step
When summing value of two arrays, conventional CPU needs one + operation per index, as they can only hold 1 data item at a time, namely a scalar register
Vector processors holds more values of same data type, registers of SSE extension of Intel's CPU are 128 bit wide
v4sf v; < This declares vector v, which consists of 4 fp numbers
^

an extension to the gcc, takes primitive data type and uses it across whole SSE register
v = v4sf {1.0,2.0,3.0,4.0}; < assigns values to elements of v
SSE can apply operation to all elements of vector at once
Adding example :
v4sf VA,VB,VC;
VC = _builtin_ia32_addps(VA,VB);
Comments
Post a Comment