Notes for Parallelism
Task: computation of instructions, distinct part of program or algorithm.
Task-Parallelism: different tasks at the same time
Data-Parallelism: same task ,different data items at the same time (eg. 2 chef slicing one tomato each)
Dependencies: execution order of two tasks A and B. A must complete before B executes.
Notation: A -> B
Dependencies lead s to partial ordering
A and B can execute parallel if
-no path in dependence graph from A -> B
-no path in dependence graph from B -> A
Note that dependency is transitive, as A->C, C->B implies A->B
----------------------Task parallelism example-----------------------
Computing min,max,avg for a large data-set.
Task can be divided into
T1- MIN, T2 - MAX, T3 - AVG
Way to tell compiler to execute task in parallel is to use POSIX threads
----------------------Data parallelism example-----------------------
Computing pair-wise sum of two arrays
Use a SIMD extension of Intel processor to do sum in one step
When summing value of two arrays, conventional CPU needs one + operation per index, as they can only hold 1 data item at a time, namely a scalar register
Vector processors holds more values of same data type, registers of SSE extension of Intel's CPU are 128 bit wide
v4sf v; <------- This declares vector v, which consists of 4 fp numbers
^
|
an extension to the gcc, takes primitive data type and uses it across whole SSE register
v = v4sf {1.0,2.0,3.0,4.0}; <----------- assigns values to elements of v
SSE can apply operation to all elements of vector at once
Adding example :
v4sf VA,VB,VC;
VC = _builtin_ia32_addps(VA,VB);
Task-Parallelism: different tasks at the same time
Data-Parallelism: same task ,different data items at the same time (eg. 2 chef slicing one tomato each)
Dependencies: execution order of two tasks A and B. A must complete before B executes.
Notation: A -> B
Dependencies lead s to partial ordering
A and B can execute parallel if
-no path in dependence graph from A -> B
-no path in dependence graph from B -> A
Note that dependency is transitive, as A->C, C->B implies A->B
----------------------Task parallelism example-----------------------
Computing min,max,avg for a large data-set.
Task can be divided into
T1- MIN, T2 - MAX, T3 - AVG
Way to tell compiler to execute task in parallel is to use POSIX threads
----------------------Data parallelism example-----------------------
Computing pair-wise sum of two arrays
Use a SIMD extension of Intel processor to do sum in one step
When summing value of two arrays, conventional CPU needs one + operation per index, as they can only hold 1 data item at a time, namely a scalar register
Vector processors holds more values of same data type, registers of SSE extension of Intel's CPU are 128 bit wide
v4sf v; <------- This declares vector v, which consists of 4 fp numbers
^
|
an extension to the gcc, takes primitive data type and uses it across whole SSE register
v = v4sf {1.0,2.0,3.0,4.0}; <----------- assigns values to elements of v
SSE can apply operation to all elements of vector at once
Adding example :
v4sf VA,VB,VC;
VC = _builtin_ia32_addps(VA,VB);
Comments
Post a Comment