Sl.No | Chapter Name | MP4 Download |
---|---|---|
1 | Introduction to Parallel Programming | Download |
2 | Parallel Architectures and Programming Models | Download |
3 | Pipelining | Download |
4 | Superpipelining and VLIW | Download |
5 | Memory Latency | Download |
6 | Cache and Temporal Locality | Download |
7 | Cache, Memory bandwidth and Spatial Locality | Download |
8 | Intuition for Shared and Distributed Memory architectures | Download |
9 | Shared and Distributed Memory architectures | Download |
10 | Interconnection networks in Distributed Memory architectures | Download |
11 | OpenMP: A parallel Hello World Program | Download |
12 | Program with Single thread | Download |
13 | Program Memory with Multiple threads and Multi-tasking | Download |
14 | Context Switching | Download |
15 | OpenMP: Basic thread functions | Download |
16 | OpenMP: About OpenMP | Download |
17 | Shared Memory Consistency Models and the Sequential Consistency Model | Download |
18 | Race Conditions | Download |
19 | OpenMP: Scoping variables and some race conditions | Download |
20 | OpenMP: thread private variables and more constructs | Download |
21 | Computing sum: first attempt at parallelization | Download |
22 | Manual distribution of work and critical sections | Download |
23 | Distributing for loops and reduction | Download |
24 | Vector-Vector operations (Dot product) | Download |
25 | Matrix-Vector operations (Matrix-Vector Multiply) | Download |
26 | Matrix-Matrix operations (Matrix-Matrix Multiply) | Download |
27 | Introduction to tasks | Download |
28 | Task queues and task execution | Download |
29 | Accessing variables in tasks | Download |
30 | Completion of tasks and scoping variables in tasks | Download |
31 | Recursive task spawning and pitfalls | Download |
32 | Understanding LU Factorization | Download |
33 | Parallel LU Factorization | Download |
34 | Locks | Download |
35 | Advanced Task handling | Download |
36 | Matrix Multiplication using tasks | Download |
37 | The OpenMP Shared Memory Consistency Model | Download |
38 | Applications finite element method | Download |
39 | Applications deep learning | Download |
40 | Introduction to MPI and basic calls | Download |
41 | MPI calls to send and receive data | Download |
42 | MPI calls for broadcasting data | Download |
43 | MPI non blocking calls | Download |
44 | Application distributed histogram updation | Download |
45 | MPI collectives and MPI broadcast | Download |
46 | MPI gathering and scattering collectives | Download |
47 | MPI reduction and alltoall collectives | Download |
48 | Discussion on MPI collectives design | Download |
49 | Characteriziation of interconnects | Download |
50 | Linear arrays 2D mesh and torus | Download |
51 | d dimensional torus | Download |
52 | Hypercube | Download |
53 | Trees and cliques | Download |
54 | Hockney model | Download |
55 | Broadcast and Reduce with recursive doubling | Download |
56 | Scatter and Gather with recursive doubling | Download |
57 | Reduce scatter and All gather with recursive doubling | Download |
58 | Discussion of message sizes in analysis | Download |
59 | Revisiting Reduce scatter on 2D mesh | Download |
60 | Reduce scatter and Allreduce on the Hypercube | Download |
61 | Alltoall on the Hypercube | Download |
62 | Lower bounds | Download |
63 | Pipeline based algorithm for Allreduce | Download |
64 | An improved algorithm for Alltoall on the Hypercube using E cube routing | Download |
65 | Pipeline based algorithm for Broadcast | Download |
66 | Introduction to parallel graph algorithms | Download |
67 | Breadth First Search BFS using matrix algebra | Download |
68 | BFS Shared memory parallelization using OpenMP | Download |
69 | Distributed memory settings and data distribution | Download |
70 | Distributed BFS algorithm | Download |
71 | Performance considerations | Download |
72 | Prims Algorithm | Download |
73 | OpenMP based shared memory parallelization for MST | Download |
74 | MPI based distributed memory parallelization for MST | Download |
75 | Sequential Algorithm Adaptation from Prims | Download |
76 | Parallelization Strategy for Prims algorithm | Download |
77 | Dry run with the parallel strategy | Download |
78 | Johnsons algorithm with 1D data distribution | Download |
79 | Speedup analysis on a grid graph | Download |
80 | Floyds algorithm for all pair shortest paths | Download |
81 | Floyds algorithm with 2D data distribution | Download |
82 | Adaptation to transitive closures | Download |
83 | Parallelization strategy for connected components | Download |
84 | Analysis for parallel connected components | Download |
Sl.No | Chapter Name | English |
---|---|---|
1 | Introduction to Parallel Programming | PDF unavailable |
2 | Parallel Architectures and Programming Models | PDF unavailable |
3 | Pipelining | PDF unavailable |
4 | Superpipelining and VLIW | PDF unavailable |
5 | Memory Latency | PDF unavailable |
6 | Cache and Temporal Locality | PDF unavailable |
7 | Cache, Memory bandwidth and Spatial Locality | PDF unavailable |
8 | Intuition for Shared and Distributed Memory architectures | PDF unavailable |
9 | Shared and Distributed Memory architectures | PDF unavailable |
10 | Interconnection networks in Distributed Memory architectures | PDF unavailable |
11 | OpenMP: A parallel Hello World Program | PDF unavailable |
12 | Program with Single thread | PDF unavailable |
13 | Program Memory with Multiple threads and Multi-tasking | PDF unavailable |
14 | Context Switching | PDF unavailable |
15 | OpenMP: Basic thread functions | PDF unavailable |
16 | OpenMP: About OpenMP | PDF unavailable |
17 | Shared Memory Consistency Models and the Sequential Consistency Model | PDF unavailable |
18 | Race Conditions | PDF unavailable |
19 | OpenMP: Scoping variables and some race conditions | PDF unavailable |
20 | OpenMP: thread private variables and more constructs | PDF unavailable |
21 | Computing sum: first attempt at parallelization | PDF unavailable |
22 | Manual distribution of work and critical sections | PDF unavailable |
23 | Distributing for loops and reduction | PDF unavailable |
24 | Vector-Vector operations (Dot product) | PDF unavailable |
25 | Matrix-Vector operations (Matrix-Vector Multiply) | PDF unavailable |
26 | Matrix-Matrix operations (Matrix-Matrix Multiply) | PDF unavailable |
27 | Introduction to tasks | PDF unavailable |
28 | Task queues and task execution | PDF unavailable |
29 | Accessing variables in tasks | PDF unavailable |
30 | Completion of tasks and scoping variables in tasks | PDF unavailable |
31 | Recursive task spawning and pitfalls | PDF unavailable |
32 | Understanding LU Factorization | PDF unavailable |
33 | Parallel LU Factorization | PDF unavailable |
34 | Locks | PDF unavailable |
35 | Advanced Task handling | PDF unavailable |
36 | Matrix Multiplication using tasks | PDF unavailable |
37 | The OpenMP Shared Memory Consistency Model | PDF unavailable |
38 | Applications finite element method | PDF unavailable |
39 | Applications deep learning | PDF unavailable |
40 | Introduction to MPI and basic calls | PDF unavailable |
41 | MPI calls to send and receive data | PDF unavailable |
42 | MPI calls for broadcasting data | PDF unavailable |
43 | MPI non blocking calls | PDF unavailable |
44 | Application distributed histogram updation | PDF unavailable |
45 | MPI collectives and MPI broadcast | PDF unavailable |
46 | MPI gathering and scattering collectives | PDF unavailable |
47 | MPI reduction and alltoall collectives | PDF unavailable |
48 | Discussion on MPI collectives design | PDF unavailable |
49 | Characteriziation of interconnects | PDF unavailable |
50 | Linear arrays 2D mesh and torus | PDF unavailable |
51 | d dimensional torus | PDF unavailable |
52 | Hypercube | PDF unavailable |
53 | Trees and cliques | PDF unavailable |
54 | Hockney model | PDF unavailable |
55 | Broadcast and Reduce with recursive doubling | PDF unavailable |
56 | Scatter and Gather with recursive doubling | PDF unavailable |
57 | Reduce scatter and All gather with recursive doubling | PDF unavailable |
58 | Discussion of message sizes in analysis | PDF unavailable |
59 | Revisiting Reduce scatter on 2D mesh | PDF unavailable |
60 | Reduce scatter and Allreduce on the Hypercube | PDF unavailable |
61 | Alltoall on the Hypercube | PDF unavailable |
62 | Lower bounds | PDF unavailable |
63 | Pipeline based algorithm for Allreduce | PDF unavailable |
64 | An improved algorithm for Alltoall on the Hypercube using E cube routing | PDF unavailable |
65 | Pipeline based algorithm for Broadcast | PDF unavailable |
66 | Introduction to parallel graph algorithms | PDF unavailable |
67 | Breadth First Search BFS using matrix algebra | PDF unavailable |
68 | BFS Shared memory parallelization using OpenMP | PDF unavailable |
69 | Distributed memory settings and data distribution | PDF unavailable |
70 | Distributed BFS algorithm | PDF unavailable |
71 | Performance considerations | PDF unavailable |
72 | Prims Algorithm | PDF unavailable |
73 | OpenMP based shared memory parallelization for MST | PDF unavailable |
74 | MPI based distributed memory parallelization for MST | PDF unavailable |
75 | Sequential Algorithm Adaptation from Prims | PDF unavailable |
76 | Parallelization Strategy for Prims algorithm | PDF unavailable |
77 | Dry run with the parallel strategy | PDF unavailable |
78 | Johnsons algorithm with 1D data distribution | PDF unavailable |
79 | Speedup analysis on a grid graph | PDF unavailable |
80 | Floyds algorithm for all pair shortest paths | PDF unavailable |
81 | Floyds algorithm with 2D data distribution | PDF unavailable |
82 | Adaptation to transitive closures | PDF unavailable |
83 | Parallelization strategy for connected components | PDF unavailable |
84 | Analysis for parallel connected components | PDF unavailable |
Sl.No | Chapter Name | Hindi |
---|---|---|
1 | Introduction to Parallel Programming | Download |
2 | Parallel Architectures and Programming Models | Download |
3 | Pipelining | Download |
4 | Superpipelining and VLIW | Download |
5 | Memory Latency | Download |
6 | Cache and Temporal Locality | Download |
7 | Cache, Memory bandwidth and Spatial Locality | Download |
8 | Intuition for Shared and Distributed Memory architectures | Download |
9 | Shared and Distributed Memory architectures | Download |
10 | Interconnection networks in Distributed Memory architectures | Download |
11 | OpenMP: A parallel Hello World Program | Download |
12 | Program with Single thread | Download |
13 | Program Memory with Multiple threads and Multi-tasking | Download |
14 | Context Switching | Download |
15 | OpenMP: Basic thread functions | Download |
16 | OpenMP: About OpenMP | Download |
17 | Shared Memory Consistency Models and the Sequential Consistency Model | Download |
18 | Race Conditions | Download |
19 | OpenMP: Scoping variables and some race conditions | Download |
20 | OpenMP: thread private variables and more constructs | Download |
21 | Computing sum: first attempt at parallelization | Download |
22 | Manual distribution of work and critical sections | Download |
23 | Distributing for loops and reduction | Download |
24 | Vector-Vector operations (Dot product) | Download |
25 | Matrix-Vector operations (Matrix-Vector Multiply) | Download |
26 | Matrix-Matrix operations (Matrix-Matrix Multiply) | Download |
27 | Introduction to tasks | Download |
28 | Task queues and task execution | Download |
29 | Accessing variables in tasks | Download |
30 | Completion of tasks and scoping variables in tasks | Download |
31 | Recursive task spawning and pitfalls | Download |
32 | Understanding LU Factorization | Download |
33 | Parallel LU Factorization | Download |
34 | Locks | Download |
35 | Advanced Task handling | Download |
36 | Matrix Multiplication using tasks | Download |
37 | The OpenMP Shared Memory Consistency Model | Download |
38 | Applications finite element method | Not Available |
39 | Applications deep learning | Not Available |
40 | Introduction to MPI and basic calls | Not Available |
41 | MPI calls to send and receive data | Not Available |
42 | MPI calls for broadcasting data | Not Available |
43 | MPI non blocking calls | Not Available |
44 | Application distributed histogram updation | Not Available |
45 | MPI collectives and MPI broadcast | Not Available |
46 | MPI gathering and scattering collectives | Not Available |
47 | MPI reduction and alltoall collectives | Not Available |
48 | Discussion on MPI collectives design | Not Available |
49 | Characteriziation of interconnects | Not Available |
50 | Linear arrays 2D mesh and torus | Not Available |
51 | d dimensional torus | Not Available |
52 | Hypercube | Not Available |
53 | Trees and cliques | Not Available |
54 | Hockney model | Not Available |
55 | Broadcast and Reduce with recursive doubling | Not Available |
56 | Scatter and Gather with recursive doubling | Not Available |
57 | Reduce scatter and All gather with recursive doubling | Not Available |
58 | Discussion of message sizes in analysis | Not Available |
59 | Revisiting Reduce scatter on 2D mesh | Not Available |
60 | Reduce scatter and Allreduce on the Hypercube | Not Available |
61 | Alltoall on the Hypercube | Not Available |
62 | Lower bounds | Not Available |
63 | Pipeline based algorithm for Allreduce | Not Available |
64 | An improved algorithm for Alltoall on the Hypercube using E cube routing | Not Available |
65 | Pipeline based algorithm for Broadcast | Not Available |
66 | Introduction to parallel graph algorithms | Not Available |
67 | Breadth First Search BFS using matrix algebra | Not Available |
68 | BFS Shared memory parallelization using OpenMP | Not Available |
69 | Distributed memory settings and data distribution | Not Available |
70 | Distributed BFS algorithm | Not Available |
71 | Performance considerations | Not Available |
72 | Prims Algorithm | Not Available |
73 | OpenMP based shared memory parallelization for MST | Not Available |
74 | MPI based distributed memory parallelization for MST | Not Available |
75 | Sequential Algorithm Adaptation from Prims | Not Available |
76 | Parallelization Strategy for Prims algorithm | Not Available |
77 | Dry run with the parallel strategy | Not Available |
78 | Johnsons algorithm with 1D data distribution | Not Available |
79 | Speedup analysis on a grid graph | Not Available |
80 | Floyds algorithm for all pair shortest paths | Not Available |
81 | Floyds algorithm with 2D data distribution | Not Available |
82 | Adaptation to transitive closures | Not Available |
83 | Parallelization strategy for connected components | Not Available |
84 | Analysis for parallel connected components | Not Available |
Sl.No | Language | Book link |
---|---|---|
1 | English | Download |
2 | Bengali | Not Available |
3 | Gujarati | Not Available |
4 | Hindi | Not Available |
5 | Kannada | Not Available |
6 | Malayalam | Not Available |
7 | Marathi | Not Available |
8 | Tamil | Not Available |
9 | Telugu | Not Available |