Parallel Programming with MPIMorgan Kaufmann, 1997 - 418 pagine "...the detailed discussion of many complex and confusing issues makes the book an important information source for programmers developing large applications using MPI."/em -- L.M. Liebrock, ACM Computing Reviews A hands-on introduction to parallel programming based on the Message-Passing Interface (MPI) standard, the de-facto industry standard adopted by major vendors of commercial parallel systems. This textbook/tutorial, based on the C language, contains many fully-developed examples and exercises. The complete source code for the examples is available in both C and Fortran 77. Students and professionals will find that the portability of MPI, combined with a thorough grounding in parallel programming principles, will allow them to program any parallel system, from a network of workstations to a parallel supercomputer. Features: + Proceeds from basic blocking sends and receives to the most esoteric aspects of MPI. + Includes extensive coverage of performance and debugging. + Discusses a variety of approaches to the problem of basic I/O on parallel machines. + Provides exercises and programming assignments. |
Sommario
Introduction | 1 |
12 The Need for Parallel Computing | 3 |
13 The Bad News | 5 |
14 MPI | 6 |
15 The Rest of the Book | 7 |
16 Typographic Conventions | 9 |
An Overview of Parallel Computing | 11 |
22 Software Issues | 25 |
103 Parallel Jacobis Method | 220 |
104 Coding Parallel Programs | 225 |
Sorting | 226 |
106 Summary | 240 |
107 References | 241 |
109 Programming Assignments | 242 |
Performance | 245 |
The Serial Trapezoidal Rule | 247 |
23 Summary | 36 |
24 References | 38 |
Greetings | 41 |
32 Execution | 42 |
33 MPI | 43 |
34 Summary | 50 |
35 References | 51 |
36 Exercises | 52 |
An Application Numerical Integration | 53 |
42 Parallelizing the Trapezoidal Rule | 56 |
43 IO on Parallel Systems | 60 |
44 Summary | 63 |
47 Programming Assignments | 64 |
Collective Communication | 65 |
52 Broadcast | 69 |
53 Tags Safety Buffering and Synchronization | 71 |
54 Reduce | 73 |
55 Dot Product | 75 |
56 Allreduce | 76 |
57 Gather and Scatter | 78 |
58 Allgather | 82 |
59 Summary | 83 |
510 References | 86 |
512 Programming Assignments | 87 |
Grouping Data tor Communication | 89 |
62 Derived Types and MPI_Type_struct | 90 |
63 Other Derived Datatype Constructors | 96 |
64 Type Matching | 98 |
G5 PackUnpack | 100 |
66 Deciding Which Method to Use | 103 |
67 Summary | 105 |
68 References | 107 |
69 Exercises | 108 |
610 Programming Assignments | 109 |
Communicators and Topologies | 111 |
72 Foxs Algorithm | 113 |
73 Communicators | 116 |
74 Working with Groups Contexts and Communicators | 117 |
75 MPI_Comm_split | 120 |
76 Topologies | 121 |
77 MPI_Cart_sub | 124 |
78 Implementation of Foxs Algorithm | 125 |
79 Summary | 128 |
710 References | 132 |
712 Programming Assignments | 133 |
Dealing with IO | 137 |
81 Dealing with stdin stdout and stderr | 138 |
82 Limited Access to stdin | 154 |
83 File IO | 156 |
84 Array IO | 158 |
85 Summary | 171 |
86 References | 176 |
88 Programming Assignments | 177 |
Debugging Your Program | 179 |
92 More on Serial Debugging | 188 |
95 An Example | 191 |
9511 Finishing Up | 210 |
97 Summary | 212 |
98 References | 215 |
Design and Coding of Parallel Programs | 217 |
101 DataParallel Programs | 218 |
113 What about the IO? | 248 |
114 Parallel Program Performance Analysis | 249 |
115 The Cost of Communication | 250 |
The Parallel Trapezoidal Rule | 252 |
117 Taking Timings | 254 |
118 Summary | 256 |
119 References | 257 |
1111 Programming Assignments | 258 |
More on Performance | 259 |
122 Work and Overhead | 261 |
123 Sources of Overhead | 262 |
124 Scalability | 263 |
125 Potential Problems in Estimating Performance | 265 |
1254 Collective Communication | 269 |
127 Summary | 275 |
128 References | 277 |
1210 Programming Assignments | 278 |
Advanced PointtoPoint Communication | 279 |
Coding Allgather | 280 |
132 Hypercubes | 284 |
133 Sendreceive | 293 |
134 Null Processes | 295 |
135 Nonblocking Communication | 296 |
136 Persistent Communication Requests | 301 |
137 Communication Modes | 304 |
138 The Last Word on PointtoPoint Communication | 309 |
1310 References | 313 |
1312 Programming Assignments | 314 |
Parallel Algorithms | 315 |
142 Sorting | 316 |
144 Parallel Bitonic Sort | 320 |
145 Tree Searches and Combinatorial Optimization | 324 |
146 Serial Tree Search | 325 |
147 Parallel Tree Search | 328 |
148 Summary | 335 |
149 References | 336 |
1411 Programming Assignments | 337 |
Parallel Libraries | 339 |
152 Using More than One Language | 340 |
153 ScaLAPACK | 342 |
154 An Example of a ScaLAPACK Program | 345 |
155 PETSc | 350 |
156 A PETSc Example | 352 |
157 Summary | 358 |
359 | |
Wrapping Up | 361 |
162 The Future of MPI | 362 |
Summary of MPI Commands | 363 |
A2 Derived Datatypes and MPI_PackUnpack | 372 |
A3 Collective Communication Functions | 376 |
A4 Groups Contexts and Communicators | 381 |
A6 Environmental Management | 391 |
A7 Profiling | 393 |
A9 Type Definitions | 396 |
MPI on the Internet | 399 |
B2 The MPI FAQ | 400 |
B6 Parallel Programming with MPI | 401 |
403 | |
407 | |
Parole e frasi comuni
allgather Amdahl's law argument array assigned attribute basic bitonic sort block blocksize broadcast buffer cached Chapter collective communication column Comm comm contents Cprintf debugging define depth-first search derived datatype derived type dest Discussed in section displacements distributed dot product elements entries error example execution float Fortran grid hypercube I/O functions I/O process implementation in/out input int count int int int MPI_Datatype io_comm keys LAPACK linear matrix memory message passing MIMD MPI function MPI provides MPI_Comm comm MPI_COMM_WORLD MPI_Datatype datatype MPI_FLOAT MPI_Group MPI_INT MPI_Pack MPI_Recv MPI_Request MPI_Send MPI_Status mpich my_rank node nonblocking number of processes operation output parallel computing parallel program parallel systems parameter performance PETSC problem process rank processors receive request runtime ScaLAPACK send_count sequence serial program shared-memory SIMD solution speedup stage status stored topology trapezoidal rule type signature typelist vector void write