ParCo'95                                                                                                                                                                 BELGACOM

Shared Memory Parallel Programming on Networks of Workstations

Willy Zwaenepoel
Department of Computer Science
Rice University
Houston, TX 77251-1892
Phone: 713-285-5402
Fax: 713-285-5930


This tutorial will explore the potential of shared memory parallel programming on networks of workstations or distributed memory machines like the IBM SP-2. Shared memory parallel programming is easier than message passing, and allows for easy porting of codes from conventional shared-memory multiprocessors like the Cray-YMP or the SGI Power Challenge to networks of workstations. It also allows for expression of parallelization strategies that are not easily supported by HPF or other parallelizing compiler approaches.

This tutorial will discuss (i) programming using distributed shared memory, (ii) efficient software implementation of distributed shared memory on machines without hardware support for shared memory, and (iii) experience using distributed shared memory on networks of workstations for real applications, from the domains of operations research (mixed integer programming), computational genetics (genetic linkage analysis), and computational seismology (seismic inversion).

Description of Tutorial and Intended Audience

Networks of workstations are increasingly being used as platforms for parallel computing. By building on existing infrastructure, networks of workstations provide a low-cost, low-risk entry into the parallel computing arena. Unfortunately, programming networks of workstations using the native network message passing facilities has proven to be a daunting task. In contrast, using a shared memory programming model, familiar from conventional shared memory multiprocessors, existing sequential codes can be parallelized with much less programmer effort than when using message passing. Also, parallel codes written for shared-memory multiprocessors can be ported with little effort. The combination of the two, shared memory programming on a network of workstations, thus provides excellent leverage both for hardware and software investments.

The goal of this tutorial is to give the audience a complete picture of the current state of the art in shared memory parallel programming on networks of workstations.

The first part of the tutorial will address the "what" and "why" questions, discussing how to program with shared memory, and why this is a good idea. Several examples will be given, and the shared memory approach will be contrasted with message passing approaches like PVM or implicitly parallel approaches like HPF.

The second part of the tutorial will focus on implementation techniques for shared memory on networks of workstations, or, more generally, on distributed memory machines. The main implementation problems will be presented (sequential consistency and false sharing), and solutions from various systems will be discussed. In particular, we will look at lazy release consistency and multiple-writer protocols as used in the TreadMarks systems and entry consistency and software write detection as used in the Midway system.

The third part of the talk will describe some application experience with shared memory on networks of workstations. We will look at some results obtained with Splash and NAS benchmarks on an ATM network, as well as compare these results to PVM implementations of the same benchmarks. We will also look at complete applications implemented in this environment, including mixed integer programming, genetic linkage analysis, and seismic inversion.

The final part of the tutorial will provide a perspective on where we are and what more is needed. In particular, efforts to integrate shared memory on networks of workstations with compilers, code reorganizers, and performance and debugging tools will be presented.

The intended audience consists of anyone interested in the design, implementation, and/or use of shared memory programming on distributed memory machines. An undergraduate background in operating systems, parallel programming, or networking should prove useful but most conceps will be developed from the basics up.

As part of the tutorial, pointers to relevant publications and systems will be made available.


Total time: 3 hours.

First hour: Programming in distributed shared memory.

Second hour: Techniques for efficient implementation.

Third hour: Application experience