Bulk synchronous parallel

The Bulk Synchronous Parallel computer is a model for designing parallelalgorithms. It serves a similar purpose to the
PRAM model. BSP differs from PRAM by not taking communication and synchronization for granted. An importantpart of analysing a BSP algorithm rests on quantifying the synchronisation and communication needed.

The model

A BSP computer consists of processors connected by a communicationnetwork. Each processor has a fast local memory, and may followdifferent threads of computation.

A BSP computation proceeds in a series of global "supersteps". Asuperstep consists of three ordered stages:

# "Concurrent computation" : Several computations take place on every participating processor. Each process only uses values stored in the local memory of the processor. The computations are independent in the sense that they occur asynchronously of all the others.
# "Communication" : At this stage, the processes exchange data between themselves.
# "Barrier synchronisation" : When a process reaches this point (the "barrier"), it waits until all other processes have finished their communication actions.

The figure below shows this in a diagrammatic form. The processes are notregarded as having a particular linear order (from left to right orotherwise), and may be mapped to processors in any way.

Communication

In many parallel programming systems, communications are considered atthe level of individual actions: sending and receiving a message, memoryto memory transfer, etc. This is difficult to work with, since thereare many simultaneous communication actions in a parallel program, andtheir interactions are typically complex. In particular, it isdifficult to say much about the time any single communication actionwill take to complete.

The BSP model considers communication actions "en masse". This hasthe effect that an upper bound on the time taken to communicate a set ofdata can be given. BSP considers all communication actions of asuperstep as one unit, and assumes all messages have a fixed size.

The maximum number of incoming or outgoing messages for a superstep isdenoted by $h$ . The ability of a communication network todeliver data is captured by a parameter $g$ , defined suchthat it takes time $hg$ for a processor to deliver $h$ messages of size 1.

A message of length $m$ obviously takes longer to send than amessage of size 1. However, the BSP model does not make a distinctionbetween a message length of $m$ or $m$ messages oflength 1. In either case the cost is said to be $mhg$ .

The parameter $g$ is dependent on the following factors:

* The protocols used to interact within the communication network.
* Buffer management by both the processors and the communication network.
* The routing strategy used in the network.
* The BSP runtime system.

A value for $g$ is, in practice, determined empirically foreach parallel computer. Note that $g$ is not the normalisedsingle-word delivery time, but the single-word delivery time undercontinuous traffic conditions.

Barriers

On most of today's architectures, barrier synchronisation is oftenexpensive, so should be used sparingly. However, future architecturedevelopments may make them much cheaper. The cost of barriersynchronisation is influenced by a couple of issues:

# The cost imposed by the variation in the completion time of the participating concurrent computations. Take the example where all but one of the processes have completed their work for this superstep, and are waiting for the last process, which still has a lot of work to complete. The best that an implementation can do is ensure that each process works on roughly the same problem size.
# The cost of reaching a globally-consistent state in all of the processors. This depends on the communication network, but also on whether there is special-purpose hardware available for synchronising, and on the way in which interrupts are handled by processors.

The cost of a barrier synchronisation is denoted by $l$ . Inpractice, a value of $l$ is determined empirically.

Barriers are potentially costly, but have a number of attractions. Theydo not introduce the possibility of deadlock or livelock, since barriersdo not create circular data dependencies. Therefore tools to detect and deal with them areunnecessary. Barriers also permit novel forms of fault tolerance.

The Cost of a BSP algorithm

The cost of a superstep is determined as the sum of three terms; thecost of the longest running local computation, the cost of globalcommunication between the processors, and the cost of the barriersynchronisation at the end of the superstep. The cost of one superstepfor $p$ processors:

$max_{i = 1}^{p}(w_i) + max_{i=1}^{p}(h_i g) + l$

where $w_i$ is the cost for the local computation in process $i$ , and $h_i$ is the number of messages sent orreceived by process $i$ . Note that homogeneous processorsare assumed here. It is more common for the expression to be written as $w + hg + l$ where $w$ and $h$ aremaxima. The cost of the algorithm then, is the sum of the costs of eachsuperstep.

$W + Hg + Sl = sum_{s=1}^{S}w_s + g sum_{s=1}^{S}h_s + Sl$

where $S$ is the number of supersteps.

$W$ , $H$ , and $S$ are usually modelledas functions, that vary with problem size. These three characteristicsof a BSP algorithm are usually described in terms of asymptoticnotation, e.g. $H in O(n/p)$ .

References

* D.B. Skillicorn, Jonathan Hill, W. F. McColl, [ftp://ftp.comlab.ox.ac.uk/pub/Documents/techpapers/Jonathan.Hill/SkillHillMcColl_QA.ps.gz Questions and answers about BSP] (1996)

ee also

* Computer cluster
* Concurrent computing
* Concurrency
* Grid computing
* Parallel computing
* ScientificPython
* LogP machine

External links

* [http://www.bsp-worldwide.org/ BSP Worldwide]
* [http://www.bsp-worldwide.org/implmnts/oxtool/papers.html BSP related papers]
* [http://web.comlab.ox.ac.uk/oucl/work/bill.mccoll/oparl.html WWW Resources on BSP Computing]

Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

Bulk Synchronous Parallel ML — Développeur Laboratoire d Informatique Fondamentale d Orléans (LIFO) de l Université d Orléans, Laboratory of Algorithms, Complexity and Logic (LAC … Wikipédia en Français
Bulk Synchronous Parallel Computers — Der Begriff Massensynchrone Parallelrechner (MSPR) oder englisch Bulk Synchronous Parallel Computers (BSP) bezeichnet ein Modell des massiv parallelen Rechners. Es wurde 1989 von Leslie Valiant eingeführt, zunächst als ein theoretisches Modell… … Deutsch Wikipedia
TransAgg — or Transform and Aggregate is a model of distributed computation for Internet scale computations. It is especially relevant for Cloud computing. Origins for TransAgg model could be found in Bulk Synchronous Parallel computations, which divide… … Wikipedia
Message Passing Interface — MPI, the Message Passing Interface, is standardized and portable message passing system designed by a group of researchers from academia and industry to function on a wide variety of parallel computers. The standard defines the syntax and… … Wikipedia
Bsp — Die Abkürzung BSP steht für: Banach Saks property, eine mathematische Eigenschaft von Banachräumen. Bandscheibenprolaps (Bandscheibenvorfall) Bayer Schering Pharma Bayerische Staatspartei Billing and Settlement Plan, ein Abrechnungsverfahren… … Deutsch Wikipedia
OpenMP — Original author(s) OpenMP Architecture Review Board[1] Developer(s) OpenMP Architecture Review Board … Wikipedia
LogP machine — The LogP machine is a model for parallel computation. [Culler et al. 1993] It aims at being more practical than the PRAM model while still allowing for easy analysis of computation.The name is not related to the mathematical logarithmic function … Wikipedia
ScientificPython — is an open source library of scientific tools for the Python programming language.Scientific computing tools implemented in this module include: *Bulk Synchronous Parallel *Differentiation for functions of any number of variables up to any order… … Wikipedia
BSP — is a three letter abbreviation that may refer to:Computers and technology * Bug squashing party, see: Hackathon * Binary space partitioning (data structure) ** BSP (file format) ** BSP (Editor) * Byte Stream Protocol, see: PARC Universal… … Wikipedia
Distributed design patterns — In software engineering, a distributed design pattern is a design pattern focused on distributed computing problems. Classification Distributed design patterns can be divided into several groups: Distributed communication patterns Security and… … Wikipedia

Academic Dictionaries and Encyclopedias

Bulk synchronous parallel

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

Bulk synchronous parallel

Look at other dictionaries:

Share the article and excerpts

Direct link