A Distributed Algorithm for Embedding Trees in Hypercubes with Modifications for Run-Time Fault Tolerance

  • Rami Melhem
  • Foster Provost

In this paper we present a distributed algorithm for embedding binary trees in hypercubes.  Starting with the root (invoked in some cube node by a host), each node is responsible for determining the addresses of its children, and for invoking the embedding algorithm for the subtree rooted at each child in the proper cube node.  This distributed embedding, along with the wealth of communication links in the hypercube, leads to a high potential for fault tolerance.  We demonstrate the fault tolerance capability by introducing restructuring techniques which may be used to tolerate faults during the initial embedding, but is especially useful for remapping nodes that fail at run-time.  The distributed nature of the embeddings eliminates the need for global knowledge of faulty nodes; each node must only know the status of its neighbors. In addition, only the neighbors of a faulty node need be aware of any change.