Searching for Bandwidth-Constrained Clusters

In The 31st International Conference on Distributed Computing Systems (ICDCS 2011), April 2011.

Sukhyun Song, Peter J. Keleher, Alan Sussman

Data-intensive distributed applications can increase their performance by running on a cluster of hosts connected via high-bandwidth interconnections. However, there is no effective method to find such a bandwidth-constrained cluster in a decentralized fashion. Our work is inspired by prior work that treats Internet bandwidth as an approximate tree metric space.

This paper presents a decentralized, accurate, and efficient method to find a cluster of Internet hosts, given the desired cluster size and minimum interconnection bandwidth. We describe a centralized polynomial time algorithm for a tree metric space, along with a proof of correctness. We then provide a decentralized version of the algorithm. Simulation experiments with two real-world datasets confirm that our clustering approach achieves high accuracy and scalability. We also discuss the costs of decentralization and how the treeness of the dataset affects clustering accuracy.

	title = "Searching for Bandwidth-Constrained Clusters",
	author = "Sukhyun Song and Peter J. Keleher and Alan Sussman",
	booktitle = { The 31st International Conference on Distributed Computing Systems (ICDCS 2011)},
	month = {April},
	year = {2011},

Available: bibtex, abstract,