We study approximation algorithms for placing replicated data in arbitrary networks. Consider a network of nodes with individual storage capacities and a metric communication cost function, in which each node periodically issues a request for an object drawn from a collection of uniform-length objects. We consider the problem of placing copies of the objects among the nodes such that the average access cost is minimized. Our main result is a polynomial-time constant-factor approximation algorithm for this placement problem. Our algorithm is based on a careful rounding of a linear programming relaxation of the problem. We also show that the data placement problem is MAXSNP-hard.
We extend our approximation result to a generalization of the data placement problem that models additional costs such as the cost of realizing the placement. We also show that when object lengths are non-uniform, a constant-factor approximation is achievable if the capacity at each node in the approximate solution is allowed to exceed that in the optimal solution by the length of the largest object.