Approximation Algorithms - General Notes

Here’s some of my notes on the readings on approximation algorithms. First we look at Greedy Algorithms for known. Mainly, load balancing, set cover, etc.

Load Balancing Problem

The greedy algorithm for assigning a set of given tasks to a set of given machines in a balanced manner is:

Start with no jobs assigned
Set $T_i = 0$ and $A(i)=\lbrace \rbrace$ for all machines $M_i$
For $j=1, \dots, n$

Let $M_i$ be a machine that acheives the minimum $\min_k T_k$
Assign job j to machine $M_i$
Set $A(i) \leftarrow A(i) \cup \lbrace j \rbrace$
Set $T_i \leftarrow T_i + t_j$
EndFor

This obviously doesn’t always give the optimal solution. So how close is it to optimal?

Well, first we know that at least one of the must do $\frac{1}{m} \sum_j t_j$ amount of work. Intuitively, this machine could be doing all the work, but there’s got to be at least one that does it.

So we already know the optimal solution has to be:

Now, for the greedy algorithm. When we add a job with load $t_j$ to some $M_i$ we know that the machine has the currently smallest load. We thus know that every other machine has a load of at least $T_i - t_j$. So all the machines have time:

which can then be refactored as

Which looks like our earlier look at the optimal solution so actually we can say:

But what can we say about $t_j$? Well, we know that:

This is because the optimal solution must be greater than or equal to the length of the longest job. Ideally, this job is placed on its own machine such that the timing is only equal to it, but in the worst case it can be much much greater. Hence the inequality. Using this inequality however, we can add teh two together:

Sorted Machine Scheduling

Obviously, the greedy algorithm has an issue where if a long job comes along at the end all the machines already have jobs running and so the job that should be scheduled on its own machine in optimum scenarios is not.

Can simply avoid this by sorting the jobs in decreasing order of time and then running the greedy algorithm. This is a 3/2-approximation algorithm. How can we prove this?

Well, we know that if there are more than m jobs, by pidgeon-hole (kind of) principle:

Since there must be two jobs running on at least one machine.

Assuming that there are in fact $> m$ jobs then, we can actually say that for a job $t_j$:

In which case the summation of the inqualities we had before becomes:

Greedy Weighted Set Cover

Each set is giving a weight. Minimize the sum of weights while covering all the elements.

What’s the greedy algorithm? Well, we use a heuristic to combine our goals: covering the elements and minimizing the weights:

Where $R$ is the set of remaining uncovered elements and $w_i$ is the weight. So the greedy algorithm is to Select a set $S_i$ that minimizes the heuristic and then remove the set elements from $R$.

Again, this is not optimal because it could miss potentially better solutions.

First, we need a good lower bound on the optimum.

Written on December 16, 2014