A ranking of n web pages is to be chosen from outputs of k search
engines. How do we choose one ranking minimizing the "disagreement" with
the k rankings ?
A clustering of n genes is to be chosen from outputs of k clustering
algorithms. How do we choose one clustering minimizing the
"disagreement" with the k clusterings?
These information aggregation problems date back to 1785, when the
French philosopher Condorcet considered voting systems where each voter
chooses a full ranking of a set of candidates. Recently, their
algorithmic and complexity aspects have been studied.
I will show that for both these problems, we can obtain improved
algorithms using essentially the same, remarkably simple principle.
Furthermore, this also applies to and yields improvements for
related graph theoretic optimization problems, known as the minimum
feedback arc set in tournaments and the correlation clustering in
complete graphs. Additionally, I will show that the problem of finding a
minimum feedback arc set in tournaments has no poly-time algorithm,
assuming NP is not contained in BPP. This almost settles a long-standing
conjecture of Bang-Jensen and Thomassen, that the problem is NP-hard.
This is joint work with Nir Ailon and Alantha Newman.