Graph Sink Detection

This page needs to be tidied, and that will require some attention from the original contributors. In particular, what are the definitions being used here? As a PhD in graph theory, to me a sink in a directed graph (not necessarily complete) is a vertex with no outgoing edges. I refer you to http://www.eli.sdsu.edu/courses/fall95/cs660/notes/Graph/Graph.html#RTFToC23 which claims the same.

Part of the confusion may be between the definition of a sink and a universal sink. According to Introduction to Algorithms (Cormen et al), a universal sink for G = (V, E) is "a vertex with in-degree |V|-1 and out-degree 0."

If you are dealing with complete digraphs then clearly there is at most one sink, and there is a linear time algorithm that will find it.


How to find a sink (if there is one) in a directed graph represented as an adjacency matrix?

(Note: A graph "sink" is _____, and the "->" notation below means _____. Find out more about stuff like this at http://_____.)

    for each vertex v:
      v.might_be_sink := true
    for each vertex v:
      if v.might_be_sink:
        for each other vertex w:
          if v -> w:
            give up on v and try next candidate
          else:
            w.might_be_sink := false
          if w -> v:
            w.might_be_sink := false
          else:
            give up on v and try next candidate
      hooray! v is a sink.

Each time around the inner loop, at least one vertex is discovered not to be a sink. (Sometimes two might be.) Therefore, the whole thing takes time at most proportional to the number of vertices in the graph.

Nice try, but in the worst case, for general vertex 2n-1, the inner loop may execute n times before vertices 2n-1 and 2n are found not to be sinks, so the time-complexity is of order v^2, where v is the number of vertices.


The following was written with the assumption that a "sink" = "a vertex with no outgoing branches". Apparently a different definition was meant.

Any algorithm which inspects each entry in the adjacency matrix at least once will have complexity v^2. Consider the alternative question: give an algorithm which answers the question if the graph has at least 1 sink. This is equivalent to asking if there is a row in the matrix with only 0's. The worst-case running scenario of an algorithm that solves this problem must be v^2; there is no short-cut for inspecting every entry.

-- StephanHouben


A clever graph creator/loader will update sink status as it goes i.e. do the v^2 bit once only :). This is the equivalent of a denormalized aggregate in a database table updated by tuple triggers.

-- RichardHenderson


The algorithm above is indeed broken, but Stephan's analysis of why it must' be broken is wrong. (A sink isn't equivalent to a row of all 0s; it's equivalent to a row of all 0s whose corresponding column has no 0s. We're ignoring the diagonal elements, of course.)

Here's an algorithm that either finds a sink, or reports that there is no sink, in linear time.

Two space indentation, and not 8 spaces? Interesting, if only Guido would wisen up also and make code more readable.

(Thanks to the original questioner for pointing out a thinko in the definition of find-possible-sink and another in the proof of theorem 1. Both now fixed.)

   def find-possible-sink(vertices):
     if there's only one vertex, return it
     good-vertices := empty-set
     pair vertices into at most n/2 pairs
     add any left-over vertex to good-vertices
     for each pair (v,w):
       if v -> w:
         add w to good-vertices
       else:
         add v to good-vertices
     return find-possible-sink(good-vertices)

def find-sink(vertices): v := find-possible-sink(vertices) if v is actually a sink, return it return "there is no spoon^H^H^H^Hink"

Theorem 1 The algorithm above takes linear time.

Proof The time taken by find-possible-sink satisfies T(n) <= T(n/2) + K.n + L for some constants K, L. Therefore, by induction, it satisfies T(n) <= constant.n.

The time taken by find-sink is linear, because checking a single vertex for sinkitude takes linear time. []

Theorem 2 The algorithm above never returns a non-sink.

Proof We never return a vertex without checking it first. []

Theorem 3 If there is a sink, the algorithm above returns it.

Proof Suppose v is a sink. It suffices to prove that find-possible-sink returns v, since it will pass the test in find-sink.

If v is the only vertex in vertices when find-possible-sink is called, then of course it will be returned. So suppose that vertices has more than one element, and includes v.

If v is unpaired, then it will automatically be entered in good-vertices. If it is paired (say, with w), then since we have w -> v and not v -> w, v will be entered in good-vertices whichever way around the vertices appear in the pair.

Hence, v will be entered in good-vertices and passed to the recursive invocation of find-possible-sink. By induction on the number of elements in vertices, it will be returned from this invocation. []

Conclusion: this algorithm finds a sink if there is one, in linear time. Corollary: there is at most one sink. (This is, of course, trivial anyway.)

When I last looked at this, I concluded that the code and proof required a minor tweak to be completely correct (i.e., work in the worst case).

Well done! Would you like to share your minor tweak, or is reading your mind left as an exercise for reader?


Wonderful work.I had tortured my brain a lot for this algo.

This algorithm works only if the graph is complete, in a sense that either v -> w or w -> vfor all v and w. Then, it is of course also impossible to have more than one sink.

False. Nothing in the proof above assumes that the graph is complete. However, the existence of a sink implies that the graph is complete. Or are you defining "sink" as "vertex with edges from some other vertices but not to any others"? Indeed, the algorithm won't find those. But I don't think that's the usual definition of "sink".

You are right, my definition differs from yours. Thanks for clarifying.

Yes that answers my question too, but it also suggests that the definition isn't very useful. Still, no-one said it had to be useful :).

I've never heard of a sink in a digraph being defined as anything other than a node with outdegree zero. And from a quick search on google, I'd conclude that no other definition is in common use.

I did various google searches before creating this page, and am satisfied the more restrictive definition of a sink is currently standard. The question appears as an exercise on various university sites.

Since there seems to be such disagreement, it's obviously not standard :-). I found three definitions myself: a node with no outgoing edges; a node reachable (via some path) from all nodes and with no outgoing edges; and a node with no outgoing edges that is reachable in one step from every other node.


Strictly speaking, the question asks for efficient detection of a (standard definition) sink if there is one. If there is no sink and the graph is not 'complete', reporting there is no sink is merely a bonus.

In theorem 1, n/2 should be annotated as rounded up to next integer if fractional. The proof still works.


BTW, this problem is in some circles known as the "Celebrity Problem", and is more colourful formulated as follows:

At a party, there is one celebrity. The celebrity has the following properties:

Now, the task is to identify the celebrity by only asking questions to people of the form: "Do you know him/her?"

The trick is that if you do it right, you can with every question eliminate some person as the celebrity. Say you ask A: "Do you know B?" If A answers yes, then A is not the celebrity. If A answers no, then B is not the celebrity. So you continue eliminating candidates until you have one left. This takes N-1 questions max. Of course, if you are not sure a celebrity exists, you will have to check explicitly. This takes another few questions.

-- StephanHouben, additions by ThomasHolenstein?


    adjacency_matrix[n, n]
    int candidate = 0
    for int i = 1; i < n; i++:
        if adjacency_matrix[candidate, i] == 1:
            candidate = i
    for int i = 0; i < n; i++:
        if i != candidate and (adjacency_matrix[i, candidate] == 0 or adjacency_matrix[candidate, i] == 1):
            return "there is no spoon^H^H^H^Hink"
    return candidate


See


CategoryAlgorithm


EditText of this page (last edited November 14, 2014) or FindPage with title or text search