Dijkstra’s algorithm

Table of Contents

Hi, and welcome to this lesson on Dijkstra’s algorithm!

This week, we’re going to talk about shortest paths in weighted graphs. The graph traversal algorithms that we’ve already looked at in the previous lesson can’t be directly applied to find shortest paths in weighted graphs.

Weighted graphs

To illustrate why the graph traversal algorithms already presented do not necessarily work for weighted graphs, consider the following toy graph.

Here, a breadth-first search, BFS, starting from vertex $v_1$ , will not produce a spanning tree of shortest paths. For example, the actual shortest path from $v_1$ to $v_3$ is ${v_1,v_2}$ , ${v_2,v_3}$ , where the path found using a BFS is $(\{v_1,v_3\})$ .

This is because a BFS favors a smaller number of hops from the initial vertex, regardless of the weights.

Dijkstra’s algorithm

Dijkstra’s algorithm is a modification of the BFS that allows us to find shortest paths, even in the presence of weights, if those are non-negative.

A quick note about weights. So far in this MOOC, we’ve only considered weights that are non-negative. In some cases it might be meaningful to add negative weights.

For example, imagine a taxi driver that can travel between cities. Sometimes he earns money because he has a client and sometimes he loses money because he travels alone. We could thus consider a graph whose vertices represent cities and edges are weighted by the cost of the corresponding travel. Some weights would then be positive and others negative.

Let’s get back to graphs with non-negative weights. The simplest way to describe Dijkstra’s algorithm is to imagine the starting vertex as a tap from which water is pouring out. The water will progress along the vertices and traverse the graph with a speed inversely proportional to the edge weights, so it’s going to reach the closest vertices first.

In other words, the Dijkstra’s algorithm is a traversal algorithm in which vertices are visited by increasing distance from the initial vertex.

At each step, the Dijkstra algorithm stores two kinds of information: 1) which vertices have already been explored, and 2) an approximation of distances from $v_1$ to all other vertices in the graph.

The Dijkstra algorithm repeats two steps. In step 1, it selects an unexplored vertex $v$ that is at a minimum distance from $v_1$ . In step 2, it updates the distances from $v_1$ to the other vertices of the graph, using information about $v$ and its neighbors.

Note that the algorithm always yields a spanning tree of minimum paths in the graph, provided weights are non-negative. We’re not going to prove this result here, but it’s not so hard to do a very nice exercise if you’d like to go a little deeper into the concepts described in this lesson.

Let’s describe more precisely how the Dijkstra algorithm works using the following example. We’re going to build a minimum spanning tree starting from vertex $v_1$ in this graph.

So, we initialize two data structures: first, the set of explored vertices, which are initially empty, and second, the array of distances from $v_1$ to the other vertices in the graph, initialized to $+\infty$ everywhere but $v_1$ , where we insert a 0.

The first step is to select a non-explored vertex that is at minimum distance from the starting vertex. Here, we select $v_1$ as our starting vertex and add it to the set of explored vertices.

The second step is to update the distances taking into account the selected vertex $v_1$ , and its neighbors. Here, $v_1$ has two neighbors: $v_2$ and $v_3$ . Reaching them in one hop from $v_1$ costs 5 and 3, respectively. So we update our distances from $v_1$ by saying that $v_2$ is at distance 5 and $v_3$ is at distance 3. Note that the length of a shortest path from $v_1$ to $v_2$ is actually 4, but that we can’t tell that yet given the results from the algorithm alone.

Let’s now start a new iteration of the algorithm. In the first step, we select an unexplored vertex at a minimum distance from $v_1$ . Here, we choose $v_3$ , which is at distance 3. We add it to the set of explored vertices.

In the second step, we look at the neighbors of $v_3$ , which are $v_2$ and $v_7$ , with corresponding weights of 1 and 2. So, if we want to reach $v_2$ by going through $v_3$ one hop before, we have a total cost of 3, which is the cost to go to $v_3$ , plus 1, which is the cost of doing one hop from $v_3$ to $v_2$ . This is better than the previous distance we estimated for $v_2$ , so we modify the distances accordingly. As far as $v_7$ is concerned, we now have an estimated distance of 3, the cost of going to $v_3$ plus 2 which is the cost of one hop from $v_3$ to $v_7$ .

A new iteration begins. Now we select the next unexplored vertex with the smallest distance, which is $v_2$ . We add it to the set of explored vertices.

The vertex $v_2$ has two neighbors, $v_4$ and $v_7$ . The vertex $v_4$ is estimated to be at distance $4 + 2 = 6$ , and $v_7$ at distance $4 + 1 = 5$ . So we update the distance to $v_4$ . As far as $v_7$ is concerned, the newfound distance is not better than the old one, so it has no effect.

A new iteration begins. We select $v_7$ at distance 5 and add it to the list of explored vertices. And so on…

Finally we obtain the following spanning tree.

Concluding words

That’s all for this lesson! Next we will see how to implement Dijkstra’s algorithm by using min-heaps.

Properties of Dijkstra’s algorithm

Termination & correctness

We want to show that Dijkstra’s algorithm ends and is correct if all weights in the graph are strictly positive.

The proof of correction and termination for Dijkstra’s algorithm is more complicated than those we have already seen for more simple traversal algorithms. We’re going to do both at the same time.

Let’s assume a priority queue (here, a min-heap) for which all operations terminate and are correct. If the notion of min-heap does not ring you a bell, you should read this article first.

For the proof, we will proceed by recurrence with the following property: all the vertices that have been removed from the priority queue are such that their distances from the initial vertex have been calculated and recorded, and the remaining vertices, if any, are all at a greater distance.

We notice trivially that this statement is true at the first pass in the loop, when we remove the initial vertex.

Let us now assume this statement to be true at some point during the execution of the algorithm. There are two cases:

Either the priority queue is empty, in which case the algorithm stops and the property remains true;
Or a new couple (current node, distance) is extracted from the priority queue.

We want to show that the extracted distance is the distance of a shortest path from the initial vertex to the current vertex.

We must note that this is already the length of a path in the graph. Indeed, the distance was updated at a time when another vertex was previously removed. By hypothesis of recurrence, this corresponded to a shortest path, and the vertices being neighbors, we obtain a quantity corresponding to the length of a path.

Then, it is necessarily the length of a shortest path. Indeed, let us consider a shortest path in the graph from the initial vertex to the current vertex and let us denote previous vertex the one immediately before the current vertex along this path. By definition of the shortest path, the subpath extracted from the initial vertex to the previous vertex is a shortest path to that vertex. This subpath is strictly shorter because it contains one edge less (and the edges have strictly positive weights). In the event of a recurrence, the previous vertex has already been removed from the priority queue. And therefore the distance to the current vertex was either set to its minimum value at that time, or was already there before (it could not be less since it is the length of a path in the graph).

By recurrence we conclude that the property is true at any stage of the algorithm.

To finish proving the correction and at the same time have the termination, we still have to show that:

Each vertex can only be removed once from the priority queue;
Any reachable vertex is removed at least once.

For 1., this is directly deduced from the increase in distances proven in the previous recurrence.

For 2., we notice that any reachable vertex admits a shortest path, and we proceed by recurrence along this path.

Complexity

The algorithm complexity depends heavily on the priority queue used. We can show that the complexity $O(|E| + |V| ln(|V|))$ can be attained when using an adapted structure.

In the particular case of the maze we work on in PyRat, each vertex has at most 4 neighbors. Therefore, we obtain a complexity of $O(|V|ln(|V|))$ , since $|E|\leq 4|V|$ . As this complexity is less than quadratic, it is quite reasonable for our problem.

To go further

Fibonacci heap: a data structure to improve execution time of Dijkstra’s algorithm.

Quiz

When running Dijkstra's algorithm on a weighted graph from vertex u, what following conditions are (independently) sufficient to guarantee that the Dijktra's algorithm outputs are the shortest paths from vertex u to the accessible vertices of the graph?

The input graph is connected.

The input graph contains only nonnegative weights.

The input weighted graph is a tree.

What following property best describes the functioning of Dijkstra's algorithm, starting from a vertex u?

We explore vertices as we would do for a BFS.

We explore vertices by increasing the hops from u in the graph.

We explore vertices by increasing the distance from u in the graph.

The Dijkstra's algorithm maintains two data structures along the process. What are they?

An estimation of the order of the graph and an estimation of the size of the graph.

The list of explored vertices and an estimation of the distances from u to the other vertices of the graph.

The list of explored vertices and the list of explored edges.