Bruteforce and backtracking to solve NP-complete problems
Table of Contents
Hi all! Welcome back to this course on algorithmics and discrete mathematics.
In today’s lesson, we’re going to talk about solving NP-Complete problems with two algorithms: an exhaustive one, and backtracking. At the end of the video, you should be able to implement these two algorithms to solve the traveling salesman problem.
Computing a solution
Classicaly, NP-complete problems are tackled either through:
« exact algorithms », which are reasonably fast for small problems,
or, « sub-optimal, or heuristic algorithms », which deliver good-enough solutions, but which are not necessarily optimal.
Today, let’s explore two exact algorithms to solve the traveling salesman problem, or TSP.
Brute force algorithm
The most direct solution to the TSP is to try all possible elementary paths, or, permutations of the cities, and see which one is the cheapest. This is known as a brute force, or exhaustive search.
If the starting city is fixed, the number of such permutations equals . Consequently, the calculation time, or the time it takes for the computer to solve the problem with this approach, lies within a polynomial factor of .
The starting point for the traveling salesman is vertex , and imagine for the sake of simplicity that he does not have to return to in the end. The number of solutions to explore is therefore if is the number of cities.
Trying all possible solutions starting from can be done using a depth-first search, during which we keep in memory the shortest possible path encountered.
Algorithm
best = infinity
def bruteforce(remaining, vertex, path, weight, graph):
if remaining is empty:
if weight < best:
best = weight
best_path = path
else:
for each vertex i in remaining:
bruteforce(remaining - i, i, path + i, weight + graph[vertex][i],graph)
Backtracking solution
A quick improvement of the previous algorithm is to stop the depth search when it’s certain that the currently explored branch will not generate a better solution than the current best one.
This means that the branch that’s explored at each step has an influence on the overall calculation time.
Various strategies can be used to further optimize backtracking, by preselecting which branches to examine first. As an example, we could select the branch whose starting vertex is the closest to the current vertex.
Example
So, the backtracking solution has a much lower complexity than the brute force approach.
Concluding words
OK, that’s it for today! I hope you’ve understood how to obtain an exact solution of the TSP using a brute force or exhaustive search, and how we can solve the TSP faster using backtracking.
In the next lesson, Patrick will tell you how to identify the complexity of a problem.
The branch-and-bound algorithm
A third algorithm for solving exactly the TSP is the branch-and-bound algorithm.
The branch-and-bound algorithm explores the tree of possibilities (in the case of the traveling salesman, the tree of possible paths). To do this, it uses a Dijkstra algorithm. At each vertex it encounters, it tries to estimate an interval (as narrow as possible) estimating the length of a shorter path obtained from the branch of the tree where it currently is.
To obtain these intervals, there are several techniques.
To obtain an upper bound, it is sufficient to use a greedy algorithm, that estimates the length of the remaining path by always going to the closest vertex. This results in a path that is not necessarily optimal, but which has a length that can only be longer than that of a shorter path.
To obtain a lower bound, we can for example count the number of vertices not yet visited (let’s call this number ). In the best case, they can be visited by going through the edges with the lowest weights. The sum of these weights gives the lower bound. This reduction is often coarse, and it is better to look for the finest.
When the algorithm has determined several intervals (one per branch to explore), it can compare them to each other. In the case of searching for a path of minimum length, if the interval of a branch has as its upper bound a value smaller than the lower bound of an interval of another branch , then the sub-tree corresponding to is eliminated. Indeed, there is no point in exploring the sub-tree corresponding to the interval knowing that the best solution found will in any case be less interesting than the worst ones by continuing on branch .
Complexity
The complexity of the algorithm is in the worst case the same as an exhaustive search (if the estimation of the limits does not cost too much) this is for example the case with a complete graph in which all the edges have the same weight.
In practice, having good accuracy in estimating intervals can save a lot of time. An optimal case is one of linear complexity.
Precision-performance compromise
As in the case of the approximate algorithms that we will study in Episode 5, interval estimation poses the problem of the precision-performance trade-off. Indeed, the narrower the interval, the more it will allow parts of the tree to be pruned. However, if its estimation requires an expensive calculation, it has an impact on total complexity.
It is therefore necessary to find estimates of the bounds of the intervals that offer a good compromise between computation time and accuracy.