A* algorithm
#programming #algorithms #graphs
A* is the successor to Dijkstra's Algorithm.
Abstract
Dijkstra wastes time by exploring in all directions unnecessarily. If the target is a fixed node, this is often not required.
A* improves this by adding something of a "directional intelligence".
Where Dijkstra uses a priority function solely comprised of start from source, A* attempts to make it better by including with it, an heuristic of cost from destination.
So the optimization, or the priority function looks something like $$f(n) = g(n) + h(n)$$where
By attempting to minimize their sum, we encode directional intelligence, reducing our search space of unnecessary nodes which are far from destination.
Therefore it becomes something like
node = extract_min(queue);
for neighbours of node:
g_new = g[node] + cost(node, neighbour)
f_new = g_new + h(neighbour);
if neighbour not visited OR g_new < g[neighbour]:
Update
Relax
Push
The relaxation logic (the decision on whether to relax or not) is still made based off of
Critical properties of heuristic,
Admissibility $$h(n) \leq \pu{trust cost from n \to goal}$$
This means that the heuristic never overestimates. You're never likely to know the true cost, but your heuristic should never exceed it.
This is needed, because admissibility ensures optimal path
Why does admissibility guarantee optimality
Consistency $$h(n) \leq \pu{cost}(n\to m) + h(m)$$
This is equivalent to triangle inequality
Consistency guarantees that nodes do not have to be reprocessed. Without consistency, this criteria would fail, and we would be unable to finalize a node upon reaching it, often requiring several visits to finalize a node.
Why does consistency guarantee that nodes do not have to be reprocessed?
Example intuition
Think of it like this
Dijkstra : Explores outwards in a circle
A with Manhattan distance* : Explores towards goal in a cone.
Time complexity Analysis
The time complexity for
Zero heuristic $$h(n) = 0$$
This is painfully obvious, and in this case A* collapses to Dijsktra, with a runtime complexity of $$O(E\log V)$$
Ideal heuristic
The ideal heuristic,
In the face if this heuristic, the time complexity becomes $$O(d)$$where
This is because the total estimated cost of any node, can never be less than the optimal node at that step.
So, the search functions as an effective linked list traversal.
Consistent heuristic
In this case, we can think of it as a Dijsktra search on a modified graph. The modification is due to the heuristic.
In this case, the time complexity is also
Instead we define it by "search-space".
In this case we talk about the effective branching factor.
If
- Bad heuristic
: approaches true branching factor - Good heuristic :
approaches 1.