In connection with graphs there are many notions. Some of them are important already now. A neighbor of a node u is any node v for which there is an edge (u, v). In this case one also says that v is adjacent to u. The nodes u and v are called the endpoints of the edge. A path from u to w is a sequence of edges (u, v_1), (v_1, v_2), ..., (v_{k - 1}, w) connecting u with w. A path is called simple if all nodes on the path are different. A cycle is a simple path which has the same begin- and endpoint. In the example this means that u == w. A graph without cycles is called acyclic. A tree is an acyclic and connected graph. By extension are directed graphs called trees when it is a tree considered as an undirected graph, and when all edges are directed towards or away from a specific node called root. A forest is a set of trees. The length of a path is the number of used edges, in the example the path has length k. The distance from u to v is the length of the shortest path between u and v. A graph is called connected if for any pair of nodes u, v there is a path running from u to v. For directed graphs one mostly speaks of strongly connected if we take the direction of the edges into account for these paths, otherwise one speaks of weakly connected. A connected component is a maximum subset of the nodes that is connected. For directed graphs one speaks of strongly (weakly) connected components, often these are also called strong (weak) components. A spanning tree is a tree that reaches all nodes of a connected graph. A spanning forest is a set of trees, one for each connected component of a graph. The degree of a node u is the number of edges incident on u. For directed graphs it is customary to distinguish indegree, the number of edges leading to u, and outdegree, the number of edges starting in u. The degree of the graph is the maximum of the degrees of all nodes. If an edge (u, v) occurs more than once (that is, the "set" of edges is actually a multiset), then we will say that there are multiple edges. A self-loop is an edge (u, u). A graph without multiple edges and self-loops is said to be simple. It is common to assume that graphs are simple. The number of nodes of a graph is often denoted with n, the number of edges with m. Simple graphs have at most n * (n - 1) edges if they are directed and at most n * (n - 1) / 2 edges if they are undirected. If m = O(n) (as in road graphs), then the graph is said to be sparse. If m is much larger than linear the graph is called dense. This notion is not precisely defined, sometimes it means m = omega(n), sometimes it means m = Omega(n^2).
A common way to represent graphs in the computer is by creating for each node a set of all its neighbors. This is a particularly good idea for sparse graphs. In general one might use linear lists for these sets. This is called the adjacency list representation of the graph. Such an implementation requires O(n + m) memory. The strong feature of using adjacency lists, is that it becomes very easy to determine all neighbors of a node. Using linked lists, it is also trivial to add or delete a particular edge. If all edges have about the same degree with a maximum g, then it is much more convenient to represent the graph as an array of arrays of length g, storing for each node its degree as well. Even for lists of variable length an array may be used, marking for each node where in the array its neighbors are beginning. This implementation requires n + m + O(1) memory. The major disadvantage of adjacency lists is that it takes time proportional to the number of neighbors of node u to test whether an edge (u, v) exists or not.
For dense graphs the most appropriate representation is with an adjacency matrix: an n x n boolean matrix, where a 1 at position (u, v) indicates that there is an edge (u, v). If the graph is undirected, the adjacency matrix is symmetric. This representation requires n^2 bits of memory, independently of the number of edges m. Thus, for really large graphs this representation cannot be used. Furthermore, any operation that requires the access of all neighbors of a node takes O(n) time. On the other hand, this is the representation to use if frequently the existence of single edges must be tested. Furthermore, for rather dense graphs of moderate sizes storing n^2 bits may require less memory than storing n + m ints.
In case no specific processing order of the nodes of a graph is required, the following non-recursive procedure might be the simplest and most efficient:
void traversal(int[] number) { for (v = 0; v < n; v++) number[v] = -1; // Flag value counter = -1; Bag b = new Bag(n); for (r = 0; r < n; r++) if (number[r] == -1) // r is the root of a new component { counter++; number[r] = counter; b.add(r); while (b.notEmpty()) { v = b.remove(); for (each neighbor w of v) if (number[w] == -1) // w has not been visited yet { counter++; number[w] = counter; b.add(w); } } } }
Clearly every node is numbered only once, because only nodes that were not visited before are assigned a value. Because counter is increased just before numbering a node and never decreased, all numbers are different. All nodes are pushed exactly once, and upon popping all their outgoing edges are considered. This means that the algorithm has running time O(n + m).
The produced tree will be directed towards the root: for each node the next node on the path to the root will be computed. The roots are characterized by the fact that they are pointing to themselves. Because for each node only one value has to be stored, independently of the degree of the tree, the whole tree can simply be maintained in an array:
void spanningForest(int[] parent) { for (v = 0; v < n; v++) parent[v] = -1; // Flag value Bag b = new Bag(n); for (r = 0; r < n; r++) if (parent[r] == -1) // r is the root of a new component { parent[r] = r; // r points to itself b.add(r); while (b.notEmpty()) { v = b.remove(); for (each neighbor w of v) if (parent[w] == -1) // w has not been visited yet { parent[w] = v; // w is reached from v b.add(w); } } } }
The time consumption is clearly the same as before: O(n + m). However, for a formal correctness proof, the above argument should be refined. It should be shown that all nodes in a connected component belong to the same tree. So, assume there is a node w belonging to the same component as r, which is not reached during the traversal starting from r. Consider a path from r to w. Such a path exists because r and w belong to the same component. Let u be the last node on the path for which the parent-value has been set during the traversal starting from r, and let v be the next node on the path. u has been added to the bag upon setting its parent-value. Because the traversal only stops once the bag is empty, u also has been removed from it. Upon that occasion all neighbors of u, including v, are inspected and when they were not yet reached at that point of time, their parent-value is set. This contradicts the assumption that there is a node on the path from r to w which has not been reached during the traversal starting from r.
void bfsTraversal(int[] number) { for (v = 0; v < n; v++) number[v] = -1; // Flag value counter = -1; Queue q = new Queue(n); for (r = 0; r < n; r++) if (number[r] == -1) // r is the root of a new component { counter++; number[r] = counter; q.enqueue(r); while (q.notEmpty()) { v = q.dequeue(); for (each neighbor w of v) if (number[w] == -1) // w has not been visited yet { counter++; number[w] = counter; q.enqueue(w); } } } }
The algorithm is identical with the previous one, except that we use a queue instead of a bag. Depending on how the bag is implemented, this may have a strong impact on the produced numbering, but not on the complexity: all nodes are enqueued and dequeued exactly once and upon dequeuing all their outgoing edges are considered. Thus, the complexity is O(n + m).
void dfsTraversal(int[] number) { for (v = 0; v < n; v++) number[v] = -1; // Flag value counter = -1; Stack s = new Stack(m); for (r = 0; r < n; r++) if (number[r] == -1) // r is the root of a new component { s.push(r); while (s.notEmpty()) { v = s.pop(); if (number[v] == -1) // v has not been visited yet { counter++; number[v] = counter; for (each neighbor w of v) s.push(w); } } } }
As before this algorithm numbers each node only once with different numbers from 0 to n - 1. Here the nodes are marked as visited only after they are popped from the stack. This implies that nodes may be pushed on the stack many times, and that the size of the stack may become as large as m (even 2 * m for undirected graphs). This is not such a large disadvantage: the graph itself also requires so much storage. If one nevertheless wants to prevent this, one should either apply the recursive algorithm, where the command "s.push(w)" is replaced by a conditional recursive call to the DFS procedure, or one should push instead of just w also v, and the index of w within the adjacency list of v. When popping this entry w, the next neighbor of v should be written instead of it.
void dfs(int v, int* preCounter, int* postCounter, int[] preNumber, int[] postNumber) { (*preCounter)++; preNumber[v] = *preCounter; // preorder number for (each neighbor w of v) if (preNumber[w] == -1) dfs(w, &preCounter, &postCounter, preNumber, postNumber); (*postCounter)++; postNumber[v] = *postCounter; // postorder number } void dfsTraversal(int* preNumber, int* postNumber) { for (v = 0; v < n; v++) preNumber[v] = -1; // Flag value preCounter = postCounter = -1; for (r = 0; r < n; r++) if (preNumber[r] == -1) dfs(r, &preCounter, &postCounter, preNumber, postNumber); }
Here we computed two types of numbers: preorder DFS numbers and postorder DFS numbers depending on whether they were assigned before or after the recursion. The preorder numbers are the same that were simply called "dfs numbers" before. The importance of computing both types of numbers will become clear soon.
The last code fragment is written in a C-like style. In Java where integer parameters cannot be passed by reference, one should either make the counters global (ugly but efficient), or pass it in some kind of object, for example as an object from the class "Integer". Click here to see all presented traversal algorithms integrated in a working Java program.
With respect to a given spanning tree, for which we have computed pre- and postorder numbers, edges can be classified in constant time per edge as follows:
This classification does not distinguish tree edges from forward tree edges. If this matters, they can be distinguished testing whether u == parent[v] or not, where parent[v] gives the node from which v was reached. The classification allows to test in O(n + m) time whether a given spanning tree of a graph G is a DFS tree of G: compute the pre- and postorder numbers and test whether there are forbidden cross edges.
For undirected graphs this gives a particularly easy test for cycles:
boolean isAcyclic() { int[] parent = new int[n]; for (v = 0; v < n; v++) parent[v] = -1; // Flag value Bag b = new Bag(n); for (r = 0; r < n; r++) if (parent[r] == -1) // r is the root of a new component { counter++; parent[r] = r; b.add(r); while (b.notEmpty()) { v = b.remove(); for (each neighbor w of v) if (parent[w] == -1) // w has not been visited yet { parent[w] = v; b.add(w); } else if (w != parent[v]) return false; } } return true; }
The above algorithm is a minor modification of the spanning-forest algorithm and has running time O(n + m). On undirected graphs one must be careful not to detect "cycles of length two": an edge (u, v) followed by the same edge in the other direction (v, u). In the algorithm these cases are singled out by the test "w != parent[v]". The applied method is simple and efficient, but one may nevertheless wonder whether it is necessary to use an additional array of n ints. One of the exercises deals with this, and the conclusion is that an array of n booleans is sufficient as well.
For directed graphs, the above algorithm might detect false cycles: reaching an earlier reached node does not need to imply that a directed cycle has been closed. On undirected graphs any graph-traversal algorithm can be used, but on directed graphs it appears to be essential to process the nodes in DFS order. Doing this, cycles are created only by backward tree edges, edges leading from a descendant v to an ancestor w. The ancestors of a node w are easy to recognize: these are exactly these nodes which already have been reached by the traversal, but for which the search is not yet completed. This can be characterized with two booleans per node. More formally, during the search any node may be in three possible states:
So, the idea can also be implemented conveniently with a three-valued array:
- 0.
- unreached
- 1.
- reached but not finished
- 2.
- finished
boolean hasCycle(int v, boolean[] status) { status[v] = 1; for (each neighbor w of v) if (status[w] == 0) // Unvisited node { if (hasCycle(w, reached, finished)) // Cycle in subtree return true; } else if (status[w] == 1) // w is an ancestor of v return true; status[v] = 2; return false; // No cycles in any of the subtrees } boolean isAcyclic() { byte[] status = new byte[n]; for (v = 0; v < n; v++) status[v] = 0; for (r = 0; r < n; r++) if (status[r] == 0) // Unvisited node if (hasCycle(r, reached, finished)) // Cycle in subtree return false; return true; // No cycles in any of the subtrees }
void connectedComponents(int[] component) { for (v = 0; v < n; v++) component[v] = -1; // Flag value Bag b = new Bag(n); for (r = 0; r < n; r++) if (component[r] == -1) // r is the root of a new component { component[r] = r; b.add(r); while (b.notEmpty()) { v = b.remove(); for (each neighbor w of v) if (component[w] == -1) // w has not been visited yet { component[w] = r; b.add(w); } } } }Hereafter, for all nodes v, component[v] gives the index of the "root" of the component. In this case, this index equals the smallest index of the nodes belonging to the component. The graph is connected if and only if component[v] == 0 for all v. Using a counter which is increased only when finding the root of a new component, the components can be numbered consecutively, which saves memory if the sizes of the connected components are to be stored in an int array size[]. Click here to see this algorithm integrated in a working Java program.
Much more interesting and relevant is to determine the strong components. This problem is non-trivial: just running a graph traversal does not bring anything, reaching a node v from a node r does not mean that there is also a way back from v to r. At first it may even appear that finding the strong components is of an essentially harder nature than the problems discussed so far. This is not true: O(n + m) time is enough.
The algorithm consists of two rounds of DFS. For a graph G it proceeds as follows:
Lemma: The nodes reached by a search starting from a node r during the second DFS performed on G' precisely give the strong component to which r belongs.
Proof: Denote the strong component to which r belongs by S_r. By definition of strong components, for any node u in S_r there is a path from u to r. Thus, there is also a path from r to u in G'. This implies that all of S_r is reached during the search starting from r. Now consider a node v which is reached during the search starting from r. Apparently there is a path in G' from r to v, and thus there is a path from v to r in G. It remains to prove that there is a path from r to v in G. r is considered before v. This implies that r has larger postorder number than v. If we assume that during the first DFS v was reached before r, then we get a contradiction, because we know that there is a path from v to r, and thus the postorder number of v would be larger than that of r. So we may assume that during the first DFS r was reached before v. In that case r only gets a larger postorder number when v is a descendant from r in the DFS tree. That is, there must be a path from r to v.
Lemma: For a graph G, the postorder numbers of the graph with inverted edges G' constitute a topological sorting of G.
Proof: Assume there is an edge (u, v) in G. This implies that in G' there is an edge (v, u). So, if during the DFS on G', v is reached before u, then u will be reached as a descendant, either directly or indirectly, from v and therefore u will get a smaller postorder number than v, as it should be. If u is reached before v, then u gets a smaller postorder number than v unless v is reached as a descendant from u. However, this would mean that there is a path in G' from u to v. Together with the edge (v, u), this means that u and v lie on a common cycle. That is, G', and hence G, is not acyclic, a contradiction.
This algorithm immediately leads to an O(n + m) algorithm even though inverting the graph is slightly technical if we are working with an array-based implementation of the adjacency lists. Furthermore, this doubles the memory requirement.
The above idea immediately leads to an algorithm:
It remains to find an efficient implementation of the above. The idea is to maintain for each node its current indegree in a separate array. The original values in this array can be computed in O(m). Then in one pass through the array all nodes with indegree zero are detected and entered in a bag. The nodes are taken out of the bag in arbitrary order. When removing an edge (u, v) the indegree of v is reduced by 1. In this way the whole algorithm has running time O(n + m). A nice feature of the algorithm is that it is not necessary to test beforehand whether the graph is acyclic or not: if the bag is empty before all nodes are numbered, then we know there is a cycle. If this does not happen, we know that apparently the graph must have been acyclic.
boolean topologicalSort(int[] number) { int[] degree = new int[n]; for (v = 0; v < n; v++) degree[v] = 0; for (v = 0; v < n; v++) for (each neighbor w of v) degree[w]++; Bag b = new Bag(n); for (v = 0; v < n; v++) if (degree[v] == 0) b.add(v); counter = 0; while (b.notEmpty()) { v = b.remove(); number[v] = counter; counter++; for (each neighbor w of v) { degree[w]--; if (degree[w] == 0) b.add(w); } } return counter == n; // Graph is acyclic }
Click here to see this algorithm integrated in a working Java program. The algorithm is efficient both considering time and memory consumption:
void unweightedSSSP(int s, int[] dist) { for (v = 0; v < n; v++) dist[v] = INFINITY; Queue q = new Queue(n); dist[s] = 0; q.enqueue(s); while (q.notEmpty()) { v = q.dequeue(); for (each neighbor w of v) if (dist[w] == INFINITY) // w has not been visited yet { dist[w] = dist[v] + 1; q.enqueue(w); } } }The algorithm is so that nodes v that are unreachable from s have distance[v] = infinity, which appears to be reasonable. Click here to see it integrated in a working Java program.
Here and in the following dist[u] denotes the distance values in the algorithm, while distance(u, v) denotes the real distance in the graph from node u to v. Because the length of the concatenation of two paths is the sum of the lengths of each path, the triangle inequality holds for all triples of nodes u, v and w:
distance(u, w) <= distance(u, v) + distance(v, w).
Using induction over the number of performed enqueue operations, it is easy to prove that at all times for all nodes dist[v] >= distance(s, v). Initially this is true. So, assume it is true after t enqueuing operations. If in step t + 1 we set dist[w] = dist[v] + 1, then we may assume that dist[v] >= distance(s, v), and thus, using the triangle inequality, dist[w] = dist[v] + 1 >= distance(s, v) + 1 = distance(s, v) + distance(v, w) >= distance(s, w). Thus, the computed values are not too small. It remains to prove that they are not too large. This is not as easy as one might think.
Lemma: The values dist[v] of the enqueued (dequeued) nodes is monotonically increasing.
Proof: We use induction over the number of performed enqueue (dequeue) operations. Initially the claim holds because any sequence of length at most 1 is monotonic. So, assume the claim holds for the first t operations. Then in operation t + 1 we are enqueuing a node w with value dist[w] = dist[v] + 1. Because v is the latest dequeued node, we may assume that dist[v] >= dist[u] for any earlier dequeued node u. But then dist[w] = dist[v] + 1 >= dist[u] + 1, the value of an arbitrary node on the queue. Here we essentially use that in a queue nodes that were enqueued earlier are also dequeued earlier (the FIFO order).
Theorem: At the end of the algorithm dist[v] = distance(s, v) for all v.
Proof: It remained to show that the values are not too large. The proof goes by contradiction. Assume that dist[w] > distance(s, w) for some w, and assume that w is the node lying closest to s which gets assigned too large a value dist[w]. Let u be the last node before w on a shortest path from s to w. So, distance(s, w) = distance(s, u) + 1. Because u lies closer to s than w, we may assume that u gets assigned the correct value: dist[u] = distance(s, u). Let v be the node which was responsible for enqueuing w. This implies dist[w] = dist[v] + 1. So, dist[v] = dist[w] - 1 > distance(s, w) - 1 = distance(s, u) = dist[u]. Thus, according to the previous lemma, node u will be dequeued before node v. Thus, w should have been enqueued by u instead of v, and we should have dist[w] = dist[u] + 1. This is a contradiction, which can be traced back to our assumption that there is a node w with dist[w] > distance(s, w).
In this kind of proofs it is very common to argue by contradiction, focusing on a supposed first occasion for which the algorithm makes a mistake. If there is no first mistake, then there is no mistake at all!void weightedSSSP(int s, int[] dist) { for (v = 0; v < n; v++) dist[v] = INFINITY; PriorityQueue pq = new PriorityQueue(n, INFINITY); dist[s] = 0; pq.decreaseKey(s, dist[s]); while (pq.notEmpty()) { v = pq.deleteMin(); for (each neighbor w of v) if (dist[w] > dist[v] + weight[v, w]) // shorter path { dist[w] = dist[v] + weight[v, w]; pq.decreaseKey(w, dist[w]); } } }
In order to get a formulation without case distinctions, initially all nodes are inserted into the priority queue with infinite key value. This works correct even if some nodes may not be reachable from s: for these nodes the key value remains unchanged throughout the algorithm. As soon as all reachable elements are deleted from the priority queue, these are deleted and they are processed just as reachable nodes, but certainly this does not lead to improved distance values. So, the existence of unreachable nodes only causes some useless work.
At most n elements are enqueued and dequeued. At most m decreasekey operations are performed. It depends on the priority queue used how much time this takes. Using a binary heap, the construction can be performed in O(n) time and each decreasekey and deletemin in O(log n) time. So, with an ordinary heap Dijkstra's algorithm has running time O((n + m) * log n). Better priority queues (Fibonacci heaps) allow to perform the decreasekey operations in O(1) amortized time, reducing the running time to O(m + n * log n). Thus, for all m = Omega(n * log n), this is O(m), which is clearly optimal. Refined algorithms perform better for sparser graphs.
Click here to see the algorithm integrated in a working Java program. In this implementation the priority queue is realized in a primitive way using an array. This minimizes the memory consumption and allows to perform a decrease-key operation in constant time, but makes deletemins expensive: they take O(n) time. So, the running time of the whole algorithm is O(n^2 + m), which for all simple graphs is O(n^2). For dense graphs this is ok, but for all other graphs it is much better to use a priority queue with faster deletemins. In general, implementing a decrease-key operation may require an additional search tree to find the nodes, but because in this particular case all node indices lie between 0 and n - 1, it is sufficient to maintain an extra array pos[] of length n, pos[u] giving the position of node u in the priority queue (one more application of direct addressing).
The proof of correctness of the algorithm is similar to the proof of correctness for the unweighted case. First one shows that the nodes that are dequeued have increasing distances.
Lemma: The values dist[v] of the dequeued nodes is monotonically increasing.
Proof: We use induction over the number of performed dequeue operations. Initially the claim holds because any sequence of length at most 1 is monotonic. So, assume the claim holds for the first t operations. Let w be the node we are dequeuing in step t + 1. Let u be any node dequeued before. At the time u was enqueued using deletemin we must have had dist[u] <= dist[w]. So, consider possible updates to dist[w] after u was dequeued. Let v be the node which caused the latest update of dist[w]. In that case dist[w] = dist[v] + weight[v, w]. From our induction assumption it follows that dist[v] >= dist[u]. But then, dist[w] = dist[u] + weight[v, w] >= dist[u]. It is at this point that we essentially use that weight[v, w] >= 0. Otherwise this lemma cannot be proven!
Theorem: At the end of the algorithm dist[v] = distance(s, v) for all v.
Proof: As before it can be shown that the values of distance[] at all times give an upper bound on the distances: as long as they are infinity, this is clear, once they have a finite value, the value corresponds to the length of a path: there may be shorter paths, but this path can be used for sure. So, we may assume that dist[v] >= distance(s, v) for all nodes at all times.
Consider the node w lying closest to s, having smallest value of distance(s, w), which upon dequeuing has dist[w] > distance(s, w). If weight[v, w] = 0, and the shortest path from s to w runs through v, then we will nevertheless say that v lies closer to s than w. Let v be the last node before w on a shortest path from s to w. Thus, distance(s, w) = distance(s, v) + weight[v, w]. Because weight[v, w] >= 0, we have dist(s, v) <= dist(s, w), and thus v gets correct value, that is dist[v] = dist(s, v). So, dist[w] > distance(s, w) = distance(s, v) + weight[v, w] = dist[v] + weight[v, w] >= dist[v], and therefore, because of the previous lemma, v will be dequeued before w. But at that occasion the algorithm would have set dist[w] = dist[v] + weight[v, w] = distance(s, v) + weight[v, w] = distance(s, w), in contradiction with the assumption that distance[w] > dist[s, v].
In the following picture the action of Dijkstra's algorithm is illustrated. Edge weights are indicated along the edges, the current values of dist[] are indicated in the nodes. 99 stands for infinity. Nodes that have been removed from pq have final distance value. The (connected) subgraph with these nodes is marked.
void findShortestPathEdges(int[] dist, int[] parent) { for (all nodes w) parent[w] = -1; for (all edges (v, w)) if (dist[w] == dist[v] + weight[v, w]) parent[w] = v; }
The routine is given for directed graphs. For undirected graphs (depending on the input format) it may be necessary to consider an edge (v, w) both for w and for v. In the current version parent[w] may be set several times if there are several paths of the same length from s to w. This may be prevented by performing the assignment only when parent[w] == w, but this does not make the routine faster.
This routine takes O(n + m) time independently of the type of graph, so these costs are negligible except for unweighted graphs. For weighted graphs it will certainly be more efficient to determine the edges by this separate procedure. But, even for unweighted graphs it may be profitable to perform a second pass over all edges: this reduces the amount of data that are handled in a single subroutine, and may therefore allow to hold a larger fraction of the graph in cache.
Afterwards every node has a unique predecessor, and the graph defined by parent[] is acyclic provided that there are no zero-cost cycles. So, the whole graph defined by parent[] constitutes a tree directed towards s, spanning all nodes reachable from s. In particular: independently of m, the set of all shortest paths has size n. Once parent[] has been computed, queries of the form "give the shortest path from v" can be solved using O(n) time and memory: start at v, push all edges on the path from v to s on a stack and print them while popping them.
It is not nice that the above algorithm does not handle zero-cost cycles correctly. This problem can most easily be overcome by an alternative algorithm which is only marginally slower than the given one: run a spanning-tree algorithm on the graph G' = (V, E'), where E' = {(v, w) in E| dist[w] == dist[v] + weight[v, w]}.
Give pseudocode realizing this operation. Distinguish four cases: an adjacency-matrix representation and a representation with adjacency lists, both for directed and undirected graphs. Indicate the time consumption for each of them. The time bounds should be given in terms of n, the number of nodes; m, the number of edges; d_u, the degree of u; d_v, the degree of v; and d_G, the current degree of the graph G.
For directed graphs it is hard to find the edges (w, v). But, choosing an appropriate graph representation requiring O(n + m) memory, a slightly more complicated algorithm can perform edge contraction efficiently: Point 1, 2 and 3 can be realized in O(d_v) time. Describe the necessary graph representation and the algorithm in full detail.
Point 4 is somewhat harder. Give a trivial realization requiring O(d_v * d_G) time. Organizing the adjacency structures in a suitable way, this can be improved to O(d_v * log(d_G)). Describe how this can be achieved. Instead of changing the organization of the adjacency structures, we can also use some additional data structure, so that we reasonably can expect to perform this operation in O(d_u + d_v) time. Describe how this can be achieved.
Now take the given heap implementation and complement it with a method decreaseKey(int i, int x) based on percolating up along the lines sketched in the text above. Replace the priority queue in the program testing Dijkstra's algorithm by this heap implementation.
Repeat the experiments for the same pairs of n and m values. Compare the times. Fit a suitable function through your results. Do the experiments conform with the theoretical analysis?
Actually the edges of T can be computed without explicitly constructing G'. Give an algorithm which, in addition to the memory required for storing G, essentially only requires memory for the array parent[].