(based on Flavia Bonomo's slides)

**Table of Contents**

An __algorithm__ is a sequence of steps that executes in finite time.
Algorithms must consist of simple steps that are __ordered__, __well
defined__ (meaning that every execution of the algorithm with the same
input parameters must yeld the same results) and __finite__ (the number
of steps must be finite).

To define how __efficient__ an algorithm is we carry out a mathematical
analysis in order to determine how much time will take its execution in
relation to the size of the input data. This is called __algorithmic
complexity__: it is a function that represents the execution time of
an algorithm in relation to the size of its input. We care mostly about
the __wort case complexity__ and the __average case complexity__.

In this document we will use the notions of __Big O notation__,
__omega (Big-Ω) notation__ and __theta (Big-θ) notation__. You
can learn more about them on the internet if you want.

As algorithm execution times tend to scale with their input, we say
that an algorithm is __properly solved__ if an algorithm of polinomial
complexity is known for the problem. __Polinomial algorithms__ are considereded
satisfactory (the lower the degree the better). Supra-polinomial algorithms
are considered non-satisfactory.

There are many techniques that can be used to design algorithms. In this
document we'll just care about three: __Greedy Algorithms__,
__Backtracking__ and __Dynamic Programming__.

The idea behind greedy algorithms is to design an algorithm that in each step picks the best available alternative towards a solution without (or almost without) considering the implications of said choice.

This kind of algorithms are easy to design, easy to implement, generally
efficient. However, they do not always work: some problems cannot be solved
this way. It can, nevertheless, be used to approximate (most of the time)
*good enough* solutions for these problems. The
Knapsack
Problem is an example of these problems.

Backtracking is a technique used to traverse all possible configurations
of the solution space of a computational problem. Most of the times the
solution to a problem can be represented as a n-dimensional vector (say
*(a _{1}, a_{2}, ..., a_{n})* with each index

Each step of a backtracking algorithm we check a step toward a possible solution in a tree-like way, going down a branch at a time until we either find the solution or until we know this particular branch is not going to yeld any valuable results.

If we realize that the branch we are exploring is not going to return a
valuable result we proceed to __trim__ the branch. In lame man terms this
means that we stop exploring that branch and turn to the next one. By doing
this the number of possible solutions to check is widely reduced. This is
what sets backtracking aside from other brute-force methods.

There are two types of trims: __optimality trims__, when we stop
exploring a branch because we know is not going to yeld an optimal result
(used in optimization problems),
and __feasibility trims__, when we stop exploring a branch because we
know it is not going to yeld a valid result.

This technique is applied typically to optimization problems, where we want not just to find a solution to a problem, but to find the the best one.

For a problem to be solvable using Dynamic Programming it must first
fulfill __Bellman's Optimality Principle__.
An optimization problem satisfies this principle if when an optimal solution
is found, the sub-solution to each sub-problem was also the optimal solution
to the corresponding subproblem.

In the Dynamic Programming technique the problem is divided in
sub-problems of smaller size that are easier to solve. This sub-problems
may also be divided in smaller sub-problems. When the sub-problems are
solved, its solutions are combined to generate a solution to the original
problem, just like in the Divide and Conquer technique.
Dynamic Programming, however, is not recursive and it's __bottom up__.

Dynamic Programming is used in optimization problems where we suspect that we'll have to repeat the same steps over and over again for a number of values. Once we solve the process for a particular value, we store the result so that if we were required to calculate it again we'd have it already solved for us. In many cases this yelds a huge performance boost.

A graph __G = (V, X)__ is a pair of sets. V is a set of __nodes__
or __vertices__ and X is a subset of V x V (pairs of elements of V).
The elements of X are called __edges__ or __axes__.

The size of the set V is traditionally noted __n__ and the size of X
is noted __m__.

Given a graph G = (V, X), given v and w nodes of G, if an edge e = (v, w)
exists in X we say that v and w are __adjacent__ and that e is
__incident__ to v and w.

Acording to Harary's book in a __graph__, only one edge can exist
between two nodes (and thus incide on
them) and a signle node cannot be adjacent to itself. However there are
other kinds of graphs: in a __multigraph__, multiple edges can exist
between the same pair of nodes, and in a __pseudograph__ not only
can many edges exist between the same pair of nodes, but a node can also
be adjacent to itself. In this document, unless explicitly noted, we will
always be talking about the first kind of graphs.

The __degree__ of a node v is the number of edges that are adjacent
to v. It is noted __d(v)__.

__Theorem:__

The sum of the degrees of all the nodes in a graph is equal to two times the number of edges.

`sum `

_{i = 1 to n} d(v_{i}) = 2m

We say that a graph is __complete__ if all its nodes are adjacent
to every other node. We note these graphs as __K _{n}__. For
example, K

A __path__ in a graph is a succession of edges e_{1},
e_{2}...e_{k} so that a side of e_{i} incides
over the same node a side of e_{i-1} incides, and the other side
of e_{i} incides over the same node a side of e_{i+1} does.

Examples of paths (in red)

A __simple path__ is a path that doesn't pass two times over the same
node. A __circuit__ is a path that starts and ends on the same node.
A __simple circuit__ is a circuit of 3 or more nodes that doesnt pass
two times over the same node.

The __length__ of a path is the number of edges in the path.
The __distance__ between two nodes v and w of a graph is the length
of the shortest path between v and w and is noted __d(v, w)__. Thus,
the distance between a node and itself, d(v, v), is 0. Also, if no path
exists between two nodes v and w, d(v, w) = ∞.

__Proposition:__

If a path P between v and w has length d(v, w), P must be a simple path.

For every pair of nodes v and w, d(v, w) >= 0.

For every pair of nodes v and w, d(v, w) = 0 if and only if v = w.

For every pair of nodes v and w, d(v, w) = d(w, v).

For every trio of nodes u, v and w, d(u, w) <= d(u, v) + d(v, w)
*(triangular inequality)*.

A graph is __connected__ if a path exists between each pair of nodes.

Given a graph G = (V, W), a __subgraph__ of G is a graph H = (V', X')
so that V' is in V and X' is in X ∩ (V' x V') (this means that the set
of edges of H is a subset of the set of edges of G and the set of nodes
of H is a subset of the set of nodes of G).

We say that a graph H, subgraph of G, is __induced__ if every pair
of nodes u, v in H that are adjacent are also adjacent in G, and every
pair of nodes of G that also belong to H and are adjacent in G are also
adjacent in H.

A __connected component__ H of a graph G is a maximal, connected
subgraph of G (meaning that if we where to add another node from G to H
it would stop being connected).

0-1-2 and 3-4 are connected components in the above graph.

We say that a graph G = (V, W) is __bipartite__ if a
partition
V1, V2 of the set of nodes V exists so that V is the union of V1 and V2,
V1 and V2 don't share any nodes, neither V1 nor V2 are empty, and every
edge of G has one side on a node of V1 and the other side on a node of V2.

Examples of bipartite graphs (V1 in red, V2 in black)

**Theorem:**

A graph with 2 or more nodes is bipartite if and only if it doesn't have simple circuits of odd length.

Two graphs G = (V, X) and G' = (V', X') are said to be __isomorphic__
if a biyective function f: V → V' exists so that for every v, w in V the
edge (v, w) exists X if and only if the edge (f(v), f(w)) exists in X'.

A simpler way to imagine this is to say that a graph is isomorphic to other graph if - without disconecting or reconecting nodes - you can rearrange the nodes of the first graph to form the second one.

Example of an isomorphic graph

If two graphs G and G' are isomorphic, they obviously have the same number of nodes and edges, they have the same number of nodes of degree k, they have the same number of connected components and the same number of simple paths of length j.

Sometimes is useful to define matrices to define certain properties of
graphs. An __adjacency matrix__ for a graph G is a matrix A belonging to
R^{n*n} where the element a_{ij} of A is set to 1 if
G has an edge that joins the nodes i and j and 0 if not.

Likewise, the __incidence matrix__ for a graph G is a matrix
B belonging to R^{m*n} (note that is says m*n and not n*n) where
the element b_{ij} of B is set to 1 if the edge i is incident to
the node j or 0 if it is not.

A __digraph__ (or __directed graph__) G = (V, X) is a graph where
X is an set of __ordered pairs__ of nodes of V. What this means is that
given two nodes v and w from V, if (v, w) exists in X then the graph has
an edge going from v to w. But if (w, v) doesn't exist in V, the opposite
is not true.

The edges of a digraph have a direction, while the edges on a graph do not.

The __in-degree__ of a node v of a digraph, noted d_{in}(v),
is the number of edges that go into v (the number of edges that have v as
their second element). Likewise, the __out-degree__ of a node v of a
digraph, noted d_{out}(v), is the number of edges that go out of v
(the number of edges that have v as their first element).

Digraphs have paths too. A __directed path__ in a digraph is just
like a path in a graph, but every edge in the past must come from the node
where the last edge arrived.

Example of a directed path in a digraph (in blue)

Likewise, a __directed circuit__ is a directed path that starts and
ends on the same node.

A digraph is said to be __strongly connected__ if for every pair of
nodes u, v there is a directed path that goes from u to v and another one
that goes from v to u.

The __underlying graph__ of a digraph is the graph that would result
from taking all the edges in the digraph and making them undirected (thus,
turning it into a graph). If between two nodes there were two or more
edges, these would be turned into one to make the underlying graph.

A __tree__ is a connected graph without simple circuits.

**Theorem:**

Given a graph G = (V, X) the following things are equivalent:

- G is a tree (a connected graph without simple circuits).
- G is a graph without simple circuits, but if we added a new edge e to
G we'd end up with a graph with
*exactly*one simple circuit, and that simple circuit would contain the edge e. - There is exactly one simple path between each pair of nodes of G.
- G is connected, but if we were to remove any edge from it, G wouldn't be connected anymore.

From the above theorem we can deduct two other lemmas:

- If G = (V, X) is a connected graph and e is an edge of G, G without e is connected if and only if e belongs to a simple circuit of G.
- The union of two different simple paths between a pair of nodes contains a simple path.

A __leaf__ is a node of degree 1. A __non-trivial tree__ is a tree
that has at least two leaves.

If G = (V, X) is a tree, m = n - 1 (remember than m is |X| and n is |V|). This is true because there is an edge connecting each node to its 'parent', except for a node that we'll call 'root', that has no parent.

If G = (V, X) has no simple circuits and has c connected components, then m = n - c. As the root of each connected component would have no parent.

If G = (V, X) has c connected components, then m >= n - c.

**Theorem:**

Given a graph G = (V, X) the following things are equivalent:

- G is a tree (a connected graph without simple circuits).
- G is a connected graph without simple circuits and m = n - 1.
- G is connected and m = n - 1.

This theorem can be deducted trivially from the previous points.

A __directed tree__ G is a digraph such that it doesn't have any
cicles, the underlying non-directed graph is a tree, and in G there
is a node r so that a directed path from r to all the other nodes exists.

Any node in a tree can be designed as the __root__ of the tree. A
tree doesn't have to have a root per sé, though. If it has, we say that
the tree is a __rooted tree__.

The __level__ of a node in a rooted tree is the distance from that
node to the root of the tree.

The __height__ of a rooted tree is the length from the root to the
furthest node.

A rooted tree is said to be __m-ary__ if all its nodes except for the leaves
and the root have a degree of at most m + 1 and the root at most m (because
it has no parent).

A rooted tree is said to be __exactly m-ary__ if all its nodes except for the
leaves and the root have a degree of exactly m + 1 and the root exactly m
(because it has no parent).

A rooted tree is __balanced__ if all its leaves are at level h or h - 1.

A rooted tree is __completely balanced__ if all its leaves are at level
h.

The internal nodes of a rooted tree are all those nodes that are not leaves nor the root.

**Theorem:**

- A m-ary tree of height h hast at most m
^{h}leaves (because on every level the number of leaves is multiplied by m and, thus, a completely balanced m-ary tree of height h would have m^{h}leaves). - A m-ary tree with l leaves has h >= floor(log
_{m}l) (for the same reason as point 1). - If T is a balanced exactly m-ary tree, then h = log
_{m}l (for the same reason as point 1).

In order to traverse a tree, the __BFS__ (Breadth-First Search) and
__DFS__ (Depth-First Search) algorithms can be used. In particular, these
algorithms can be used in any graph, not only trees (even graphs with cycles).

The BFS (Breadth-First Search) algorithm takes a vertex, then travels to its neighbours, from there to the neighbours of these neighbours, etc. Once all the nodes on this connected component have been visited, if we still have nodes that weren't visited we start again from any of those.

It's called *Breadth-First*
Search because if it were used on a tree and it were to start on the root of
said tree, it would travel first to the root, then to each node on the first
level, then to each node on the second level, etc, thus traversing each level
breadthways.

With some slight modifications it can be used to calculate the distance from one node to another, to find a node in a graph and to count the number of connected components of the graph.

The DFS (Depth-First Search) algorithm takes a vertex, then travels to a neighbour of this vertex, then to a neighbour of this neighbour, and so on. When a node with no un-visited neighbours is reached, the algorithm is restarted from an un-visited neighbour of the last visited node that still has un-visited neighbours. Once all the nodes on this connected component have been visited, if we still have nodes that weren't visited we start again from any of those.

It's called *Depth-First* search because if it were used on a tree and
it were to start on the root of said tree, it would travel from the root down
to a leaf, then to the sister leaf of this leaf, then to a parallel branch,
then to other branch, etc, thus traveling one branch at a time, going first
deep down.

With some slight modifications it can be used to calculate the distance from one node to another, to find a node in a graph and to count the number of connected components of the graph.

DFS is the typical traverse algorithm using in backtracking algorithms, because it allows to trim certain branches of the tree.

It's convenient to store the graph using an adjacency list for use with these algorithms, in order to be able to execute "for each neighbour of the node w" in O(d(w)) time. This way, if all other operations are executed in constant time, the total complexity of the algorithms would be O(n + m). The n comes from the sequential traversal that checks if there are un-visited nodes left.

If an adjacency matrix is used instead, the complexity will be O(n^{2}),
because looking for all the neighbours of a node would take O(n).

A __Spanning Tree__ of a graph G is a subgraph
of G that is a tree an that has the same set of nodes that G has.

Let T = (V, X) be a tree and l : X → R a function that assigns
lengths (or weights) to the edges of T. We define the __length__ of T
as l(T) = sum _{e in X} l(e). Thus the length of T is the sum
of the lengths of all the edges of T.

Given a graph G = (V, X), the __minimum spanning tree__ (MST) of G
(called T) is a spanning tree of G of minimum length. This means that of
all the possible spanning trees for G, it is the one with the smallest
sum of the lenghts of its edges.

MST of a graph (in green)

To find the MST of a tree we can use __Prim's Algorithm__ and
__Kruskal's Algorithm__, both explained in the following sections.

Prim's algorithm is a greedy algorithm that given a connected graph G returns the minimum spanning tree of G.

To understand Prim's algorithm we must first carefully understand this lemma: let T = (V, Xt) be a spanning tree of G = (V, X). If e is an edge of G that doesn't exist in T (and thus it's not in Xt) and f from Xt an edge from the cycle that appears when we add e to E. Then, the tree T' = (V, Xt ∪ {e} \ {f}) (T with the edge e minus the edge f) is a spanning tree of G. This is true for we add an edge to form a cycle and then we remove an edge from that cycle, thus leaving us with a connected graph that is also a tree and has the same set of nodes G has, thus being a spanning tree of G.

The algorithm:

Input: G = (V, X) a connected graph with a function l : X → R (the length function) for its edges. Prim's Algorithm: Vt = {u} # u can be any node of G Xt = {} i = 1 while i <= n -1: pick e = (u, v) from X so that l(e) is minimum among the edges that have one side on u from Vt and the other on v from the nodes in V that are not in Vt. Xt = Xt + {e} Vt = Vt + {v} i = i + 1 T = (Vt, Xt) return T

A proof of correctness for Prim's algorithm can be found here.

Kruskal's algorithm is a greedy algorithm that given a connected graph G returns the minimum spanning tree of G.

Input: G = (V, X) a connected graph with a function l : X → R (the length function) for its edges. Kruskal's Algorithm: Xt = {} i = 1 while i <= n - 1: pick e from X so that l(e) is minimum among the edges that don't form circuits with the edges already in Xt. Xt = Xt + {e} i = i + 1 T = (V, Xt) # Create the tree return T

A proof of correctness for Kruskal's algorithm can be found here.