Contents

Introducing Graphs

Graph is a mathematical object. Well, it sounds too complicated... :)

Let us start from something simpler. Have a look at the following picture:

A B C D E F G

Here you see 7 circles depicting nodes (or vertices) linked by edges. That is all! The graph is just the heap of nodes and edges. The particular placement of rounds and lines is not important - lines could be drawn curving for example.

So the graph is completely specified by the following data:

For example, the graph above could be specified as:

nodes:
A, B, C, D, E, F, G

edges:
A-B, B-C, C-D, D-E, E-B, B-F, F-G

Popular examples which could be represented by graphs are the:

Optionally graph can have additional information, for example:

  1. Edges could have "weights" (lengths of roads, time of signal propagation between routers etc.)
  2. Vertices can have weights too (though this could be transformed into weights of the edges)
  3. Edges could be "directed", like one-way road between cities

Terminology

Graph representation in programming

Shortly speaking, there are many ways. A lot of ways. Let us regard only few most popular - keep in mind that based on these examples you can invent other variants using common data structures.

Incidence Matrix

For graph with N vertices we can use the matrix N * N. The j-th cell in the i-th row will contain a number - the weight of the edge from vertex i to vertex j. The only question is how to represent the absent edges - usually some special value (or simply extremely large weight) could be used. For example our graph will look like:

   A B C D E F G

A  - 1 - - - - -
B  1 - 1 - 1 1 -
C  - 1 - 1 - - -
D  - - 1 - 1 - -
E  - 1 - 1 - - -
F  - 1 - - - - 1
G  - - - - - 1 -

As you see, all edges have marks in two cells (from i to j and from j to i) because they are undirected. All weights are 1 for our sample graph.

This variant was popular with languages like C or Pascal not having more efficient data structures than arrays. Good feature is that the edge between given vertex could be found in one operation. However it is significantly space-consuming (the web graph could not be represented in such way for sure).

Incidence Lists

To save the space we can store for each vertex only list of its existent edges. These lists could be grouped into an array (so that we can found a list by the vertex number etc.) Here is the example for our graph:

A: B               g['a'] = [('b', 1)]
B: A, C, E, F      g['b'] = [('a', 1), ('c', 1), ('e', 1), ('f', 1)]
C: B, D            g['c'] = [('b', 1), ('d', 1)]
D: C, E            g['d'] = [('c', 1), ('e', 1)]
E: B, D            g['e'] = [('b', 1), ('d', 1)]
F: B, G            g['f'] = [('b', 1), ('g', 1)]
G: F               g['a'] = [('f', 1)]

On the left we used simplified notation, without edge weights (because our example do not have them - or rather they are all equal) - however in general case we need to store the list of edges, each of which contains both the target vertex and the weight - like it is shown on the right, in Python-like syntax.

You may note that if there are no duplicating edges, we can use dictionaries (mapping target vertex name to weight) instead of lists. However with dictionaries we can do better.

Using Dictionary

These specific data structures are called "dictionaries" in Python or "hash maps" in Java and C++. They are also known as "associative arrays" - in languages like PHP and JavaScript all arrays are associative.

They work like ordinary array, but can use strings and other (constant) objects for indexes. So we can, for example, create a dictionary of edges, using the names of pair of vertices as the key (or index) and the weight of the edge as the value. Here is an example with our graph:

g['ab'] = 1
g['ba'] = 1
g['bc'] = 1
g['be'] = 1
    ...
g['gf'] = 1

This approach is really a space-efficient version of the incidence matrix.