NGraph

a simple (Network) Graph library in C++

Introduction

NGraph is a simple, small, C++ class library for the analysis of network graphs (i.e. those of graph theory, consisting of vertices and edges). The library provides functionality for creating and accessing graphs in a convenient way, with little overhead or complicated paradgims.

Although a mathematical Graph can consist of any enumerated type (cities, colors, people, etc.) it is convenient to think of the "nodes" as non-negative integers {0, 1, 2, ... } with "edges" simply being a pair of nodes. The number of edges in a Graph is often referred to as its size , while the number of nodes comprise its order .

This first version of NGraph implements a simple directed graph (which can also be used as the basis of a undirected graph by storing only edges (i,j) where i < j) without multiple edges or loops (edges from a vertex to itself). These conventions are not as severly limiting as they sound, as there are many simple workarounds to accomodate them.

Finally, it should be noted that NGraph is not a replacement or a substitute for more elaborate graph packages, like the Boost Graph Library (BGL), or the LEDA graph library. These well-designed packages are much more sophisticated and complex. NGraph, on the other hand, is a tiny library (several hundred lines of code) meant to demonstrate how one can harness C++ components (like sets from Standard Template Library) to create useful data structures with just a few commands.

Examples

To create a Graph, we call the insert_edge() function to build its edges:


    using namespace NGraph;

    Graph A;
    A.insert_edge(3,4);
    A.insert_edge(4,5);
    ...
    A.insert_edge(0,1);

Note that because this is not a multi-graph (i.e. one that allows repeated edges) inserting that same edge twice has no effect.

At any time, we can see find out a graph's order and size by calling the num_vertices() and num_edges(). For example,


    std::cout << "Graph has " << A.num_vertices() << " vertices and "
        << A.num_edges() << " edges.\n";

One can remove edges by calling remove_edge(), e.g.


    A.remove_edge(4,5);

If the edge is not present in the graph, this operation has no effect. We can repeatedly remove edges from a graph until it is empty (i.e. its order is zero). We can also check to see if a given edge is present by calling contains_edge(), which returns a boolean value true if the edge exsits, and false otherwise. Likewise, we can remove vertices with the remove_vertex() function and it will also automatically remove all edges from the graph containing that vertex as well.

Now that we have built a graph, what can one do with it? Well, each vertex can have a number of edges going out (out-edges) and well as coming in (in-edges). The total number of lines connected to a vertex is referred to as tis degree. Accordingly, the out_degree and in-degree tells us how many of these edges point inward, or outward. For exmaple,


 Graph::vertex a = 4;    // a more meaningful type than "unsigned int" 
 Graph::vertex b = 6;    //   
 A.degree(a);            // return the degree at vertex a
 A.in_degree(a);         // return the number of vertices that point to this one

 Graph::vertex_set S = A.out_neighbors(a);   / return vertices that point to this one

 A.subgraph(S);          // return the graph whose edge include all vertices in S

 A + B;                  // return the union of two graphs

 cout << A ;             // print out graph A onto standard output stream

Iterating through Graphs

The above functionality is fine for very simple operations with graphs. In theory, one could get by with these simple primitives. But in practice, more sophisticated graph algorithms require to be able to traverse a graph quickly, be able to access in-coming and out-going edges with each vertex.

The simplest graph structure is to maintain a list of the graph's edges as an (i,j) pair. (This is referred to as coordinate list.) While universal, it does provide an efficient scheme for finding vertex neighbors --one would need to scan the whole list to compute this.

Instead, modified graph structures are often implemented which store the graph data in a different way to better optimize such operations. A common method is an adjacency list, which just a list of neighbors for each vertex. Another approach would be to use some sort of hashing function, or clustering mechanism to optmize storage and/or accessing cost. Whatever the method used, we would like to able to use the Graph object in the same way. That is, we wouldn't want our application code to be completely dependent on the implementation details of the Graph library. In fact, we would like to keep this issue aside, so we can try several alternative strategies and potentially choose the best one for our case.

We need then, some methodology to walk (or traverse) through the graph in a way that does make any asummptions about its underlying structure. The Standard Template Library (STL) faced this very same problem and solve it quite elegantly with the concept of iterators. These pointer-like entities allow one to traverse a container without making many assumptions about the underlying implementation. A.begin() refers to the first node, and A.end() refers to the tail end of the container, while the ++ operator allows one to jump from one element to the next. We give an example below.


   //  for each node in the graph...
   //
   for ( Graph::const_iterator p = A.begin(); p != A.end(); p++)
   {
        int do = out_degree(p);
        int di = in_degree(p);

        // identify its neighbors
        //
        Graph::vertex_set Si = in_neighbors(p);
        Graph::vertex_set So = out_neighbors(p);

        // print out each out-going edge
        //
        for (Graph::vertex_set::const_iterator t = Si.begin(); t !=Si.end(); t++)
        {
            cout << *t << "\n";
        }
        cout << "\n";
   }

Another example: computing clustering coefficients

The cluster coefficient of a vertex gives an indication of how well a group of neighbors are connected. In a social netowrk graph where the nodes are people and edges represent friendships, this number determines the probability that any two of your friends will know each other. Mathematically, it is the ratio of the number of edges formed by the subgraph of the neighboring divided by the total number of possible edges in its complete graph (i.e. where everyone know each other). This gives a real number somewhere between 0.0 and 1.0, with the higher the number, the more "friendly" the vertcies seem to be. We can compute the cluserting coefficient of node "a" as follows. First, we find out the neighboring verticies (the group of friends), then form a subgraph of this group with the graph. Lastly, the size of a complete graph is N*(N-1)/2, where N is the order, so we can write


double cluster_coeff(const Graph &A, const Graph::vertex &a)
{
   Graph::vertex_set &group_of_friends = A.out_neighbors(a);
   double  N = group_of_friends.size(); 

    return   A.subgraph(group_of_friends).size() / (N * (N-1));
}

(Note that we made the variable "N" a double so that the division expression would be performed as a floating point calculation, rather than integer one. Also, as we visit each node we count each edge twice, so we need to cut the numerator in half. This cancels with halving the denominator, so there is no "2" in computation.)

Try some other examples

There are other examples in the source distribtution, along with some documented code, but this short tutorial should get you started. The best way to learn the capabilities of Ngraph is to play around with it!

Detailed Doxygen doucmentation, and another tutorial to follow.

This software was developed at the National Institute of Standards and Technology (NIST) by employees of the Federal Government in the course of their official duties. Pursuant to title 17 Section 105 of the United States Code this software is not subject to copyright protection and is in the public domain. NIST assumes no responsibility whatsoever for its use by other parties, and makes no guarantees, expressed or implied, about its quality, reliability, or any other characteristic.

We would appreciate acknowledgement if the software is incorporated in redistributable libraries or applications.

Download

NGraph: Network Graph Library
- Source code: ngraph_2_5.zip (9 KB)

Roldan Pozo