categories: math
tags: math clustering coefficient graph theory

Clustering Coefficient

BY Jonathan Paek

PUBLISHED March 20, 2023
 

Described below is the local clustering coefficient, network average clustering coefficient, and global clustering coefficient. This is my abridged summary after reviewing the content indicated in the reference(s) section.

Local Clustering Coefficient

The local clustering coefficient is a metric defined for a node,inode, i and calculated using count of links within the set of neighboring nodes, NN, for a node at vertex, i. It is a ratio of actual count of links, L, that exist within the neighboring node from the total possible links.

Ci=L(k)(k1)/2  =2L(k)(k1) C_i = \frac{L} {(k)(k-1)/2} \; = \frac{2L} {(k)(k-1)}

If you are wondering here about the 2, this is for the undirected graph and occurred during simplification of the total potential links which was divided by half to remove duplicate counts and simplified by moving to the numerator. Here, kk is the number of neighboring nodes N for our undirected graph. So, for a directed graph we have

Ci=Ltotal  =L(k)(k1) C_i = \frac{L} {total} \; = \frac{L} {(k)(k-1)}

Network Average Clustering Coefficient

The average clustering coefficient can by found by simply averaging all CiC_i values found in the local clustering coefficient.

Global Clustering Coefficient

The global clustering coefficient is also a ratio and using a similar methodology earlier, instead uses counts based on a formation of triplets formed by a node. What is a triplet? Here a triplet is defined created by 3 nodes which can be of type with 3 edges (closed triplet, a triangle), or with 2 edges (open, one which would form a triangle but a single edge removed). Keep in mind here that if only a single edge exists from these 3 nodes, this would not be defined as a triplet.

An important additional note here is that if a closed triplet is found then there are 3 closed tripets for the set of these 3 nodes which are counted from rotations of the triangle and which overlap on top of each other.

Then the global clustering coefficient for the graph, GG, is the ratio of counts of closed tripets and all triplets (both open and closed).

C=kclosedktriplets C = \frac {k_{closed}} {k_{triplets}}

The global clustering coefficient is called the transitivity.

So for a more intuitive form of calculating the transitivity, TT:

T=3trianglestripletsT = \frac{3 * triangles}{triplets}

Computation with Triangles

Using our knowledge of above there is also a relationship with the "triangles".

For example we can determine the local coefficient using the count of triangles rather than the count of edges. A proof of such equivlanece is available in [1]. Instead of discussion of this proof let's instead show below the computation using an adjacency matrix.

The triangles can be counted like so, here for a given vertex, i.

12j,kAi,jAj,kAk,i\frac{1}{2} \sum_{j,k} A_{i,j} A_{j,k} A_{k,i}

Given, kik_i as the number of degrees for vertex i, we also have the total ways to link to k nodes.

12ki(ki1)\frac{1}{2} k_i(k_i - 1)

Then the ratio of these two will give the local clustering coefficient:

Ci=j,kAi,jAj,kAk,iki(ki1)C_i = \frac{\sum_{j,k} A_{i,j} A_{j,k} A_{k,i}} {k_i(k_i - 1)}

References

[1] Wikipedia - Clustering Coefficient