Hi,šŸ‘‹ we have updated the app and fixed multiple bugs. We are lacking funds, request to free user not to use Adblock. Ads are non intrusive. 😊

✨ Visual Editor

close

palette Canvas & Background

Gradient:arrow_forward
Text Color:
135°

style Card Style

40px
16px

text_fields Typography

16px
Avi Chawla
@_avichawla
- Google Maps uses graph ML to predict ETA
- Netflix uses graph ML in recommendation
- Spotify uses graph ML in recommendation
- Pinterest uses graph ML in recommendation

Here are 6 must-know ways for graph feature engineering (with code):
Avi Chawla
@_avichawla
Like images, text, and tabular datasets have features, so do graph datasets.

This means when building models on graph datasets, we can engineer these features to achieve better performance.

Let's discuss some feature engineering techniques below!
Thread image
Avi Chawla
@_avichawla
First, let’s create a dummy social networking graph dataset with accounts and followers (which will also be accounts).

We create the two DataFrames shown below, an accounts DataFrame and a followers DataFrame.

Check this codešŸ‘‡
Thread image
Avi Chawla
@_avichawla
The above data is tabular, but we need to convert this into a graph format.

To do this, we use the networkx framework as follows:

• Initialize a graphĀ G.
• AddĀ nodes from theĀ accountsĀ DF.
• Added edges between the nodes using theĀ followers'Ā DF.

Here's the codešŸ‘‡
Thread image
Avi Chawla
@_avichawla
This produces the following graph.

Next, let's cover the 6 graph feature engineering techniques.
Thread image
Avi Chawla
@_avichawla
1-3) Node degree

In a directed graph, there are two types of degrees:

• In-Degree:Ā The number of incoming edges (followers) a node has.
• Out-Degree:Ā The number of outgoing edges (followings) a node has.
Avi Chawla
@_avichawla
Here’s how we can compute them using NetworkX:

• in_degree(x)Ā counts edges directed toward the nodeĀ x.
• out_degree(x)Ā counts edges directed away from the nodeĀ x.
• degree(x)Ā is the sum of the in-degree and out-degree of nodeĀ x.

Check this codešŸ‘‡
Thread image
Avi Chawla
@_avichawla
These features are now part of the accounts DataFrame.

Check thisšŸ‘‡
Thread image
Avi Chawla
@_avichawla
4-6) Node centrality

Node degree features capture connectedness but fail to capture the influence of those connections.

For instance, a user can have many online friends just because they send friend requests to everyone.

Centrality features handle this.
Avi Chawla
@_avichawla
4) Betweenness centrality

This measures how often a node appears on the shortest paths between other nodes.

If a node often acts as a ā€œbridgeā€ between other nodes, it plays a key role in facilitating information flow.

Here's the codešŸ‘‡
Thread image
Avi Chawla
@_avichawla
5) Closeness centrality

This indicates how close a node is to all other nodes in the network based on the shortest paths.

To compute closeness centrality for a node v, we sum the shortest path length from v to all other nodes and take its reciprocal.

Here's the codešŸ‘‡
Thread image
Avi Chawla
@_avichawla
6) Eigenvector centrality

If a node is connected to other influential nodes, it amplifies its own influence.

It helps identify nodes that are influential not only due to their direct ties but also due to their connections with other influential nodes.

Here's the codešŸ‘‡
Thread image
Avi Chawla
@_avichawla
PyTorch Geometric is a PyTorch extension specifically developed for building graph-based neural networks.

It has an intuitive API that facilitates inspecting and analyzing graphs and building ML models on graph-based datasets.

Open-source with 22k+ stars!
Thread image
Generated by Thread Navigator
100%
view_carousel Carousel Studio NEW
Press ⌘ + S to quick-export