Social Network Analysis (SNA) is very useful for various purposes. It can be used to formulate strategy for how and where to launch a product or promotion, to formulate strategy for how to follow up the plan with incentives, to study segmentations, etc. While it has wide application in varieties of Physical Sciences and Social Sciences, is even useful at a personal level. Most of us indulge in Social Networks in today’s world. So, if one is curious to know how his/her network looks like, this article will provide you step by step method to gaining that insight.
To demonstrate the method for visualisation of Social Network, I use the data from my LinkedIn network. The rest of the article will deal with one aspect of my LinkedIn Network data. I will not discuss here how one can get hold of their LinkedIn data. You can click on this link provided by LinkedIn to get the same.
After downloading the LinkedIn data, it needs to be cleansed and normalised so that it can be used for the visualisation. This again I leave it to you. However, I provide you the sample of how my data looked like after the cleansing and normalisation was completed.
I used the data from LinkedIn available in Connections.CSV. I had to create 2 Comma Separated Files (CSV) from this – LinkedInFriends.CSV and LinkedInNetwork.CSV.
LinkedInFriends.CSV contained the unique list of all of my connections. I only retained 3 columns for this article i.e. Name of the Connection, Company of the Connection and Number of Years since we have been Connection. Note that this file cannot have Duplicate Entries in the column Name of the Connection. This needs to be filtered out or taken care of. In my network, there were some people who shared their names with some other connection. However, both these connections worked in different Companies. The contents of this file will be used as the NODES in the network visualisation. The LinkedInFriends.CSV file looked like as shown below. I saved this file as a Excel Worksheets as well names LinkedInFriends.xlsx.
LinkedInNetwork.CSV contained the connection that each of the Friends have on the LinkedIn. The column pair Source and Target define this relationship. The column Elapsed Years tells the period since when these connection pair have known each other on the LinkedIn network. The contents of this file will be used as the EDGES in the network visualisation. The LinkedInNetwork.CSV file looks as shown below. I saved this file as a Excel Worksheets as well names LinkedInNetwork.xlsx.
Firstly, I show you how to create visualisation of the LinkedIn Network using Gelphi.
First, invoke Gelphi to obtain the screen as shown below.
Click on the Data Laboratory button as shown in the Red Box above to obtain this screen as shown below.
Here, click on Import Spreadsheet as shown in the Red Box above. It will ask for the path of the Spreadsheet. Select the Directory where the Excel File LinkedInNetwork.xlsx is stored and from there choose the file LinkedInNetwork.xlsx. You will see something similar to what is shown below.
Click Next to obtain this.
Click Finish to see something similar to what is shown below.
Click on the Radio Button Append to existing Workspace and click OK. The data uploaded will shown as follows.
Click on Overview as shown in the Red Box above. You will see a visualisation of the network similar to what is shown below.
On the left hand side, click on the Drop Down List Choose Layout as shown in the Red Box above. Select the Noverlap layout and click Run. Depending on how large the data is, it will take time before the layout building completes. At the end, my network looked like this.
Now choose ForceAtlas2 layout from the Choose Layout Drop Down List and click Run. This will take a lot of time as this layout has to iterate about 10,000 to 20,000 times. At the end, my network looked like this.
I have purposefully not included Company Names or Names of Connections. However, these can be added to the Visualisation. However, I added different colours for the different Companies. So, the different clusters can be seen above.
Also, I only used the Duration of Association between connection for analysis for this article. However, we can use attributes like interactivity between connections using Likes, Comments, Shares, Messages, etc.
The same results can be obtained using a R program. The R program to generate this is provided below.
plot(net, layout.forceatlas2(net, iterations=10000, plotstep=500))