Saturday, October 10, 2015

"Your Network at Play"

From MetaFilter:
The Washington Post has a puzzle to see how well you understand social networks. The day’s political issue: whether baseball caps are fashionable. More explanation and the solution below the jump.

Even with a minority opinion, caps are deemed fashionable. Despite the misleading title about understanding people, what the puzzle really demonstrates is graph theory and social network analysis (SNA). The WaPo example explicates an idea in the academic paper The Majority Illusion in Social Networks (full text PDF) by Lehrman, Yan and Wu. That paper finds that individuals can have a great deal of influence over their networks, even if they are indeed in the minority.

Social network visualization is easier than you may think. There are two major graph visualizers that are open source and have active communities: Gephi and the Excel based NodeXL (Win only). Getting started with Gephi. A video tutorial on NodeXL.

If you'd like to play with your own data, Netvizz on Facebook (you'll have to approve the app) will let you work with group and page data. Stanford offers a number of prerolled graph files, which will let you play around with real world data. And (thanks to their malfeasance) Enron's dataset provides a valuable contextualized social network of email exchanges. MIT provides an all-in-one analysis and visualization web tool if you'd like to look at your Gmail/Yahoo/Exchange data, with a handy option to delete the local copy after you exit.

Here are some key terms related to SNA: individuals are known as nodes, and a relationship between individuals (e.g. friendship) can be thought as an edge. Edges may have edge weight, meaning the strength of that relationship (e.g. amount of information shared, time spent together, etc.), as well as directionality, meaning the originator and the receiver of a friend request (e.g. you may follow Kanye West on Twitter, but he doesn't follow you). The number of edges that a node has is thought of as its degree, and the way that they are positioned in a graph is determined by some metric of centrality. Depending on the structure of the graph, further automated analysis can be done - and, in fact, Google's page rank is an example of graph analysis using linkages between websites that weights the quality of each connection by the reputation of its neighboring nodes. This article (HTML Fulltext) by Hawe, Webster and Shiell gives a more advanced glossary of terms....