Developer Matt Biddulph recently released code for calculating the social influence developers in Github based on their location. Marak Squires took the code and lists of the most influential developers in different regions. He notes there are some issues that skew the results, but offers some suggestions on how to improve them.
First of all, if a user lists their location as “NYC” or “Brooklyn” instead of “New York City,” that user won’t show up in a list of the most influential Github users in New York City. Also, many users who follow a lot of projects were given a higher influence ranking than they probably deserved.
To fix the normalization issues there are two options. We could ask Github to change the location field on profiles so it could be geo-coded (unlikely to happen anytime soon) or we could manually setup aliases for each region and then change the script to perform an OR operation on these aliases. For instance to calculate the graph for New York we could query “New York, NYC, New York City, Brooklyn, Queens, Manhattan, Staten Island”. I suppose we could also just try to take the current location string and geocode it. The point is there is room for improvement.
To fix the results being skewed by people who just happen to follow a lot of projects, we could change the script to take other information into account such as: number of projects, number of watchers on projects, following to followers ratio, fork to ownership ratio, commit history. There is also an issue of legacy organizations and organizations having followers…but that one is a bit tricky.
With that in mind, here are the most influence users in Portland:
You can find out the rankings here.