Class Projects - Spring 2015


A tool that visualizes who influences the proliferation of hashtags.

This is "demovideo" by Northwestern U. Knight Lab on Vimeo, the home for high quality videos and the people who love them.
What it does

#Influence is a web application to find out influencers behind a Twitter hashtag and to visualize how powerful they are. By using intuitively easy-to-understand “bubble data visualization,” it shows top 5 influential people on a certain hashtag as a circle, and it even displays top retweeters for the influencer within the circle. Besides, the size of circles represents the size of influence.

In today’s Twitter Sphere, we see a great number of hashtags being produced, mentioned and shared every second, and we use them with little awareness of the origin. So there are fair questions to ask—who are the invisible hands behind the viral hashtags and how big are their influences?

As finding influencers is becoming important for digital marketing, established services such as official Twitter or Keyhole begun to provide information about influencers. But it’s sometimes based on closed algorithm or offered in the form of complex statistics. Developing #influence, we wanted to demystify the collective concept gathered under the hashtags in a more democratic and understandable way than any other service does.   

How it works

Each time a user types in a hashtag in a search box at the first page, the Twitter API returns the 100 most popular tweets about the hashtag at that moment based on Twitter's own algorithm. D3.js sorts the tweets out based on the number of retweets and renders the top 5 tweets as big circles. The size of circle represents the number of retweets in the first level visualization.

For each top tweet, D3.js ranks its retweeters by looking at their follower numbers, and displays up to the top 10 retweeters as the small circles within a big circles. Again, in the second level visualization, unlike the 5 circles in the first level, the size of circle is based on the follower numbers.

We acknowledge that our method is far from perfect, for two main reasons: First, Twitter's API doesn't count edited retweets, so those retweets don't show up in our visualization. Second, Twitter's API doesn't track the retweets of a retweet, so we evaluate the influence of the retweeters (second level circles) based on their follower numbers instead.

Key Technologies

●Twitter API via twit client for Node.js


●Express for Node.js



Next Steps

When considering future work with #Influence, we should work on current limitations in the system. A central problem with this product is the lack of robustness to multiple simultaneous users. To address this, we need to implement a new system for passing Twitter data to the frontend that does not require individual clients to share data resources from the server.

We should also optimize the system’s performance by redesigning the Twitter API queries to call each other recursively. This would allow the page to return a result in the minimum amount of time. We might also be able to improve the system’s performance by storing retweeter IDs in database to avoid repeated lookup queries to Twitter. Over time, frequently influential retweeters would no longer need to be looked up.

An additional interesting feature that could be added to #Influence would involve parsing Twitter account descriptions to discover users’ areas of expertise. This would allow the user to compare an individual user’s influence in a particular topic with their authority on it.


Neal Kfoury, Neha Rathi, Yasu Saito, Jia You