Local Tweetcast predicts how a people in a specific location will vote based on the content of their tweets.
What it does
“Tweetcast Your Vote” is a previous Knight Lab project and powerful tool that could predict an individual’s likely voting preferences based on the content of their tweets. Local Tweetcast builds on that legacy and uses similar technology to try to determine how much support a candidate might have in a given geography based on Twitter data.
Although a variety of ways to collect data about voter preferences already exist, Local Tweetcast gives researchers, academics, journalists, and interested citizens another data point to consider and allows them to hone in on a specific geographic location.
To use Local Tweetcast, a user simply inputs a location and waits for results to be displayed. Results show the percentage of users supporting Bernie Sanders, Hillary Clinton, or Donald Trump. Results for previously cached counties in the United States might also be displayed.
How it works
Local Tweetcast collects large amounts of tweet data. This collection begins with the Tweepy API (a wrapper for the Twitter API), which allows for retrieval of tweets and user timelines, as well as filtering by keyword and location. Via Tweepy, a script scrapes tweets from given locations continually while being cognizant of rate limits. Tweets are stored in csv files.
Using the gensim Python module, Tweetcast runs a nearest neighbor algorithm to determine which candidate a Twitter user supports. The algorithm was trained on data collected from Twitter users that clearly supported or endorsed a certain candidate.
The web app is powered by Flask. D3 is used to create the visualizations. The app is hosted on heroku
The project would benefit from a bigger database of users and tweets improve accuracy. A quicker processing speed, more visualizations, and support for local government elections are also likely improvements.