Class Projects - Spring 2014

Sourcerous

Sourcerous is a web-based application that helps business reporters identify new sources.

This is "Sourcerous Screencast" by Northwestern U. Knight Lab on Vimeo, the home for high quality videos and the people who love them.
What it does

One of the hardest parts about being a business journalist is finding new sources for a specific story. The process can take anywhere from just under an hour to several days. Sourcerous helps shorten that timeframe by scanning previously published articles on a particular topic and pulling relevant names, professional titles, and quotes. Sourcerous displays the collected information in a table that a journalist can scan quickly for new sources.


How it works

Sourcerous uses a Python xml parser to parse the RSS of Google News and Yahoo! Finance to locate articles relevant to keywords that journalist is interested in. These articles are then processed by a multi-thread backend. Each thread calls the Alchemy API to search for the names of the individuals referenced, their job titles, companies, LinkedIn page, and the quote associated with them in an article.  Once these results have been located, Sourcerous lists each of these results in alphabetical order.


Next Steps
  • Enhanced search functionality, including and improved location list, entity detection (i.e. distinguish between human names and company names), and enhancing the quote-pulling ability of the engine.
  • Improved user interface for clarity and appeal.
  • Improved results organization that prioritizes names based on frequency within located articles and not just on alphabetical order.

Connect

Student team: Curtis Sprung, Thomas Morreale, Jin Sun, and Xiaofeng Zhu.

Faculty guidance: Larry Birnbaum and Rich Gordon.