Discovering Disease Outbreaks from News Headlines

pandas, scikit-learn, text extraction, K-means/DBSCAN clustering

We just launched our liveProject platform — where you can sign up for a structured project and get real-world experience.

Our pilot project which is rather relevant at present — puts you in the role of a data scientist at the World Health Organization (WHO). The WHO is responsible for responding to international epidemics, a critical component of which involves monitoring global news headlines for signs of disease outbreaks. However, this daily deluge of news data is too huge to manually analyze. Your challenge is to pull geographic information from headlines, and determine where in the world outbreaks are occurring. Problems you will have to solve include extracting information from text using regular expressions, using the Basemap Matplotlib extension to visualize map locations for patterns indicating an epidemic, and reporting your findings to your superiors so resources can be dispatched.

Here’s the best part: the solo track for this project is FREE! Go try it out today at:

Learn more about liveProject here:

Follow Manning Publications on Medium for free content and exclusive discounts.