1 Billion rows, 1 laptop. Serious data science
In this talk, we will analyse the entire public database of the New York City Yellow Cab taxi service, containing the data for well over a billion trips. Our live demonstration will showcase how to find which locations are most lucrative for taxi drivers given a certain time of day, how to identify interesting objects or events in the data, as well as build a machine learning model to predict the expected tip amount for a given trip. We will do the entire exploration and analysis in Python using a single laptop, live!