Duration: 42 mins
July 18, 2019

A new approach to DataFrames and pipelines

We show how to deal with massive datasets using small resources using the Python Vaex DataFrame library. Using computational graphs, efficient algorithms and storage (Apache Arrow / HDF5) Vaex can easily handle up to a billion rows, even on your laptop. As a bonus, Vaex can automatically generate a Machine Learning pipeline using the graph structure build-up internally in the DataFrame.

Presented by:

    • Jovan Veljanoski
      Founder of vaex.io
    • Maarten Breddels
      Founder of vaex.io