Duration: 51 mins
September 3, 2019

Modern Data Science with Vaex - A new approach to DataFrames and pipelines

Applying the combined benefits of computational graphs, which are common in neural network libraries, with delayed (a.k.a lazy) evaluations to a DataFrame library enables efficient memory and CPU usage. Together with memory-mapped storage (Apache Arrow, HDF5) and out-of-core algorithms, we can process considerably larger data sets with fewer resources. As an added bonus, the computational graphs ‘remember’ all operations applied to a DataFrame, meaning that data processing pipelines can be generated automatically.

Presented by:

    • Jovan Veljanoski
      Founder of vaex.io