Vaex: Lazy Out-of-Core DataFrames for Python.

Visualize and explore big tabular datasets. A billion rows per second on a single computer.

Install Docs Demo movie GitHub

Follow @maartenbreddels
Generic placeholder image

Why use vaex

Visualize and explore huge tabular datasets interactively...

Read more »

Generic placeholder image

How does it work

vaex does this by visualizing binned aggregated data...

Read more »

Generic placeholder image

What is vaex

A graphical interface, or library that integrates with the Jupyter/IPython notebook...

Read more »

Why use vaex?

  • Visualize and explore big tabular data interactively
  • Process more than a billion objects per second on a single computer.
  • Transform the data (lazily) on the fly using regular numpy, without using memory.
  • Filter the dataset by using visual queries and boolean expressions to visualize subsets of the or to do data cleansing.
  • Vaex has a graphical interface for most common uses cases.
  • Vaex integrates well in the Jupyter/IPython notebook/lab ecosystem.
  • Client/server architecture: Delegate computations to a remote server. (in development)
  • Use a cluster to visualize and explore even larger datasets (10-100 billion). (in development)
  • With a focus on astronomy and astrophysics, but widely applicable.
  • Can visualize the whole Gaia catalogue in one second.

How does it work?

Vaex does this by:

  • Binning or aggregating the data on a grid, using simple optimized algorithms
  • Virtual columns behave like regular columns, but are only computed in chunks when needed not to waste memory.
  • Columnar storage of data avoids reading unneeded data and enables maximum performance of hard drives.
  • Memory mapped files avoids unneeded reading, and copying of data. Open a terrabytes file in milliseconds.

What is vaex?

  • A Python library/package for (data) scientists:
    • Is pip and conda installable.
    • Make custom plot and statistics.
    • Calculate statistics on a N-dimensional grid and visualize it.
    • Create interactive Jupyter/IPython notebooks.
    • Publication quality plots with matplotlib.
    • Interactive plots with bqplot or bokeh.
    • Combine the notebook with the graphical interface in one kernel
  • Has a standalone program/gui that
    • Requires no programming knowledge
    • Visualizes 1d histograms, 2d density plots, averages quantities, and 3d volume rendering
    • Allows interactive navigation and selection
    • Overlay vector and tensor quantities in 2 and 3d.
Generic placeholder image


Desktop user? Download the standalone OSX or Linux version. *

For programming? Install the python package:
$ pip install --user --pre vaex

Or for anaconda users:
$ conda install -c conda-forge vaex

Latest from git:
$ pip install git+

Or see more detailed instructions.

*Not possible to combine with the IPython/Jupyter notebook.

Live demo. Yellow taxi pickup locations in New York City.

The demo on the right shows 140 million points, rendered real time. Zoom/pan and the plot get updated on the fly.

Demo movies.

Fast visualization

Coming soon

Selections and linked views

Coming soon

Notebook integration

Coming soon

Gaia data.


Interactive demo showing 100 million points (10%) of the Gaia DR1 data, rendered real time. Zoom/pan and the plot gets updated on the fly.


Vaex is funded by:



Vaex is open source, the source code and issues live on Github. Please use github to report issues. Contributions are welcome using Pull Requests.