Duration: 44 mins
April 25, 2017

A billion stars in the Jupyter Notebook

With large astronomical catalogues (>1 billion) already available, we are preparing for methods to visualize and explore these large datasets. Instead of using cluttered scatter plots, these data volumes require different visualization techniques, in the form of binned statistics, e.g. histograms, density maps, and volume rendering in 3d. The calculation of statistics on N-dimensional grids is handled by Python library called Vaex, which I will introduce. It can process at least a billion stars/samples per second, to produce for instance the mean of a quantity on a regular grid. This statistics can be calculated for any mathematical expression on the data (numpy style) and can be on the full dataset or subsets, specified by queries/selections.

Presented by:

    • Maarten Breddels
      Founder of vaex.io