Storage of Data

tl;dr: Use HDF5

  1. HDF5
  2. BCOLZ : not designed for multidimentional data.
  3. Zarr : works with multidimensional data and also parallel computating.
  4. Blaze ecosystem

A article that compares HDF5, BCOLZ, and Zarr:To HDF5 and beyond

I also recommend pandas. It is a python module that works very well with data. It even loads HDF5 out of box.

Back to top

© 2016-2018, Lei Ma | Created with Sphinx and . | On GitHub | Physics Notebook Statistical Mechanics Notebook Neutrino Physics Notes Intelligence | Index | Page Source