Model Database's logo
Join the Model Database community

and get access to the augmented documentation experience

to get started

Overview

Datasets Server automatically converts and publishes public datasets less than 5GB on the Hub as Parquet files. Parquet files are column-based and they shine when you’re working with big data. There are several different libraries you can use to work with the published Parquet files:

  • Polars, a Rust based DataFrame library
  • Pandas, a data analysis tool for working with data structures
  • DuckDB, a high-performance SQL database for analytical queries
  • ClickHouse, a column-oriented database management system for online analytical processing