🤗 Datasets Server
Datasets Server is a lightweight web API for visualizing and exploring all types of datasets - computer vision, speech, text, and tabular - stored on the Model Database Hub. As datasets increase in size and data type richness, the cost of preprocessing (storage and compute) these datasets can be challenging and time-consuming. To help users access these modern datasets, Datasets Server runs a server behind the scenes to generate the API responses ahead of time and stores them in a database so they are instantly returned when you make a query through the API.
Let Datasets Server take care of the heavy lifting so you can use a simple REST API on any of the 30,000+ datasets on Model Database to:
- List the dataset splits, column names and data types
- Get the dataset size (in number of rows or bytes)
- Download and view rows at any index in the dataset
- Search a word in the dataset
- Get insightful statistics about the data
- Access the dataset as parquet files to use in your favorite processing or analytics framework
data:image/s3,"s3://crabby-images/ff7e1/ff7e13788240a35e00af12409b33c012c8bcefbe" alt=""
data:image/s3,"s3://crabby-images/5fcfc/5fcfce566f625d1a12f26b7b080cd5f6ec5050c5" alt=""
Dataset viewer of the OpenAssistant dataset
Join the growing community on the forum or Discord today, and give the Datasets Server repository a ⭐️ if you’re interested in the latest updates!