Labelf Blog

Discover the latest product updates, announcements, and articles from the Labelf team.
January 11, 2021

9 Great Resources for Finding Data

Where can I find datasets to classify?

1. Huggingface datasets

Huggingface datasets consists as of writing over 600 datasets in 80+ languages and they can all be browsed by tags in their viewer.


With a huge amount of data alot can be found here.

3. Kaggle

Kaggle hosts over 60.000 datasets.

4. The Pile

The pile consists of 840GB of text data from a great variety of domains in English.

5. Pushift

API for querying reddit and other social media data.

6. Stack Overflow

Search and fetch stackoverflow posts.


Find data from the US government

8. Google dataset search

Search for datasets with Google.


A great source for multilingual content ranging from subtitles to law.

Viktor Alm

CEO & Co-Founder @ Labelf AI

Change Cookie Settings