Dataset list update - May 2019
You're receiving this email because you subscribed to updates from datasetlist.com
New datasets
2019 is shaping up to be an incredible year for public datasets. There have been several stellar releases in the past few months: the largest dataset of human voice to date: Mozilla Common Voice, several huge question-answering datasets: Google Natural Questions and GQA. A couple of huge medical datasets have been published simultaneously: MIMIC-CXR and CheXpert. The largest publicly available Chinese language corpus has been made public, along with several exciting datasets from Facebook, Nvidia and IBM, among others.
The list of datasets from before 2019 is now more comprehensive thanks to community feedback, numerous significant datasets have been added, scroll down the list to find them.
Product hunt
Dataset list is now on Product hunt! I'm sharing dataset list with the world so feel free to join the discussion here.
New categories
Two new categories have been added: Medical and QA. There's a great deal of interest and research in these fields and with adding these categories I'm hoping this will make it easier to find relevant datasets.
Thank you!
I really appreciate all feedback I have been receiving over the past couple of weeks. Your suggestions, ideas and new dataset tips are helping to make this page much better and useful than I could ever hope to achieve on my own.
As always, if you have any feedback, or you know of a dataset that belongs on the list, just reply to this email and let me know.
If you have any other ideas about content that could go on the page or a dataset-related problem you'd like to solve, hit reply!
Nikola from Dataset list