Recommender Systems Workshop
Public Data Sets
Hi all!
Please help us by keeping this list updated :)
Main homepage:
Recommender Systems Workshop
Recommender Systems and Personalization Datasets
collection of datasets that have been collected for research
by the group of Julian McAuley at UCSD
Download
MovieLens (Movies)
One of the most popular datasets for collaborative filtering
Time range: 1995–2019
Type of data: Explicit ratings (1–5 stars)
Size: 100K, 1M, 10M, 25M ratings
Download
Amazon Product Reviews (E-commerce)
Large-scale dataset of product reviews and ratings
Time range: 1996–2018
Type of data: Explicit ratings (1–5 stars), reviews
Size: 233M+ reviews across categories
Download
Netflix Prize (Movies)
Classic dataset from the Netflix Prize competition
Time range: ~1998–2005
Type of data: Explicit ratings (1–5 stars)
Size: 100M+ ratings from 480K users
Download
Yelp Open Dataset (Local Businesses)
Large dataset for local business recommendation
Time range: 2004–2022
Type of data: Explicit ratings (1–5 stars), reviews
Size: 8.6M reviews from 1.3M users
Download
Spotify Million Playlist Dataset (Music)
Dataset for playlist-based music recommendation
Time range: 2010–2017
Type of data: Implicit interaction (playlists)
Size: 1M playlists, 2M unique tracks
Download
MIND (News Recommendation)
Large-scale dataset of MSN News click interactions
Implicit feedback (clicks on news articles)
Includes article titles, abstracts, categories
Download
Yahoo! Music Ratings (Music)
Explicit song and album ratings from Yahoo! Music users
Explicit ratings (scale 0-100)
Part of KDD Cup 2011 challenge
Download