Authors: Yufan Zhang, Ruitian Wu, Tian Jin, Edward Wang, Rundong Hu
Report: yufanbruce.com/project/dsw
This repo presents a restaurant recommendation system developed using Yelp’s public dataset. Our study aims to predict user preferences for new restaurants by integrating various data modeling techniques, focusing on both restaurant features and user interactions. We employed content-based filtering and collaborative filtering methods, including Linear Regression, Random Forest Regression, and Alternating Least Squares (ALS), to construct a multi-faceted recommendation model. Our findings indicate that Linear Regression and Random Forest-based collaborative filtering outperformed other recommendation systems. The study also highlights the limitations of collaborative filtering on sparse datasets.
Download the Yelp data from the URL.
unzip yelp_dataset.zip
The project is derived from the final project of the course, CS 5304 Data Science in the Wild (2024 Spring), at Cornell Tech.