DeepRec

Overview

In this paper, we propose DeepRec, a novel LLM-based recommender that facilitates autonomous multi-turn interactions between LLMs and traditional recommendation models (TRMs) for deep item space exploration. In each interaction turn, LLMs reason over user preferences and collaborate with TRMs to retrieve candidate items. After multi-turn interaction, LLMs rank the aggregated candidates to generate the final recommendations. We utilize reinforcement learning (RL) for optimization and introduce novel contributions in three key aspects: recommendation model based data rollout, recommendation-oriented hierarchical rewards, and a two-stage RL training strategy. For data rollout, we design a preference-aware TRM, with which LLMs interact to construct trajectory data. For reward design, we propose a hierarchical reward function that comprises both process-level and outcome-level rewards to optimize the interaction process and recommendation quality, respectively. For RL training, our two-stage RL strategy first guides LLMs to learn effective interactions with TRMs, followed by recommendation-oriented RL for performance enhancement.

Environment

pip install torch==2.5.1
pip install transformers==4.46.3
pip install vllm==0.6.5
pip install packaging
pip install ninja
pip install flash-attn --no-build-isolation
pip install deepspeed
pip install accelerate
pip install datasets

Datasets

You can find all the datasets we used in Google Drive. Please download the file and unzip it to the data/ folder.

Preference-Aware TRM

You can download the model parameters of the preference-aware TRM on both datasets here. Please download the file and unzip it to the server/ folder.

Quick Start

You can find all the run scripts in the scripts/ folder.

Retrieval Server

bash scripts/recall.sh

Cold Start RL

# Reward Server
bash scripts/reward.sh 5001 cold

# Training
bash scripts/cold_train.sh

Recommendation-Oriented RL

# Reward Server 
bash scripts/reward.sh 5002 rec

# Training   
# ckpt_dir_of_cold_start is the model checkpoint directory during cold start RL
bash scripts/rec_train.sh ckpt_dir_of_cold_start

Evaluation

# Generation
# start_idx and end_idx are the starting and ending indexes of the test data respectively
# final_ckpt_dir is the model checkpoint directory after two-stage RL
bash scripts/eval_generate.sh gpu_id start_idx end_idx final_ckpt_dir

# Calculation Metrics
# test_dir is the test result directory generated by the model
python evaluation/metric_calc_rec.py --test_results_dir test_dir

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.idea		.idea
assert		assert
evaluation		evaluation
openrlhf		openrlhf
script		script
server		server
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DeepRec

Overview

Environment

Datasets

Preference-Aware TRM

Quick Start

Retrieval Server

Cold Start RL

Recommendation-Oriented RL

Evaluation

About

Uh oh!

Releases

Packages

Languages

RUCAIBox/DeepRec

Folders and files

Latest commit

History

Repository files navigation

DeepRec

Overview

Environment

Datasets

Preference-Aware TRM

Quick Start

Retrieval Server

Cold Start RL

Recommendation-Oriented RL

Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages