Skip to content

An end-to-end machine learning pipeline for predicting customer churn using a Kaggle telecom dataset. This project includes data loading, cleaning, preprocessing, feature encoding, and model training with hyperparameter tuning using RandomizedSearchCV. Evaluation is performed using classification metrics and AUC score.

Notifications You must be signed in to change notification settings

reemkhaleed/Customer-Churn-Prediction-with-Scikit-learn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Customer-Churn-Prediction-with-Scikit-learn

An end-to-end machine learning pipeline for predicting customer churn using a Kaggle telecom dataset. This project includes data loading, cleaning, preprocessing, feature encoding, and model training with hyperparameter tuning using RandomizedSearchCV. Evaluation is performed using classification metrics and AUC score.

Churn Prediction Pipeline | pandas + sklearn

This project is an end-to-end machine learning pipeline to predict customer churn based on a telecom dataset from Kaggle. It includes steps from data loading to model training and evaluation using scikit-learn.


πŸ“Š Dataset

  • Source: Customer Churn Analysis Dataset
  • The dataset contains customer information such as contract type, tenure, payment method, and service usage.
  • The target variable is Churn (Yes/No), indicating whether a customer left the company.

πŸš€ Features

βœ… Load and explore real-world-like telecom data
βœ… Clean missing values and convert data types
βœ… Encode categorical variables using OneHotEncoding
βœ… Scale numeric features with StandardScaler
βœ… Use Random Forest classifier
βœ… Tune hyperparameters with RandomizedSearchCV
βœ… Evaluate performance using classification report and ROC AUC


🧱 Tech Stack

  • Python 3
  • pandas
  • scikit-learn
  • numpy
  • Jupyter / Google Colab

About

An end-to-end machine learning pipeline for predicting customer churn using a Kaggle telecom dataset. This project includes data loading, cleaning, preprocessing, feature encoding, and model training with hyperparameter tuning using RandomizedSearchCV. Evaluation is performed using classification metrics and AUC score.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published