Distributed_ML_Sagemaker_Pipelines

An end-to-end machine learning pipeline built on AWS SageMaker Pipelines, designed to support parallel model development and batch scoring on distributed, containerized infrastructure.

Overview

This project demonstrates the use of SageMaker Pipelines to operationalize a machine learning workflow that includes:

Feature engineering
Model training with XGBoost
Model evaluation based on MSE threshold
Conditional model registration
Offline batch scoring using SageMaker Batch Transform

Ideal for MLOps teams looking to streamline experimentation, ensure consistency in deployment workflows, and scale processing across compute instances.

Architecture :

Parameters:

⚙️ Pipeline Stages

Stage	Description
`Processing`	Executes `preprocessing.py` to clean and split data
`Training`	Trains XGBoost model on training set
`Evaluation`	Evaluates model against validation set using MSE
`Register Model`	Saves model if MSE < threshold
`Batch Transform`	Scores batch data using newly trained model

Take Aways:

With the learnings from this experiment, we successfully implemented parallel model development and scoring pipelines for four models—supporting both Purchase and Refinance scenarios in production.

▶️ How to Run

-->Clone the repo: git clone https://github.com/krishnamami/Distributed_ML_Sagemaker_Pipelines.git

-->pip install -r requirements.txt

-->python sage_maker_pipeline.py

Related Projects

Fine_Tuning_LLM

Markov_Chain_Attribution

Multi Agent Anamoly Detection

Author Krishna Goud

Head of Data Engineering & MLOps | Rocket LA LinkedIn

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
README.md		README.md
requirements.txt.txt		requirements.txt.txt
sage_maker_pipeline.py		sage_maker_pipeline.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Distributed_ML_Sagemaker_Pipelines

Table of Contents

Overview

Architecture :

⚙️ Pipeline Stages

Take Aways:

▶️ How to Run

Related Projects

About

Uh oh!

Releases

Packages

Uh oh!

Languages

krishnamami/Distributed_ML_Sagemaker_Pipelines

Folders and files

Latest commit

History

Repository files navigation

Distributed_ML_Sagemaker_Pipelines

Table of Contents

Overview

Architecture :

⚙️ Pipeline Stages

Take Aways:

▶️ How to Run

Related Projects

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages