🚀 Self-Healing Infrastructure with Observability (Prometheus + Grafana + Loki + Jaeger + Ansible)

This project implements a complete monitoring and self-healing DevOps stack using containerized microservices. It showcases a robust observability pipeline (metrics, logs, traces) and a self-recovery system that reacts to system failures or threshold breaches in real-time.

Built with:
Prometheus, Grafana, Loki, Promtail, Jaeger, Alertmanager, Ansible, Flask, Docker, and Docker Compose

Key features:

📈 Real-time system & app metrics (Prometheus + Grafana)
📬 Smart alerting & auto-remediation via Ansible
📜 Centralized logging using Loki + Promtail
🧵 Distributed tracing with Jaeger
⚙️ All services containerized and orchestrated via Docker Compose

📸 Project Screenshot

🧱 Project Architecture

graph TD
  App[Flask App]
  Prometheus -->|scrapes| App
  Prometheus --> NodeExporter
  Prometheus --> Alertmanager
  Promtail --> Loki
  Loki --> Grafana
  Jaeger --> Grafana
  Prometheus --> Grafana
  Alertmanager --> Webhook[Webhook Listener]
  Webhook --> Ansible[Ansible Playbook: restart_app.yml]

📂 Folder Structure

infra-guardian/
├── app/                      # Flask app with metrics & tracing
│   ├── app.py
│   ├── requirements.txt
│   └── Dockerfile
│
├── monitoring/
│   ├── prometheus.yml
│   ├── alert_rules.yml
│   ├── alertmanager.yml
│   ├── loki-config.yaml
│   └── promtail-config.yml
│
├── ansible/
│   ├── inventory
│   └── restart_app.yml
│
├── Dockerfile.webhook       # For webhook container
├── docker-compose.yml
├── screenshots/
│   ├── grafana-dashboard.png
│   ├── setup observability.webp
│   ├── firing.png
│   └── resolved.png
└── README.md

🛠️ Tools Used

Docker + Docker Compose – containerized environment
Prometheus – metrics collection
Grafana – dashboarding and visualization
Loki + Promtail – log aggregation
Jaeger – distributed tracing
Alertmanager – alert routing
Webhook Listener – custom Flask service
Ansible – automation for self-healing

🚀 How to Run the Project

git clone https://github.com/Ayush2005547/infra-guardian.git
cd infra-guardian
docker-compose up --build

Then access:

✅ Workflow Summary

Prometheus scrapes metrics from App and Node Exporter.
Alertmanager fires alert on threshold breach.
Webhook listener receives alert → triggers Ansible playbook.
Ansible restarts the Docker container.
Logs and traces are recorded using Loki & Jaeger.
Grafana shows real-time dashboards for metrics, logs & traces.

📈 Sample Alerts Configured

High CPU Usage (>80%)
Instance Down
App Unresponsive

📊 Grafana Dashboards Include:

CPU, RAM, Disk (from Node Exporter)
Application Uptime & Metrics
Real-time Logs (from Loki)
Tracing Visuals (from Jaeger)

📄 License

MIT License. Fork freely and adapt!

👨‍💻 Authors

Ayush Ahirwar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 Self-Healing Infrastructure with Observability (Prometheus + Grafana + Loki + Jaeger + Ansible)

📸 Project Screenshot

🧱 Project Architecture

📂 Folder Structure

🛠️ Tools Used

🚀 How to Run the Project

✅ Workflow Summary

📈 Sample Alerts Configured

📊 Grafana Dashboards Include:

📄 License

👨‍💻 Authors

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
ansible		ansible
app		app
monitoring		monitoring
screenshots		screenshots
.gitignore		.gitignore
Dockerfile.webhook		Dockerfile.webhook
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements_webhook.txt		requirements_webhook.txt
webhook.py		webhook.py

Ayush-2005547/infra-guardian

Folders and files

Latest commit

History

Repository files navigation

🚀 Self-Healing Infrastructure with Observability (Prometheus + Grafana + Loki + Jaeger + Ansible)

📸 Project Screenshot

🧱 Project Architecture

📂 Folder Structure

🛠️ Tools Used

🚀 How to Run the Project

✅ Workflow Summary

📈 Sample Alerts Configured

📊 Grafana Dashboards Include:

📄 License

👨‍💻 Authors

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages