Exploratory Data Analysis (EDA) on a cardiovascular health dataset to uncover patterns, correlations, and possible risk factors for heart attacks.
This analysis focuses on patient vitals and lab results, identifying trends that differentiate positive and negative heart attack cases.
Heart attacks remain one of the leading causes of death worldwide.
Through data analysis, we can identify health factors that significantly impact heart attack risk.
In this project, we:
- Clean and prepare the dataset for analysis
- Explore relationships between health metrics and heart attack outcomes
- Visualize data for better understanding
- Generate statistical insights to aid early detection
Column | Description |
---|---|
age | Age of the patient (years) |
gender | Gender (1 = male, 0 = female) |
impluse | Pulse / heart rate (beats per minute) |
pressurehight | Systolic blood pressure (mmHg) |
pressurelow | Diastolic blood pressure (mmHg) |
glucose | Blood sugar level (mg/dL) |
kcm | Potassium concentration in the blood (mmol/L) |
troponin | Troponin level (ng/mL) β a key marker for heart damage |
class | Heart attack outcome (positive or negative ) |
Sample Record:
age | gender | impluse | pressurehight | pressurelow | glucose | kcm | troponin | class |
---|---|---|---|---|---|---|---|---|
64 | 1 | 66 | 160 | 83 | 160 | 1.8 | 0.012 | negative |
- Python 3.x
- Pandas β Data manipulation
- NumPy β Numerical computations
- Matplotlib / Seaborn β Visualization
- Plotly β Interactive graphs (optional)
- Data Loading β Import dataset into Pandas
- Data Cleaning β Fix column names, handle missing values, correct outliers
- Univariate Analysis β Distributions of age, glucose, blood pressure, troponin, etc.
- Bivariate Analysis β Compare medical metrics between positive and negative cases
- Correlation Analysis β Heatmaps to find relationships between features
- Feature Insights β Identify most important indicators for heart attack detection
- Elevated troponin levels are strongly associated with heart attacks
- Patients with high systolic blood pressure (>140 mmHg) and age above 50 show higher risk
- Pulse rate anomalies may correlate with increased probability of a positive case
- Glucose and potassium imbalance could be a secondary risk factor
# Clone the repository
git clone https://github.com/ngusadeep/heart-attack.git