PhishShield is a powerful AI-driven phishing detection agent that analyzes emails and web content to identify potential phishing threats in real-time. Using an intelligently trained Random Forest classifier on a dataset of 200k+ emails, PhishShield leverages linguistic patterns, link analysis, and structural metadata to predict whether a message is phishing or legitimate with over 96% accuracy.
PhishShield consists of
- 🧠 Backend (Flask API): ML-powered REST API for real-time predictions.
- 🎨 Frontend (HTML/CSS/JS): Interactive UI for email inspection and manual inputs.
- 📊 ML Model: Pre-trained model using advanced feature engineering.
- 📁 Dataset: 200,000 labeled email samples with 7 rich features for training and research.
- REST API via Flask backend
- JSON-based predictions with confidence scores
- Auto model training fallback (
train.py
)
- 25+ extracted features including:
- Domain reputation
- Link patterns
- Urgent language indicators
- HTML tag frequency
- Attachment behavior
- Random Forest Classifier
- Over 96% accuracy
- Trained on 200k labeled samples
- Explainable predictions
- Drag-and-drop email file analysis (.eml, .msg)
- Manual content entry support
- Visual risk indicator (safe/suspicious)
- Historical detection log
- Responsive UI for all devices
Features:
- Dual input methods (file upload + text input)
- Real-time analysis results
- Confidence visualization
Workflow:
- Paste email content
- Optionally add metadata
- Get instant phishing probability
Supports:
.eml
and.msg
email files- Automatic header parsing
- Attachment detection
Key Functions:
- Persistent scan records
- Quick-result filtering
- One-click reanalysis
PhishShield includes a robust and reusable dataset of 200,000+ labeled emails for training, evaluation, and experimentation.
Feature | Description |
---|---|
email_text |
Body content of the email |
subject |
Email subject line |
has_attachment |
Binary flag (1 = yes, 0 = no) |
links_count |
Number of hyperlinks detected |
sender_domain |
Domain of sender’s email address |
urgent_keywords |
Binary flag (1 = urgent words found) |
label |
Target class: phishing or legitimate |
🧠 Ideal for building and enhancing phishing classifiers or integrating into broader cybersecurity AI pipelines.
- Python 3.8+
- Git
- Node.js (for frontend development, optional)
# Clone the repository
git clone https://github.com/AtharIbrahim/Phishing-Email-Agent.git
cd Phishing-Email-Agent
# Create virtual environment
python -m venv venv
# Activate environment
# For Windows:
venv\Scripts\activate
# For macOS/Linux:
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Start Flask server
python app.py
- Name: Athar Ibrahim Khalid
- GitHub: https://github.com/AtharIbrahim/
- LinkedIn: LinkedIn Profile
- Website: Athar Ibrahim Khalid
This project is licensed under the MIT License. See the LICENSE file for details.