title | sdk | app_file | pinned |
---|---|---|---|
AI Shopping Assistant |
docker |
app.py |
false |
A full-stack, LLM-powered laptop recommendation system that combines intelligent data ingestion pipelines with an interactive chatbot workflow. The system helps users find personalized laptop recommendations based on their preferences through natural language conversations.
- Project Overview
- Tech Stack
- Architecture & Workflow
- Data Flow Diagram
- Setup Instructions
- Usage
- Dependencies
- API Endpoints
- Contributing
- License
- Author & Contact
AI Shopping Assistant is a production-ready system that demonstrates advanced AI integration in e-commerce applications. The project features two distinct but interconnected workflows:
- Data Ingestion Pipeline: An admin-triggered, automated workflow for processing and mapping laptop datasets
- Chatbot Interface: An intelligent conversational agent that provides personalized laptop recommendations
The system is deployed on AWS Lambda and HuggingFace Spaces, integrating with Amazon S3 for storage, PostgreSQL (AIVEN) and Cloudflare D1 for data persistence, and OpenAI's GPT-4o-mini models for natural language processing.
- Live Demo: AI Shopping Assistant
- Source Code: GitHub Repository
- Home Page: aydie.in
- Flask 3.0.3 - Main web framework (entry point:
app.py
)
- AIVEN PostgreSQL - Primary storage for mapped laptop data
- Cloudflare D1 SQL - Alternative serverless database option
- SQLAlchemy 2.0.36 - ORM for database abstraction
- psycopg2 2.9.10 - PostgreSQL client
- Amazon S3 - Raw CSV file storage
- AWS Lambda - Serverless backend deployment
- HuggingFace Spaces - Application hosting
- boto3 - AWS SDK
- OpenAI API 1.102.0 - Primary language model (GPT-4o-mini)
- Google Generative AI 0.7.2 - Optional Gemini integration
- LangChain 0.2.16 - LLM orchestration
- LlamaIndex 0.12.38 - Retrieval and indexing
- Pandas 2.2.2 - Data manipulation
- NumPy 1.26.4 - Numerical computing
- PandasSQL 0.7.3 - SQL query engine for pandas
- PyArrow 17.0.0 - Parquet file handling
- FastParquet 2024.5.0 - High-performance parquet operations
- Gunicorn 22.0.0 - WSGI server
- Mangum 0.19.0 - AWS Lambda adapter
The system implements two distinct pipelines that share core functionality through the orchestrator.py
module:
Purpose: Maintain a fresh, pre-processed laptop recommendation database
Trigger Methods:
- Web interface via
ingestor.py
with file upload throughupload.html
- Direct function call via
Orchestrator.start_internal_data_ingestion()
Process Flow:
CSV Upload → S3 Storage → Data Retrieval → LLM Mapping → Database Update
Key Components:
-
File Upload (
DataIngestion.start_data_ingestion()
)- Accepts CSV files containing laptop specifications
- Uploads to Amazon S3 bucket via
AWSConnection.upload_file_to_s3()
- Real-time progress tracking through
/upload-status
endpoint
-
Data Processing (
ProductMapper.start_dataframe_product_mapping()
)- Reads structured data using parquet files for efficiency
- Processes laptop descriptions through OpenAI GPT-4o-mini API
- Handles batch processing with error recovery
-
Specification Mapping (
ProductMapper.do_product_mapping()
)- Maps laptop features to standardized categories using OpenAI GPT models:
- GPU Intensity: Low (integrated graphics) / Medium (mid-range dedicated) / High (high-end dedicated like RTX)
- Display Quality: Low (below Full HD) / Medium (Full HD+) / High (4K, HDR)
- Portability: High (<1.51kg) / Medium (1.51-2.51kg) / Low (>2.51kg)
- Multitasking: Low (8-12GB RAM) / Medium (16GB RAM) / High (32GB+ RAM)
- Processing Speed: Low (i3, Ryzen 3) / Medium (i5, Ryzen 5) / High (i7+, Ryzen 7+)
- Maps laptop features to standardized categories using OpenAI GPT models:
-
Database Update (
PostgresDataBaseUpdate.update_to_postgres_database()
)- Supports both PostgreSQL (AIVEN) and Cloudflare D1 databases
- Drops existing table for complete refresh
- Creates new schema based on DataFrame structure
- Batch inserts all processed data with JSON handling for mapped columns
Purpose: Provide personalized laptop recommendations through conversational AI
Conversation States:
normal
- Standard conversation flowawaiting_feedback
- Waiting for user satisfaction confirmationawaiting_rating
- Collecting user rating (1-5 scale)ended
- Conversation terminated
Process Flow:
User Input → Moderation → Profile Building → Intent Confirmation → Recommendation → Feedback
Key Functions:
-
Input Processing (
/chat
endpoint inapp.py
)- Greeting detection via
get_chat_completion()
with OpenAI GPT-4o-mini - Content moderation using
moderation_check()
with omni-moderation-latest - JSON filtering through
filter_json_from_response()
- Greeting detection via
-
Profile Building (
Orchestrator.get_chat_completion()
)- Extracts user preferences through guided conversation
- Uses system instruction prompts from
SystemInstruction.system_instruction
- Builds comprehensive user profile dictionary:
{ 'GPU intensity': 'low/medium/high', 'Display quality': 'low/medium/high', 'Portability': 'low/medium/high', 'Multitasking': 'low/medium/high', 'Processing speed': 'low/medium/high', 'Budget': 'numerical_value' }
-
Intent Confirmation (
Orchestrator.intent_confirmation_check()
)- Validates profile completeness using
IntentConfirmation.intent_confirmation
prompt - Ensures all 6 required attributes are captured
- Returns JSON response with confirmation status
- Validates profile completeness using
-
Recommendation Engine (
ProductRecommendation.recommend_product()
)- Queries database via
LoadFromDatabase.fetch_query_engine_data()
- Filters by budget using
QueryEngine.filter_budget()
- Calculates match scores via
QueryEngine.filter_by_user_score()
andProductMapper.map_the_score()
- Returns top 3 recommendations with detailed specifications
- Queries database via
-
Response Generation
- Formats recommendations using
ProductRecommender.system_message
prompt - Handles satisfaction feedback through
/feedback
endpoint - Routes to human agent via
route_to_human_agent()
if user unsatisfied - Collects ratings through
/rate
endpoint for continuous improvement
- Formats recommendations using
ADMIN WORKFLOW (Data Ingestion)
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Admin Upload │───▶│ S3 Storage │───▶│ Data Pipeline │
│ (ingestor.py) │ │ (aws_s3_conn.py) │ │(data_ingestion) │
│ upload.html │ │ │ │ │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│
▼
┌──────────────────┐ ┌─────────────────┐
│ Orchestrator │◀───│ LLM Mapping │
│ (orchestrator.py)│ │(product_mapper) │
│ │ │ GPT-4o-mini │
└──────────────────┘ └─────────────────┘
│ │
▼ ▼
┌──────────────────┐ ┌─────────────────┐
│ PostgreSQL/D1 │ │ Mapped Features │
│ Database │ │ Low/Med/High │
│ (parquet cache) │ │ Categories │
└──────────────────┘ └─────────────────┘
│
▼
USER WORKFLOW (Chatbot)
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ User Chatbot │◀───│ Orchestrator │◀───│ Query Engine │
│ (app.py) │ │ (orchestrator.py)│ │(query_engine.py)│
│ index.html │ │ │ │ │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Conversation │ │ Profile Builder │ │ Recommendation │
│ State Machine │ │ Intent Confirm │ │ Score Matching │
│ (normal/rating) │ │ JSON Validation │ │ Top 3 Results │
└─────────────────┘ └──────────────────┘ └─────────────────┘
- Python 3.8+
- PostgreSQL database (AIVEN recommended) or Cloudflare D1
- AWS S3 bucket with appropriate permissions
- OpenAI API key with GPT-4o-mini access
-
Clone the repository
git clone https://github.com/aydiegithub/ai-shopping-assistant.git cd ai-shopping-assistant
-
Install dependencies
pip install -r requirements.txt
-
Environment Configuration Create a
.env
file with the following variables:# OpenAI Configuration OPENAI_API_KEY=your_openai_api_key # AWS Configuration AWS_ACCESS_KEY_ID=your_aws_access_key AWS_SECRET_ACCESS_KEY=your_aws_secret_key S3_BUCKET_NAME=your_s3_bucket_name AWS_DEFAULT_REGION=us-east-1 # PostgreSQL Configuration (AIVEN) POSTGRES_HOST=your_postgres_host POSTGRES_PORT=5432 POSTGRES_DB_NAME=your_database_name POSTGRES_USER=your_postgres_user POSTGRES_PASSWORD=your_postgres_password POSTGRES_TABLE_NAME=laptops # Cloudflare D1 Configuration (Optional) CLOUDFLARE_ACCOUNT_ID=your_cloudflare_account_id D1_SQL_DATABASE_ID=your_d1_database_id CLOUDFLARE_API_TOKEN=your_cloudflare_api_token D1_SQL_DATABASE_NAME=your_d1_database_name
-
Database Setup
# Initialize database with sample data from src.backend.orchestrator import Orchestrator orchestrator = Orchestrator() orchestrator.start_internal_data_ingestion()
-
Run the application
# Development server (main chatbot) python app.py # Data ingestion server (admin interface) python ingestor.py # Production server gunicorn -w 4 -b 0.0.0.0:8000 app:app
AWS Lambda Deployment: The application includes Mangum WSGI adapter for seamless Lambda deployment:
from mangum import Mangum
handler = Mangum(app)
HuggingFace Spaces:
Configure app.py
with the appropriate port settings:
port = int(os.environ.get("PORT", 7860))
app.run(host='0.0.0.0', port=port, debug=True)
-
Web Interface:
python ingestor.py # Navigate to http://localhost:5000 # Upload CSV files through the interface # Monitor progress via real-time status updates
-
Programmatic Access:
from src.backend.orchestrator import Orchestrator orch = Orchestrator() orch.start_internal_data_ingestion( local_file_path="src/database/laptop_data.csv", s3_file_name="laptop_data.csv" )
- Start Conversation: Send greeting to
/chat
endpoint - Profile Building: Answer guided questions about laptop needs
- Receive Recommendations: Get personalized top 3 suggestions
- Provide Feedback: Rate assistance (1-5) and optionally escalate to human agent
POST /chat
{
"message": "Hi, I need help finding a laptop",
"messages": [],
"state": "normal"
}
Response:
{
"message": "Hello there! I am here to help you. I am your personal laptop assistant. What kind of laptop are you looking for?",
"messages": [...],
"state": "normal"
}
POST /chat
{
"message": "I'm a video editor working with 4K content in After Effects",
"messages": [...],
"state": "normal"
}
Response:
{
"message": "Great! Working with After Effects and 4K content requires high GPU intensity and processing power. Do you need portability for travel work?",
"messages": [...],
"state": "normal"
}
POST /chat
{
"message": "Mostly stationary work, budget around 150000 INR",
"messages": [...],
"state": "normal"
}
Response:
{
"message": "{'GPU intensity': 'high', 'Display quality': 'high', 'Portability': 'low', 'Multitasking': 'high', 'Processing speed': 'high', 'Budget': '150000'}\n\nBased on your requirements for 4K video editing:\n\n1. **ASUS ROG Strix G15**: High-performance gaming laptop with RTX 3070, 16GB RAM, AMD Ryzen 7, 15.6\" 144Hz display - Rs. 145,000\n2. **MSI Creator 15**: Content creation focused with RTX 3060, 32GB RAM, Intel i7, 15.6\" 4K display - Rs. 149,999\n3. **HP OMEN 15**: Gaming laptop with RTX 3060, 16GB RAM, AMD Ryzen 7, 15.6\" 165Hz display - Rs. 139,000\n\nHope I have solved your request. Did this help you? (yes/no)",
"messages": [...],
"state": "awaiting_feedback"
}
flask==3.0.3
python-dotenv==1.0.1
SQLAlchemy==2.0.36
sqlalchemy-cloudflare-d1==0.1.0
psycopg2==2.9.10
pandas==2.2.2
numpy==1.26.4
pandasql==0.7.3
pyarrow==17.0.0
fastparquet==2024.5.0
openai==1.102.0
google-generativeai==0.7.2
llama-index==0.12.38
langchain==0.2.16
langchain-community==0.2.16
tenacity
boto3>=1.26.0,<2
cloudflare>=3.0.0,<4
mangum==0.19.0
requests==2.32.3
gunicorn==22.0.0
pydantic>=2.7.4,<3
pydantic-core>=2.16.3,<3
black==24.8.0
pytest==8.3.3
-
GET
/
- Main chatbot interface- Serves
index.html
template - Returns: HTML chatbot interface
- Serves
-
POST
/chat
- Main conversation endpoint- Body:
{"message": "user_input", "messages": [], "state": "normal"}
- Returns:
{"message": "bot_response", "messages": [...], "state": "normal|awaiting_feedback"}
- Functions: Greeting detection, moderation, profile building, intent confirmation
- Body:
-
POST
/feedback
- User satisfaction feedback- Body:
{"message": "yes/no", "messages": [...]}
- Returns: Rating prompt or human agent escalation
- Functions: Satisfaction assessment, routing logic
- Body:
-
POST
/rate
- User rating submission (1-5 scale)- Body:
{"message": "1-5", "messages": [...]}
- Returns:
{"message": "Thank you for your feedback! Chat ended.", "state": "ended"}
- Body:
-
GET
/
- Upload interface- Serves
upload.html
template - Returns: File upload interface with progress tracking
- Serves
-
POST
/upload-csv
- CSV file upload and processing- Body:
multipart/form-data
with file - Returns:
{"message": "Upload started", "status": "processing"}
- Functions: File validation, S3 upload, LLM mapping, database update
- Body:
-
GET
/upload-status
- Real-time upload progress- Returns:
{"status": "uploading|processing|completed|error", "progress": 0-100, "message": "...", "logs": [...]}
- Returns:
-
GET
/logs
- Recent processing logs- Returns:
{"logs": [{"timestamp": "HH:MM:SS", "message": "...", "level": "info|error"}]}
- Returns:
initialise_conversation()
- Sets up system instruction for new chatsget_chat_completion(input_messages, json_format=False)
- OpenAI API wrapper with retry logicmoderation_check(input_message)
- Content safety validation using omni-moderation-latestintent_confirmation_check(input_message)
- Profile completeness validationstart_product_recommendation(input_message)
- Triggers recommendation pipelineset_user_profile(message)
- Extracts user preferences from conversationroute_to_human_agent(input_message)
- Escalation logic for unsatisfied users
do_product_mapping(laptop_description)
- Maps single laptop description to categories using GPT-4o-ministart_dataframe_product_mapping(df)
- Batch processes entire DataFrame with parquet optimizationmap_the_score(mapped_column, user_profile)
- Calculates compatibility scores
filter_budget(data, criteria)
- Filters laptops by budget constraintsfilter_by_user_score(data, user_profile)
- Ranks laptops by compatibility score
recommend_product(user_profile)
- End-to-end recommendation pipelinecalculate_score(mapping_column, user_profile)
- Score calculation wrapper
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
- Follow PEP 8 style guidelines (use
black==24.8.0
for formatting) - Add comprehensive logging using the project's logging framework
- Write unit tests for new functionality using
pytest==8.3.3
- Update documentation for API changes
- Ensure all environment variables are documented
- Test both ingestion and chatbot workflows before submitting
ai-shopping-assistant/
├── app.py # Main Flask chatbot application
├── ingestor.py # Admin data ingestion interface
├── requirements.txt # Python dependencies
├── .env # Environment variables (not tracked)
├── .gitignore # Git ignore rules
├── LICENSE # MIT License
├── README.md # Project documentation
├── Dockerfile # Container configuration
├── setup.py # Package setup
├── diagnostics.py # System diagnostics
├── templates.py # Template utilities
├── test.py # Test utilities
├── tester.ipynb # Jupyter notebook for testing
├── src/
│ ├── backend/
│ │ ├── __init__.py # Backend module initialization
│ │ ├── orchestrator.py # Central coordination logic
│ │ ├── data_ingestion.py # Data pipeline management
│ │ ├── product_mapper.py # LLM-based feature mapping
│ │ ├── product_recommender.py # Recommendation engine
│ │ ├── query_engine.py # Database querying and filtering
│ │ └── prompts.py # LLM prompt templates
│ ├── constants/
│ │ └── __init__.py # Configuration constants and environment variables
│ ├── database/
│ │ ├── aws_s3_connection.py # S3 storage operations
│ │ ├── aiven_posgresql_update.py # PostgreSQL operations
│ │ ├── cloudflare_connection.py # Cloudflare D1 database operations
│ │ ├── load_from_database.py # Data retrieval operations
│ │ ├── laptop_data.csv # Sample dataset
│ │ ├── laptop_data_copy.csv # Dataset backup
│ │ └── laptop_data_mapped.parquet # Pre-processed mapped data
│ ├── frontend/
│ │ ├── static/
│ │ │ ├── main.js # Frontend JavaScript
│ │ │ └── style.css # Frontend styling
│ │ └── templates/
│ │ ├── logo/ # Brand assets
│ │ ├── index.html # Chatbot interface
│ │ └── upload.html # Admin upload interface
│ ├── logging/
│ │ └── __init__.py # Logging configuration
│ └── utils/
│ └── __init__.py # Utility functions
This project is licensed under the MIT License. See the LICENSE file for details.
Aditya Dinesh K
- LinkedIn: linkedin.com/in/aydiemusic
- Email: developer@aydie.in | aditya@aydie.in
- Website: www.aydie.in
This project demonstrates advanced AI integration patterns in production environments, showcasing best practices for LLM-powered applications, data pipeline automation, and scalable cloud deployment. The system serves as a comprehensive example of modern AI-driven e-commerce solutions with real-world deployment considerations.