A comprehensive WhatsApp chat analysis tool that combines machine learning, data science, and generative AI to provide deep insights into your conversations. This project goes beyond traditional message statistics to analyze mood trends, communication patterns, user personas, and conversation dynamics using advanced NLP techniques.
Live Demo: WhatsApp Conversation Analyzer
- Message Statistics: Total messages, words, media shared, links shared
- User Activity Analysis: Most active users, message frequency patterns
- Temporal Analysis: Activity timelines, peak usage hours, monthly trends
- Content Analysis: Word clouds, most common words, emoji usage patterns
- Mood Trend Analysis: Track emotional patterns over time using sentiment analysis
- Apology & Gratitude Detection: Identify frequency of apologies and expressions of gratitude
- User Persona Analysis: Generate personality profiles based on communication patterns
- Conversation Flow Analysis: Understand dialogue dynamics and response patterns
- Topic Modeling: Discover hidden themes and subjects in conversations
- Language Detection: Automatically detect and analyze conversations in multiple languages
- Cross-Language Analytics: Support for major languages including English, Hindi, Spanish, French, and more
- Unicode Emoji Analysis: Comprehensive emoji sentiment and usage analysis
- Interactive Dashboards: Built with Streamlit for seamless user experience
- Dynamic Charts: Activity heatmaps, timeline visualizations, and statistical plots
- Word Clouds: Customizable word frequency visualizations
- Emoji Analytics: Visual emoji usage patterns and sentiment mapping
Whatsapp_analysis/
├── app.py # Streamlit application entrypoint
├── preprocessor.py # Chat file parsing and data preprocessing
├── helper.py # Statistical analysis, visualizations, and utility functions
├── requirements.txt # Python dependencies
└── README.md # Project documentation
- Python 3.11: Core programming language (strongly recommended)
- Streamlit: Web application framework for interactive dashboards
- Pandas: Data manipulation and analysis
- NumPy: Numerical computing and array operations
- Matplotlib/Seaborn: Data visualization libraries
- Plotly: Interactive plotting and visualization
- NLTK/spaCy: Natural language processing
- Scikit-learn: Machine learning algorithms for pattern recognition
- Emoji: Emoji analysis and sentiment mapping
- LangDetect: Multi-language detection capabilities
- Python 3.11 or higher
- pip package manager
-
Clone the repository:
git clone https://github.com/RajeebLochan/Whatsapp_analysis.git cd Whatsapp_analysis
-
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Run the application:
streamlit run app.py
-
For Individual Chats:
- Open the chat in WhatsApp
- Tap on the contact name → More → Export Chat
- Choose "Without Media" for faster processing
- Save the .txt file
-
For Group Chats:
- Open the group chat
- Tap on group name → More → Export Chat
- Choose "Without Media"
- Save the .txt file
- Upload Chat File: Use the file uploader to select your exported .txt file
- Select Analysis Scope: Choose between "Overall" analysis or specific user analysis
- Explore Insights: Navigate through different analysis sections:
- Statistical Overview
- Activity Patterns
- Content Analysis
- AI-Powered Insights
- Multi-Language Analytics
The analyzer supports WhatsApp chat exports in multiple date formats:
- DD/MM/YYYY format
- MM/DD/YYYY format
- YYYY-MM-DD format
- 12-hour and 24-hour time formats
- Sentiment Analysis: Tracks emotional tone using pre-trained models
- Topic Modeling: Uses LDA (Latent Dirichlet Allocation) for theme discovery
- Clustering Analysis: Groups similar conversation patterns
- Predictive Analytics: Forecast communication trends
- Conversation Summarization: AI-generated summaries of chat themes
- Persona Generation: Detailed personality profiles based on communication style
- Insight Generation: Automated discovery of interesting conversation patterns
- Relationship Dynamics: Analysis of interpersonal communication patterns
- Local Processing: All data processing happens locally on your machine
- No Data Storage: Chat data is not stored or transmitted to external servers
- Privacy First: Your conversations remain completely private
- Efficient Data Processing: Optimized pandas operations for large chat files
- Memory Management: Chunked processing for handling extensive chat histories
- Caching: Streamlit caching for improved performance
- Parallel Processing: Multi-threaded analysis for faster results
We welcome contributions to improve the WhatsApp Conversation Analyzer! Here's how you can help:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature
- Make your changes: Implement new features or fix bugs
- Test thoroughly: Ensure all functionality works as expected
- Submit a pull request: Describe your changes and their benefits
- Follow PEP 8 style guidelines
- Add comprehensive docstrings
- Include unit tests for new features
- Update documentation as needed
- Export Options: PDF reports and data export functionality
- Advanced ML Models: Custom-trained models for better accuracy
- Group Dynamics: Enhanced group conversation analysis
- Integration APIs: REST API for external applications
- v1.0: Initial release with basic analytics
- v1.1: Added multi-language support
- v1.2: Integrated AI-powered insights
- v1.3: Enhanced visualization capabilities
- File Upload Errors: Ensure the file is a valid WhatsApp .txt export
- Memory Issues: For large files, consider analyzing smaller date ranges
- Encoding Problems: Ensure the chat file is UTF-8 encoded
- Use Python 3.11 for optimal performance
- Close other applications to free up system memory
- For very large chats, consider splitting the analysis by date ranges
This project is licensed under the MIT License - see the LICENSE file for details.
Rajeeb Lochan
- Twitter: @rajeeb_thedev
- LinkedIn: rajeeb-lochan
- GitHub: @RajeebLochan
- Thanks to the open-source community for providing excellent libraries
- Special recognition to the Streamlit team for their amazing framework
- Appreciation to all contributors and users who have provided feedback
If you use this tool in your research or projects, please cite:
@software{lochan2024whatsapp,
title={WhatsApp Conversation Analyzer},
author={Lochan, Rajeeb},
year={2025},
url={https://github.com/RajeebLochan/Whatsapp_analysis}
}
Note: This tool is for educational and personal use only. Please respect privacy and obtain consent before analyzing shared conversations.