Book Inventory Analysis is an end-to-end analytics project that combines web scraping, SQL insights, and Python-based data visualization to extract and analyze book data from Books to Scrape. The project demonstrates the full workflow from raw web data extraction to actionable business insights.
Key Skills Applied:
- Web Scraping with Python (BeautifulSoup, Requests)
- Data Storage and Analysis using SQL
- Exploratory Data Analysis (EDA) and Visualization with Pandas, Matplotlib, Seaborn
- Scraped book details from all pages on the website, including:
- Title
- Price
- Availability
- Rating (out of 5 stars)
- Stored the extracted data in a CSV file for further analysis.
- Loaded the CSV into a SQL database to derive key insights:
- Count of books available in stock
- Top 5 most expensive books
- Average book rating
- Distribution of books across different ratings (1 to 5 stars)
Sample Insights:
- There are X books in stock
- The most expensive book is “XYZ”, priced at £59.99
- Average rating of books is 4.2 stars
- Y books have a 5-star rating
- Performed EDA using Pandas to summarize dataset characteristics (count, average price, missing values)
- Visualized key insights using Matplotlib & Seaborn:
- Bar chart: Number of books per rating
- Histogram: Distribution of book prices
- Pie chart: Proportion of books in stock vs out of stock
- Additional visualizations highlighting trends and patterns
- Gained hands-on experience in web scraping and data extraction
- Applied SQL for structured data analysis
- Developed skills in data visualization and EDA to generate actionable insights
- Learned to integrate Python, SQL, and visualization for end-to-end analytics projects