Skip to content

Web scraping, SQL insights, and Python EDA project using the Books to Scrape dataset. Collected multi-page book data, stored it in CSV, generated SQL-based insights, and built visualizations with Pandas, Matplotlib, and Seaborn to explore ratings, prices, and availability trends.

Notifications You must be signed in to change notification settings

Ujjwal611/Book_Inventory_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 

Repository files navigation

📚 Book Inventory Analysis

Project Leader: Neha Gupta

Team Members: Adarsh Rai, Ujjwal Jain


Project Overview

Book Inventory Analysis is an end-to-end analytics project that combines web scraping, SQL insights, and Python-based data visualization to extract and analyze book data from Books to Scrape. The project demonstrates the full workflow from raw web data extraction to actionable business insights.

Key Skills Applied:

  • Web Scraping with Python (BeautifulSoup, Requests)
  • Data Storage and Analysis using SQL
  • Exploratory Data Analysis (EDA) and Visualization with Pandas, Matplotlib, Seaborn

Project Workflow

1. Web Scraping (Python)

  • Scraped book details from all pages on the website, including:
    • Title
    • Price
    • Availability
    • Rating (out of 5 stars)
  • Stored the extracted data in a CSV file for further analysis.

2. SQL Insights

  • Loaded the CSV into a SQL database to derive key insights:
    1. Count of books available in stock
    2. Top 5 most expensive books
    3. Average book rating
    4. Distribution of books across different ratings (1 to 5 stars)

Sample Insights:

  • There are X books in stock
  • The most expensive book is “XYZ”, priced at £59.99
  • Average rating of books is 4.2 stars
  • Y books have a 5-star rating

3. Exploratory Data Analysis (EDA) & Visualization

  • Performed EDA using Pandas to summarize dataset characteristics (count, average price, missing values)
  • Visualized key insights using Matplotlib & Seaborn:
    • Bar chart: Number of books per rating
    • Histogram: Distribution of book prices
    • Pie chart: Proportion of books in stock vs out of stock
    • Additional visualizations highlighting trends and patterns

Outcome & Learning

  • Gained hands-on experience in web scraping and data extraction
  • Applied SQL for structured data analysis
  • Developed skills in data visualization and EDA to generate actionable insights
  • Learned to integrate Python, SQL, and visualization for end-to-end analytics projects

About

Web scraping, SQL insights, and Python EDA project using the Books to Scrape dataset. Collected multi-page book data, stored it in CSV, generated SQL-based insights, and built visualizations with Pandas, Matplotlib, and Seaborn to explore ratings, prices, and availability trends.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •