Skip to content

MagicXML is a high-performance web application built with FastAPI that converts data between XML, CSV, Excel, JSON, PDF, and image formats. Designed for data analysts, developers, and e-commerce professionals, MagicXML handles complex structures with advanced parsing capabilities, asyncio-powered processing, and intelligent data classification.

License

Notifications You must be signed in to change notification settings

Solrikk/MagicXML

Repository files navigation

MagicXML 🧙‍♂️

Advanced XML to CSV Conversion Tool

License: MIT Python 3.8+ FastAPI


🚀 Overview

MagicXML is a high-performance web application built with FastAPI that converts data between XML, CSV, Excel, JSON, PDF, and image formats. Designed for data analysts, developers, and e-commerce professionals, MagicXML handles complex structures with advanced parsing capabilities, asyncio-powered processing, and intelligent data classification.

Supported Conversions

  • Convert CSV to XML
  • Convert CSV to Excel
  • Convert Excel to CSV
  • Convert JSON to CSV
  • Convert CSV to JSON
  • Convert XML to JSON
  • JPEG↔PNG image conversion
  • Convert PDF to CSV
  • Convert PDF to Excel
  • Convert PDF to JSON
  • Convert CSV to PDF
  • Convert Excel to PDF

🔗 Live Demo: https://magic-xml.replit.app

✨ Key Features

  • High-Performance Processing: Asynchronous architecture for efficient handling of large XML files

  • Intelligent Data Extraction: Contextual parsing of complex nested XML structures

  • Data Cleaning & Sanitization: Automatic cleaning of HTML tags and special characters

  • Multilingual Support: Interface available in English, Russian, and more languages

  • RESTful API: Programmatic access for seamless integration with your systems

  • Callback Support: Optional webhook notifications when processing is complete

  • Robust Error Handling: Comprehensive error management with detailed reporting

  • Versatile Format Conversions: Convert between CSV, XML, Excel, JSON, PDF, and JPEG/PNG images

🛠️ Technical Architecture

MagicXML leverages several advanced technologies to deliver exceptional performance:

  • FastAPI Backend: High-performance asynchronous API framework
  • Asyncio & Aiohttp: Non-blocking I/O operations for concurrent processing
  • XML ElementTree: Efficient XML parsing and traversal
  • BeautifulSoup: Intelligent HTML content cleaning
  • Modern Frontend: Responsive design with custom CSS and JavaScript

📊 Use Cases

  • E-commerce Data Processing: Convert product feeds from XML to CSV
  • Data Analysis: Transform XML datasets into analysis-ready CSV format
  • System Integration: Bridge XML-based systems with CSV-compatible tools
  • Catalog Management: Process large product catalogs efficiently
  • Automated Workflows: Integrate with data pipelines via API

🔧 Installation & Setup

Prerequisites

  • Python 3.8+
  • Git

Quick Start

# Clone the repository
git clone https://github.com/Solrikk/MagicXML.git
cd MagicXML

# Install dependencies
poetry install

# Run the application
poetry run uvicorn main:app --host 0.0.0.0 --port 8080 --reload

Alternatively, install dependencies with pip:

pip install -r requirements.txt

🔌 API Reference

Convert XML to CSV

curl -X 'POST' \
  'https://magic-xml.replit.app/process_link' \
  -H 'Content-Type: application/json' \
  -d '{
    "link_url": "https://example.com/data.xml",
    "preset_id": "optional-tracking-id",
    "return_url": "https://your-callback-url.com/webhook"
  }'

Response

{
  "file_url": "https://magic-xml.replit.app/download/data_files/example_com.csv",
  "preset_id": "optional-tracking-id",
  "status": "completed"
}

Check Processing Status

curl -X 'GET' 'https://magic-xml.replit.app/status/{preset_id}'

Download Generated CSV

curl -X 'GET' 'https://magic-xml.replit.app/download/data_files/{filename}'

📝 Implementation Details

Asynchronous Processing

MagicXML processes XML files asynchronously using Python's asyncio and aiohttp:

async def process_offers_chunk(offers_chunk, build_category_path, format_type):
    offers = []
    for offer_elem in offers_chunk:
        offer_data = await process_offer(offer_elem, build_category_path, format_type)
        offers.append(offer_data)
    return {"offers": offers}

This approach enables efficient concurrent processing, drastically reducing conversion time for large XML files.

Text Processing & Data Cleaning

The application implements sophisticated text processing to ensure data quality:

def clean_description(description):
    if not description:
        return ''
    soup = BeautifulSoup(description, 'html5lib')
    allowed_tags = ['p', 'br']
    for tag in soup.find_all(True):
        if tag.name not in allowed_tags:
            tag.unwrap()
    # Additional cleaning logic...
    return str(soup)

© 2025 MagicXML - Advanced XML to CSV Converter

GitHubLive Demo

About

MagicXML is a high-performance web application built with FastAPI that converts data between XML, CSV, Excel, JSON, PDF, and image formats. Designed for data analysts, developers, and e-commerce professionals, MagicXML handles complex structures with advanced parsing capabilities, asyncio-powered processing, and intelligent data classification.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •