MagicXML 🧙‍♂️

Advanced XML to CSV Conversion Tool

⭐English⭐ | Russian | German | Japanese | Korean | Chinese

🚀 Overview

MagicXML is a high-performance web application built with FastAPI that converts data between XML, CSV, Excel, JSON, PDF, and image formats. Designed for data analysts, developers, and e-commerce professionals, MagicXML handles complex structures with advanced parsing capabilities, asyncio-powered processing, and intelligent data classification.

Supported Conversions

Convert CSV to XML
Convert CSV to Excel
Convert Excel to CSV
Convert JSON to CSV
Convert CSV to JSON
Convert XML to JSON
JPEG↔PNG image conversion
Convert PDF to CSV
Convert PDF to Excel
Convert PDF to JSON
Convert CSV to PDF
Convert Excel to PDF

🔗 Live Demo: https://magic-xml.replit.app

✨ Key Features

High-Performance Processing: Asynchronous architecture for efficient handling of large XML files
Intelligent Data Extraction: Contextual parsing of complex nested XML structures
Data Cleaning & Sanitization: Automatic cleaning of HTML tags and special characters
Multilingual Support: Interface available in English, Russian, and more languages
RESTful API: Programmatic access for seamless integration with your systems
Callback Support: Optional webhook notifications when processing is complete
Robust Error Handling: Comprehensive error management with detailed reporting
Versatile Format Conversions: Convert between CSV, XML, Excel, JSON, PDF, and JPEG/PNG images

🛠️ Technical Architecture

MagicXML leverages several advanced technologies to deliver exceptional performance:

FastAPI Backend: High-performance asynchronous API framework
Asyncio & Aiohttp: Non-blocking I/O operations for concurrent processing
XML ElementTree: Efficient XML parsing and traversal
BeautifulSoup: Intelligent HTML content cleaning
Modern Frontend: Responsive design with custom CSS and JavaScript

📊 Use Cases

E-commerce Data Processing: Convert product feeds from XML to CSV
Data Analysis: Transform XML datasets into analysis-ready CSV format
System Integration: Bridge XML-based systems with CSV-compatible tools
Catalog Management: Process large product catalogs efficiently
Automated Workflows: Integrate with data pipelines via API

🔧 Installation & Setup

Prerequisites

Python 3.8+
Git

Quick Start

# Clone the repository
git clone https://github.com/Solrikk/MagicXML.git
cd MagicXML

# Install dependencies
poetry install

# Run the application
poetry run uvicorn main:app --host 0.0.0.0 --port 8080 --reload

Alternatively, install dependencies with pip:

pip install -r requirements.txt

🔌 API Reference

Convert XML to CSV

curl -X 'POST' \
  'https://magic-xml.replit.app/process_link' \
  -H 'Content-Type: application/json' \
  -d '{
    "link_url": "https://example.com/data.xml",
    "preset_id": "optional-tracking-id",
    "return_url": "https://your-callback-url.com/webhook"
  }'

Response

{
  "file_url": "https://magic-xml.replit.app/download/data_files/example_com.csv",
  "preset_id": "optional-tracking-id",
  "status": "completed"
}

Check Processing Status

curl -X 'GET' 'https://magic-xml.replit.app/status/{preset_id}'

Download Generated CSV

curl -X 'GET' 'https://magic-xml.replit.app/download/data_files/{filename}'

📝 Implementation Details

Asynchronous Processing

MagicXML processes XML files asynchronously using Python's asyncio and aiohttp:

async def process_offers_chunk(offers_chunk, build_category_path, format_type):
    offers = []
    for offer_elem in offers_chunk:
        offer_data = await process_offer(offer_elem, build_category_path, format_type)
        offers.append(offer_data)
    return {"offers": offers}

This approach enables efficient concurrent processing, drastically reducing conversion time for large XML files.

Text Processing & Data Cleaning

The application implements sophisticated text processing to ensure data quality:

def clean_description(description):
    if not description:
        return ''
    soup = BeautifulSoup(description, 'html5lib')
    allowed_tags = ['p', 'br']
    for tag in soup.find_all(True):
        if tag.name not in allowed_tags:
            tag.unwrap()
    # Additional cleaning logic...
    return str(soup)

GitHub • Live Demo

Name		Name	Last commit message	Last commit date
Latest commit History 187 Commits
assets		assets
data_files		data_files
docs/readme		docs/readme
static		static
templates		templates
tests		tests
LICENSE		LICENSE
README.md		README.md
main.py		main.py
path_utils.py		path_utils.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
replit.nix		replit.nix
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MagicXML 🧙‍♂️

⭐English⭐ | Russian | German | Japanese | Korean | Chinese

🚀 Overview

Supported Conversions

✨ Key Features

🛠️ Technical Architecture

📊 Use Cases

🔧 Installation & Setup

Prerequisites

Quick Start

🔌 API Reference

Convert XML to CSV

Response

Check Processing Status

Download Generated CSV

📝 Implementation Details

Asynchronous Processing

Text Processing & Data Cleaning

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

Solrikk/MagicXML

Folders and files

Latest commit

History

Repository files navigation

MagicXML 🧙‍♂️

⭐English⭐ | Russian | German | Japanese | Korean | Chinese

🚀 Overview

Supported Conversions

✨ Key Features

🛠️ Technical Architecture

📊 Use Cases

🔧 Installation & Setup

Prerequisites

Quick Start

🔌 API Reference

Convert XML to CSV

Response

Check Processing Status

Download Generated CSV

📝 Implementation Details

Asynchronous Processing

Text Processing & Data Cleaning

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages