a lightweight, comprehensive solution for managing delta tables built on polars and deltalake
-
Updated
Jan 1, 2025 - Python
a lightweight, comprehensive solution for managing delta tables built on polars and deltalake
An open-source Python library for simplifying local testing of Databricks workflows that use PySpark and Delta tables.
🍺 A data engineering project showcasing an ELT pipeline using modern technologies such as Delta-rs, and Apache Airflow.
A modular data-driven framework leveraging Spark, Delta Lake, MLflow, Airflow, and Streamlit to build and orchestrate a full-stack data lake solution. Designed for scalable ETL, automated ML pipelines, and interactive dashboarding.
This is a prototype of a big-data system about cultural heritage data and metadata. We ingest, process and deduplicate cultural objects from Europeana, and then we make recommendations of similar content using CLIP
Databricks & Blueprint Hackathon - using databricks, spark structured streaming, delta, and azure devops to build automated deployment of notebooks and jobs.
Add a description, image, and links to the delta-tables topic page so that developers can more easily learn about it.
To associate your repository with the delta-tables topic, visit your repo's landing page and select "manage topics."