Airflow, Spark and Kafka example
-
Updated
Nov 21, 2023 - Dockerfile
Airflow, Spark and Kafka example
BigData Pipeline is a local testing environment for experimenting with various storage solutions (RDB, HDFS), query engines (Trino), schedulers (Airflow), and ETL/ELT tools (DBT). It supports MySQL, Hadoop, Hive, Kudu, and more.
Install Airflow on a single Raspberry Pi via Ansible, using LocalExecutor backed by a Postgres database.
Airflow Docker with pip instalation using requirement.txt and using DAG from cloned repository
airflow + postgresql + docker-compose + celery
Add a description, image, and links to the airflow-docker topic page so that developers can more easily learn about it.
To associate your repository with the airflow-docker topic, visit your repo's landing page and select "manage topics."