Data Pipeline Automation by integrating Django Signals with Celery
Mannu Gupta (~theparadoxer02) |
In this talk I will describe, how we can automate a sophisticated data to multiple pipeline, monitor every single stage and handle the different scenarios with each stages. The main components used for architecting the application are Django Signals and Celery.
Additionally I will also shed some light on the advantage and disadvantage of using Apache Airflow which might be a good alternative for the above solution.
Airflow is developed in Python. Airflow is a historically important tool in the data engineering ecosystem. It introduced the ability to combine a strict Directed Acyclic Graph (DAG) model with Pythonic flexibility in a way that made it appropriate for a wide variety of use cases.
- Basic understanding of Django Signals
- Basic understanding of Celery
still working on it.
He is currently employed as Software Engineer at Essentia SoftServ.
He has a keen interest in multiple tech domains, but Backend and DevOps interest him the most.