An ETL framework with Celery, with automatic load balancing

Harshavardhan Rangan (~hrangan)


0

Votes

Description:

Our proposal is to load balance an ETL system made up of discrete readers, transformers and loaders. The framework will balance the ratio of these workers based on the data I/O rates of said workers. Our approach is to override the autoscaling mechanism of celery.

While our experience lies in ETL, Celery is a new technology for us and something we've wanted to work with. This is going to be a learning experience for us, and we look forward to having you join us.

Prerequisites:

Intermediate to advanced python knowledge<br> Intermediate to advanced ETL Knowledge<br> Basic celery knowledge<br>

Content URLs:

https://en.wikipedia.org/wiki/Extract,_transform,_load<br> https://en.wikipedia.org/wiki/Self-stabilization<br> http://celery.readthedocs.org/<br> http://redis.io/<br> https://www.rabbitmq.com/<br>

Section: Core Python
Type: Dev Sprint
Target Audience: Intermediate
Last Updated: