An ETL framework with Celery, with automatic load balancing
Harshavardhan Rangan (~hrangan) |
Our proposal is to load balance an ETL system made up of discrete readers, transformers and loaders. The framework will balance the ratio of these workers based on the data I/O rates of said workers. Our approach is to override the autoscaling mechanism of celery.
While our experience lies in ETL, Celery is a new technology for us and something we've wanted to work with. This is going to be a learning experience for us, and we look forward to having you join us.
Intermediate to advanced python knowledge[HTML_REMOVED] Intermediate to advanced ETL Knowledge[HTML_REMOVED] Basic celery knowledge[HTML_REMOVED]
https://en.wikipedia.org/wiki/Extract,_transform,_load[HTML_REMOVED] https://en.wikipedia.org/wiki/Self-stabilization[HTML_REMOVED] http://celery.readthedocs.org/[HTML_REMOVED] http://redis.io/[HTML_REMOVED] https://www.rabbitmq.com/[HTML_REMOVED]