How to build a production-ready distributed task queue management system with celery

Vishrut Kohli (~vishrutkohli)




Brief Introduction

We hear a lot about queuing technologies like Redis, RabbitMQ, Sqs, etc but when it comes to building and maintaining a consuming and publishing mechanism for those queues it becomes a difficult and time-consuming task. Whenever you are working with python and you hear distributed task queue management system first thing which comes to mind is Celery but making a production-ready which means Highly efficient, Resilient, transparent, and scalable setup with celery is a tricky business. In this talk, we will see how we can take advantage of Celery to build and manage our task queue distribution system with ease along with tips and tricks to find the perfect configuration for your celery at scale.

Content of talks(Outline):

  • Introduction to talk and speaker? - 2 min
    1. About myself
    2. What do I do at Grofers?
    3. Introduction to the Talk.

  • What are task queues and why we need them? - 4 min
    1. A real Life problem we face.
    2. Its Solution using task queues.
    3. Explaining task queues.

  • What is Celery and why we need it? - 2 min
    1. Why celery is useful.
    2. Talking about the keywords we are going to use in the talk.
    3. Explaining task queues.

  • Building a distributed task queueing system. -5 min
    1. Starting with a real-life problem.
    2. Talking about the approach to start building our system.
    3. Talking about pipelines and how pipelines are better.
    4. Which broker to choose?
    5. Code samples.

  • Tuning a distributed task queueing system for better efficiency. -5 min
    1. Does the batching of tasks help our use case?
    2. Why benchmarking after each optimization is important.
    3. What are IO-bound and CPU-Bound tasks and why we need to split them?
    4. When to use -Ofair optimization to maximize performance.
    5. When to use prefetch multiplier to maximize performance.
    6. Store task results in the celery backend only if you need them.

  • Adding resiliency to the system -3 min
    1. Talking about the basic auto-retry feature of celery.
    2. Talking about exponential backoff and when do we need them.
    3. Adding max retries for circuit breaking.
    4. using acks_late= true to handle worker failures.
    5. Using retry jitter to add randomness to retries.
    6. Using a DLQ to record circuit breaker failures.

  • What to do in times of SOS? - 2 min
    1. Starting with checking CPU and memory usages and see what is causing the failure and seeing horizontally scaling or vertically scaling can help or not.
    2. using max tasks per child and max memory per child when you suspect a memory leak.

  • Monitoring the system we built. - 2 min
    1. Talking about how flower helps in monitoring our celery setup and how easy it is to set up.
    2. Using RabbitMQ admin panel for monitoring queues.

  • Questions - 5 min

  • Bad Jokes - All the time


  • Basic knowledge of python.
  • Basic knowledge of web development.
  • know about or used a little celery before.
  • Good sense of Humour and love for Gifs.

Video URL:

Content URLs:

Link to slides

Work In Progress Knowledgebase to the talk

A practice talk I gave at weekly Grofers tech meet on the same topic on 13th august with the same slides.

If someone needs access to this for further shortlisting they can request access and I will provide at the earliest. I can also make it public if that's fine with the organizers.

Speaker Info:

Vishrut Kohli is working as a software engineer at Grofers formerly working as data science lead at VC funded Edutech startup Leverage Edu. He embarked on his python journey from the start of his college days but learned many things using it at scale with python as his primary tech stack. He was also a METI japan internship scholar, Won SBI national Hackathon, and was the world finalist for united by HCL Hackathon held in manchester. He is highly enthusiastic in running python at scale easily and flawlessly and in his free time loves to build small side projects/tools and teach python to fellow juniors.

Speaker Links:

Personal Landing Page:

LinkedIn Link:

Github Link:

Reddit Link:

Facebook Link:


UnitedByHCL Hackathon Github link:

UnitedByHCL Hackathon Demo Link:

SBI National Hackathon Github Link:

SBI National Hackathon Demo Link:

Section: Web development
Type: Talks
Target Audience: Beginner
Last Updated: