Building your own task scheduler in 30 minutes
Gowtham Nagarajan (~gowtham25) |
Periodic tasks are tasks which are executed at specified time intervals, again and again, with minimal human intervention. Every engineer would have used it in one way or the other, to schedule periodic database backups, enriching data for your Machine Learning algorithm, polling an API, daily marketing emails, etc.
Our use case was to address one such case of running specific tasks at specific schedules. Complexity of the scheduler increased when multiple tasks had to invoke the same methods internally to perform specific actions. After exploring a couple of options with crontab and celery redbeat, we decided to write our own scheduler.
Our task scheduler uses redis sorted sets to schedule and run the tasks. Redis is an in-memory cache, key value store. Sorted set is one of the data types supported by Redis.
How does it work?
- In sorted sets elements (strings) are stored along with a floating number value called as score. The element which has least score is the first one to be popped.
- Timestamp is stored as scores for the elements.
- Workers verify the first element of the sorted set every millisecond.
- Once the current timestamp is greater than the timestamp of the element, the element is popped from redis and task is executed
Key points to note :
- Redis is single threaded, always one worker gets a chance to pick up the task.
- Redis guarantees lock and synchronization.
- Introduction to task schedulers - 2 mins
- Use cases - 2 mins
- Options explored with existing task schedulers - 3 mins
- Introduction to redis and sorted set data type - 5 mins
- Using sorted set as a task scheduler and handling locks - 7 mins
- Code walk through to set up task scheduler - 7 mins
- Redis - https://redis.io/topics/data-types
- Sorted Sets - https://www.tutorialspoint.com/redis/redis_sorted_sets.htm
- Celery Redbeat - https://github.com/sibson/redbeat
Gowtham is a full stack developer working at Mad Street Den.