Distributed scheduling leveraging multiple nodes in the cluster
Avra Sengupta (~avra) |
Setting up a cron job in a machine, is perhaps the most easiest way of scheduling a particular task. But in a distributed system, spawning across several nodes, critical tasks can't just be scheduled on a single node. That would introduce a single point of failure (SPOF). We also can't schedule the same set of jobs in every node, as we don't want duplication of a task.
The solution to the above problem is a distributed cron scheduler spread across the cluster, working concurrently on different nodes to perform a task from a set of jobs in such a manner that we neither miss a job nor do we perform the same job twice.
A basic understanding of how a distributed system works, and how crond works.
I am a software engineer at Red Hat Inc., working on GlusterFS, a distributed file system. I have ~4 yrs of experience as a software developer in linux powered distributed systems.
Find out more about GlusterFS at http://www.gluster.org/ and if you want to get your hands dirty with the code, http://glusterhacker.blogspot.in/ should help you get started.