Just a second...

October 9, 2022

You need a better solution for Cron Jobs

If you are coming from a technical background, the chances are, you have already used it or at least have heard of Cron Jobs. Even if you don’t have a technical background, no worries, I got you covered. I will be explaining the Cron Job concept briefly to you.

In this article, we are going to discuss the traditional approach for Cron Jobs and what are the flaws of using it. So, without any further do, let’s get a brief understanding of what exactly Cron Job is.

What is a cron job?

A cron job is a program or script that runs automatically at certain times. The name comes from the Unix operating system, where it was originally developed. In Linux systems, cron jobs run every minute, hour, day, week, month, quarter, year, etc.

When you schedule something to happen at a specific time, you want it to happen exactly then. That’s why you need a cron job. If you’re looking for ways to automate processes, you can create a cron job to execute a task. For example, you can schedule a backup to occur daily.

I’m sorry if that was, a little overwhelming, but hopefully got the idea. Let’s have a quick overview first what is the traditional way to do it.

The traditional setup for cron jobs in Linux systems

Well, the concept of Cron Jobs came from the Linux systems. They had this implementation to run a command at a specified time.

We could access Cron Jobs via a Linux command called “crontab“. There are some basic syntaxes to get started.

Cron job syntaxes in Linux systems

Here I’m sharing some of the crontab commands to do things.

  • crontab -e: edits crontab entries to add, delete, or edit cron jobs.
  • crontab -l: list all the cron jobs for the current user.
  • crontab -u username -l: list another user’s cron jobs.
  • crontab -u username -e: edit another user’s cron jobs.

If you want to learn more about crontabs, check out this awesome article I found.

Cron Jobs commands may vary from OS to OS. It means Linux-based systems have crontab but, what about Windows Systems? As a developer, you might not know where your application is going to be deployed! And depending on the OS you need to set up cron jobs every time. I think this is hard.

Let’s assume, you are working on some project, say a NodeJS project, and you need to set up Cron Jobs for your application. What if, instead of setting it up at the OS level, you could do the same in your NodeJS program?

So, for that, if you do a quick Google search, it will come up with an NPM package called “node-cron“. Other programming languages have packages as well to implement in-code corn job setups.

Problems with the traditional approach

1. No conditional run

The package “node-cron” works kind of in the same way the OS-level cron jobs would behave. With “node-corn” you set up your corn jobs within your NodeJS code and you don’t really need to dig into, how the OS-level cron jobs run.

And your cron jobs along with your normal NodeJS server start. That’s easy, right?

But the problem I faced was there is no option for conditional job execution. Basically, once you set your cron jobs up, it is committed to running it, at the specified time, even if there is no need to run the cron.

Also, if your application is having millions of traffic, you might become super skeptical about the performance, ram usage, etc.

In this case, if your app is running cron for no possible reason, you might find unnecessary ram usage, which is certainly not expected.

2. No flexible execution time

In the traditional setup, you have to provide a time, when the cron needs to run. So the cron will run maybe every day at 1 am or maybe the 2nd of every month at 11 pm.

But what if you need to run the first cron on the 22nd of November and depending on some parameter the next run needs to take place after 5 hours or so?

If you need flexibility on when to run the cron job, you might face issues with the traditional Cron Jobs or with the “node-cron” package.

3. No handling for missed jobs

Suppose, you have a cron job to run at 5:30 am, 1st January 2022. Now for some random reason, your server got down at 5:29 am and got back online at 5:45 am. So what will happen?

Your cron job which was supposed to be run at 5:30 am on 1st January was never executed as at that time your server was offline! So the scheduled tasks were never executed on 1st January.

Even though the server came back online, it will see the time to run the job has already passed, and it will skip the execution and will wait till the next scheduled time.

So this is a flaw. Right? I think we need a better solution, to overcome all these problems.

Conclusion

Cron Jobs is a great way to automate stuff. Operating systems do have extensible ways to set up cron jobs. In some of the use cases, this setup will work for you. For example, if you are running your cron job every 2mins then it is not at all a problem. But if you are running your cron once a day or once a week, then the execution becomes important as the gap in between is big. If one execution fails, the next execution will happen after one week or so.

I the execution skips somehow there is no handling for it. So, we need to have something more resilient, and robust. We will discuss possible solutions for it. We will discuss packaged solutions available and will also discuss if we can create a custom implementation.

What are the issues your guys are facing with the traditional Cron Jobs setup? Let me know in the comments sections below. I would love to know about your experiences.

Till then, stay safe, and keep coding.

Posted in LinuxTaggs:
Write a comment