UNIX Shell: A Job Scheduler for Batch Processing


This article describes a trivial job scheduler for UNIX shell scripts. The scheduler guarantees that no more than a predefined constant number of jobs (out of all jobs in a batch) will be scheduled for execution at the same time, where the duration of a particular job cannot be predicted in advance, with an arbitrary variance in duration of jobs in the batch.

Imagine that you have to process a large number of files, possibly having different sizes (maybe very different sizes). You found that the processing is CPU-bound (i.e. CPU quantums is the primary resource that it consumes while its demands for I/O and RAM are negligible). After a careful examination, you decided that writing a shell script is the best way to express processing of a single file. Naturally, you may like to speed up the processing by using a CPU with multiple cores. Hence the question: how to schedule processing jobs in order to prevent thrashing while maximising occupancy of CPU cores at the same time?

An obvious solution is to to distribute jobs among CPU cores on the first-vacant basis while setting core affinity of every job (that is, confining a job to a particular core). This solution, however, involves much delicate programming, it will not be portable, and it may be surprisingly inefficient in run-time because this solution does not cooperate with the kernel's scheduler, causing it to jump through hoops while satisfying that solution's capricious insistence (which is very often incorrect and counter-productive) on a particular allocation of system resources. The kernel knows better, for it sees the complete picture.

The correct way of solving the problem is by letting the kernel know that there is a job to do—when the time comes. In this case, the right time to start the next job of the batch is when the number of jobs that are running is less than the number of CPU cores. Leave particulatities of resource management up to the kernel.

Any programmer will (correctly) note that it sounds like a trivial problem that can be solved using a counting semaphore. There is one unpleasant thing though: there are no semaphores in shell, the programmer needs to come up with a mechanism that behaves like one.

The following program implements the idea of a token dispenser with an attendant. The dispenser has two openings or slots: one for fetching tokens and another for returning them. Tokens can be fetched from the dispenser or returned to it one by one. The attendant keeps public order, prevents drunken brawls and snatching of a token by one job from another. In the beginning, the dispenser is loaded with a number of tokens. A job is allowed to start only if there is at least one token in the dispenser. The attendant fetches a token from the dispenser and passes it to the job. The job holds the token during its run-time and returns the token to the dispenser (the attendant is not involved) when it is completed. If the dispenser is empty, the next job waits for a token to be returned by another job that was started earlier.

The complete source code (64 lines of shell) is available here.

The dispenser is implemented in the function jobctrl that expects one parameter, the initial number of tokens in the dispenser:

24  jobctrl ()
25  {
26      rm -f "${FIFO}"; mkfifo -m 0600 "${FIFO}" # Create the feedback channel
27  
28      exec <> "${FIFO}" # Keep both ends of the feedback channel open as stdin
29      
30      # Announce the initial availability of job slots
31      dd if=/dev/zero bs=1 count="${1}" status=none | tr '\000' 'C'
32  
33      while true; do # Process jobs
34  	    # Read a notification of a job's completion or the stop command
35  	    [ "$(dd bs=1 count=1 status=none)" = 'S' ] && break
36  	    echo -n 'C' # Re-announce the slot that just became available
37      done
38  
39      rm -f "${FIFO}" # Remove the feedback channel
40      
41      echo -n '.' # Acknowledge the stop command
42  }

The FIFO that is created on line 26 is the slot for returning tokens. Since FIFO is a global variable, its value is known to jobs which ‘return tokens’ by writing to the FIFO.

Line 28 performs a cheap but critically important trick: it associates the function's standard input stream with the FIFO both for reading and writing, thus

exec is a shell built-in command that, in the absense of its command parameter, still performs the requested I/O redirection. The function itlsef only reads from the FIFO.

Line 31 prints to the standard output stream of the function a string of letters C (for ‘continue’). The number of letters corresponds to the initial number of tokens in the dispenser.

Lines 33–37 contain the main loop of the dispenser. Line 35 blocks the function's execution until a job writes a character to the FIFO (note that dd(1) reads from the standard input stream in this case). Any letter but S is treated as a ‘token return’. The letter S (for ‘stop’) is a special character that signals to the function that it must stop. The main processing loop (described below) writes S to the FIFO when all the processing is complete. Line 36 outputs the letter C to the standard output stream of the function thus letting the attendant know that another token was just returned.

As a special note, dd bs=1 count=1 status=none seems to be the only portable way of reading a character from the standard input in shell. GNU Bash provides the -n and -N options with the built-in command read. DASH, ash, and ksh have none. I did not check the others. I need not say that portability is quite expensive in this case.

Line 39 removes the FIFO from the file system. An attempt to write to FIFO past this point will create a regular file (which, while debugging, can be used a sign that something went wrong, i.e. there are jobs that were not properly waited for, see the description of the main processing loop below).

Line 41 acknowledges the receipt of the stop command by writing a special symol . (ASCII period) to the standard output stream of the function. This symbol is expected by the main processing loop (see below).

The simulator of a job is as follows:

 5  random ()
 6  {
 7      local lim=$((${1}))
 8      local x="$(dd if=/dev/urandom bs=1 count=1 status=none \
 9      	      | hexdump -e '"%d"')"
10      echo "$((${x} % ${lim}))"
11  }
12  
13  job ()
14  {
15      local id="${1}"
16      local seconds="$(random 30)"
17  
18      echo "$(printf "%03d" "${id}"): sleeping for ${seconds} seconds."
19      sleep "${seconds}"
20      
21      echo -n 'C' > "${FIFO}"
22  }

On lines 5–11 there is a helper function random that expects one parameter, an integer lim, and prints a pseudo-random number in the interval [0..lim].

The function job on lines 13–22 simulates a job by printing the value of its only parameter (an integer identifier) and sleeping for a random period of time that does not exceed 29 seconds. When the function wakes up it ‘returns the token’ that it holds by writing the letter C to the FIFO on line 21.

I promised the description of the main processing loop:

44  process ()
45  {
46      local limit="${1}"
47      local id=0
48  
49      while [ "${id}" -lt "${limit}" ]; do
50  	    dd of=/dev/null bs=1 count=1 status=none
51  	    job "${id}" &
52  	    id="$((${id} + 1))"
53      done
54  
55      wait # for the background jobs to terminate
56  
57      # Send the stop command to the job controller 
58      echo -n 'S' > "${FIFO}"
59  
60      # Wait for the job controller to acknowledge the stop command
61      dd of=/dev/null bs=1 count=1 status=none
62  }

The function process expects one parameter, an integer number of jobs in the batch. The main loop that iterates over all jobs in the batch is on lines 49–53. On line 50 the function blocks its execution until a single character (the letter C that the function jobctrl prints) appears on its standard input stream: this is the act of passing a token to the next job of the batch by the attendant. As soon as attendant passed the token, the job is started in the background on line 51.

As soon as all jobs of the batch were launched, the function waits for termination of all background processes on line 55.

When the wait is over, the function writes the letter S to the FIFO, thus letting the function jobctrl that it is time to stop its main loop.

On line 61 the function blocks its execution until a single character (the ASCII period . that the function jobctrl prints) appears on its standard input stream. The function exits ss soon as that character arrives.

‘—Where is the attendant?’, you might ask. Here it is, the pipe between jobctrl and process, on line 64:

64  jobctrl "${1}" | process "${2}"

Please note that the mechanism that I described above is quite slow in itself, it fits well when you can sacrifice some speed for simplicity, manageability, and portability. (The program is portable: I wrote it using ksh on OpenBSD and tested using dash and bash on a machine running Debian GNU/Linux 9.8 (stretch), which are quite different systems.)

The weak spot of the program is the bottleneck that is created by the main loop inside the function jobctrl. If your jobs are very quick (shorter than about quarter of a second), the read speed of the loop becomes critical for keeping up with job switching (the mandatory sequence of open(2), write(2), and close(2) at the end of each job also does not contribute to the overall speed), while there is an execve(2) of dd(1) in the middle of the loop.

However, if you do not expect to have more than several tens of job switches per second (it should be easy to estimate), this program may suit your needs.

Another issue that you should consider is job tracking: the program completely ignores it. If you have a ‘runaway’ job that fails to return the token that it holds, the performance may immediately decrease. Signals that cause process suspension or termination are a related problem. As with any shell script, you should provide proper traps that clean up jobs that are blocked on I/O that will never happen.

Vadim Penzin, March 27th, 2019


I hereby place this article into the public domain.
I publish this information in the hope that it will be useful, but without ANY WARRANTY.
You are responsible for any and all consequences that may arise as the result of using this information.