Do you ever get the funny feeling that your computer isn't quite as fast as it should be? I used to feel that way, and then I found GNU Parallel.
GNU Parallel is a shell utility for executing jobs in parallel. It can parse multiple inputs, thereby running your script or command against sets of data at the same time. You can use all your CPU at last!
If you've ever used xargs
, you already know how to use Parallel. If you don't, then this article teaches you, along with many other use cases.
Installing GNU Parallel
GNU Parallel may not come pre-installed on your Linux or BSD computer. Install it from your repository or ports collection. For example, on Fedora:
$ sudo dnf install parallel
Or on NetBSD:
If all else fails, refer to the project homepage[1].
From serial to parallel
As its name suggests, Parallel's strength is that it runs jobs in parallel rather than, as many of us still do, sequentially.
When you run one command against many objects, you're inherently creating a queue. Some number of objects can be processed by the command, and all the other objects just stand around and wait their turn. It's inefficient. Given enough data, there's always going to be a queue, but instead of having just one queue, why not have lots of small queues?
Imagine you have a folder full of images you want to convert from JPEG to PNG. There are many ways to do this. There's the manual way of opening each image in GIMP and exporting it to the new format. That's usually the worst possible way. It's not only time-intensive, it's labor-intensive.
A pretty neat