Since they I/O bound off of the same hardware, and sometimes also CPU
bound when compiling things, running them in parallel only slows
things down.
Closes: #28
As long as the image isn't used later on, there is no point in the
automatic `get` after `put`.
Also limit the inputs that each `put` step uses.
Closes: #29