Process substitution

Feature of some Unix shells From Wikipedia, the free encyclopedia

In computing, process substitution is a form of inter-process communication that allows the input or output of a command to be made available as a file path. The command is substituted in-line, where a file name would normally occur, by the command shell. This allows programs that normally only accept file paths to directly read from or write to another program.

History

Process substitution was first introduced by the KornShell from Bell Labs and first documented in ksh86 from 1986[1] initially only available on systems with support for the /dev/fd/n character device files. The rc shell provides the feature as "pipeline branching" in Version 10 Unix, released in 1990 with a different syntax[2]. The Z Shell has had the feature since its initial 1.0 version released in 1990[3] and the Bash shell since version 1.13.4, released in 1993[4] both using the Korn Shell syntax.

Zsh (since its first version in 1990), added a third form of process substitution (=(cmd)) that uses a temporary file instead of a pipe. The fish shell added its own process substitution with yet another syntax in 2005, using pipes, then a variant using temporary files in 2007 which then became the default (initially inadvertently) as the pipe variant suffered from deadlocks[5].

Syntax

More information Korn/Bash, Zsh ...
Korn/BashZshRcFish
Pass output of cmd2 as file for cmd1 to read
(concurrently)
cmd1 <(cmd2)cmd1 <(cmd2)cmd1 <{cmd2}cmd1 (cmd2|psub -F)[α]
Pass input of cmd2 as file for cmd1 to write into
(concurrently)
cmd1 >(cmd2)cmd1 >(cmd2)cmd1 >{cmd2}
Pass path of temporary file with contents of cmd2 to cmd1
(sequentially)
cmd1 =(cmd2)cmd1 (cmd2|psub -f)[α]
Close
  1. psub alone since 2015 is an alias for psub -f as psub -F is currently broken as running the commands sequentially.

Examples

The following examples use KornShell syntax.

The Unix diff command normally accepts the names of two files to compare, or one file name and standard input. Process substitution allows one to compare the output of two programs directly:

$ diff <(sort file1) <(sort file2)

The <(command) expression tells the command interpreter to run command and make its output appear as a file. The command can be any arbitrarily complex shell code.

Without process substitution, the alternatives are:

  1. Save the output of the command(s) to a temporary file, then read the temporary file(s).
    $ sort file2 > file2.sorted
    $ sort file1 | diff - file2.sorted
    $ rm file2.sorted
    
  2. Create a named pipe (also known as a FIFO), start one command writing to the named pipe in the background, then run the other command with the named pipe as input.
    $ mkfifo sort2.fifo
    $ sort file2 > sort2.fifo &
    $ sort file1 | diff - sort2.fifo
    $ rm /tmp/sort2.fifo
    
  3. On systems with /dev/fd/n support, perform the KornShell approach by hand:
    $ sort file1 | { sort file2 3<&- | diff /dev/fd/3 -; } 3<&0
    

All of which are more cumbersome.

Process substitution can also be used to capture output that would normally go to a file, and redirect it to the input of a process. The KornShell syntax for writing to a process is >(command). Here is an example using the tee, wc and gzip commands that counts the lines in a file with wc -l and compresses it with gzip in one pass:

$ tee >(wc -l >&2) < bigfile | gzip > bigfile.gz

Advantages

The main advantages of process substitution over its alternatives are:

  • Simplicity: The commands can be given in-line; there is no need to save temporary files or create named pipes first.
  • Performance: Reading directly from another process is often faster than having to write a temporary file to disk, then read it back in. This also saves disk space.
  • Parallelism: The substituted process can be running concurrently with the command reading its output or writing its input, taking advantage of multiprocessing to reduce the total time for the computation.

Mechanism

Under the hood, process substitution variants where commands run concurrently have two possible implementations. On systems which support /dev/fd (most Unix-like systems) most implementations work by calling the pipe() system call, which returns a file descriptor $fd for a new anonymous pipe, then creating the string /dev/fd/$fd, and substitutes that on the command line. On systems without /dev/fd support, they call mkfifo() with a new temporary filename to create a named pipe, and substitute this filename on the command line. To illustrate the steps involved, consider the following simple command substitution on a system with /dev/fd support:

$ diff file1 <(sort file2)

The steps the shell performs are:

  1. Create a new anonymous pipe. This pipe will be accessible with something like /dev/fd/63; you can see it with a command like echo <(true).
  2. Execute the substituted command in the background (sort file2 in this case), piping its output to the anonymous pipe.
  3. Simultaneously, execute the primary command, replacing the substituted command with a path to the anonymous pipe. In this case, the full command might expand to something like diff file1 /dev/fd/63.
  4. When execution is finished, close the anonymous pipe.

For named pipes, the execution differs solely in the creation and deletion of the pipe; they are created with mkfifo() (which is given a new temporary file name) and removed with unlink(). All other aspects remain the same.

The variants using temporary files such as Zsh's cmd1 =(cmd2) or Fish's cmd1 (cmd2|psub -f) first run cmd2 with its output redirected to a temporary file, then run cmd1 with =(cmd2) substituted with the path of that temporary file and automatically delete that file after cmd1 terminates.

Limitations

Except for the variants using temporary files, the "files" created are not seekable nor mmap'apble, which means the process reading or writing to the file cannot perform random access; it must read or write once from start to finish. Programs that explicitly check the type of a file before opening it may refuse to work with process substitution, because the "file" resulting from process substitution is not a regular file. Additionally, in the KornShell, Bash versions prior to 4.4 (2016) and Zsh versions prior to 5.6 (2018), it is not possible to obtain the exit status of a process substitution command from the shell that created the process substitution. [6]

See also

References

Further reading

Related Articles

Wikiwand AI