Reading Output From Child Process Using Python

June 25, 2024 Post a Comment

The Context I am using the subprocess module to start a process from python. I want to be able to access the output (stdout, stderr) as soon as it is written/buffered. The soluti

Solution 1:

At issue here is buffering by the child process. Your subprocess code already works as well as it could, but if you have a child process that buffers its output then there is nothing that subprocess pipes can do about this.

I cannot stress this enough: the buffering delays you see are the responsibility of the child process, and how it handles buffering has nothing to do with the subprocess module.

You already discovered this; this is why adding sys.stdout.flush() in the child process makes the data show up sooner; the child process uses buffered I/O (a memory cache to collect written data) before sending it down the sys.stdout pipe .

Python automatically uses line-buffering when sys.stdout is connected to a terminal; the buffer flushes whenever a newline is written. When using pipes, sys.stdout is not connected to a terminal and a fixed-size buffer is used instead.

Now, the Python child process can be told to handle buffering differently; you can set an environment variable or use a command-line switch to alter how it uses buffering for sys.stdout (and sys.stderr and sys.stdin). From the Python command line documentation:

-u Force stdin, stdout and stderr to be totally unbuffered. On systems where it matters, also put stdin, stdout and stderr in binary mode.
[...]
PYTHONUNBUFFERED If this is set to a non-empty string it is equivalent to specifying the -u option.

If you are dealing with child processes that are not Python processes and you experience buffering issues with those, you'll need to look at the documentation of those processes to see if they can be switched to use unbuffered I/O, or be switched to more desirable buffering strategies.

One thing you could try is to use the script -c command to provide a pseudo-terminal to a child process. This is a POSIX tool, however, and is probably not available on Windows.

Baca Juga

It should be noted that when flushing a pipe, no data is 'written to disk'; all data remains entirely in memory here. I/O buffers are just memory caches to get the best performance out of I/O by handling data in larger chunks. Only if you have a disk-based file object would fileobj.flush() cause it to push any buffers to the OS, which usually means that data is indeed written to disk.