Non blocking outputs can be seen for example
when piping telnet through tee to a terminal.
In that case telnet sets its input to nonblocking mode,
which results in tee's output being nonblocking,
in which case in may receive an EAGAIN error upon write().
The same issue was seen with mpirun.
The following can be used to reproduce this
locally at a terminal (in most invocations):
$ { dd iflag=nonblock count=0 status=none;
dd bs=10K count=10 if=/dev/zero status=none; } |
tee || echo fail >/dev/tty
* src/iopoll.c (iopoll_internal): A new function refactored from
iopoll(), to also support a mode where we check the output
descriptor is writeable.
(iopoll): Now refactored to just call iopoll_internal().
(fwait_for_nonblocking_write): A new internal function which
uses iopoll_internal() to wait for writeable output
if an EAGAIN or EWOULDBLOCK was received.
(fwrite_nonblock): An fwrite() wrapper which uses
fwait_for_nonblocking_write() to handle EAGAIN.
(fclose_nonblock): Likewise.
src/iopoll.h: Add fclose_nonblock, fwrite_nonblock.
src/tee.c: Call fclose_nonblock() and fwrite_nonblock wrappers,
instead of the standard functions.
* tests/misc/tee.sh: Add a test case.
* NEWS: Mention the improvement.
The idea was suggested by Kamil Dudka in
https://bugzilla.redhat.com/1615467
* bootstrap.conf (gnulib_modules): Add free-posix, tmpfile.
* src/split.c (copy_to_tmpfile): New function.
(input_file_size): Use it to split larger files when sizes cannot
easily be determined via fstat or lseek. See Bug#61386#235.
* tests/split/l-chunk.sh: Mark tests of /dev/zero as
very expensive since they exhaust /tmp.
This was introduced recently with commit v9.1-166-g6b12e62d9
* src/tee.c (tee_files): Check the return from fopen()
before passing to fileno() etc.
* tests/misc/tee.sh: Add a test case.
Problem reported by Pádraig Brady (Bug#61386#226).
* src/split.c (parse_chunk): Use die instead of error.
(main): Quote a string.
* tests/local.mk (all_root_tests): Move du/apparent.sh from here ...
(all_tests): ... to here.
Problem reported by Christoph Anton Mitterer (Bug#61884).
* src/du.c (process_file): When counting apparent sizes, count
only usable st_size members.
* tests/du/apparent.sh: New file.
* tests/local.mk (all_root_tests): Add it.
* src/split.c (create): Avoid fstat + ftruncate in the usual case
where the output file does not already exist, by trying
to create it with O_EXCL first. This costs a failed open
in the unusual case where the output file already exists,
but that’s OK.
Prefer signed types to uintmax_t, as this allows for better
runtime checking with gcc -fsanitize=undefined.
Also, when an integer overflows just use the maximal value
when the code will do the right thing anyway.
* src/split.c (set_suffix_length, bytes_split, lines_split)
(line_bytes_split, lines_chunk_split, bytes_chunk_extract)
(lines_rr, parse_chunk, main):
Prefer a signed type (typically intmax_t) to uintmax_t.
(strtoint_die): New function.
(OVERFLOW_OK): New macro. Use it elsewhere, where we now allow
LONGINT_OVERFLOW because the code then does the right thing on all
practical platforms (they have int wide enough so that it cannot
be practically exhausted). We can do this now that we can safely
assume intmax_t has at least 64 bits.
(parse_n_units): New function.
(parse_chunk, main): Use it.
(main): Do not worry about integer overflow when the code
will do the right thing anyway with the extreme value.
Just use the extreme value.
* tests/split/fail.sh: Adjust to match new behavior.
* src/split.c (bytes_split, lines_chunk_split)
(bytes_chunk_extract, main): Prefer ssize_t to size_t when
representing the return value of ‘read’. Use a negative value
instead of SIZE_MAX to indicate a missing value.
* src/split.c: Include sys-limits.h, not safe-read.h.
(input_file_size, bytes_split, lines_split, line_bytes_split)
(lines_chunk_split, bytes_chunk_extract, lines_rr): Call read, not
safe_read, since safe_read no longer buys us anything.
(main): Reject outlandish buffer sizes right away,
rather than allocating huge buffers and never using them.
* src/split.c (closeout): There should be no need for a special
case for ECHILD, since we never wait for the same child twice.
Simplify with this in mind.
Ignore and default SIGPIPE, rather than blocking and unblocking it.
* src/split.c (default_SIGPIPE):
New static var, replacing oldblocked and newblocked.
(create): Use it.
(main): Set it.
* src/split.c (input_file_size): Do not bother with lseek if the
initial read probe reaches EOF, since the file size is known then.
This works better on macOS, which doesn’t allow lseek on /dev/null.
Do not special-case size-zero files, as the issue can occur
with any size file (though /proc files are the most common).
If the current position is past end of file, treat this as
size zero regardless of whether the file has a usable st_size.
Pass through lseek -1 return values rather than using ‘return -1’;
this makes the code a bit easier to analyze (and a bit faster).
Avoid undefined behavior if the size calculation overflows.
(lines_chunk_split): Do not bother with lseek if it would have
no effect if successful. This works better on macOS, which
doesn’t allow lseek on /dev/null.
* tests/split/l-chunk.sh: Adjust to match fixed behavior.
* src/split.c (bytes_split): New arg REM_BYTES.
Use this to split more evenly. All callers changed.
(lines_chunk_split, bytes_chunk_extract):
Be consistent with new byte_split.
* tests/split/b-chunk.sh, tests/split/l-chunk.sh: Test new behavior.
* src/split.c (lines_chunk_split): Simplify by having chunk_end
point to the first byte after the chunk, rather than to the last
byte of the chunk. This will reduce confusion once we allow
chunks to be empty.
* src/tee.c (pipe_check): Make this a local var instead
of a static var. This suppresses a -Wmaybe-uninitialized
diagnostic with gcc 12.2.1 20221121 (Red Hat 12.2.1-4).
(main): Don’t set pipe_check unnecessarily if a later
-p option overrides an earlier one that wants pipe_check.
Problem discovered when I investigated the GCC warning.
* src/tail.c (check_output_alive): Reuse iopoll()
rather than directly calling poll() or select().
* src/iopoll.c (iopoll): Refactor to support non blocking operation,
or ignoring descriptors by passing a negative value.
* src/iopoll.h (iopoll): Adjust to support a BLOCK parameter.
* src/tee.c (tee_files): Adjust iopoll() call to explicitly block.
* src/local.mk: Have tail depend on iopoll.c.
* src/tee.c (usage): Change from describing one (non pipe) aspect
to the more general point of being the option to use if working with
pipes, and referencing the more detailed info below.
* doc/coreutils.texi (tee invocation): s/standard/appropriate/ since
the standard operation with pipes is to exit immediately upon write
error. s/early/immediately/ as it's ambiguous as to what "early"
is in relation to.
If input is intermittent (a tty, pipe, or socket), and all remaining
outputs are pipes (eg, >(cmd) process substitutions), exit early when
they have all become broken pipes (and thus future writes will fail),
without waiting for more input to become available, as future write
attempts to these outputs will fail (SIGPIPE/EPIPE).
Only provide this enhancement when pipe errors are ignored (-p mode).
Note that only one output needs to be monitored at a time with iopoll(),
as we only want to exit early if _all_ outputs have been removed.
* src/tee.c (pipe_check): New global for iopoll mode.
(main): enable pipe_check for -p, as long as output_error ignores EPIPE,
and input is suitable for iopoll().
(get_next_out): Helper function for finding next valid output.
(fail_output, tee_files): Break out write failure/output removal logic
to helper function.
(tee_files): Add out_pollable array to track which outputs are suitable
for iopoll() (ie, that are pipes); track first output index that is
still valid; add iopoll() broken pipe detection before calling read(),
removing an output that becomes a broken pipe.
* src/local.mk (src_tee_SOURCES): include src/iopoll.c.
* NEWS: Mention tee -p enhancement in Improvements.
* doc/coreutils.texi: Mention the new early exit behavior in the nopipe
modes for the tee -p option.
Suggested-by: Arsen Arsenović <arsen@aarsen.me>
When a program's output becomes a broken pipe, future attempts to write
to that ouput will fail (SIGPIPE/EPIPE). Once it is known that all
future write attepts will fail (due to broken pipes), in many cases it
becomes pointless to wait for further input for slow devices like ttys.
Ideally, a program could use this information to exit early once it is
known that future writes will fail.
Introduce iopoll() to wait on a pair of fds (input & output) for input
to become ready or output to become a broken pipe.
This is relevant when input is intermittent (a tty, pipe, or socket);
but if input is always ready (a regular file or block device), then
a read() will not block, and write failures for a broken pipe will
happen normally.
Introduce iopoll_input_ok() to check whether an input fd is relevant
for iopoll().
Experimentally, broken pipes are only detectable immediately for pipes,
but not sockets. Errors for other file types will be detected in the
usual way, on write failure.
Introduce iopoll_output_ok() to check whether an output fd is suitable
for iopoll() -- namely, whether it is a pipe.
iopoll() is best implemented with a native poll(2) where possible, but
fall back to a select(2)-based implementation platforms where there are
portability issues. See also discussion in tail.c.
In general, adding a call to iopoll() before a read() in filter programs
also allows broken pipes to "propagate" backwards in a shell pipeline.
* src/iopoll.c, src/iopoll.h (iopoll): New function implementing broken
pipe detection on output while waiting for input.
(IOPOLL_BROKEN_OUTPUT, IOPOLL_ERROR): Return codes for iopoll().
(IOPOLL_USES_POLL): Macro for poll() vs select() implementation.
(iopoll_input_ok): New function to check whether an input fd is relevant
for iopoll().
(iopoll_output_ok): New function to check whether an input fd is
suitable for iopoll().
* src/local.mk (noinst_HEADERS): add src/iopoll.h.
* NEWS: Mention the fts fix to avoid the following assert
in rm on mem pressure:
Program terminated with signal SIGSEGV, Segmentation fault.
at ../lib/cycle-check.c:60
assure (state->magic == CC_MAGIC);
* gnulib: Update to the latest to pick up fts commit f17d3977.
* tests/rm/empty-inacc.sh: Ensure we're not reading from stdin
when we're relying on no prompt to proceed. Also change the
file being tested so that a failure in one test doesn't impact
following tests causing a framework failure.
gdb was seen to hang intermittently on macOS 12.
Also gdb requires signing on newer macOS systems:
https://sourceware.org/gdb/wiki/PermissionsDarwin
So restrict its use on macOS systems for now.
* tests/rm/r-root.sh: Skip on darwin systems.
* tests/tail-2/inotify-race.sh: Restrict the test to
inotify capable systems to avoid the hang with some gdbs.
* tests/tail-2/inotify-race.sh: Likewise.
Upcomming gnulib changes may disable SEEK_HOLE
even if the system supports it, so dynamically
check if we've SEEK_HOLE enabled.
* init.cfg (seek_data_capable_): SEEK_DATA may be disabled in the build
if the system support is deemed insufficient, so also use `cp --debug`
to determine if it's enabled.
* tests/cp/sparse-2.sh: Adjust to a more general diagnostic.
* tests/cp/sparse-extents-2.sh: Likewise.
* tests/cp/sparse-extents.sh: Likewise.
* tests/cp/sparse-perf.sh: Likewise.
How a file is copied is dependent on the sparseness of the file,
what file system it is on, what file system the destination is on,
the attributes of the file, and whether they're being copied or not.
Also the --reflink and --sparse options directly impact the operation.
Given it's hard to reason about the combination of all of the above,
the --debug option is useful for users to directly identify if
copy offloading, reflinking, or sparse detection are being used.
It will also be useful for tests to directly query if
these operations are supported.
The new output looks as follows:
$ src/cp --debug src/cp file.sparse
'src/cp' -> 'file.sparse'
copy offload: yes, reflink: unsupported, sparse detection: no
$ truncate -s+1M file.sparse
$ src/cp --debug file.sparse file.sparse.cp
'file.sparse' -> 'file.sparse.cp'
copy offload: yes, reflink: unsupported, sparse detection: SEEK_HOLE
$ src/cp --reflink=never --debug file.sparse file.sparse.cp
'file.sparse' -> 'file.sparse.cp'
copy offload: avoided, reflink: no, sparse detection: SEEK_HOLE
* doc/coreutils.texi (cp invocation): Describe the --debug option.
(mv invocation): Likewise.
(install invocation): Likewise.
* src/copy.h: Add a new DEBUG member to cp_options, to control
whether to output debug info or not.
* src/copy.c (copy_debug): A new global structure to
unconditionally store debug into from the last copy_reg operations.
(copy_debug_string, emit_debug): New functions to print debug info.
* src/cp.c: if ("--debug") x->debug=true;
* src/install.c: Likewise.
* src/mv.c: Likewise.
* tests/cp/debug.sh: Add a new test.
* tests/local.mk: Reference the new test.
* NEWS: Mention the new feature.
* src/remove.c (prompt, rm_fts): In the dir-handling code of both of
these functions, relax a "get_dir_status (...) == DS_EMPTY" condition
to instead test only "get_dir_status (...) != 0", enabling flow control
to reach the prompt function also for unreadable directories. However,
that function itself also needed special handling for this case:
(prompt): Handle empty, inaccessible directories properly,
deleting them with -d (--dir), and prompting about whether to delete
with -i (--interactive).
* tests/rm/empty-inacc.sh: Add tests for the new code.
Reported by наб <nabijaczleweli@nabijaczleweli.xyz> in
bugs.debian.org/1015273
* NEWS (Bug fixes): Mention this.
* tests/chmod/setgid.sh: Try all the groups you’re a member of,
in case id -g returns 4294967295 (nogroup) which is special
and does not let you chgrp a file to it.
* init.cfg (groups): Port better to macOS 12, where
group 4294967295 (nogroup) is special: you can be a member
without being able to chgrp files to the group.
* src/copy.c: Some changes if HAVE_FCLONEFILEAT && !USE_XATTR.
(fd_has_acl): New function.
(CLONE_ACL): Default to 0.
(copy_reg): Use CLONE_NOFOLLOW to avoid races like CVE-2021-30995
<https://www.trendmicro.com/en_us/research/22/a/
analyzing-an-old-bug-and-discovering-cve-2021-30995-.html>.
Use CLONE_ACL if available and working, falling back to cloning
without it if it fails due to EINVAL.
If the only problem with fclonefileat is that it would create the
file with the wrong timestamp, or with too few permissions,
do that but fix the timestamp and permissions afterwards,
rather than falling back on a traditional copy.
* src/copy.c (infer_scantype): Do not set *SCAN_INFERENCE
when returning a value other than LSEEK_SCANTYPE.
This is just minor refactoring; it simplifies the code a bit.
Callers are uneffected.
doc: document --preserve=mode better
* src/tail (tail_forever): Attempt to read() from non blocking
single non regular file, which shouldn't block, but also
read data even when the mtime doesn't change.
* NEWS: Mention the improvement.
* THANKS.in: Thanks for detailed testing.
This was seen to be an issue when following a
symlink that was being updated to point to
different underlying devices.
* src/tail.c (recheck): Guard the lseek() call to only
be performed for regular files.
* NEWS: Mention the bug fix.
--raw output is the most composable format, and also is a
robust way to discard the file name without parsing (escaped) output.
Examples:
$ cksum --raw -a crc "$afile" | basenc --base16
4ACFC4F0
$ cksum --raw -a crc "$afile" | basenc --base2msbf
01001010110011111100010011110000
$ cksum --raw -a sha256 "$bfile" | basenc --base32
AAAAAAAADHLGRHAILLQWLAY6SNH7OY5OI2RKNQLSWPY3MCUM4JXQ====
* doc/coreutils.texi (cksum invocation): Describe the new feature.
* src/digest.c (output_file): Inspect the new RAW_DIGEST global,
and output the bytes directly if set.
* src/cksum.c (output_crc): Likewise.
* src/sum.c (output_bsd, output_sysv): Likewise.
* tests/misc/cksum-raw.sh: A new test.
* tests/local.mk: Reference the new test.
* NEWS: Mention the new feature.