* NEWS: Mention the change.
* bootstrap.conf (gnulib_modules): Add posix_spawn,
posix_spawnattr_setsigdefault, posix_spawn_file_actions_addclose,
posix_spawn_file_actions_adddup2, and posix_spawn_file_actions_init.
* src/split.c: Include spawn.h.
(create): Use posix_spawn instead of fork and execl.
* NEWS: Mention the improvement.
* src/fmt.c (put_line): Exit if any error writing line.
(flush_paragraph): Exit if any error writing buffer.
* tests/misc/write-errors.sh: Enable the (flush_paragraph) test case,
and add another to check the put_line() case.
* src/numfmt.c (process line): Inspect the stdio error state when
outputting each line so that we don't have to check each output function
but do eventually exit upon write error, while also remaining buffered.
(main): Also check when outputting a header for the edge case
of very long headers.
* tests/misc/write-errors.sh: Enable the numfmt test case.
* NEWS: Mention the improvement, and reorganize all numfmt improvements.
* NEWS: Mention the change.
* bootstrap.conf (gnulib_modules): Add posix_spawnattr_destroy,
posix_spawnattr_init, posix_spawnattr_setflags, and posix_spawnp.
* src/install.c (strip): Use posix_spawnp instead of fork and execlp.
* tests/numfmt/mb-non-utf8.sh: Test GB18030 delimiter search.
* tests/local.mk: Reference the new test, and move
the existing numfmt.pl test from tests/misc to tests/numfmt.
* src/numfmt.c (is_utf8_charset): A new function to efficiently
determine if running with a UTF-8 charset.
(mbsmbchr): A new function to efficiently search for
a (multi-byte) character in a multi-byte string.
(next-field): Use mbsmbchr() rather than mbstr() directly.
* bootstrap.conf: Depend on mbsstr() to robustly search for a
multi-byte delimiter character (string) within a multi-byte string.
* src/numfmt.c (main): Accept a valid multi-byte delimiter character.
(next_field): Adjust delimiter search from single byte
to multi-byte aware. Use mbsstr to find the first match.
* tests/misc/numfmt.pl: Add test case.
* NEWS: Mention the improvement.
* src/numfmt.c (process_suffixed_number): Use gnulib's
mbs_endswith() helper, which is more robust in non UTF-8 locales.
Also always output a devmsg if a suffix is specified.
* src/numfmt.c (process_line): Restore byte overwritten with NUL,
as it may be part of a multi-byte blank.
(process_suffixed_number): Skip multi-byte blanks,
and correctly determine width with mbswidth().
(parse_format_string): Use c_isblank() to explicitly
indicate that's all the format spec supports.
* tests/misc/numfmt.pl: Add test cases.
* NEWS: Mention the bug fix.
Output, accept, or disallow a string between the number and unit
as recommended in <https://physics.nist.gov/cuu/Units/checklist.html>
I.e. support outputting numbers of the form: "1234 M"
* src/numfmt.c (simple_strtod_human): Skip unit separator if present,
or disallow a unit separator if empty.
(double_to_human): Output unit separator if specified.
(main): Accept --unit-separator.
* tests/misc/numfmt.pl: Add test cases.
* doc/coreutils.texi: Describe the new option,
giving examples of interaction with --delimiter.
* NEWS: Mention the new feature.
* THANKS.in: Add Johannes Schauer Marin Rodrigues,
who provided a preliminary patch.
This does not validate grouping character placement,
and currently just ignores grouping characters.
* src/numfmt.c (simple_strtod_int): Skip grouping chars
that are part of a number.
* tests/misc/numfmt.pl: Add test cases.
* NEWS: Mention the improvement.
* src/numfmt.c (simple_strtod_human): Accept (multi-byte)
non-breaking space character between number and unit.
Note we restrict this to a single character between number
and unit, to allow less ambiguous parsing if multiple blanks
are used to delimit fields.
* tests/misc/numfmt.pl: Add test cases.
* doc/coreutils.texi (numfmt invocation): Fix stale description
--delimiter skipping whitespace.
* NEWS: Mention the improvement.
* tests/du/bigtime.sh: At least on Linux, the ext4 filesystem
doesn't support such large timestamp, while tmpfs does. Try a bit
harder to look for a filesystem with large timestamp support.
Rename to less redundant names, now that we use
a separate test directory per util.
* tests/basenc/basenc-bounded-memory.sh -> .../bounded-memory.sh
* tests/basenc/basenc-large.sh -> .../large-input.sh
* tests/local.mk: Reference new names.
* src/digest.c (split_3): Mention the ambiguity in misinterpreting
base64 characters as hex is not a practical consideration.
Also add an example of both tagged formats which makes it
easier to interpret the parsing logic.
* src/digest.c (split_3): Fallback to untagged matching in the
case where -a is specified and we have matched a TAG in
the possibly base64 data. This might happen in 1 in every 64K files.
Note we remove the modification of string S (and redundant streq) in
the tag matching, as that was not needed since v8.32-223-g217cd278e.
* tests/cksum/cksum-c.sh: Add a test case.
* NEWS: Mention the bug fix.
* src/digest.c (sha2_sum_stream): Change from unreachable()
to affirm() so that we have defined behavior unless
we configure with --disable-assert.
(sha3_sum_stream): Likewise.
(split_3): Validate SHA2-lengths before passing on.
* tests/cksum/cksum-c.sh: Add a test case.
* NEWS: Mention the bug fix.
* src/digest.c (split_3): Look up the provided tag with -a sha2
because there is not a 1:1 mapping between them.
* tests/cksum/cksum-c.sh: Add a test case.
* NEWS: Mention the bug fix.
* src/remove.c (rm_fts): When not recursive,
arrange for ‘rm -d DIR’ to behave more like ‘rmdir DIR’.
This works better for Ceph snapshot directories.
Problem reported by Yannick Le Pennec (bug#78245).
* NEWS: Mention the bug.
* src/digest.c (split_3): Check that the base64 digest matches the
length supported by the algorithm.
(digest_check): Check that the read digest matches the base64 length of
the algorithm's digest. The previous condition would not work for
'cksum -a blake2b -l 8 ...'.
* tests/cksum/cksum-base64-untagged.sh: New file.
* tests/local.mk (all_tests): Add the new test.
* configure.ac (FORTIFY_SOURCE): Don’t indent a line
where the indentation can cause trailing white space in config.h.
Problem reported by Grisha Levit (Bug#79567).
This avoids CWE-122: Heap-based Buffer Overflow
where we could write blank characters beyond
the allocated heap buffer.
* src/expand-common.c (set_max_column_width): Refactor function from ...
(add_tab_stop): ... here.
(set_extend_size): Call new function.
(set_increment_size): Likewise.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/79555
This is significant for the date, od, and pr commands
which have options of the form -X[OPTIONAL], which change like:
diff -r man.orig/date.1 man/date.1
< \fB\-I[FMT]\fR, \fB\-\-iso\-8601\fR[=\fI\,FMT\/\fR]
> \fB\-I\fR[\fI\,FMT\/\fR], \fB\-\-iso\-8601\fR[=\fI\,FMT\/\fR]
diff -r man.orig/od.1 man/od.1
< \fB\-w[BYTES]\fR, \fB\-\-width\fR[=\fI\,BYTES\/\fR]
> \fB\-w\fR[\fI\,BYTES\/\fR], \fB\-\-width\fR[=\fI\,BYTES\/\fR]
* man/help2man (convert_options): Support options of the form
-X[PARAM], so that we now consistently format them (in italics).
This changes a few pages, but the changes in tail.1
concisely illustrate the resulting man page changes:
$ diff -r man.orig/tail.1 man/tail.1
< \fB\-c\fR, \fB\-\-bytes\fR=\fI\,[\/\fR+]NUM
> \fB\-c\fR, \fB\-\-bytes\fR=\fI\,[+]NUM\/\fR
< \fB\-f\fR, \fB\-\-follow[=\fR{name|descriptor}]
> \fB\-f\fR, \fB\-\-follow\fR[=\fI\,{name|descriptor}\/\fR]
< \fB\-n\fR, \fB\-\-lines\fR=\fI\,[\/\fR+]NUM
> \fB\-n\fR, \fB\-\-lines\fR=\fI\,[+]NUM\/\fR
* man/help2man: Relax the option match so more --option
variations are supported, and passed through to convert_option().
Specifically more variations after '=' are now supported.
Also split and document the regular expression.
Reported at https://github.com/coreutils/coreutils/issues/84
* gl/modules/mbbuf: New file.
* gl/lib/mbbuf.c: Likewise.
* gl/lib/mbbuf.h: Likewise.
* gl/local.mk (EXTRA_DIST): Add the new files.
* bootstrap.conf (gnulib_modules): Add mbbuf.
* src/fold.c: Include mbbuf.h.
(fold_file): Use the mbbuf functions instead of calling fread and
handling the input buffer ourselves.
* cfg.mk (exclude_file_name_regexp--sc_preprocessor_indentation)
(exclude_file_name_regexp--sc_GPL_version): Match gl/lib/mbbuf.c and
gl/lib/mbbuf.h.
* configure.ac: Add detection of AVX512 intrinsics for wc.
* src/local.mk: Build AVX512 wc libraries.
* src/wc.c: Add runtime detection of AVX512 intrinsics and call
appropriate function when detected.
* src/wc.h (wc_lines_avx512): Declare function.
* tests/wc/wc-cpu.sh: Add a test that disables AVX512 intrinsics.
* src/wc_avx512.c: New file containing the wc -l implementation using
AVX512. The logic and code is reused from the AVX2 implementation with
slight adaptations. Replaced __builtin_popcount by __builtin_popcountll
and the combination of _mm256_cmpeq_epi8 and _mm256_movemask_epi8 by a
single call to _mm512_cmpeq_epi8_mask.
* NEWS: Mention the improvement.
A coverage report indicated these weren't tested
(as generally the test shell builtin is used).
* tests/test/test-file.sh: Add a new test.
* tests/local.mk: Reference the new test.
Prompted by the syntax-check failure:
maint.mk: the above files include argmatch.h but don't use it
make: *** [maint.mk:741: sc_prohibit_argmatch_without_use] Error 1
* src/join.c: Remove include, as the previous commit changed from using
ARRAY_CARDINALITY to countof - for which system.h includes the header.
* src/tail.c (file_lines): Seek to the previous block instead of the
beginning (or a little before) of the block that was just scanned.
Otherwise, the same block is read and scanned (at least partially)
again. This bug was introduced by commit v9.7-219-g976f8abc1.
* tests/tail/basic-seek.sh: Add a new test.
* tests/local.mk: Reference the new test.
* NEWS: mention the bug fix.