On macOS, isspace(0x85) returns true,
which results in splitting within multi-byte characters.
* src/fmt.c (get_line): s/isspace/c_isspace/.
* tests/fmt/non-space.sh: Add a new test.
* tests/local.mk: Reference new test.
* NEWS: Mention the fix.
Addresses https://bugs.gnu.org/54124
* tests/misc/wc-nbsp.sh: Only the en_US.iso8859-1 form
is accepted on macOS 10.15.7 at least. GNU/Linux also
accepts ISO-8859-1 (and canonicalizes the charmap to this).
This implements my suggestion in Bug#54112.
* src/dd.c (usage): Document the change.
(parse_integer, scanargs): Implement the change.
Omit some now-obsolete checks for invalid flags.
* tests/dd/bytes.sh: Test the new behavior, while retaining
checks for the now-obsolete usage.
* tests/dd/nocache_eof.sh: Avoid now-obsolete usage.
Alias iseek=N to skip=N, oseek=N to seek=N (Bug#45648).
* src/dd.c (scanargs): Parse iseek= and oseek=.
* tests/dd/skip-seek.pl (sk-seek5): New test case.
COLORTERM is an environment used usually to expose truecolor support in
terminal emulators. Therefore support matches on that in addition
to TERM. Also set the default COLORTERM match pattern so that
we apply colors if COLORTERM is any value.
This implicitly supports a terminal like "foot"
without a need for an explicit TERM entry.
* NEWS: Mention the new feature.
* src/dircolors.c (main): Match COLORTERM like we do for TERM.
* src/dircolors.hin: Add default config to match any COLORTERM.
* tests/misc/dircolors.pl: Add test cases.
* NEWS: Mention the new feature.
* doc/coreutils.texi (dircolors invocation): Describe the new
--print-ls-colors option.
* src/dircolors.c (print_ls_colors): A new global to select
between shell or terminal output.
(append_entry): A new function refactored from dc_parse_stream()
to append the entry in the appropriate format.
(dc_parse_stream): Adjust to call append_entry().
* tests/misc/dircolors.pl: Add test cases.
This also affects ls -v in some corner cases.
Problems reported by Michael Debertol <https://bugs.gnu.org/49239>.
While looking into this, I spotted some more areas where the
code and documentation did not agree, or where the documentation
was unclear. In some cases I changed the code; in others
the documentation. I hope things are nailed down better now.
* doc/sort-version.texi: Distinguish more carefully between
characters and bytes. Say that non-identical strings can
compare equal, since they now can. Improve readability in
various ways. Make it clearer that a suffix can be the
entire string.
* src/ls.c (cmp_version): Fall back on strcmp if filevercmp
reports equality, since filevercmp is no longer a total order.
* src/sort.c (keycompare): Use filenvercmp, to treat NULs correctly.
* tests/misc/ls-misc.pl (v_files):
Adjust test to match new behavior.
* tests/misc/sort-version.sh: Add tests for stability,
and for sorting with NUL bytes.
Use more constrained argument matching
to improve forward compatibility and robustness.
For example it's better that `cksum -a sha3` is _not_
equivalent to `cksum -a sha386`, so that a user
specifying `-a sha3` on an older cksum would not be surprised.
Also argmatch() is used when parsing tags from lines like:
SHA3 (filename) = abcedf....
so it's more robust that older cksum instances to fail
earlier in the parsing process, when parsing output from
possible future cksum implementations that might support SHA3.
* src/digest.c (algorithm_from_tag): Use argmatch_exact()
to ensure we don't match abbreviated algorithms.
(main): Likewise.
* tests/misc/cksum-a.sh: Add a test case.
When the destination for mv is a directory, use functions like openat
to access the destination files, when such functions are available.
This should be more efficient and should avoid some race conditions.
Likewise for 'install'.
* src/cp.c (must_be_working_directory, target_directory_operand)
(target_dirfd_valid): Move from here ...
* src/system.h: ... to here, so that install and mv can use them.
Make them inline so GCC doesn’t complain.
* src/install.c (lchown) [HAVE_LCHOWN]: Remove; no longer needed.
(need_copy, copy_file, change_attributes, change_timestamps)
(install_file_in_file, install_file_in_dir):
New args for directory-relative names. All uses changed.
Continue to pass full names as needed, for diagnostics and for
lower-level functions that do not support directory-relative names.
(install_file_in_dir): Update *TARGET_DIRFD as needed.
(main): Handle target-directory in the new, cp-like way.
* src/mv.c (remove_trailing_slashes): Remove static var; now local.
(do_move): New args for directory-relative names. All uses changed.
Continue to pass full names as needed, for diagnostics and for
lower-level functions that do not support directory-relative names.
(movefile): Remove; no longer needed.
(main): Handle target-directory in the new, cp-like way.
* tests/install/basic-1.sh:
* tests/mv/diag.sh: Adjust to match new diagnostic wording.
* src/dd.c: Prefer signed to unsigned types where either will do,
as this helps improve checking with gcc -fsanitize=undefined.
Limit the signed types to their intended ranges.
(MAX_BLOCKSIZE): Don’t exceed IDX_MAX - slop either.
(input_offset_overflow): Remove; overflow now denoted by negative.
(parse_integer): Return INTMAX_MAX on overflow, instead of unspecified.
Do not falsely report overflow for ‘00x99999999999999999999999999999’.
* tests/dd/misc.sh: New test for 00xBIG.
* tests/dd/skip-seek-past-file.sh: Adjust to new diagnostic wording.
New test for BIGxBIG.
When copying to a directory, use functions like openat to access
the destination files, when such functions are available. This
should be more efficient and should avoid some race conditions.
* bootstrap.conf (gnulib_modules): Add areadlinkat-with-size,
fchmodat, fchownat, mkdirat, mkfifoat, utimensat.
* src/copy.c (lchown) [!HAVE_LCHOWN]:
* src/copy.c, src/system.h (rpl_mkfifo, mkfifo) [!HAVE_MKFIFO]:
Remove. All uses removed.
(utimens_symlink): Remove; we shouldn’t have to worry about
those obsolete systems any more. All uses replaced by utimensat.
* src/copy.c (copy_dir, set_owner, fchmod_or_lchmod, copy_reg)
(same_file_ok, writable_destination, overwrite_ok, abandon_move)
(create_hard_link, src_is_dst_backup, copy_internal, copy):
* src/cp.c (make_dir_parents_private, re_protect):
New args for directory-relative names. All uses changed.
Continue to pass full names as needed, for diagnostics and for
lower-level functions like qset_acl that do not support
directory-relative names.
* src/copy.c (copy_reg): Prefer readlinkat to lstatat for merely
checking whether a file is a symlink, to avoid EOVERFLOW issues.
(subst_suffix): New function.
(create_hard_link): Accept a null SRC_NAME as meaning that if it
is needed it needs to be constructed from SRC_RELNAME, DST_NAME,
and DST_RELNAME.
(source_is_dst_backup): Use subst_suffix instead of doing it by hand.
(copy_internal): Remember and use directory-relative names instead
of full names.
* src/cp.c (lchown) [!HAVE_LCHOWN]: Remove. All uses removed.
(must_be_working_directory): New function.
(target_directory_operand): Simply take file name as arg,
and return a file descriptor or negative number on failure;
open with O_DIRECTORY to obtain any file descriptor.
All uses changed.
(target_dirfd_valid): New function.
(do_copy): Use these new functions to obtain a file descriptor
for any target directory, and use directory-relative names
for that directory.
(main): Omit no-longer-needed stat when --target-directory,
as do_copy now does this.
* src/ln.c (O_PATHSEARCH): Move from here ...
* src/system.h: ... to here.
* tests/cp/fail-perm.sh: Adjust to change in diagnostic wording,
and add a test for --no-target-directory.
Run "make update-copyright" and then...
* gnulib: Update to latest with copyright year adjusted.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Likewise.
* tests/sample-test: Adjust to use the single most recent year.
Living so close to Hollywood I know that "colorize"
means adding color to something that was already monochrome,
whereas "color" means to give color to something.
Coreutils apps color text instead of colorizing it.
This fixes a bug that I introduced in
2006-12-06T19:44:08Z!eggert@cs.ucla.edu.
* src/copy.c (USE_XATTR): New macro.
(copy_reg): Use it to help the compiler. Prefer open u+w to a
later chmod u=rw; u+r isn’t needed for xattr. For the later u-r,
do only one (or zero) chmod calls instead of two (or one).
In the last chmod, respect the umask instead of ignoring it.
* tests/cp/preserve-mode.sh: Test for the bug.
* tests/misc/env-signal-handler.sh: Use retry_delay_ to
avoid a false failure under load, where env hasn't setup
the SIGINT handling before timeout(1) sends the SIGINT.
Fixes https://bugs.gnu.org/51793
New warnings are added related to the handling
of thousands grouping characters, decimal points, and sign characters.
Examples now diagnosed are:
$ printf '0,9\n1,a\n' | sort -nk1 --debug -t, -s
sort: key 1 is numeric and spans multiple fields
sort: field separator ‘,’ is treated as a group separator in numbers
1,a
_
0,9
___
$ printf '1,a\n0,9\n' | LC_ALL=fr_FR.utf8 sort -gk1 --debug -t, -s
sort: key 1 is numeric and spans multiple fields
sort: field separator ‘,’ is treated as a decimal point in numbers
0,9
___
1,a
__
$ printf '1.0\n0.9\n' | LC_ALL=fr_FR.utf8 sort -s -k1,1g --debug
sort: note numbers use ‘,’ as a decimal point in this locale
0.9
_
1.0
_
$ LC_ALL=fr_FR.utf8 sort -n --debug /dev/null
sort: text ordering performed using ‘fr_FR.utf8’ sorting rules
sort: note numbers use ‘,’ as a decimal point in this locale
sort: the multi-byte number group separator in this locale \
is not supported
$ sort --debug -t- -k1n /dev/null
sort: key 1 is numeric and spans multiple fields
sort: field separator ‘-’ is treated as a minus sign in numbers
sort: note numbers use ‘.’ as a decimal point in this locale
$ sort --debug -t+ -k1g /dev/null
sort: key 1 is numeric and spans multiple fields
sort: field separator ‘+’ is treated as a plus sign in numbers
sort: note numbers use ‘.’ as a decimal point in this locale
* src/sort.c (key_warnings): Add the warnings above.
* tests/misc/sort-debug-warn.sh: Add test cases.
Also check that all sort invocations succeed.
* NEWS: Mention the improvement.
Addresses https://bugs.gnu.org/51011
* src/timeout.c (main): Propagate the killed status from the child.
* doc/coreutils.texi (timeout invocation): Remove the
description of the --foreground specific handling of SIGKILL,
now that it's consistent with the default mode of operation.
* tests/misc/timeout.sh: Add a test case.
* NEWS: Mention the change in behavior.
Fixes https://bugs.gnu.org/51135
* init.cfg (seek_data_capable_): Add a timeout to ensure failure for
slow lseek(...SEEK_DATA) calls (even if that syscall isn't interrupted).
* tests/cp/sparse-perf.sh: Run the SEEK_DATA check on the
1TiB empty file to exclude both FreeBSD 9.1 which takes 35s,
and ZFS which requires a delay of about 5s between file creation
and use of SEEK_DATA to correctly determine it's empty (return ENXIO).
Also remove the stat size checks as they invalidate the test
due to cp never writing data due to it being always zeros,
and thus converted to holes in the output.
* src/chmod.c: Reorder enum so CH_NOT_APPLIED
can be treated as a non error.
* tests/chmod/ignore-symlink.sh: A new test.
* tests/local.mk: Reference the new test.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/50784
* tests/cp/sparse-perf.sh: Avoid the case where
we saw SEEK_DATA take 35s to return a result
against a 1TB sparse file. This happened on
a FreeBSD 9.1 VM at least.
Reported by Nelson H. F. Beebe.
* src/digest.c: Allow using the --untagged option with --check,
so that `cksum -a md5 --untagged` used to emulate md5sum for example,
may be augmented with the --check option. Also support the --tag
option with cksum, to allow overriding a previous --untagged setting.
* doc/coreutils.texi: Adjust accordingly.
* tests/misc/cksum-a.sh: Likewise.
* tests/tail-2/F-vs-rename.sh: Keep stdout and stderr separate,
so that interspersion doesn't impact regex checks. Also wait
for each file's data to be printed to avoid multiple writes
to a file to be printed in a single iteration, which would
impact the regex checks. Also we refactor the check function,
rather than repeatedly redefining variations.
This is wrong fix really, as only introducing delay I think.
* tests/tail-2/F-vs-rename.sh: Avoid a rare false failure
due to a race in the test. Now wait until tail has noticed
that b is replaced before writing to a, so that the subsequent
write of "y" to b will be displayed independently from
current contents of b ("x").
* tests/ls/removed-directory.sh: On FreeBSD 9.1 at least,
one gets ENOENT when trying to traverse the current removed dir
with ../, so instead reference the parent dir directly.
* src/digest.c (digest_check): Treat empty lines like comments,
as commented checksum files very often have empty lines.
* tests/misc/md5sum.pl: Adjust accordingly.
* src/digest.c: Always set the digest_length, so that
we check the correct number of hex digits when parsing
non tagged format checksums.
* tests/misc/cksum-a.sh: Add a test case. Also fix
up this test which was ineffective due to fail=1
being set in a subshell and ignored.
Support checksum files with CRLF line endings,
which is a common gotcha for using --check on windows,
or with checksum files generated on windows.
Note we escape \r here to support the original coreutils format
(with file name at EOL), and file names with literal
\r characters as the last character of their name.
* src/digest.c (filename_unescape): Convert \\r -> \r.
(print_filename): Escape \r -> \\r.
(output_file): Detect \r chars in file names.
(digest_check): Ignore literal \r char at EOL.
* tests/misc/md5sum.pl: Add a test case.
* tests/misc/sha1sum.pl: Likewise.
* NEWS: Mention the improvement.
This only practically matters on windows.
But given there are separate text handling options in cygwin,
keep the interface simple, and avoid exposing the
confusing binary/text difference here.
* doc/coreutils.texi (md5sum invocation): Mention that
--binary and --text are not supported by the cksum command.
* src/digest.c: Set flag to use binary mode by default.
(output_file): Don't distinguish text and binary modes with
' ' and '*', and just use ' ' always.
This format is a better default, since it results in simpler usage,
as you don't need to specify --tag on generation or -a on
checking invocations. Also it's a more general format supporting
mixed and length adjusted digests.
* doc/coreutils.texi (cksum invocation): Document a new --untagged
option, to use the older coreutils format.
(md5sum invocation): Mention that cksum doesn't support --tag.
* src/digest.c: Adjust cksum(1) to default to --tag,
and accept the new --untagged option.
* tests/misc/b2sum.sh: Adjust accordingly.
* tests/misc/cksum-a.sh: Likewise.
* tests/misc/cksum-c.sh: Likewise.
* src/cksum.h: Thread DELIM through the output functions.
* src/digest.c: Likewise.
* src/sum.c: Likewise.
* src/sum.h: Likewise.
* src/cksum.c: Likewise. Also adjust check to allow -z
with traditional output modes. Also ajust the global variable
name to avoid shadowing warnings.
* tests/misc/cksum-a.sh: Adjust accordingly.
Support `cksum --check FILE` without having to specify a digest
algorithm, allowing for more generic file check instructions.
This also supports mixed digest checksum files, supporting
more robust multi digest checks.
* src/digest.c (algorithm_from_tag): A new function to
identify the digest algorithm from a tagged format line.
(split3): Set the algorithm depending on tag, and update
the expected digest length accordingly.
* tests/misc/cksum-c.sh: Add a new test.
* tests/local.mk: Reference the new test.
* tests/misc/md5sum.pl: Adjust to more generic error.
* tests/misc/sha1sum.pl: Likewise.
* doc/coreutils.texi (md5sum invocation): Mention the new -c feature.
* NEWS: Mention the new feature.
Add message digest sm3, which uses the OSCCA SM3 secure
hash (OSCCA GM/T 0004-2012 SM3) generic hash transformation.
* bootstrap.conf: Add the sm3 module.
* doc/coreutils.texi: Mention the cksum -a option.
* src/digest.c: Provide support for --algorithm='sm3'.
* tests/misc/sm3sum.pl: Add a new test (from Tianjia Zhang)
* tests/local.mk: Reference the new test.
* NEWS: Mention the new feature.
Tested-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
* src/digest.c: Organize HASH_ALGO_CKSUM to be table driven,
and amalgamate all digest algorithms.
(main): Parse all options if HASH_ALGO_CKSUM, and disallow
--tag, --zero, and --check with the traditional bsd, sysv, and crc
checksums for now.
* src/local.mk: Reorganize to include all digest modules in cksum.
* tests/misc/cksum-a.sh: Add a new test.
* tests/misc/b2sum.sh: Update to default to checking with cksum,
as b2sum's implementation diverges a bit from the others.
* tests/local.mk: Reference the new test.
* doc/coreutils.texi (cksum invocation): Adjust the summary to
identify the new mode, and document the new --algorithm option.
* man/cksum.x: Adjust description to be more general.
* man/*sum.x: Add [See Also] section referencing cksum(1).
* NEWS: Mention the new feature.
* tests/ls/stat-vs-dirent.sh: Skip the test if we can't stat(1),
as the file may have been removed, or have a malformed name
due to '\n' etc. in the file name.
Adjust to output the file name if any name parameter is passed.
This is consistent with sum -s, cksum, and sum implementations
on other platforms. This should not cause significant compat
issues, as multiple fields are already output, and so already
need to be parsed.
* src/sum.c (bsd_sum_file): Output the file name
if any name parameter is passed.
* tests/misc/sum.pl: Adjust accordingly.
* doc/coreutils.texi (sum invocation): Likewise.
* NEWS: Mention the change in behavior.
* tests/misc/help-version.sh: Test that /dev/full causes
shell printf to fail. This ports better to NetBSD 9.88.46,
where it doesn’t. Problem reported by Nelson H. F. Beebe.
* tests/misc/help-version.sh: Merge gzip-related changes
back from gzip/tests/help-version. This fixes problems
when TERM is not 'dumb', and should simplify maintenance.
Emil Lundberg <lundberg.emil@gmail.com> reports in
https://bugs.gnu.org/49741 about a 'basenc --base64 -d' decoding bug.
The input buffer length was not divisible by 3, resulting in
decoding errors.
* NEWS: Mention fix.
* src/basenc.c (DEC_BLOCKSIZE): Change from 1024*5 to 4200 (35*3*5*8)
which is divisible by 3,4,5,8 - satisfying both base32 and base64;
Use compile-time verify() macro to enforce the above.
* tests/misc/basenc.pl: Add test.
This patch modifies basenc to prefer signed integers to
unsigned, as signed are less error-prone.
This patch also updates Gnulib to to latest, which updates Gnulib’s
base32 and base64 modules to prefer signed to unsigned integers.
* src/basenc.c: Include idx.h.
(struct base2_decode_context): Use unsigned char, not unsigned
for an octet that must fit in an unsigned char.
(base_encode, struct base_decode_context)
(base64_decode_ctx_wrapper, prepare_inbuf, base64url_encode)
(base64url_decode_ctx_wrapper, base32_decode_ctx_wrapper)
(base32hex_encode, base32hex_decode_ctx_wrapper, base16_encode)
(base16_decode_ctx, z85_encode, Z85_HI_CTX_TO_32BIT_VAL)
(z85_decoding, z85_decode_ctx, base2msbf_encode)
(base2lsbf_encode, base2lsbf_decode_ctx, base2msbf_decode_ctx)
(wrap_write, do_encode, do_decode, main):
Prefer signed integers to unsigned.
(main): Treat extremely large wrap columns as if they were
infinite; that’s good enough. Since we’re now using xstrtoimax,
this allows ‘-w -0’ (same as ‘-w 0’).
* tests/misc/base64.pl (gen_tests): -w-0 is no longer an error.
This better tests the SEEK_HOLE logic which
replaced the original fiemap hole identification logic.
Also it avoids a false failure in sparse-2.sh
on reflink supporting file systems, where we
try to correlate the file sizes produced by cp and dd.
* tests/cp/sparse-2.sh: s/cp/cp --reflink=never/
* tests/cp/sparse-extents-2.sh: Likewise.
* tests/cp/sparse-extents.sh: Likewise.
* tests/cp/sparse-perf.sh: Likewise.
* tests/cp/sparse.sh: Likewise.
Fixes https://github.com/coreutils/coreutils/issues/54