* NEWS: Mention the change.
* bootstrap.conf (gnulib_modules): Add posix_spawn,
posix_spawnattr_setsigdefault, posix_spawn_file_actions_addclose,
posix_spawn_file_actions_adddup2, and posix_spawn_file_actions_init.
* src/split.c: Include spawn.h.
(create): Use posix_spawn instead of fork and execl.
Use new Gnulib streq function instead of rolling our own macro.
* bootstrap.conf (gnulib_modules): Add stringeq.
* src/rm.c (main): Don’t assume streq is a macro that expands to (...),
as it is now a function.
* src/system.h:
* tests/df/no-mtab-status.sh, tests/df/skip-duplicates.sh:
(STREQ): Remove. All uses replaced by streq.
* cfg.mk (sc_standard_outputs): Add a grep command for source files.
* src/du.c (main): Use standard input instead of stdin, standard output
instead of stdout, and standard error instead of stderr in messages.
* src/nohup.c (main): Likewise.
* src/sort.c (main): Likewise.
* src/split.c (main): Likewise.
* src/stdbuf.c (main): Likewise.
* src/wc.c (main): Likewise.
* tests/du/files0-from.pl (@Tests): Adjust test case to new messages.
* tests/sort/sort-files0-from.pl: Likewise.
* tests/wc/wc-files0-from.pl: Likewise.
I thought of a way to pacify -Wswitch-enum without much trouble.
Either add all the enums, or if that’s too verbose use ‘switch (+E)’
to indicate to the reader that there need not be a case for
every enum value. Since this approach improves static checking,
make the change everywhere and check it with -Wswitch-enum.
* configure.ac: Compile with -Wswitch-enum if it works and
--enable-gcc-warnings. No need to remove -Wswitch-default
since Gnulib no longer adds it.
* src/chmod.c (describe_change):
* src/chown-core.c (describe_change):
* src/copy.c (copy_debug_string, copy_debug_sparse_string):
* src/df.c (decode_output_arg, get_dev):
* src/du.c (main):
* src/factor.c (print_factors):
* src/head.c (diagnose_copy_fd_failure):
* src/ls.c (time_type_to_statx, calc_req_mask)
(decode_line_length, get_funky_string, parse_ls_color)
(gobble_file, print_long_format):
* src/split.c (main):
* src/sync.c (sync_arg):
* src/tr.c (is_char_class_member):
* src/wc.c (main):
Add switch cases to pacify -Wswitch-enum.
* src/copy.c (copy_debug_string, copy_debug_sparse_string):
Add unreachable () for unreachable cases.
* src/digest.c (main):
* src/od.c (decode_one_format):
* src/tr.c (get_next, get_spec_stats):
switch (E) → switch (+E).
* src/digest.c (main):
* src/tr.c (get_next):
Omit unnecessary ‘default: break;’ that merely pacified GCC,
as the new pacification style is better.
* src/ls.c (decode_line_length):
Add default unreachable case to prevent warning that function
might not return a value.
(gobble_file): Distinguish DEREF_NEVER from unreachable cases.
Update to latest gnulib with new copyright year.
Run "make update-copyright" and then...
* gnulib: Update included in this commit as copyright years
are the only change from the previous gnulib commit.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Likewise.
* tests/sample-test: Adjust to use the single most recent year.
Also, prepare for allowing some arguments to overflow
without that being an error.
* gl/lib/xdectoint.c: Do not include stddef.h,
since we no longer use ‘unreachable’.
(xnumtoimax, xnumtoumax, __xnumtoint):
New arg FLAGS. All callers changed.
Stop using __xdectoint_signed. All definers removed.
* gl/lib/xdectoint.h (XTOINT_MIN_QUIET, XTOINT_MAX_QUIET)
(XTOINT_MIN_RANGE, XTOINT_MAX_RANGE): New flag constants.
* src/fmt.c (main):
* src/fold.c (main):
* src/nl.c (main):
* src/pr.c (getoptnum):
* src/split.c (main):
Use XTOINT_MIN_RANGE and XTOINT_MAX_RANGE if appropriate.
* src/pr.c (getoptnum): Return int rather than returning void
and storing through int *.
* src/stty.c (apply_settings):
Use ckd_add to check for overflow instead of doing it by hand.
(integer_arg): Accept and return uintmax_t, not unsigned long.
* src/split.c (line_bytes_split): Do not shrink hold buffer.
If it’s large for this batch it’s likely to be large for the next
batch, and for ‘split’ it’s not worth the complexity/CPU hassle to
shrink it. Do not assume hold_size can be bufsize.
Update to latest gnulib with new copyright year.
Run "make update-copyright" and then...
* gnulib: Update included in this commit as copyright years
are the only change from the previous gnulib commit.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Manually update copyright year,
until we fully sync with gnulib at a later stage.
* tests/sample-test: Adjust to use the single most recent year.
* src/system.h: Include <stdckdint.h>, since the new
DECIMAL_DIGIT_ACCUMULATE uses it.
Do not include stdckdint.h from files that also include system.h.
(DECIMAL_DIGIT_ACCUMULATE): Omit last arg, which is no longer needed.
Reimplement by using C23-style stdckdint.h’s ckd_mul and ckd_add,
as that’s more standard and is more likely to generate better code.
Include ctype.h only in files that need it. Many of its uses
are incorrect, as they assume single-byte locales. The idea is
to remove the incorrect uses later, when there is time.
* src/chroot.c, src/csplit.c, src/dd.c, src/digest.c, src/dircolors.c:
* src/expand-common.c, src/expand.c, src/fmt.c, src/fold.c, src/ls.c:
* src/od.c, src/pinky.c, src/pr.c, src/ptx.c, src/seq.c:
* src/set-fields.c, src/split.c, src/stdbuf.c, src/test.c:
* src/tr.c, src/truncate.c, src/unexpand.c, src/wc.c:
Include ctype.h.
* src/system.h: Do not include ctype.h.
include ctype.h.o
Most of this just affects commentary and documentations. The only
significant behavior change is translating author names via
proper_name_lite rather than proper_name_utf8, or not translating
them at all. proper_name_lite is good enough for coreutils and
avoids the bloat that had coreutils not using Gnulib proper_name.
* bootstrap.conf (gnulib_modules): Use propername-lite instead
of propername.
(XGETTEXT_OPTIONS): Look for proper_name_lite instead of for
proper_name_utf8.
* cfg.mk (local-checks-to-skip): Remove
sc_proper_name_utf8_requires_ICONV, since we no longer use
proper_name_utf8.
(old_NEWS_hash): Update.
(sc_check-I18N-AUTHORS): Remove; no longer needed.
* bootstrap.conf: Depend on tmpdir rather than tmpfile,
as the standard tmpfile() doesn't honor $TMPDIR.
* src/split.c (copy_to_tmpfile): Adjust to call temp_stream() rather
than tmpfile();
* NEWS: Mention the improvement.
* cfg.mk (sc_some_programs_must_avoid_exit_failure): Adjust to
avoid false positive.
(sc_prohibit_exit_write_error): A new syntax check to prohibit
open coding error(..., "write error"); instead directing to use...
* src/system.h (write_error): ... a new function to clear stdout errors
before we explicitly diagnose a write error and exit.
* src/basenc.c: Use write_error() to ensure no repeated diagnostics.
* src/cat.c: Likewise.
* src/expand.c: Likewise.
* src/factor.c: Likewise.
* src/paste.c: Likewise.
* src/seq.c: Likewise.
* src/shuf.c: Likewise.
* src/split.c: Likewise.
* src/tail.c: Likewise.
* src/tr.c: Likewise.
* src/unexpand.c: Likewise.
* tests/misc/write-errors.sh: Remove TODOs for the fixed utilities:
expand, factor, paste, shuf, tr, unexpand.
* bootstrap.conf (gnulib_modules): Add stdckdint.
Also, in C source code, prefer C23 macros like ckd_add
to their Gnulib near-equivalents like INT_ADD_WRAPV.
Include <stdckdint.h> as needed.
Now that Gnulib’s ‘error’ module does proper static checking
for not returning, we need no longer use the ‘die’ macro.
This makes code easier to read for people that are used to ‘error’.
* cfg.mk (error_fns, exclude_file_name_regexp): Remove ‘die’.
(sc_die_EXIT_FAILURE): Remove.
* src/die.h: Remove. All includes removed. All calls to ‘die’
changed back to calls to ‘error’.
* src/install.c (get_ids): Use quoteaf (problem found with
make syntax-check).
* src/system.h: Include error.h, since some of our macros call ‘error’.
Stop including error.h elsewhere.
This modernizes the source code somewhat, to take advantage
of advances in GCC over the years, and Gnulib’s ‘assure’ module.
Include assure.h in files that now need it.
Do not include assert.h directly; it’s no longer needed.
* bootstrap.conf (gnulib_modules): Add ‘assure’.
* gl/lib/randread.c (randread_error):
* src/chmod.c (describe_change):
* src/chown-core.c (describe_change):
* src/cp.c (decode_preserve_arg):
* src/head.c (diagnose_copy_fd_failure):
* src/ls.c (parse_ls_color):
* src/od.c (decode_one_format):
* src/split.c (main):
* src/test.c (binary_operator, posixtest):
Prefer affirm to abort, since it has better diagnostics in the
normal case and better performance with -DNDEBUG.
* gl/lib/xdectoint.c, src/die.h: Include stddef.h, for unreachable.
* gl/lib/xdectoint.c: Do not include verify.h; no longer needed.
* gl/lib/xdectoint.c (__xnumtoint):
* src/die.h (die):
Prefer C23 unreachable () to assume (false).
* gl/lib/xfts.c (xfts_open):
* src/basenc.c (base32hex_encode):
* src/copy.c (abandon_move, copy_internal, valid_options):
* src/cut.c (cut_fields):
* src/df.c (alloc_field, decode_output_arg, get_dev):
* src/du.c (process_file, main):
* src/echo.c (usage):
* src/factor.c (udiv_qrnnd, mod2, gcd2_odd, factor_insert_large)
(mulredc2, factor_using_pollard_rho, isqrt2, div_smallq)
(factor_using_squfof):
* src/iopoll.c (iopoll_internal, fwrite_wait):
* src/join.c (add_field):
* src/ls.c (dev_ino_pop, main, gobble_file, sort_files):
* src/mv.c (do_move):
* src/od.c (decode_format_string, read_block, dump, main):
* src/remove.c (rm):
* src/rm.c (main):
* src/sort.c (stream_open):
* src/split.c (next_file_name, lines_chunk_split):
* src/stdbuf.c (main):
* src/stty.c (set_speed):
* src/tac-pipe.c (line_ptr_decrement, line_ptr_increment):
* src/touch.c (touch):
* src/tr.c (find_bracketed_repeat, get_next)
(validate_case_classes, get_spec_stats, string2_extend, main):
* src/tsort.c (search_item, tsort):
* src/wc.c (main):
Prefer affirm to assert, as it allows for better static
checking when compiling with -DNDEBUG.
* src/chown-core.c (change_file_owner):
* src/df.c (get_field_list):
* src/expr.c (printv, null, tostring, toarith, eval2):
* src/ls.c (time_type_to_statx, calc_req_mask, get_funky_string)
(print_long_format):
* src/numfmt.c (simple_strtod_fatal):
* src/od.c (decode_one_format):
* src/stty.c (mode_type_flag):
* src/tail.c (xlseek):
* src/tr.c (is_char_class_member, get_next, get_spec_stats)
(string2_extend):
Prefer unreachable () to abort () or assert (false) when merely
pacifying the compiler, e.g., in a switch statement on an enum
where all cases are covered.
* src/copy.c (valid_options): Now returns void; the bool was useless.
Caller no longer needs to assert.
* src/csplit.c (find_line):
* src/expand-common.c (next_file):
* src/shred.c (incname):
* src/sort.c (main):
* src/tr.c (append_normal_char, append_range, append_char_class)
(append_repeated_char, append_equiv_class):
* src/tsort.c (search_item):
Omit assert, since the hardware will check for us.
* src/df.c (header_mode): Now the enum type it should have been.
* src/du.c (process_file):
* src/ls.c (assert_matching_dev_ino):
* src/tail.c (valid_file_spec):
* src/tr.c (validate_case_classes):
Mark defns with MAYBE_UNUSED if they’re not used when -DNDEBUG.
* src/factor.c (prime_p, prime2_p, mp_prime_p): Now ATTRIBUTE_PURE.
Prefer affirm to error+abort. No need to translate this diagnostic.
* src/fmt.c (get_paragraph):
* src/stty.c (display_changed, display_all, sane_mode):
* src/who.c (idle_string):
Prefer assume to assert, since the goal is merely pacification
and assert doesn’t pacify anyway if -DNDEBUG is used.
* src/join.c (decode_field_spec):
Omit unreachable abort.
* src/ls.c (assert_matching_dev_ino, main):
* src/tr.c (get_next):
Prefer assure to assert, since the check is relatively expensive
and won’t help static analysis.
* src/ls.c (main):
Prefer static_assert to assert of a constant expression.
(format_inode): Redo to make it clear that buflen doesn’t matter,
and that buf must have a certain number of bytes. All callers changed.
This pacifies -Wformat-overflow.
* src/od.c (decode_one_format):
Omit an assert that tested for obviously undefined behavior,
as the compiler could optimize it away anyway.
* src/od.c (decode_one_format, decode_format_string):
Prefer ATTRIBUTE_NONNULL to runtime checking.
* src/stat.c: Do not include <stddef.h> since system.h does that now.
* src/sync.c (sync_arg):
Prefer unreachable () to assert (true), which was a typo.
* src/system.h: Include stddef.h, for unreachable.
* src/tail.c (xlseek): Simplify by relying on ‘error’ to exit.
Note mktemp --suffix has the same inconsistency,
but mktemp -d does support creating dirs
so probably best to leave that as is.
* src/split.c (main): Check for trailing /.
* tests/split/additional-suffix.sh: Augment the test.
Reported in https://bugs.debian.org/1036827
As split is often dealing with large files,
ensure we indicate to the kernel our sequential access pattern.
This was seen to operate 5% faster when reading from SSD,
as tested with:
dd bs=1M count=2K if=/dev/urandom of=big.in
for split in split.orig split; do
# Ensure big file is not cached
dd of=big.in oflag=nocache conv=notrunc,fdatasync count=0 status=none
# Test read efficiency
CWD=$PWD; (cd /dev/shm && time $CWD/src/$split -n2 $CWD/big.in)
done
real 0m9.039s
user 0m0.055s
sys 0m3.510s
real 0m8.568s
user 0m0.056s
sys 0m3.752s
* src/split.c (main): Use fdadvise to help the kernel
choose a more appropriate readahead buffer.
* NEWS: Mention the improvement.
* bootstrap.conf (gnulib_modules): Add free-posix, tmpfile.
* src/split.c (copy_to_tmpfile): New function.
(input_file_size): Use it to split larger files when sizes cannot
easily be determined via fstat or lseek. See Bug#61386#235.
* tests/split/l-chunk.sh: Mark tests of /dev/zero as
very expensive since they exhaust /tmp.
Problem reported by Pádraig Brady (Bug#61386#226).
* src/split.c (parse_chunk): Use die instead of error.
(main): Quote a string.
* tests/local.mk (all_root_tests): Move du/apparent.sh from here ...
(all_tests): ... to here.
* src/split.c (create): Avoid fstat + ftruncate in the usual case
where the output file does not already exist, by trying
to create it with O_EXCL first. This costs a failed open
in the unusual case where the output file already exists,
but that’s OK.
Prefer signed types to uintmax_t, as this allows for better
runtime checking with gcc -fsanitize=undefined.
Also, when an integer overflows just use the maximal value
when the code will do the right thing anyway.
* src/split.c (set_suffix_length, bytes_split, lines_split)
(line_bytes_split, lines_chunk_split, bytes_chunk_extract)
(lines_rr, parse_chunk, main):
Prefer a signed type (typically intmax_t) to uintmax_t.
(strtoint_die): New function.
(OVERFLOW_OK): New macro. Use it elsewhere, where we now allow
LONGINT_OVERFLOW because the code then does the right thing on all
practical platforms (they have int wide enough so that it cannot
be practically exhausted). We can do this now that we can safely
assume intmax_t has at least 64 bits.
(parse_n_units): New function.
(parse_chunk, main): Use it.
(main): Do not worry about integer overflow when the code
will do the right thing anyway with the extreme value.
Just use the extreme value.
* tests/split/fail.sh: Adjust to match new behavior.
* src/split.c (bytes_split, lines_chunk_split)
(bytes_chunk_extract, main): Prefer ssize_t to size_t when
representing the return value of ‘read’. Use a negative value
instead of SIZE_MAX to indicate a missing value.
* src/split.c: Include sys-limits.h, not safe-read.h.
(input_file_size, bytes_split, lines_split, line_bytes_split)
(lines_chunk_split, bytes_chunk_extract, lines_rr): Call read, not
safe_read, since safe_read no longer buys us anything.
(main): Reject outlandish buffer sizes right away,
rather than allocating huge buffers and never using them.
* src/split.c (closeout): There should be no need for a special
case for ECHILD, since we never wait for the same child twice.
Simplify with this in mind.
Ignore and default SIGPIPE, rather than blocking and unblocking it.
* src/split.c (default_SIGPIPE):
New static var, replacing oldblocked and newblocked.
(create): Use it.
(main): Set it.
* src/split.c (input_file_size): Do not bother with lseek if the
initial read probe reaches EOF, since the file size is known then.
This works better on macOS, which doesn’t allow lseek on /dev/null.
Do not special-case size-zero files, as the issue can occur
with any size file (though /proc files are the most common).
If the current position is past end of file, treat this as
size zero regardless of whether the file has a usable st_size.
Pass through lseek -1 return values rather than using ‘return -1’;
this makes the code a bit easier to analyze (and a bit faster).
Avoid undefined behavior if the size calculation overflows.
(lines_chunk_split): Do not bother with lseek if it would have
no effect if successful. This works better on macOS, which
doesn’t allow lseek on /dev/null.
* tests/split/l-chunk.sh: Adjust to match fixed behavior.
* src/split.c (bytes_split): New arg REM_BYTES.
Use this to split more evenly. All callers changed.
(lines_chunk_split, bytes_chunk_extract):
Be consistent with new byte_split.
* tests/split/b-chunk.sh, tests/split/l-chunk.sh: Test new behavior.
* src/split.c (lines_chunk_split): Simplify by having chunk_end
point to the first byte after the chunk, rather than to the last
byte of the chunk. This will reduce confusion once we allow
chunks to be empty.
* src/dd.c (parse_integer): Support Q,R suffixes.
* src/od.c (main): Likewise.
* src/split.c (main): Likewise.
* src/stdbuf.c (parse_size): Likewise.
* src/truncate.c (main): Likewise.
* src/sort.c (specify_size_size): Likewise.
Also line length syntax check fix.
* tests/misc/numfmt.pl: Adust top end large number checks
to the new largest values.
* doc/coreutils.texi (numfmt invocation): Add a numfmt example.
* NEWS: Tweak to aid searchability.
Update to latest gnulib with new copyright year.
Run "make update-copyright" and then...
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Manually update copyright year,
until we fully sync with gnulib at a later stage.
* tests/sample-test: Adjust to use the single most recent year.
Use the new Gnulib modules alignalloc and xalignalloc
to simplify some memory allocation.
Also, fix some unlikely integer overflow problems.
* bootstrap.conf (gnulib_modules): Add alignalloc, xalignalloc.
* src/cat.c, src/copy.c, src/dd.c, src/shred.c, src/split.c:
Include alignalloc.h.
* src/cat.c (main):
* src/copy.c (copy_reg):
* src/dd.c (alloc_ibuf, alloc_obuf):
* src/shred.c (dopass):
* src/split.c (main):
Use alignalloc/xalignalloc/alignfree instead of doing page
alignment by hand.
* src/cat.c (main):
Check for integer overflow in page size calculations.
* src/dd.c (INPUT_BLOCK_SLOP, OUTPUT_BLOCK_SLOP, MAX_BLOCKSIZE):
(real_ibuf, real_obuf) [lint]:
Remove; no longer needed.
(cleanup) [lint]:
(scanargs): Simplify.
* src/ioblksize.h (io_blksize): Do not allow blocksizes largest
than the largest power of two that fits in idx_t and size_t.
* src/shred.c (PAGE_ALIGN_SLOP, PATTERNBUF_SIZE): Remove.
Run "make update-copyright" and then...
* gnulib: Update to latest with copyright year adjusted.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Likewise.
* tests/sample-test: Adjust to use the single most recent year.
This functionality regressed with the adjustments
in commit v8.25-4-g62e7af032
* src/split.c (bytes_chunk_extract): Account for already read data
when seeking into the file.
* tests/split/b-chunk.sh: Use the hidden ---io-blksize option,
to test this functionality.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/46048