1
0
mirror of git://git.sv.gnu.org/coreutils.git synced 2026-04-16 00:36:08 +02:00
Commit Graph

265 Commits

Author SHA1 Message Date
Collin Funk
2213542e40 split: cleanup after posix_spawn
* boostrap.conf (gnulib_modules): Add posix_spawn_file_actions_destroy.
* src/split.c (cleanup): Call posix_spawnattr_destroy and
posix_spawn_file_actions_destroy after a successful posix_spawn.
2025-10-23 12:42:01 -07:00
Collin Funk
ca8b928665 split: prefer posix_spawn to fork and execl
* NEWS: Mention the change.
* bootstrap.conf (gnulib_modules): Add posix_spawn,
posix_spawnattr_setsigdefault, posix_spawn_file_actions_addclose,
posix_spawn_file_actions_adddup2, and posix_spawn_file_actions_init.
* src/split.c: Include spawn.h.
(create): Use posix_spawn instead of fork and execl.
2025-10-23 11:24:06 -07:00
Paul Eggert
ee367bd38d maint: STREQ → streq
Use new Gnulib streq function instead of rolling our own macro.
* bootstrap.conf (gnulib_modules): Add stringeq.
* src/rm.c (main): Don’t assume streq is a macro that expands to (...),
as it is now a function.
* src/system.h:
* tests/df/no-mtab-status.sh, tests/df/skip-duplicates.sh:
(STREQ): Remove.  All uses replaced by streq.
2025-09-17 12:20:24 -07:00
Collin Funk
c9a30d6781 maint: use consistent references to standard files in messages
* cfg.mk (sc_standard_outputs): Add a grep command for source files.
* src/du.c (main): Use standard input instead of stdin, standard output
instead of stdout, and standard error instead of stderr in messages.
* src/nohup.c (main): Likewise.
* src/sort.c (main): Likewise.
* src/split.c (main): Likewise.
* src/stdbuf.c (main): Likewise.
* src/wc.c (main): Likewise.
* tests/du/files0-from.pl (@Tests): Adjust test case to new messages.
* tests/sort/sort-files0-from.pl: Likewise.
* tests/wc/wc-files0-from.pl: Likewise.
2025-08-05 18:46:04 -07:00
Pádraig Brady
69f21c5d46 doc: use consistent references to standard files
* cfg.mk (sc_standard_outputs): A new syntax check to
enforce standard references.
* doc/coreutils.texi: s/stderr/standard error/ etc.
* src/date.c: Likewise.
* src/dd.c: Likewise.
* src/env.c: Likewise.
* src/sort.c: Likewise.
* src/split.c: Likewise.
* src/stty.c: Likewise.
* src/timeout.c: Likewise.
* src/who.c: Likewise.
2025-07-29 11:43:09 +01:00
Paul Eggert
008bb4732b maint: pacify ‘gcc -Wswitch-enum’
I thought of a way to pacify -Wswitch-enum without much trouble.
Either add all the enums, or if that’s too verbose use ‘switch (+E)’
to indicate to the reader that there need not be a case for
every enum value.  Since this approach improves static checking,
make the change everywhere and check it with -Wswitch-enum.
* configure.ac: Compile with -Wswitch-enum if it works and
--enable-gcc-warnings.  No need to remove -Wswitch-default
since Gnulib no longer adds it.
* src/chmod.c (describe_change):
* src/chown-core.c (describe_change):
* src/copy.c (copy_debug_string, copy_debug_sparse_string):
* src/df.c (decode_output_arg, get_dev):
* src/du.c (main):
* src/factor.c (print_factors):
* src/head.c (diagnose_copy_fd_failure):
* src/ls.c (time_type_to_statx, calc_req_mask)
(decode_line_length, get_funky_string, parse_ls_color)
(gobble_file, print_long_format):
* src/split.c (main):
* src/sync.c (sync_arg):
* src/tr.c (is_char_class_member):
* src/wc.c (main):
Add switch cases to pacify -Wswitch-enum.
* src/copy.c (copy_debug_string, copy_debug_sparse_string):
Add unreachable () for unreachable cases.
* src/digest.c (main):
* src/od.c (decode_one_format):
* src/tr.c (get_next, get_spec_stats):
switch (E) → switch (+E).
* src/digest.c (main):
* src/tr.c (get_next):
Omit unnecessary ‘default: break;’ that merely pacified GCC,
as the new pacification style is better.
* src/ls.c (decode_line_length):
Add default unreachable case to prevent warning that function
might not return a value.
(gobble_file): Distinguish DEREF_NEVER from unreachable cases.
2025-02-02 22:09:52 -08:00
Pádraig Brady
28b176085f maint: update all copyright year number ranges
Update to latest gnulib with new copyright year.
Run "make update-copyright" and then...

* gnulib: Update included in this commit as copyright years
are the only change from the previous gnulib commit.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Likewise.
* tests/sample-test: Adjust to use the single most recent year.
2025-01-01 09:33:08 +00:00
Paul Eggert
495a35311c maint: distinguish EOVERFLOW vs ERANGE better
Also, prepare for allowing some arguments to overflow
without that being an error.
* gl/lib/xdectoint.c: Do not include stddef.h,
since we no longer use ‘unreachable’.
(xnumtoimax, xnumtoumax, __xnumtoint):
New arg FLAGS.  All callers changed.
Stop using __xdectoint_signed.  All definers removed.
* gl/lib/xdectoint.h (XTOINT_MIN_QUIET, XTOINT_MAX_QUIET)
(XTOINT_MIN_RANGE, XTOINT_MAX_RANGE): New flag constants.
* src/fmt.c (main):
* src/fold.c (main):
* src/nl.c (main):
* src/pr.c (getoptnum):
* src/split.c (main):
Use XTOINT_MIN_RANGE and XTOINT_MAX_RANGE if appropriate.
* src/pr.c (getoptnum): Return int rather than returning void
and storing through int *.
* src/stty.c (apply_settings):
Use ckd_add to check for overflow instead of doing it by hand.
(integer_arg): Accept and return uintmax_t, not unsigned long.
2024-08-10 19:30:01 -07:00
Paul Eggert
1116367581 split: don’t trust st_size on /proc files
* src/split.c (create): Don’t trust st_size == 0.
2024-04-06 15:18:28 -07:00
Paul Eggert
c4c5ed8f4e split: do not shrink hold buffer
* src/split.c (line_bytes_split): Do not shrink hold buffer.
If it’s large for this batch it’s likely to be large for the next
batch, and for ‘split’ it’s not worth the complexity/CPU hassle to
shrink it.  Do not assume hold_size can be bufsize.
2024-01-17 12:19:14 -08:00
Pádraig Brady
a966dcdb69 maint: update all copyright year number ranges
Update to latest gnulib with new copyright year.
Run "make update-copyright" and then...

* gnulib: Update included in this commit as copyright years
are the only change from the previous gnulib commit.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Manually update copyright year,
until we fully sync with gnulib at a later stage.
* tests/sample-test: Adjust to use the single most recent year.
2024-01-01 13:27:23 +00:00
Paul Eggert
1f651f4b3d split: omit bad_cast
* src/split.c (infile): Now char const *, not char *.
(main): Omit unnecessary bad_cast calls.
2023-12-31 19:49:26 -08:00
Paul Eggert
23e26ed972 maint: DECIMAL_DIGIT_ACCUMULATE uses stdckdint.h
* src/system.h: Include <stdckdint.h>, since the new
DECIMAL_DIGIT_ACCUMULATE uses it.
Do not include stdckdint.h from files that also include system.h.
(DECIMAL_DIGIT_ACCUMULATE): Omit last arg, which is no longer needed.
Reimplement by using C23-style stdckdint.h’s ckd_mul and ckd_add,
as that’s more standard and is more likely to generate better code.
2023-11-14 20:38:24 -08:00
Paul Eggert
4edb14d20f maint: include ctype.h selectively
Include ctype.h only in files that need it.  Many of its uses
are incorrect, as they assume single-byte locales.  The idea is
to remove the incorrect uses later, when there is time.
* src/chroot.c, src/csplit.c, src/dd.c, src/digest.c, src/dircolors.c:
* src/expand-common.c, src/expand.c, src/fmt.c, src/fold.c, src/ls.c:
* src/od.c, src/pinky.c, src/pr.c, src/ptx.c, src/seq.c:
* src/set-fields.c, src/split.c, src/stdbuf.c, src/test.c:
* src/tr.c, src/truncate.c, src/unexpand.c, src/wc.c:
Include ctype.h.
* src/system.h: Do not include ctype.h.

include ctype.h.o
2023-10-30 00:58:04 -07:00
Paul Eggert
68f4c238ca maint: prefer psame_inode, PSAME_INODE, STP_*
Prefer psame_inode, PSAME_INODE, STP_NBLOCKS, and STP_BLKSIZE,
which take addresses of objects, to their counterparts that
take the whole objects.  In some cases the whole objects might
not be initialized, which would be undefined behavior strictly
speaking.
* gl/lib/root-dev-ino.h (ROOT_DEV_INO_CHECK):
* src/cp-hash.c (src_to_dest_compare):
* src/ls.c (dev_ino_compare):
* src/pwd.c (robust_getcwd):
Prefer PSAME_INODE to SAME_INODE.
* src/chown-core.c (restricted_chown):
* src/copy.c (copy_reg, same_file_ok, source_is_dst_backup)
(copy_internal):
* src/ln.c (do_link):
* src/pwd.c (logical_getcwd):
* src/sort.c (avoid_trashing_input):
* src/split.c (create):
* src/stat.c (find_bind_mount):
Prefer psame_inode to SAME_INODE.
* src/copy.c (infer_scantype):
* src/du.c (process_file):
* src/ls.c (gobble_file, print_long_format)
(print_file_name_and_frills, length_of_file_name_and_frills):
* src/stat.c (print_stat):
Prefer STP_NBLOCKS to ST_NBLOCKS.
* src/copy.c (copy_reg):
* src/head.c (elide_tail_bytes_file, elide_tail_lines_file):
* src/ioblksize.h (io_blksize):
* src/od.c (skip):
* src/shred.c (do_wipefd):
* src/stat.c (print_stat):
* src/tail.c (tail_bytes):
* src/truncate.c (do_ftruncate):
* src/wc.c (wc):
Prefer STP_BLKSIZE to ST_BLKSIZE.
* src/ioblksize.h (io_blksize):
Arg is now struct stat const *, not struct stat.  All callers changed.
2023-09-04 23:12:02 -07:00
Paul Eggert
2dddc87214 maint: spelling fixes, including author names
Most of this just affects commentary and documentations.  The only
significant behavior change is translating author names via
proper_name_lite rather than proper_name_utf8, or not translating
them at all.  proper_name_lite is good enough for coreutils and
avoids the bloat that had coreutils not using Gnulib proper_name.
* bootstrap.conf (gnulib_modules): Use propername-lite instead
of propername.
(XGETTEXT_OPTIONS): Look for proper_name_lite instead of for
proper_name_utf8.
* cfg.mk (local-checks-to-skip): Remove
sc_proper_name_utf8_requires_ICONV, since we no longer use
proper_name_utf8.
(old_NEWS_hash): Update.
(sc_check-I18N-AUTHORS): Remove; no longer needed.
2023-08-28 14:06:43 -07:00
Paul Eggert
9970fac34b maint: include idx.h everywhere
* src/system.h: Include idx.h here, instead of in every file
that currently uses idx_t.  This should make it easier to use
idx_t in the future.
2023-07-31 17:51:28 -07:00
Pádraig Brady
464be62df6 split: honor $TMPDIR for temp files
* bootstrap.conf: Depend on tmpdir rather than tmpfile,
as the standard tmpfile() doesn't honor $TMPDIR.
* src/split.c (copy_to_tmpfile): Adjust to call temp_stream() rather
than tmpfile();
* NEWS: Mention the improvement.
2023-07-18 23:11:24 +01:00
Pádraig Brady
0b2ff7637f all: avoid repeated diagnostic upon write error
* cfg.mk (sc_some_programs_must_avoid_exit_failure): Adjust to
avoid false positive.
(sc_prohibit_exit_write_error): A new syntax check to prohibit
open coding error(..., "write error"); instead directing to use...
* src/system.h (write_error): ... a new function to clear stdout errors
before we explicitly diagnose a write error and exit.
* src/basenc.c: Use write_error() to ensure no repeated diagnostics.
* src/cat.c: Likewise.
* src/expand.c: Likewise.
* src/factor.c: Likewise.
* src/paste.c: Likewise.
* src/seq.c: Likewise.
* src/shuf.c: Likewise.
* src/split.c: Likewise.
* src/tail.c: Likewise.
* src/tr.c: Likewise.
* src/unexpand.c: Likewise.
* tests/misc/write-errors.sh: Remove TODOs for the fixed utilities:
expand, factor, paste, shuf, tr, unexpand.
2023-07-17 11:28:36 +01:00
Paul Eggert
d727aba601 maint: prefer ckd_add to INT_ADD_WRAPV etc
* bootstrap.conf (gnulib_modules): Add stdckdint.
Also, in C source code, prefer C23 macros like ckd_add
to their Gnulib near-equivalents like INT_ADD_WRAPV.
Include <stdckdint.h> as needed.
2023-07-01 11:51:16 -07:00
Paul Eggert
6d61667d0d maint: go back to using ‘error’
Now that Gnulib’s ‘error’ module does proper static checking
for not returning, we need no longer use the ‘die’ macro.
This makes code easier to read for people that are used to ‘error’.
* cfg.mk (error_fns, exclude_file_name_regexp): Remove ‘die’.
(sc_die_EXIT_FAILURE): Remove.
* src/die.h: Remove.  All includes removed.  All calls to ‘die’
changed back to calls to ‘error’.
* src/install.c (get_ids): Use quoteaf (problem found with
make syntax-check).
* src/system.h: Include error.h, since some of our macros call ‘error’.
Stop including error.h elsewhere.
2023-07-01 11:51:16 -07:00
Paul Eggert
478055dc30 maint: improve static and dynamic checking
This modernizes the source code somewhat, to take advantage
of advances in GCC over the years, and Gnulib’s ‘assure’ module.
Include assure.h in files that now need it.
Do not include assert.h directly; it’s no longer needed.
* bootstrap.conf (gnulib_modules): Add ‘assure’.
* gl/lib/randread.c (randread_error):
* src/chmod.c (describe_change):
* src/chown-core.c (describe_change):
* src/cp.c (decode_preserve_arg):
* src/head.c (diagnose_copy_fd_failure):
* src/ls.c (parse_ls_color):
* src/od.c (decode_one_format):
* src/split.c (main):
* src/test.c (binary_operator, posixtest):
Prefer affirm to abort, since it has better diagnostics in the
normal case and better performance with -DNDEBUG.
* gl/lib/xdectoint.c, src/die.h: Include stddef.h, for unreachable.
* gl/lib/xdectoint.c: Do not include verify.h; no longer needed.
* gl/lib/xdectoint.c (__xnumtoint):
* src/die.h (die):
Prefer C23 unreachable () to assume (false).
* gl/lib/xfts.c (xfts_open):
* src/basenc.c (base32hex_encode):
* src/copy.c (abandon_move, copy_internal, valid_options):
* src/cut.c (cut_fields):
* src/df.c (alloc_field, decode_output_arg, get_dev):
* src/du.c (process_file, main):
* src/echo.c (usage):
* src/factor.c (udiv_qrnnd, mod2, gcd2_odd, factor_insert_large)
(mulredc2, factor_using_pollard_rho, isqrt2, div_smallq)
(factor_using_squfof):
* src/iopoll.c (iopoll_internal, fwrite_wait):
* src/join.c (add_field):
* src/ls.c (dev_ino_pop, main, gobble_file, sort_files):
* src/mv.c (do_move):
* src/od.c (decode_format_string, read_block, dump, main):
* src/remove.c (rm):
* src/rm.c (main):
* src/sort.c (stream_open):
* src/split.c (next_file_name, lines_chunk_split):
* src/stdbuf.c (main):
* src/stty.c (set_speed):
* src/tac-pipe.c (line_ptr_decrement, line_ptr_increment):
* src/touch.c (touch):
* src/tr.c (find_bracketed_repeat, get_next)
(validate_case_classes, get_spec_stats, string2_extend, main):
* src/tsort.c (search_item, tsort):
* src/wc.c (main):
Prefer affirm to assert, as it allows for better static
checking when compiling with -DNDEBUG.
* src/chown-core.c (change_file_owner):
* src/df.c (get_field_list):
* src/expr.c (printv, null, tostring, toarith, eval2):
* src/ls.c (time_type_to_statx, calc_req_mask, get_funky_string)
(print_long_format):
* src/numfmt.c (simple_strtod_fatal):
* src/od.c (decode_one_format):
* src/stty.c (mode_type_flag):
* src/tail.c (xlseek):
* src/tr.c (is_char_class_member, get_next, get_spec_stats)
(string2_extend):
Prefer unreachable () to abort () or assert (false) when merely
pacifying the compiler, e.g., in a switch statement on an enum
where all cases are covered.
* src/copy.c (valid_options): Now returns void; the bool was useless.
Caller no longer needs to assert.
* src/csplit.c (find_line):
* src/expand-common.c (next_file):
* src/shred.c (incname):
* src/sort.c (main):
* src/tr.c (append_normal_char, append_range, append_char_class)
(append_repeated_char, append_equiv_class):
* src/tsort.c (search_item):
Omit assert, since the hardware will check for us.
* src/df.c (header_mode): Now the enum type it should have been.
* src/du.c (process_file):
* src/ls.c (assert_matching_dev_ino):
* src/tail.c (valid_file_spec):
* src/tr.c (validate_case_classes):
Mark defns with MAYBE_UNUSED if they’re not used when -DNDEBUG.
* src/factor.c (prime_p, prime2_p, mp_prime_p): Now ATTRIBUTE_PURE.
Prefer affirm to error+abort.  No need to translate this diagnostic.
* src/fmt.c (get_paragraph):
* src/stty.c (display_changed, display_all, sane_mode):
* src/who.c (idle_string):
Prefer assume to assert, since the goal is merely pacification
and assert doesn’t pacify anyway if -DNDEBUG is used.
* src/join.c (decode_field_spec):
Omit unreachable abort.
* src/ls.c (assert_matching_dev_ino, main):
* src/tr.c (get_next):
Prefer assure to assert, since the check is relatively expensive
and won’t help static analysis.
* src/ls.c (main):
Prefer static_assert to assert of a constant expression.
(format_inode): Redo to make it clear that buflen doesn’t matter,
and that buf must have a certain number of bytes.  All callers changed.
This pacifies -Wformat-overflow.
* src/od.c (decode_one_format):
Omit an assert that tested for obviously undefined behavior,
as the compiler could optimize it away anyway.
* src/od.c (decode_one_format, decode_format_string):
Prefer ATTRIBUTE_NONNULL to runtime checking.
* src/stat.c: Do not include <stddef.h> since system.h does that now.
* src/sync.c (sync_arg):
Prefer unreachable () to assert (true), which was a typo.
* src/system.h: Include stddef.h, for unreachable.
* src/tail.c (xlseek): Simplify by relying on ‘error’ to exit.
2023-07-01 11:51:15 -07:00
Paul Eggert
16b5ca6e0d maint: prefer C23-style nullptr
* bootstrap.conf (gnulib_modules): Add nullptr.
In code, prefer nullptr to NULL where either will do.
2023-06-29 15:29:29 -07:00
Pádraig Brady
0147288d20 split: --additional-suffix: disallow trailing '/'
Note mktemp --suffix has the same inconsistency,
but mktemp -d does support creating dirs
so probably best to leave that as is.

* src/split.c (main): Check for trailing /.
* tests/split/additional-suffix.sh: Augment the test.
Reported in https://bugs.debian.org/1036827
2023-05-31 17:26:13 +01:00
Pádraig Brady
059e53e5b4 split: advise the kernel of sequential access pattern
As split is often dealing with large files,
ensure we indicate to the kernel our sequential access pattern.
This was seen to operate 5% faster when reading from SSD,
as tested with:

dd bs=1M count=2K if=/dev/urandom of=big.in

for split in split.orig split; do
  # Ensure big file is not cached
  dd of=big.in oflag=nocache conv=notrunc,fdatasync count=0 status=none
  # Test read efficiency
  CWD=$PWD; (cd /dev/shm && time $CWD/src/$split -n2 $CWD/big.in)
done

real    0m9.039s
user    0m0.055s
sys     0m3.510s

real    0m8.568s
user    0m0.056s
sys     0m3.752s

* src/split.c (main): Use fdadvise to help the kernel
choose a more appropriate readahead buffer.
* NEWS: Mention the improvement.
2023-05-08 21:34:58 +01:00
Paul Eggert
bb9dbcbbfd split: support split -n on larger pipe input
* bootstrap.conf (gnulib_modules): Add free-posix, tmpfile.
* src/split.c (copy_to_tmpfile): New function.
(input_file_size): Use it to split larger files when sizes cannot
easily be determined via fstat or lseek.  See Bug#61386#235.
* tests/split/l-chunk.sh: Mark tests of /dev/zero as
very expensive since they exhaust /tmp.
2023-03-07 13:41:46 -08:00
Paul Eggert
a4778006c8 maint: pacify ‘make syntax-check’
Problem reported by Pádraig Brady (Bug#61386#226).
* src/split.c (parse_chunk): Use die instead of error.
(main): Quote a string.
* tests/local.mk (all_root_tests): Move du/apparent.sh from here ...
(all_tests): ... to here.
2023-03-06 15:39:07 -08:00
Paul Eggert
8022874d12 split: tune for when creating output files
* src/split.c (create): Avoid fstat + ftruncate in the usual case
where the output file does not already exist, by trying
to create it with O_EXCL first.  This costs a failed open
in the unusual case where the output file already exists,
but that’s OK.
2023-03-04 14:49:46 -08:00
Paul Eggert
788654dd82 split: style fix
* src/split.c (ofile_open): Avoid ‘if (! (a = b))’ style.
2023-03-04 14:49:46 -08:00
Paul Eggert
40bf1591bb split: prefer signed integers to size_t
This allows for better runtime checking with gcc
-fsanitize=undefined.
* src/split.c: Include idx.h.
(open_pipes_alloc, n_open_pipes, suffix_length)
(set_suffix_length, input_file_size, sufindex, outbase_length)
(outfile_length, addsuf_length, create, cwrite, bytes_split)
(lines_split, line_bytes_split, lines_chunk_split)
(bytes_chunk_extract, ofile_open, lines_rr, main):
Prefer signed integers (typically idx_t) to size_t.
2023-03-04 14:49:46 -08:00
Paul Eggert
3434cdcec1 split: handle large numbers better
Prefer signed types to uintmax_t, as this allows for better
runtime checking with gcc -fsanitize=undefined.
Also, when an integer overflows just use the maximal value
when the code will do the right thing anyway.
* src/split.c (set_suffix_length, bytes_split, lines_split)
(line_bytes_split, lines_chunk_split, bytes_chunk_extract)
(lines_rr, parse_chunk, main):
Prefer a signed type (typically intmax_t) to uintmax_t.
(strtoint_die): New function.
(OVERFLOW_OK): New macro.  Use it elsewhere, where we now allow
LONGINT_OVERFLOW because the code then does the right thing on all
practical platforms (they have int wide enough so that it cannot
be practically exhausted).  We can do this now that we can safely
assume intmax_t has at least 64 bits.
(parse_n_units): New function.
(parse_chunk, main): Use it.
(main): Do not worry about integer overflow when the code
will do the right thing anyway with the extreme value.
Just use the extreme value.
* tests/split/fail.sh: Adjust to match new behavior.
2023-03-04 14:49:46 -08:00
Paul Eggert
1ebee5b1a3 split: prefer ssize_t for read result
* src/split.c (bytes_split, lines_chunk_split)
(bytes_chunk_extract, main): Prefer ssize_t to size_t when
representing the return value of ‘read’.  Use a negative value
instead of SIZE_MAX to indicate a missing value.
2023-03-04 14:49:46 -08:00
Paul Eggert
e19a59141b split: be more careful about buffer sizes
* src/split.c: Include sys-limits.h, not safe-read.h.
(input_file_size, bytes_split, lines_split, line_bytes_split)
(lines_chunk_split, bytes_chunk_extract, lines_rr): Call read, not
safe_read, since safe_read no longer buys us anything.
(main): Reject outlandish buffer sizes right away,
rather than allocating huge buffers and never using them.
2023-03-04 14:49:46 -08:00
Paul Eggert
0450987853 split: minor -1 / 0 refactor
* src/split.c (create, bytes_split, ofile_open):
Prefer comparing to 0 to comparing to -1.
2023-03-04 14:49:46 -08:00
Paul Eggert
a110ce4ce3 split: don’t worry about ECHILD
* src/split.c (closeout): There should be no need for a special
case for ECHILD, since we never wait for the same child twice.
Simplify with this in mind.
2023-03-04 14:49:46 -08:00
Paul Eggert
41615f0f8f split: don’t assume pid_t fits in int
* src/split.c (filter_pid): Now pid_t, not int.
(of_t): opid member is now pid_t, not int.
2023-03-04 14:49:45 -08:00
Paul Eggert
99fcde22ce split: simplify SIGPIPE handling
Ignore and default SIGPIPE, rather than blocking and unblocking it.
* src/split.c (default_SIGPIPE):
New static var, replacing oldblocked and newblocked.
(create): Use it.
(main): Set it.
2023-03-04 14:49:45 -08:00
Paul Eggert
aa266f1b3d split: port ‘split -n N /dev/null’ better to macOS
* src/split.c (input_file_size): Do not bother with lseek if the
initial read probe reaches EOF, since the file size is known then.
This works better on macOS, which doesn’t allow lseek on /dev/null.
Do not special-case size-zero files, as the issue can occur
with any size file (though /proc files are the most common).
If the current position is past end of file, treat this as
size zero regardless of whether the file has a usable st_size.
Pass through lseek -1 return values rather than using ‘return -1’;
this makes the code a bit easier to analyze (and a bit faster).
Avoid undefined behavior if the size calculation overflows.
(lines_chunk_split): Do not bother with lseek if it would have
no effect if successful.  This works better on macOS, which
doesn’t allow lseek on /dev/null.
* tests/split/l-chunk.sh: Adjust to match fixed behavior.
2023-03-04 14:49:45 -08:00
Paul Eggert
fb6fc7f3ce split: split more evenly with -n
* src/split.c (bytes_split): New arg REM_BYTES.
Use this to split more evenly.  All callers changed.
(lines_chunk_split, bytes_chunk_extract):
Be consistent with new byte_split.
* tests/split/b-chunk.sh, tests/split/l-chunk.sh: Test new behavior.
2023-03-04 14:49:45 -08:00
Paul Eggert
0d997e18b9 split: small -n lines simplification
* src/split.c (lines_chunk_split):
Rewrite while as if-while for clarity.
2023-03-04 14:49:45 -08:00
Paul Eggert
f749449e5c split: refactor lines_chunk_split
* src/split.c (lines_chunk_split): Simplify by having chunk_end
point to the first byte after the chunk, rather than to the last
byte of the chunk.  This will reduce confusion once we allow
chunks to be empty.
2023-03-04 14:49:45 -08:00
Pádraig Brady
f4567ed953 all: further adjustments for new Ronna, Quetta SI prefixes
* src/dd.c (parse_integer): Support Q,R suffixes.
* src/od.c (main): Likewise.
* src/split.c (main): Likewise.
* src/stdbuf.c (parse_size): Likewise.
* src/truncate.c (main): Likewise.
* src/sort.c (specify_size_size): Likewise.
Also line length syntax check fix.
* tests/misc/numfmt.pl: Adust top end large number checks
to the new largest values.
* doc/coreutils.texi (numfmt invocation): Add a numfmt example.
* NEWS: Tweak to aid searchability.
2023-01-06 14:26:40 +00:00
Pádraig Brady
01755d36e7 maint: update all copyright year number ranges
Update to latest gnulib with new copyright year.
Run "make update-copyright" and then...

* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Manually update copyright year,
until we fully sync with gnulib at a later stage.
* tests/sample-test: Adjust to use the single most recent year.
2023-01-01 14:50:15 +00:00
Paul Eggert
06b45ef985 split: pacify -fsanitizer=leak
* src/split.c (lines_rr): New arg FILESP.  All uses changed.
(main): Use main_exit, not return.  Omit unnecessary alignfree.
2022-01-31 12:07:39 -08:00
Paul Eggert
b973d2d44a maint: simplify memory alignment
Use the new Gnulib modules alignalloc and xalignalloc
to simplify some memory allocation.
Also, fix some unlikely integer overflow problems.
* bootstrap.conf (gnulib_modules): Add alignalloc, xalignalloc.
* src/cat.c, src/copy.c, src/dd.c, src/shred.c, src/split.c:
Include alignalloc.h.
* src/cat.c (main):
* src/copy.c (copy_reg):
* src/dd.c (alloc_ibuf, alloc_obuf):
* src/shred.c (dopass):
* src/split.c (main):
Use alignalloc/xalignalloc/alignfree instead of doing page
alignment by hand.
* src/cat.c (main):
Check for integer overflow in page size calculations.
* src/dd.c (INPUT_BLOCK_SLOP, OUTPUT_BLOCK_SLOP, MAX_BLOCKSIZE):
(real_ibuf, real_obuf) [lint]:
Remove; no longer needed.
(cleanup) [lint]:
(scanargs): Simplify.
* src/ioblksize.h (io_blksize): Do not allow blocksizes largest
than the largest power of two that fits in idx_t and size_t.
* src/shred.c (PAGE_ALIGN_SLOP, PATTERNBUF_SIZE): Remove.
2022-01-27 13:04:14 -08:00
Pádraig Brady
3067a9293a maint: update all copyright year number ranges
Run "make update-copyright" and then...

* gnulib: Update to latest with copyright year adjusted.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Likewise.
* tests/sample-test: Adjust to use the single most recent year.
2022-01-02 16:15:55 +00:00
Paul Eggert
2715aba08a maint: prefer rawmemchr to memchr when easy
* bootstrap.conf (gnulib_modules): Add rawmemchr.
* src/csplit.c: Include idx.h.
* src/csplit.c (record_line_starts):
* src/head.c (elide_tail_lines_pipe):
* src/shuf.c (next_line):
* src/split.c (lines_split):
* src/tail.c (pipe_lines):
* src/wc.c (wc_lines):
Prefer rawmemchr to memchr when rawmemchr is easy.
* src/csplit.c (load_buffer):
* src/head.c (struct linebuffer):
Make room for a 1-byte sentinel.
2021-09-15 15:08:28 -07:00
Paul Eggert
f8dc5a6215 split: avoid NULL + 1
* src/split.c (lines_chunk_split): Don’t add to a null pointer.
It’s undefined behavior, and it’s unnecessarily confusing
regardless.
2021-09-15 15:08:28 -07:00
Pádraig Brady
ef772bf97f maint: use "char const *" rather than "const char *"
* cfg.mk (sc_prohibit-const-char): Add a new syntax-check to
enforce this style.
* *.[ch]: sed -i 's/const char \*/char const */g'
2021-04-11 18:33:45 +01:00
Pádraig Brady
bb21daa125 split: fix --number=K/N to output correct part of file
This functionality regressed with the adjustments
in commit v8.25-4-g62e7af032

* src/split.c (bytes_chunk_extract): Account for already read data
when seeking into the file.
* tests/split/b-chunk.sh: Use the hidden ---io-blksize option,
to test this functionality.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/46048
2021-01-25 21:39:09 +00:00