Run "make update-copyright" and then...
* gnulib: Update to latest with copyright year adjusted.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Likewise.
* tests/sample-test: Adjust to use the single most recent year.
This functionality regressed with the adjustments
in commit v8.25-4-g62e7af032
* src/split.c (bytes_chunk_extract): Account for already read data
when seeking into the file.
* tests/split/b-chunk.sh: Use the hidden ---io-blksize option,
to test this functionality.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/46048
Run "make update-copyright" and then...
* gnulib: Update to latest with copyright year adjusted.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Likewise.
* tests/sample-test: Adjust to use the single most recent year.
Run "make update-copyright" and then...
* gnulib: Update to latest with copyright year adjusted.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Likewise.
* tests/sample-test: Adjust to use the single most recent year.
* src/split.c (set_suffix_length): Use a more standard
zero based logN calculation for the number of units.
* tests/split/suffix-auto-length.sh: Add a test case.
* THANKS.in: Mention the reporter.
* NEWS: Mention the fix.
Fixes https://bugs.gnu.org/35291
Run "make update-copyright" and then...
* gnulib: Update to latest with copyright year adjusted.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Likewise.
* tests/sample-test: Adjust to use the single most recent year.
Problem reported for split by Scott Worley (Bug#33761):
* src/shred.c (do_wipefd):
Also report an error if ftruncate fails on a shared memory object.
* src/sort.c (get_outstatus): New function.
(stream_open, avoid_trashing_input): Use it.
* src/sort.c (stream_open):
* src/split.c (create):
If ftruncate fails, do not report an error
unless it is a regular file or a shared memory object.
Run "make update-copyright" and then...
* gnulib: Update to latest with copyright year adjusted.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Likewise.
* tests/sample-test: Adjust to use the single most recent year.
* doc/coreutils.texi (split invocation): Document the new option.
* src/split.c (usage): Likewise.
(main): Process the new option much like --numeric-suffixes,
but with an adjusted alphabet.
* tests/split/numeric.sh: Refactor to support --hex mode.
* NEWS: Mention the new feature.
* src/split.c (bytes_split): Don't write to an existing filter
if it has exited. When filters exit early, skip input data if
possible. Refactor out 2 redundant variables.
* tests/split/filter.sh: Improve test coverage given the
new more efficient processing. Also use a 10TB file to
expand the file systems tested on.
commit v8.25-4-g62e7af0 introduced the issue as it
broke out of the processing loop irrespective of
the value of new_file_flag which was used to indicate
a finite number of filters or not.
For example, this ran forever (as it should):
$ yes | split --filter="head -c1 >/dev/null" -b 1000
However this exited immediately due to EPIPE being propagated
back through cwrite and the loop not considering new filters:
$ yes | split --filter="head -c1 >/dev/null" -b 100000
Similarly processing would exit early for a bounded number of
output files, resulting in empty data sent to all but the first:
$ truncate -s10T big.in
$ split --filter='head -c1 >$FILE' -n 2 big.in
$ echo $(stat -c%s x??)
1 0
I was alerted to this code by clang-analyzer,
which indicated dead assigments, which is often
an indication of code that hasn't considered all cases.
* src/split.c (bytes_split): Change the last condition in
the processing loop to also consider the number of files
before breaking out of the processing loop.
* tests/split/filter.sh: Add a test case.
* NEWS: Mention the bug fix.
* bootstrap.conf, src/base64.c, src/cat.c, src/cksum.c:
* src/head.c, src/md5sum.c, src/od.c, src/split.c, src/sum.c:
* src/tac.c, src/tail.c, src/tee.c, src/tr.c, src/wc.c:
Adjust to renaming of the xsetmode module to xbinary-io,
and of the xsetmode function to xset_binary_mode.
This fixes a bug noted by Eric Blake. Code was using xfreopen to
change files to binary mode, but this fails for stdout when in
append mode. Such code should use xsetmode instead. This affects
only the port on platforms like MS-Windows which distiguish text
from binary I/O.
* bootstrap.conf (gnulib_modules):
Remove xfreopen and add xsetmode. Sort.
* src/base64.c (main):
* src/cat.c (main):
* src/cksum.c (cksum):
* src/head.c (head_file, main):
* src/md5sum.c (digest_file):
* src/od.c (open_next_file):
* src/split.c (main):
* src/sum.c (bsd_sum_file, sysv_sum_file):
* src/tac.c (tac_file, main):
* src/tail.c (tail_file):
* src/tee.c (tee_files):
* src/tr.c (main):
* src/wc.c (wc_file): Use xsetmode, not xfreopen.
Run "make update-copyright" and then...
* gnulib: Update to latest with copyright year adjusted.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Likewise.
* tests/sample-test: Adjust to use the single most recent year.
ASAN reported this error for: split -n2/3 /dev/null
ERROR: AddressSanitizer: negative-size-param: (size=-1)
#0 0x7f0d4c36951d in __asan_memmove (/lib64/libasan.so.2+0x8d51d)
#1 0x404e06 in memmove /usr/include/bits/string3.h:59
#2 0x404e06 in bytes_chunk_extract src/split.c:988
#3 0x404e06 in main src/split.c:1626
Specifically there would be invalid memory access
and subsequent processing if the chunk to be extracted
was beyond the initial amount read from file (which is
currently capped at 128KiB). This issue is not in a
released version, only being introduced in commit v8.25-4-g62e7af0
* src/split.c (bytes_chunk_extract): The initial_read != SIZE_MAX
should have been combined with && rather than ||, but also this
condition is always true in this function so remove entirely.
* tests/split/b-chunk.sh: Add a test case.
Fixes http://bugs.gnu.org/25003
This reduces a standard coreutils install size by about 160K.
* src/cat.c: Change to proper_name() which removes about 18K text.
* src/cp.c: Likewise.
* src/df.c: Likewise.
* src/du.c: Likewise.
* src/getlimits.c: Likewise.
* src/realpath.c: Likewise.
* src/split.c: Likewise.
* src/stdbuf.c: Likewise.
* src/timeout.c: Likewise.
* src/truncate.c: Likewise.
* src/local.mk: Remove -llibiconv from the above programs.
* cfg.mk (sc_check-AUTHORS): Adjust to use factor(1).
* AUTHORS: Adjust to use ASCII to satisfy sc_check-AUTHORS.
die() has the advantage of being apparent to the compiler
that it doesn't return, which will avoid warnings in some cases,
and possibly generate better code.
* cfg.mk (sc_die_EXIT_FAILURE): A new syntax check rule to
catch any new uses of error (CONSTANT, ...);
* src/split.c (lines_rr): Reinstate the conditional
setting of the WROTE boolean, as otherwise split -n r/1 would
consume all input when all --filter commands are stopped.
There was a test in place to check for this, but
it was incorrect as detailed below.
(input_file_size): Immediately disallow --number with
non seekable inputs, as such an invocation is not currently
generally supported and will fail as the data overflows
the internal buffer.
* tests/split/l-chunk.sh: Adjust to again disallow -n /dev/zero.
Also change all '&& fail=1' checks to use the 'returns_ 1' form.
* tests/split/filter.sh: Change the no longer supported /dev/zero
case to a regular $OFF_T_MAX file (supported on XFS for example).
Also fix the timeout(1) commands so they're not subject to
pipefail issues.
Problem reported by Nelson H.F. Beebe in: http://bugs.gnu.org/22624
Other problems also fixed: basically, the code got confused because
GNU/Linux reports that /dev/zero has size zero.
* src/split.c (input_file_size): Now takes struct stat *, not just
size. Always store the first buffer. All callers changed. Treat
/dev/zero as an infinitely-large file, both on GNU/Linux where
fstat and lseek say its size is zero, and on GNU/Hurd where they
say the size is OFF_T_MAX.
(cwrite): Return true on success.
(bytes_split): Don't try to read past EOF, and stop if a write fails.
(lines_rr): Omit stray check for ignorable errno.
(main): Get file size only when n_units > 1, since that's the only
time it is needed. Defer most of the work to input_file_size.
* tests/split/l-chunk.sh: Adjust tests to match new behavior
on oddball inputs.
Run "make update-copyright" and then...
* gnulib: Update to latest with copyright year adjusted.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Likewise.
* tests/sample-test: Adjust to use the single most recent year.
Quote file names using the "shell-escape" or "shell-escape-always"
methods, which quote as appropriate for most shells,
and better support copy and paste of presented names.
The "always" variant is used when the file name is
embedded in an error message with surrounding spaces.
* cfg.mk (sc_error_shell_quotes): A new syntax check rule
to suggest quotef() where appropriate.
(sc_error_shell_always_quotes): Likewise for quoteaf().
* src/system.h (quotef): A new define to apply shell quoting
when needed. I.E. when shell character or ':' is present.
(quoteaf): Likewise, but always quote.
* src/*.c: Use quotef() and quoteaf() rather than quote()
where appropriate.
* tests/: Adjust accordingly.
These strings are often file names or other user specified
parameters, which can give confusing errors in
the presence of unexpected characters for example.
* cfg.mk (sc_error_quotes): A new syntax check rule.
* src/*.c: Wrap error() string arguments with quote().
* tests/: Adjust accordingly.
* NEWS: Mention the improvement.
* src/shred.c (usage): For -u, separate the decscription
of the short and long option, to clarify that the short option
takes no parameter.
* src/split.c (usage): Likewise for -d.
* src/tee.c (usage): Likewise for -p.
* src/uniq.c (usage): Likewise for -D.
Suggested by Stephane Chazelas
tests/misc/wc-proc.sh fails when the page size is 64K
* src/wc.c (wc): The lseek adjustment should be based on st_blksize,
rather than on the internal buffer size. This is significant on
aarch64 where st_blksize in /proc is the 64K (the page size) and
thus larger than the internal buffer.
* src/split.c (main): Even though the similar processing is done
on the internal buffer size, that's based on st_blksize and
so fine in this regard. Add an assert to enforce this.
Avoid this path for the undocumented ---io-blksize option.
Supporting `split --numeric-suffixes=1 -n100` for example.
* doc/coreutils.texi (split invocation): Mention the two
use cases for the FROM parameter, and the consequences on
the suffix length determination.
* src/split.c (set_suffix_length): Use the --numeric-suffixes
FROM parameter in the suffix width calculation, when it's
less than the number of files specified in --number.
* tests/split/suffix-auto-length.sh: Add test cases.
Fixes http://bugs.gnu.org/20511
* src/system.h (emit_stdin_note): A new function, refactoring
the usage note about the '-' FILE implying stdin.
* src/base64.c (usage): Use the new function to emit the
note in a standard location and with standard separation.
* src/cat.c (usage): Likewise.
* src/csplit.c (usage): Likewise.
* src/cut.c (usage): Likewise.
* src/expand.c (usage): Likewise.
* src/fmt.c (usage): Likewise.
* src/head.c (usage): Likewise.
* src/md5sum.c (usage): Likewise.
* src/nl.c (usage): Likewise.
* src/od.c (usage): Likewise.
* src/paste.c (usage): Likewise.
* src/pr.c (usage): Likewise.
* src/ptx.c (usage): Likewise.
* src/shred.c (usage): Likewise.
* src/shuf.c (usage): Likewise.
* src/sort.c (usage): Likewise.
* src/sum.c (usage): Likewise.
* src/tac.c (usage): Likewise.
* src/tail.c (usage): Likewise.
* src/tsort.c (usage): Likewise.
* src/unexpand.c (usage): Likewise.
* src/wc.c (usage): Likewise.
* src/join.c (usage): Adjust the separation used for
the message referring to FILE1 or FILE2 as stdin.
* src/comm.c (usage): Add a message using the same
wording (translation) as used in join.
* src/split.c (usage): Reword to using FILE rather than
INPUT, allowing use of emit_stdin_note(). Also remove
the mention of "fixed-size" pieces as this isn't now
always the case.
Fixes http://pad.lv/1450179
With GCC 5 and the newly added warnings from gnulib, ensure the
correct signed integer is passed for the printf format,
to avoid -Werror=format= failures.
* src/split.c (eolchar): A new variable to hold
the separator character (unibyte for now).
This is reference throughout rather than hardcoding '\n'.
(usage): Describe the new --separator option, and
mention records along with lines so there is no ambiguity
that all options treat lines and records equivalently.
(main): Have -t update eolchar, or default to '\n'.
* tests/split/record-sep.sh: New test case.
* tests/local.mk: Reference the new test.
* doc/coreutils.texi (split invocation): Document the new option.
Adjust --lines, --line-bytes, --number=[lr]/... to mention
they pertain to records if --separator is specified.
* NEWS: Mention the new feature.
* src/split.c (usage): Indent the info on CHUNKS so that
help2man can match it and align appropriately in its own section.
Fixes http://bugs.gnu.org/19228
Run "make update-copyright" and then...
* tests/sample-test: Adjust to use the single most recent year.
* tests/du/bind-mount-dir-cycle-v2.sh: Fix case in copyright message,
so that year is updated automatically in future.
Following on from commit v8.23-82-gaddae94, consistently diagnose
numbers that are too large, so as to distinguish from other errors,
and make the limits obvious.
* gl/modules/xdectoint: A new module implementing xdecto[iu]max(),
which handles the common case of parsing a bounded integer and
exiting with a diagnostic on error.
* gl/lib/xdectoimax.c: The signed variant.
* gl/lib/xdectoint.c: The parameterized implementation.
* gl/lib/xdectoint.h: The interface.
* gl/lib/xdectoumax.c: The unsigned variant.
* bootstrap.conf: Reference the new module.
* cfg.mk (exclude_file_name_regexp--sc_require_config_h_first):
Exclude the parameterized templates.
* src/csplit.c: Output EOVERFLOW or ERANGE errors if appropriate.
* src/fmt.c: Likewise.
* src/fold.c: Likewise.
* src/head.c: Likewise.
* src/ls.c: Likewise.
* src/nl.c: Likewise.
* src/nproc.c: Likewise.
* src/shred.c: Likewise.
* src/shuf.c: Likewise.
* src/stdbuf.c: Likewise.
* src/stty.c: Likewise.
* src/tail.c: Likewise.
* src/truncate.c: Likewise.
* src/split.c: Likewise.
* src/pr.c: Likewise.
* tests/pr/pr-tests.pl: Adjust to avoid matching errno diagnostic.
* tests/fmt/base.pl: Likewise.
* tests/split/l-chunk.sh: Likewise.
* tests/misc/shred-negative.sh: Likewise.
* tests/misc/tail.pl: Likewise. Also remove the redundant
existing ERR_SUBST from test err-6.
* tests/ls/hex-option.sh: Check HEX/OCT options.
* tests/misc/shred-size.sh: Likewise.
* tests/misc/stty-row-col.sh: Likewise.
"zu" was output on solaris 8 for example rather than the number,
since coreutils-8.22.
* cfg.mk: Disallow %z, since we don't currently use the gnulib
fprintf module, so any usage with it is non portable. Also
our usage with error() currently works only through an ancillary
dependency on the vfprintf gnulib module.
* src/rm.c (main): Use %PRIuMAX rather than %zu for portability.
* src/dd.c (alloc_[io]buf): Likewise for consistency.
* src/od.c (main): Likewise.
* src/split.c (set_suffix_length): Likewise.
* NEWS: Mention the rm bug fix.
Reported in http://bugs.gnu.org/19184
Fix similar problems in head, od, split, tac, and tail.
Reported by George Shuklin in: http://bugs.gnu.org/18621
* NEWS: Document this.
* src/head.c (elseek): Move up.
(elide_tail_bytes_pipe, elide_tail_lines_pipe): New arg
CURRENT_POS. All uses changed.
(elide_tail_bytes_file, elide_tail_lines_file):
New arg ST and remove arg SIZE. All uses changed.
* src/head.c (elide_tail_bytes_file):
* src/od.c (skip): Avoid optimization for /sys files, where
st_size is bogus and st_size == st_blksize.
Don't report error at EOF when not optimizing.
* src/head.c, src/od.c, src/tail.c: Include "stat-size.h".
* src/split.c (input_file_size): New function.
(bytes_split, lines_chunk_split, bytes_chunk_extract): New arg
INITIAL_READ. All uses changed. Use it to double-check st_size.
* src/tac.c (tac_seekable): New arg FILE_POS. All uses changed.
(copy_to_temp): Return size of temp file. All uses changed.
* src/tac.c (tac_seekable):
* src/tail.c (tail_bytes):
* src/wc.c (wc):
Don't trust st_size; double-check by reading.
* src/wc.c (wc): New arg CURRENT_POS. All uses changed.
* tests/local.mk (all_tests): Add tests/misc/wc-proc.sh,
tests/misc/od-j.sh, tests/tail-2/tail-c.sh.
* tests/misc/head-c.sh:
* tests/misc/tac-2-nonseekable.sh:
* tests/split/b-chunk.sh:
Add tests for problems with /proc and /sys files.
* tests/misc/od-j.sh, tests/misc/wc-proc.sh, tests/tail-2/tail-c.sh:
New files.
* src/system.h (emit_ancillary_info): Take the invariant PROGRAM_NAME
as a parameter, so that consistent references are made to online docs
and texinfo nodes, when a --program-prefix is in place. Note the
man pages don't need this fix as they're generated before the program
prefix is used.
* NEWS: Mention the improvements in references to online documentation.
Input buffering is best avoided because it introduces
delayed processing of output for intermittent input,
especially when the output size is less than that of
the input buffer. This is significant when output
is being further processed which could happen if split
is writing to precreated fifos, or through --filter.
If input is arriving quickly from a pipe then this will
already be buffered before we read it, so fast arriving
input shouldn't be a performance issue.
* src/split.c (lines_split, lines_bytes_split, bytes_split,
lines_chunk_split, bytes_chunk_extract): s/full_read/safe_read/.
* THANKS.in: Mention the reporter.
* NEWS: Mention the improvement.
Run "make update-copyright", but then also run this,
perl -pi -e 's/2\d\d\d-//' tests/sample-test
to make that one script use the single most recent year number.
Also do not end option descriptions with a period, properly indent
continuation lines, and make some tiny clarifications.
* src/du.c (usage): Lowercase after semicolon.
* src/ls.c (usage): Semicolons instead of periods, small rephrasing
and two hyphens for clarity, proper indentation.
* src/mktemp.c (usage): Semicolons and lowercase.
* src/od.c (usage): Semicolons.
* src/ptx.c (usage): Use the standard phrase, clarify default option.
* src/setuidgid.c (usage): Properly indent continuation line.
* src/split.c (usage): Semicolons, lowercase, no final period.
* src/stat.c (usage): Semicolons, lowercase.
* src/tail.c (usage): Proper indentation, one shorter rephrasing,
semicolons, no final periods.
* src/timeout.c (usage): Properly indent, semicolons, no final periods.
Fixes http://bugs.gnu.org/14976
* src/split.c (line_bytes_split): Rewrite to only buffer
when necessary. I.E. only increase the buffer when we've
already lines output in a split and we encounter a line
larger than the input buffer size, in which case a hold
buffer will be increased in increments of the input buffer size.
(lines_rr): Use the more abstract xalloc_die() just like
we did in line_bytes_split(), rather than explicitly
printing the "memory exhausted" message and exiting.
* tests/split/line-bytes.sh: Add a new test for this
function which previously had no test coverage.
* tests/local.mk: Reference the new test.
* NEWS: Mention the improvement.
Fixes http://bugs.gnu.org/13537
Each program with at least one long option which is marked as
'required_argument' and which has also a short option for that
option, should print a note about mandatory arguments.
Define that well-known note centrally and use it rather than
literal printf/fputs, and add it where it was missing.
* src/system.h (emit_mandatory_arg_note): Add new function.
* src/cp.c (usage): Use it rather than literal printf/fputs.
* src/csplit.c, src/cut.c, src/date.c, src/df.c, src/du.c:
* src/expand.c, src/fmt.c, src/fold.c, src/head.c, src/install.c:
* src/kill.c, src/ln.c, src/ls.c, src/mkdir.c, src/mkfifo.c:
* src/mknod.c, src/mv.c, src/nl.c, src/od.c, src/paste.c:
* src/pr.c, src/ptx.c, src/shred.c, src/shuf.c, src/sort.c:
* src/split.c, src/stdbuf.c, src/tac.c, src/tail.c, src/timeout.c:
* src/touch.c, src/truncate.c, src/unexpand.c, src/uniq.c:
Likewise.
* src/base64.c (usage): Add call of the above new function
because at least one long option has a required argument.
* src/basename.c, src/chcon.c, src/date.c, src/env.c:
* src/nice.c, src/runcon.c, src/seq.c, src/stat.c, src/stty.c:
Likewise.
Run "make update-copyright", but then also run this,
perl -pi -e 's/2\d\d\d-//' tests/sample-test
to make that one script use the single most recent year number.
* src/split.c (lines_rr) [IF_LINT]: Plug a harmless leak.
(main) [IF_LINT]: Free a usually-small (~70KB) buffer
just before exit, mainly to take this off the radar of
leak-detecting tools.
Improved-by: Pádraig Brady.
* src/split.c (create): Check if output file is the
same inode as the input file.
* tests/split/guard-input: New test case.
* tests/Makefile.am: Reference new test case.
* NEWS: Mention the fix.
Improved-by: Jim Meyering
Reported-by: François Pinard