coreutils

mirror of git://git.sv.gnu.org/coreutils.git synced 2026-04-18 09:46:33 +02:00

Author	SHA1	Message	Date
Pádraig Brady	0ee8b03422	doc: touch: clarify --time description in man page * src/touch.c (usage): Reorganise the description to be similar to the format used for the ls --time description, which formats better when converted to a man page. Also separate the description to allow for more granular translations. Fixes https://bugs.gnu.org/67656	2023-12-06 13:07:43 +00:00
dann frazier	73d119f4f8	tail: fix tailing sysfs files where PAGE_SIZE > BUFSIZ * src/tail.c (file_lines): Ensure we use a buffer size >= PAGE_SIZE when searching backwards to avoid seeking within a file, which on sysfs files is accepted but also returns no data. * tests/tail/tail-sysfs.sh: Add a new test. * tests/local.mk: Reference the new test. * NEWS: Mention the bug fix. Fixes https://bugs.gnu.org/67490	2023-12-01 23:01:32 +00:00
Pádraig Brady	615167cc4d	numfmt: support lowercase 'k' for Kilo and Kibi For consistency with the "SI" standard, and with other coreutils which output a lowercase 'k' in "SI" mode. * src/numfmt.c (suffix_power): Treat 'k' like 'K' on input. (double_to_human): Output lowercase 'k' in SI mode. (usage): Adjust accordingly. * doc/coreutils.texi: Mention 'k' accepted, and printed in SI mode. * tests/misc/numfmt.pl: Adjust accordingly. * NEWS: Mention the change in behavior. Fixes https://bugs.gnu.org/47103	2023-11-27 19:41:49 +00:00
Paul Eggert	74b9d6a6e8	uniq: fix bug with -w in multibyte locales -w counted bytes not characters, which is wrong in multibyte locales. This bug exists even in Fedora, which is why the recently-added test cases from Fedora didn’t catch it. * src/uniq.c (find_field): New arg PLEN. All callers changed. Compute length of field correctly in multi-byte locales. (different): Don’t worry about check_chars; find_field now does that. * tests/uniq/uniq.pl: Test for this bug.	2023-11-16 11:37:25 -08:00
Paul Eggert	0ed9d1823a	tests: omit inapplicable test code * tests/misc/join.pl, tests/uniq/uniq.pl: Remove test for "invalid byte, character or field list" message that is not generated.	2023-11-16 11:37:25 -08:00
Paul Eggert	77201c506f	uniq: change macro to function * src/uniq.c (swap_lines): New static function, replacing the old SWAP_LINES macro. These days this is just as fast. All uses changed.	2023-11-16 11:37:25 -08:00
Paul Eggert	3bee7c9754	uniq: prefer static init * src/uniq.c (skip_fields, skip_chars, check_chars, count_occurrences) (output_unique, output_first_repeated, output_later_repeated) (delimit_groups): Initialize statically, rather than in ‘main’. This shrinks the executable a bit.	2023-11-16 11:37:25 -08:00
Paul Eggert	a72b7823b4	uniq: simplify and fix unlikely bug by using bool * src/uniq.c (enum countmode): Remove this type. (count_occurrences): New static var, replacing the old countmode, and of type boolean instead of a two-value enum type that was confusing (and which caused a hard-to-test bug when the count exceeded INTMAX_MAX - 1). All uses changed.	2023-11-16 11:37:25 -08:00
Paul Eggert	a257b63ce7	uniq: prefer signed integers * src/uniq.c (skip_fields, skip_chars, check_chars, size_opt) (find_field, different, writeline, check_file, main): Prefer signed to unsigned integer types, since this allows for better runtime checking with -fsanitize=undefined.	2023-11-14 23:15:18 -08:00
Paul Eggert	23e26ed972	maint: DECIMAL_DIGIT_ACCUMULATE uses stdckdint.h * src/system.h: Include <stdckdint.h>, since the new DECIMAL_DIGIT_ACCUMULATE uses it. Do not include stdckdint.h from files that also include system.h. (DECIMAL_DIGIT_ACCUMULATE): Omit last arg, which is no longer needed. Reimplement by using C23-style stdckdint.h’s ckd_mul and ckd_add, as that’s more standard and is more likely to generate better code.	2023-11-14 20:38:24 -08:00
Paul Eggert	3e0d7787e6	pinky: fix string size calculation * src/pinky.c (count_ampersands): Simplify and return idx_t. (create_fullname): Compute proper destination string size, basically, by adding (ulen - 1) * ampersands rather than ulen * (ampersands - 1). Problem found on CHERI-64.	2023-11-11 00:17:49 -08:00
Paul Eggert	4c15a1b6e6	maint: port randread to FreeBSD 14 * gl/lib/randread.c (POINTER_IS_ALIGNED): Rename from ALIGNED_POINTER to avoid a collision with <machine/param.h> on FreeBSD 14.	2023-11-11 00:17:49 -08:00
Paul Eggert	394b29aaff	build: update gnulib submodule to latest	2023-11-11 00:17:48 -08:00
Pádraig Brady	7f2c97a241	ls: fix recent regression in size alignment * src/ls.c (print_long_format): Use correct column width, introduced due to a copy/paste error in commit v9.4-2-gcbb6dfec5 * tests/ls/size-align.sh: Add a test. * tests/local.mk: Reference the new test. Fixes https://bugs.gnu.org/66919	2023-11-03 16:34:38 +00:00
Paul Eggert	56e9acb292	join: fix recently introduced NUL bug * src/join.c (xfields): Simplify and fix bug with fields that start with a NUL byte when -t is not used. * tests/misc/join-utf8.sh: Also test when -t is not used, and when a field starts with NUL.	2023-10-30 10:49:44 -07:00
Paul Eggert	bd45f0963c	maint: pacify ‘make syntax-check’ * tests/misc/join-utf8.sh: Omit fail=0. Fix framework_failure_ typo. * tests/misc/join.pl: Change ` to '.	2023-10-30 01:33:19 -07:00
Paul Eggert	ba5017b65a	maint: copy join, uniq tests from Fedora * tests/misc/join.pl, tests/uniq/uniq.pl: Copy from Fedora 39. This adds more multi-byte tests.	2023-10-30 01:24:43 -07:00
Paul Eggert	11b01fc21f	join,uniq: support multi-byte separators * NEWS: Mention this. * bootstrap.conf (gnulib_modules): Remove cu-ctype, as this module is now more trouble than it’s worth. All uses removed. Add skipchars. * gl/lib/cu-ctype.c, gl/lib/cu-ctype.h, gl/modules/cu-ctype: Remove. * gl/lib/skipchars.c, gl/lib/skipchars.h, gl/modules/skipchars: * tests/misc/join-utf8.sh: New files. * src/join.c: Include skipchars.h and mcel.h instead of cu-ctype.h. (tab): Now mcel_t, not int. All uses changed. (output_separator, output_seplen): New static vars. (eq_tab, newline_or_blank, comma_or_blank): New functions. (xfields, prfields, prjoin, add_field_list, main): Support multi-byte characters. * src/numfmt.c: Include ctype.h, skipchars.h. Do not include cu-ctype.h. (newline_or_blank): New function. (next_field): Support multi-byte characters. * src/sort.c: Include ctype.h instead of cu-ctype.h. (inittables): Open-code field_sep since it no longer exists. ‘sort’ is not multi-byte safe yet, but when it is this code will need revamping anyway. * src/uniq.c: Include mcel.h and skipchars.h instead of cu-ctype.h. (newline_or_blank): New function. (find_field): Support multi-byte characters. * tests/local.mk (all_tests): Add tests/misc/join-utf8.sh	2023-10-30 00:58:04 -07:00
Paul Eggert	2709bea0f4	test: allow non-blank white space in numbers * src/test.c (find_int): Use isspace, not isblank, for compatibility with how strtol works, which is how most other shells do this.	2023-10-30 00:58:04 -07:00
Paul Eggert	a3ce33c106	stdbuf: port to oddball toupper * src/stdbuf.c: Do not include ctype.h. (set_libstdbuf_options): Use c_toupper, not toupper, since the C locale is intended here.	2023-10-30 00:58:04 -07:00
Paul Eggert	8d60cd8ad6	dircolors: assume C-locale spaces * src/dircolors.c: Include c-ctype.h, not ctype.h. (parse_line): Use c_isspace, not isspace, as the .dircolors file format (which does not seem to be documented!) appears to be ASCII.	2023-10-30 00:58:04 -07:00
Paul Eggert	5602342a16	maint: port to oddball tolower * src/digest.c (hex_equal): Work even in oddball locales where tolower does not work as expected on ASCII letters.	2023-10-30 00:58:04 -07:00
Paul Eggert	4edb14d20f	maint: include ctype.h selectively Include ctype.h only in files that need it. Many of its uses are incorrect, as they assume single-byte locales. The idea is to remove the incorrect uses later, when there is time. * src/chroot.c, src/csplit.c, src/dd.c, src/digest.c, src/dircolors.c: * src/expand-common.c, src/expand.c, src/fmt.c, src/fold.c, src/ls.c: * src/od.c, src/pinky.c, src/pr.c, src/ptx.c, src/seq.c: * src/set-fields.c, src/split.c, src/stdbuf.c, src/test.c: * src/tr.c, src/truncate.c, src/unexpand.c, src/wc.c: Include ctype.h. * src/system.h: Do not include ctype.h. include ctype.h.o	2023-10-30 00:58:04 -07:00
Paul Eggert	684e810ae2	maint: move field_sep into separate module This is so that we don’t need to have every source file include ctype.h. * bootstrap.conf (gnulib_modules): Add cu-ctype. * gl/lib/cu-ctype.c, gl/lib/cu-ctype.h, gl/modules/cu-ctype: New files. * src/join.c, src/numfmt.c, src/sort.c, src/uniq.c: Include cu-ctype.h, for field_sep. * src/system.h (field_sep): Remove; now supplied by cu-ctype.	2023-10-30 00:58:04 -07:00
Paul Eggert	2f3d9524bb	digest: omit unnecessary b2sum includes * src/blake2/b2sum.c: Do not include string.h, errno.h, ctype.h, unistd.h, getopt.h.	2023-10-30 00:58:03 -07:00
Paul Eggert	0292a5678a	maint: prefer c_isxdigit when that is the intent * src/digest.c (valid_digits, split_3): * src/echo.c (main): * src/printf.c (print_esc): * src/ptx.c (unescape_string): * src/stat.c (print_it): When the code is supposed to support only POSIX-locale hex digits, use c_isxdigit rather than isxdigit. Include c-ctype.h as needed. This defends against oddball locales where isxdigit != c_isxdigit.	2023-10-30 00:58:03 -07:00
Pádraig Brady	f7e25d5bb5	maint: fix syntax check issue * src/basenc.c: Fix preprocessor indentation.	2023-10-28 13:13:50 +01:00
Pádraig Brady	8c735f6585	base32,base64: disallow non-canonical encodings This will make decoding more resilient to corruption whether due to transmission errors or nefarious adjustment. See https://eprint.iacr.org/2022/361.pdf * gnulib: Update to commit 3f463202bd enforcing canonical encoding. * tests/basenc/base64.pl: Add test cases, and adjust existing cases. * NEWS: Mention the change in behavior.	2023-10-28 13:13:34 +01:00
Paul Eggert	60bd7bad9d	basenc: fix unlikely locale issue; tune This sped up ‘basenc -d --base16’ by 60% on my old platform, AMD Phenom II X4 910e, Fedora 38. * src/basenc.c (struct base16_decode_context): Simplify by omitting have_nibble. ‘nibble’ is now negative if it’s missing. All uses changed. (B16): New macro, inspired by ../lib/base64.c. (base16_to_int): New static var, likewise. (isubase16): Reimplement using base16_to_int, since isxdigit is not guaranteed to succeed on the chars we want when the locale is oddball. (base16_decode_ctx): Tune by using base16_to_int and by	2023-10-25 15:09:27 -07:00
Paul Eggert	dcc1514d9a	basenc: tweak checks to use unsigned char This tends to generate better code, at least on x86-64, because callers are just as fast and callees can avoid a conversion. * src/basenc.c: The following renamings also change the arg type from char to unsigned char. All uses changed. (isubase): Rename from isbase. (isubase64url): Rename from isbase64url. (isubase32hex): Rename from isbase32hex. (isubase16): Rename from isbase16. (isuz85): Rename from isz85. (isubase2): Rename from isbase2. 2023-10-24 Paul Eggert <eggert@cs.ucla.edu> * src/basenc.c (struct base16_decode_context): Simplify by storing -1 for missing nibbles. All uses changed.	2023-10-25 15:09:27 -07:00
Paul Eggert	f4a59d453e	build: update gnulib submodule to latest	2023-10-25 15:09:27 -07:00
Pádraig Brady	5f538c27a1	basenc: --base16: also allow lower case with --ignore-garbage * src/basenc.c (isbase16): Also return true for lower case. * tests/basenc/basenc.pl: Add a test case. Reported by Paul Eggert.	2023-10-25 14:04:00 +01:00
Pádraig Brady	d733f2ec26	basenc: --base16: support lower case hex digits * src/basenc.c (base16_decode_ctx): Convert to uppercase before converting from hex. * tests/basenc/basenc.pl: Add a test case. * NEWS: Mention the change in behavior. Addresses https://bugs.gnu.org/66698	2023-10-23 14:04:38 +01:00
Pádraig Brady	2e0dcd87bf	doc: fix RFC references * doc/coreutils.texi: Adjust RFC URLs as the original now give 404 errors.	2023-10-23 12:29:03 +01:00
Pádraig Brady	caa716803a	tests: move all basenc tests to their own directory * tests/misc/base64.pl: Move to tests/basenc/base64.pl * tests/misc/basenc.pl: Move to tests/basenc/basenc.pl * tests/local.mk: Adjust accordingly	2023-10-06 18:22:35 +01:00
Pádraig Brady	378dc38f48	basenc: auto pad base32 and base64 inputs when decoding Padding of encoded data is useful in cases where base64 encoded data is concatenated / streamed. I.e. where there are padding chars _within_ the stream. In other cases padding is optional and can be inferred. Note we continue to treat partial padding as invalid, as that would be indicative of truncation. * src/basenc.c (do_decode): Auto pad the end of the input. * NEWS: Mention the change in behavior. * tests/misc/base64.pl: Adjust to not fail for missing padding. Addresses https://bugs.gnu.org/66265	2023-10-06 18:21:12 +01:00
Paul Eggert	a2434d3e58	sort: improve --help Problem reported by Jorge Stolfi (bug#66253). * src/sort.c (usage): Suggest looking at the manual for -n details.	2023-09-28 18:03:34 -07:00
Pádraig Brady	0c46704832	doc: rm --help: mention that '.' or '..' are rejected * src/rm.c (usage): State that '.' or '..' are rejected.	2023-09-25 15:26:31 +01:00
Paul Eggert	de4e704273	wc: pacify ‘make syntax-check’ * src/wc_avx2.c (wc_lines_avx2): Explicitly make it ‘extern’. Not sure why this is needed.	2023-09-23 17:20:26 -07:00
Paul Eggert	2245a95806	wc: distribute src/wc.h * src/local.mk (noinst_HEADERS): Add src/wc.h.	2023-09-23 17:20:25 -07:00
Paul Eggert	f40c6b5cf2	wc: goto considered harmful * src/wc.c: Do not include assure.h. Replace the only use of ‘assure’ with ‘unreachable’ which is good enough. (wc, main): Remove labels and gotos. This doesn’t affect performance in any way I can measure, and makes the code a bit easier to follow.	2023-09-23 17:07:52 -07:00
Paul Eggert	6b8b1f9e77	wc: prefer signed integers Prefer signed to unsigned integers, to make it easier to catch integer overflow errors. * src/wc.c: Do not include safe-read. (total_lines_overflow, total_words_overflow, total_chars_overflow) (total_bytes_overflow): Now bool, not uintmax_t. All uses changed. (max_line_length): Now intmax_t, not uintmax_t. All uses changed. The total_... vars are still uintmax_t because overflow into them is checked. (page_size): Now idx_t, not size_t. (wc_lines, wc, get_input_fstatus, compute_number_width, main): Prefer signed to unsigned ints where either should do. (wc_lines, wc): Use read rather than safe_read, since we don’t need safe_read’s checks for huge buffers. (wc): Redo call to mbrtoc32 to lessen the number of comparisons against its returned value. Do this partly by keeping a pointer to the end of the buffer rather than a count. Simplify overflow-checking code. (compute_number_width): Check for integer overflow. Don’t assume size_t fits into unsigned long. * src/wc.h (struct wc_lines): Prefer signed integers. * src/wc_avx2.c: Do not include safe-read.h. (wc_lines_avx2): Prefer signed integers. Use read, not safe_read.	2023-09-23 17:07:52 -07:00
Paul Eggert	8d41285fe4	wc: improve avx2 API * src/wc.c: Use "#include <...>" for files not in the current dir. Include "wc.h" instead of declaring wc_lines_avx2 by hand. (wc_lines): New API, with no file name (no longer needed) and with a return struct rather than arg pointers. All uses changed. Use avx2_supported directly instead of using a function pointer. Exploit C99-style declarations after statements. Multiply by 15 rather than dividing; it’s faster and more accurate and cannot overflow here. (wc): Simplify based on wc_lines API change. * src/wc.h: New file. * src/wc_avx2.c: Include it, to check API better. (wc_lines_avx2): Use new API. All uses changed. Exploit C99. Make locals more local.	2023-09-23 17:07:52 -07:00
Paul Eggert	769ace51e8	factor,tail: avoid quadratic reallocation * src/factor.c (struct mp_factors): New member nalloc. (mp_factor_init): Initialize it. * src/factor.c (mp_factor_insert): * src/tail.c (parse_options): Use xpalloc to avoid quadratic worst-case behavior on reallocation. * src/tail.c (pids_alloc): New static var.	2023-09-23 01:15:50 -07:00
Paul Eggert	9ecc4f4e44	doc: mention Unicode exceptions for wc	2023-09-23 00:28:28 -07:00
Paul Eggert	a6064bb864	wc: simplify by removing SUPPORT_OLD_MBRTOWC * src/wc.c (SUPPORT_OLD_MBRTOWC): Remove. All uses removed. (wc): Simplify by assuming C99-or-later behavior for mbrtoc32, which after all is a C11 API. Fix the !SUPPORT_OLD_MBRTOWC code, which evidently was never tested seriously.	2023-09-23 00:28:27 -07:00
Paul Eggert	17a9e79023	wc: 3× speedup in C locale The 3× speedup was measured by invoking 'wc $(find * -type f)' on the coreutils sources etc. on an Ubuntu 23.04 x86-64. These changes also speed up wc 20% in UTF-8 locales. * src/wc.c (wc_isprint, wc_isspace): New static vars. (wc): Use them for speed. (main): Initialize them if needed. (isnbspace): Remove; no longer used.	2023-09-23 00:28:27 -07:00
Paul Eggert	bee39b93f5	wc: treat encoding errors as non white space * src/wc.c (wc): Treat encoding errors like non white space characters.	2023-09-23 00:28:27 -07:00
Paul Eggert	31076e8689	wc: fix word count bug * bootstrap.conf (gnulib_modules): Remove c32isprint. * src/wc.c (wc): Consider all non-white-space characters to be word constituents, even if they are not printable. POSIX requires this, and it is what BSD does. Partly do this by simplifying the check for a word, by counting word starts rather than word ends. * tests/wc/wc.pl: Test for the bug.	2023-09-23 00:28:27 -07:00
Paul Eggert	a6648d4102	maint: omit some unused function tests * m4/jm-macros.m4: Do not check for ftruncate, iswspace, mkfifo, mbrlen, sysctl. Coreutils no longer uses the corresponding HAVE_* macros, typically because Gnulib handles them now. * src/wc.c (iswspace): Remove; unused.	2023-09-23 00:28:27 -07:00

1 2 3 4 5 ...

29986 Commits