* src/sort.c (fillbuf): When enlarging the line buffer, ensure that
the new size is a multiple of "sizeof (struct line)". This avoids
alignment problems when indexing from the end of the buffer.
Problem reported by Andreas Schwab in
<http://lists.gnu.org/archive/html/bug-coreutils/2007-07/msg00158.html>.
* bootstrap.conf: Remove findprog.
* doc/coreutils.texi (sort invocation): The default is to not
compress. Don't treat "" specially.
* src/sort.c: Don't include findprog.h.
(create_temp): Compress only if the user specified --compress-program.
* tests/misc/sort-compress: Adjusts tests to match new behavior.
an environment variable.
* doc/coreutils.texi (sort invocation): Document this.
* src/sort.c (usage): Likewise.
(COMPRESS_PROGRAM_OPTION): New const.
(long_options, create_temp, main): Support new option.
* tests/misc/sort-compress: Test it.
* src/sort.c (usage): Split a diagnostic that had grown to be
longer than the C89 maximum of 509 bytes.
* .x-sc_cast_of_argument_to_free: New file. Allow a cast in sort.c.
FIXME: this is just temporary, while we wait to remove the offending
access-calling code.
* Makefile.am (EXTRA_DIST): Add .x-sc_cast_of_argument_to_free.
* Makefile.maint (sc_cast_of_argument_to_free): Use the
canonical, $$($(CVS_LIST_EXCEPT)).
* m4/.gitignore, m4/.cvsignore, lib/.gitignore, lib/.cvsignore: Update.
like it will be approved. Also add --check=quiet, --check=silent
as long aliases, and --check=diagnose-first as an alias for -c.
* doc/coreutils.texi (sort invocation): Document this.
Also, mention that sort -c can take at most one file.
* src/sort.c: Implement this.
Include argmatch.h.
(usage): Document the change.
(CHECK_OPTION): New constant.
(long_options): --check now takes an optional argument, and is now
treated differently from 'c'.
(check_args, check_types): New constant arrays.
(check): New arg CHECKONLY, which suppresses diagnostic if -C.
(main): Parse the new options.
* tests/sort/Test.pm (02d, 02d, incompat5, incompat6):
New tests for -C.
In pipe_fork callers, use these named constants, not "2" and "8".
(proctab, nprocs): Declare to be "static".
(pipe_fork) [lint]: Initialize local, pid,
to avoid unwarranted may-be-used-uninitialized warning.
(create_temp): Use the active voice. Describe parameters, too.
2007-01-21 James Youngman <jay@gnu.org>
Centralize all the uses of sigprocmask(). Don't restore an invalid
saved mask.
* src/sort.c (enter_cs, leave_cs): New functions for protecting
code sequences against signal delivery.
* (exit_cleanup): Use enter_cs and leave_cs instead of
calling sigprocmask directly.
(create_temp_file, pipe_fork, zaptemp): Likewise
2007-01-21 Dan Hipschman <dsh@linux.ucla.edu>
Add compression of temp files to sort.
* NEWS: Mention this.
* bootstrap.conf: Import findprog.
* configure.ac: Add AC_FUNC_FORK.
* doc/coreutils.texi: Document GNUSORT_COMPRESSOR environment
variable.
* src/sort.c (compress_program): New global, holds the name of the
external compression program.
(struct sortfile): New type used by mergepfs and friends instead
of filenames to hold PIDs of compressor processes.
(proctab): New global, holds compressor PIDs on which to wait.
(enum procstate, struct procnode): New types used by proctab.
(proctab_hasher, proctab_comparator): New functions for proctab.
(nprocs): New global, number of forked but unreaped children.
(reap, reap_some): New function, wait for/cleanup forked processes.
(register_proc, update_proc, wait_proc): New functions for adding,
modifying and removing proctab entries.
(create_temp_file): Change parameter type to pointer to file
descriptor, and return type to pointer to struct tempnode.
(dup2_or_die): New function used in create_temp and open_temp.
(pipe_fork): New function, creates a pipe and child process.
(create_temp): Creates a temp file and possibly a compression
program to which we filter output.
(open_temp): Opens a compressed temp file and creates a
decompression process through which to filter the input.
(mergefps): Change FILES parameter type to struct sortfile array
and update access accordingly. Use open_temp and reap_some.
(avoid_trashing_input, merge): Change FILES parameter like
mergefps and call create_temp instead of create_temp_file.
(sort): Call create_temp instead of create_temp_file.
Use reap_some.
(avoid_trashing_input, merge, sort, main): Adapt to mergefps.
* src/csplit.c (main): Also catch SIGALRM, SIGPIPE, SIGPOLL,
SIGPROF, SIGVTALRM, SIGXCPU, SIGXFSZ.
* src/ls.c (main): Likewise (except SIGPIPE was already caught).
Note that ls.c is special, as it also catches SIGTSTP.
* src/sort.c (main): Likewise. Also catch SIGQUIT. More details in
<http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/9510>.
(exit_cleanup): New function.
(main): Don't invoke atexit until we're ready.
Invoke it with exit_cleanup, not with cleanup and close_stdout,
to avoid a race condition with cleanup and signal handling. More
details: http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/9508
so that commands like "sort -k 18446744073709551616" no longer fail merely
because 18446744073709551616 doesn't fit in uintmax_t. The trick is that
these fields can all be treated as effectively infinity; their exact
values don't matter, since no internal buffer can be that long.
* src/join.c (string_to_join_field): Verify that SIZE_MAX <= ULONG_MAX
if the code assumes this. Silently truncate too-large values to SIZE_MAX,
as the remaining code will do the right thing in this case.
* src/sort.c (parse_field_count): Likewise.
* src/uniq.c (size_opt, main): Likewise.
* tests/join/Test.pm (bigfield): New test.
* tests/sort/Test.pm (bigfield): New test.
* tests/uniq/Test.pm (121): New test.
Signed-off-by: Jim Meyering <jim@meyering.net>
* src/sort.c (main): Don't allocate memory for each new key here.
(insertkey): Allocate memory for each key here, instead.
(key_init): Rename from new_key. Don't allocate.
Don't include rand-isaac.c; include randint.h and randread.h instead.
(RANDOM_SOURCE_OPTION): New enum.
(long_opts, usage, main): New option --random-source.
Include md5.h, randread.h, xmemxfrm.h.
(longopts, usage, main): Remove undocumented --seed option;
it's now replaced by --random-source.
(rand_state, get_hash): Remove.
(randread_source): New static var.
(random_state, cmp_hashes, compare_random): New functions; they guarantee
no collisions in the random hash function.
(keycompare): Use compare_random for -R; don't fall back on comparing
via memcoll, since compare_random does the right thing.
(check_ordering_compatibility, main): Use it.
(main): Check for -c and -o.
Don't bother with a usage message for
"sort -c a b", for consistency with other error diagnostics.
Don't include md5.h; it wasn't needed.
(struct keyfield): Rename random_hash to random, for consistency
with the other member names. All uses changed.
(usage): Tweak wording to mention STRING for --seed option.
(short_options): Rorder for consistency with other programs.
(rand_state): Now a struct, not a pointer to one. All uses changed.
(HASH_WORDS, HASH_SIZE): Remove.
(get_hash): Remove comments around resbuf size, since we can assume C89.
Use a "more-kosher" (but slower) approach of invoking isaac_refill.
(keycompare): Adjust to the new get_hash.
Add a FIXME.
(badfieldspec): Omit recently-introduced comment; it isn't needed.
(main): Don't set need_random simply because gkey has it set; that
doesn't necessarily mean we'll need random numbers.
Redo seeding to match new get_hash approach.
(usage): Add options --random-sort and --seed to implement a random
shuffle.
Include md5.h and rand-isaac.h.
(get_hash): New function.
(rand_state): New var.
(HASH_WORDS, HASH_SIZE): New macros.
Include stdlib--.h. Do not include unistd-safer.h.
(create_temp_file): Don't call fd_safer; no longer needed
now that we include *--.h files.
(xfopen): Don't call fopen_safer, for similar reasons.