1
0
mirror of git://git.sv.gnu.org/coreutils.git synced 2026-04-21 03:12:48 +02:00
Files
coreutils/gl/lib/heap.c

152 lines
3.6 KiB
C
Raw Normal View History

sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
/* Barebones heap implementation supporting only insert and pop.
Copyright (C) 2010-2024 Free Software Foundation, Inc.
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
2017-09-19 01:13:23 -07:00
along with this program. If not, see <https://www.gnu.org/licenses/>. */
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
/* Full implementation: GDSL (http://gna.org/projects/gdsl/) by Nicolas
Darnis <ndarnis@free.fr>. */
#include <config.h>
#include "heap.h"
#include "stdlib--.h"
#include "xalloc.h"
static int heap_default_compare (void const *, void const *);
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
static size_t heapify_down (void **, size_t, size_t,
int (*) (void const *, void const *));
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
static void heapify_up (void **, size_t,
int (*) (void const *, void const *));
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
struct heap
{
void **array; /* array[0] is not used */
size_t capacity; /* Array size */
size_t count; /* Used as index to last element. Also is num of items. */
int (*compare) (void const *, void const *);
};
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
/* Allocate memory for the heap. */
struct heap *
heap_alloc (int (*compare) (void const *, void const *), size_t n_reserve)
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
{
struct heap *heap = xmalloc (sizeof *heap);
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
if (n_reserve == 0)
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
n_reserve = 1;
heap->array = xnmalloc (n_reserve, sizeof *(heap->array));
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
heap->array[0] = nullptr;
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
heap->capacity = n_reserve;
heap->count = 0;
heap->compare = compare ? compare : heap_default_compare;
return heap;
}
static int
heap_default_compare (void const *a, void const *b)
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
{
return 0;
}
void
heap_free (struct heap *heap)
{
free (heap->array);
free (heap);
}
/* Insert element into heap. */
int
heap_insert (struct heap *heap, void *item)
{
if (heap->capacity - 1 <= heap->count)
heap->array = x2nrealloc (heap->array, &heap->capacity,
sizeof *(heap->array));
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
heap->array[++heap->count] = item;
heapify_up (heap->array, heap->count, heap->compare);
return 0;
}
/* Pop top element off heap. */
void *
heap_remove_top (struct heap *heap)
{
void *top;
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
if (heap->count == 0)
return nullptr;
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
top = heap->array[1];
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
heap->array[1] = heap->array[heap->count--];
heapify_down (heap->array, heap->count, 1, heap->compare);
return top;
}
/* Move element down into appropriate position in heap. */
static size_t
heapify_down (void **array, size_t count, size_t initial,
int (*compare) (void const *, void const *))
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
{
void *element = array[initial];
size_t parent = initial;
while (parent <= count / 2)
{
size_t child = 2 * parent;
2021-12-18 09:34:31 -08:00
if (child < count && compare (array[child], array[child + 1]) < 0)
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
child++;
if (compare (array[child], element) <= 0)
break;
array[parent] = array[child];
parent = child;
}
array[parent] = element;
return parent;
}
/* Move element up into appropriate position in heap. */
static void
heapify_up (void **array, size_t count,
int (*compare) (void const *, void const *))
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
{
size_t k = count;
void *new_element = array[k];
2021-12-18 09:34:31 -08:00
while (k != 1 && compare (array[k / 2], new_element) <= 0)
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
{
2021-12-18 09:34:31 -08:00
array[k] = array[k / 2];
sort: parallelize internal sort This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-09 08:03:50 +01:00
k /= 2;
}
array[k] = new_element;
}