This significantly improves performance when a number exceeds
2**(W_TYPE_SIZE - 1) and is the product of a prime less than
FIRST_OMITTED_PRIME and another prime less than 2**(W_TYPE_SIZE - 1).
On my platform, for example, it doubled the speed of factoring
4999 * (2**128 - 159).
* src/factor.c (mp_size, mp_finish_in_single): New functions.
(mp_factor_using_division, mp_factor_using_pollard_rho):
Finish using single precision when possible.
* tests/factor/factor.pl (lt-5000-times-128-bit): New test.
* src/factor.c (factor_insert_refind):
Use idx_t for indexes into primes_diff,
for consistency with other indexes into primes_diff.
This has no practical effect unless the primes_diff
table becomes unreasonably large.
Support a multiplicity argument in the mp case, too.
This helps keeps the two cases in sync, for maintenance.
* src/factor.c (mp_factor_insert, mp_factor_insert_ui):
New arg M, for multiplicity. All callers changed.
Use something other than a macro when that is easy and won’t hurt
performance.
* src/factor.c (__ll_B, __ll_lowpart, _ll_highpart) [!USE_LONGLONG_H]:
(MAX_NFACTS, highbit_to_mask, factor_insert, PRIMES_PTAB_ENTRIES):
Make these enums, or constants, or static functions instead of macros.
(highbit_to_mask): Rename from HIGHBIT_TO_MASK. All uses changed.
* src/factor.c (factor_insert_multiplicity):
Adjust to keep in sync with mp_factor_insert changes below,
by adding 1 to the index and using memmove to move.
(mp_factor_insert): Omit redundant call to mpz_cmp.
Prefer idx_t (always nonnegative) to ptrdiff_t,
by adding 1 to the indexes.
Prefer mpz_init_set to mpz_init+mpz_set.
Use memmove to move, rather than doing it by hand.
* src/factor.c (MAX_NFACTS): Allow word size of 128 bits,
even if this is only theoretical now.
Check that struct factors’s unsigned char counts won’t overflow.
* src/factor.c (USE_LONGLONG_H):
Default to false on unusual (but standard-conforming)
platforms that lack int64_t etc.
(UWtype, UHWtype): Now typedefs, not macros.
(UQItype): Remove.
(SItype, USItype, DItype, UDItype): Use standard C types.
This simplifies things slightly by using uuint for
some two-word integers.
* src/factor.c (strtouuint): Accept uuint *, not two mp_limb_t *.
All callers changed.
(print_factors_single): Accept uuint, not two limbs.
All callers changed.
(print_factors): Use simpler test for high bit,
one that need not worry about promoting to int.
Simplify by using GMP’s word type instead of pretending to roll our own.
* src/factor.c (wide_uuint): Remove. All uses replaced by mp_limb_t.
(umul_ppmm) [!umul_ppmm]: Don’t assume unsigned long is at least half
as wide as mp_limb_t. This simpler anyway.
(strtouuint): Rename from strto2wide_uint. All uses changed.
Remove experimental code for 128-bit words as it does not work and
we lack time to figure out why. Instead, ensure that words are
the same size as with GMP.
* src/factor.c (USE_INT128): Remove. All uses removed.
(wide_uint, W_TYPE_SIZE): Define to be the same as GMP.
(MP_LIMB_MAX): New macro. Check that it matches W_TYPE_SIZE.
(USE_LONGLONG_H): Default to true.
(UHWtype) [USE_LONGLONG_H]: Define to unsigned int, same as GMP.
(prime_p): Go back to not worrying about 128-bit words,
since GMP doesn’t worry and doesn’t use them.
(lbuf_putbitcnt): New function, since we cannot assume
that bitcnt_t fits into mp_limb_t.
(print_factors): Use it.
* src/make-prime-list.c (output_primes):
Don’t assume that wide_uint’s maximum is UINTMAX_MAX.
* src/factor.c (struct mp_factors): e (multiplicity) member
is now of type mp_bitcnt_t, not unsigned long int, since
its value is at most a bit count. All uses changed.
* src/factor.c (BIG_POWER_OF_10, LOG_BIG_POWER_OF_10):
Place fewer restrictions on BIG_POWER_OF_10.
This is only for currently-theoretical hosts;
it shouldn’t affect machine code on practical platforms.
* src/factor.c (wide_int): Remove, since it gets in the
way of using mp_limb_t for words. All uses removed.
(submod2, HIGHBIT_TO_MASK, divexact_21):
Rewrite without using wide_int.
This shouldn't change the machine code these days,
as compilers are pretty smart about isolating the
top bit of an unsigned int.
In practice there’s no bug but we might as well avoid the
undefined behavior.
* src/factor.c (hi_is_set): New static function.
(factor_insert_large, prime2_p, print_factors_single): Use it.
* src/local.mk: Similarly to commit v8.22-156-g09937e9d0
track speedlist.h with nodist_src_stty_SOURCES and DISTCLEANFILES
to ensure the make distcheck manifest comparison passes.
Addresses https://bug.gnu.org/78960
* src/local.mk: Use the coarser BUILT_SOURCES mechanism
to generate speedlist.h, rather than a specific dependency
(which did seem to work for parallel builds).
Fixes https://bugs.gnu.org/78960
* src/od.c (print_function_type): New type. Use it for convenience.
(width_bytes): Omit duplicate entries, such as ‘double’ vs ‘long
double’ on macOS. Problem reported by Bruno Haible
<https://bugs.gnu.org/78933>.
(decode_one_format): Cast null pointer to print_function_type
to pacify Apple clang-1400.0.29.202.
* src/local.mk: Adjust the dependency so that speedlist.h
is built irrespective of the object file name.
Note we could use BUILT_SOURCES for this,
but it's better to have this more accurate dependency.
Reinstate check removed in commit 56aa549a0 so that we
disallow -f2 when configured with utils_cv_ieee_16_bit_supported=no.
Otherwise the output routines will consume floats,
i.e. 4 bytes at a time. Without this extra check
the tests/od/od-endian.sh will fail with this configuration.
* src/od.c (decode_one_format): Reinstate the explicit check
for this configuration edge case.
Problem reported by Pádraig Brady <https://bugs.gnu.org/78880#43>.
This patch doesn’t fix any bugs; it merely pacifies GCC.
* src/od.c (ispec_to_format): New function, replacing
the old ISPEC_TO_FORMAT macro. All uses changed.
This part of the change is just refactoring.
(decode_one_format): Pacify à la ispec_to_format.
* src/od.c (width_bytes, decode_one_format): Don’t assume a signed
type has the same size as the corresponding unsigned type.
This has no effect on practical platforms; it’s just for
consistency there.
* src/od.c (address_base, address_pad_len, format_address):
Initialize statically rather than dynamically.
(limit_bytes_to_format): Remove. All uses replaced by
checking sign of end_offset.
(max_bytes_to_format): Remove static var. Now local to ‘main’.
(end_offset): -1 now means no limit. All uses changed.