exp2()
Base-2 Exponential Functionexp2n()
Functionexp10()
Base-10 Exponential Functionexp()
Base-e Exponential Functionexpm1()
Base-e Exponential Functionlog()
Base-e alias Natural Logarithm Functionlog1p()
Base-e alias Natural Logarithm Functionlog10()
Base-10 alias Common Logarithm Functionlog2()
Base-2 alias Binary Logarithm Functionlogb()
Functionilogb()
Functioncos()
(Circular) Cosine Functioncot()
(Circular) Cotangent Functionsin()
(Circular) Sine Functiontan()
(Circular) Tangent Functionacos()
Arc Cosine Functionacot()
Arc Cotangent Functionacot2()
Arc Cotangent Functionasin()
Arc Sine Functionatan()
Arc Tangent Functionatan2()
Arc Tangent Functioncosh()
Hyperbolic Cosine Functioncoth()
Hyperbolic Cotangent Functionsinh()
Hyperbolic Sine Functiontanh()
Hyperbolic Tangent Functionacosh()
Area Hyperbolic Cosine Functionacoth()
Area Hyperbolic Cotangent Functionasinh()
Area Hyperbolic Sine Functionatanh()
Area Hyperbolic Tangent Functionfmax()
Functionfmin()
Functionhypot()
Functionpow()
Functioncbrt()
Functionceil()
Functionfabs()
Functionfdim()
Functionfloor()
Functionfma()
Functionfmod()
Functionfpclassify()
Functionfrexp()
Functionisfinite()
Functionisinf()
Functionisnan()
Functionisnormal()
Functionissubnormal()
Functionldexp()
Functionldexp10()
Functionremainder()
Functionremquo()
Functionrint()
Functionround()
Functionroundeven()
Functionsignbit()
Functionsqrt()
Functiontrunc()
Functionceil()
Functioncopysign()
Functionfloor()
Functionfrexp()
Functionldexp()
Functionmodf()
Functionnextafter()
Functionrint()
Functionround()
Functiontrunc()
Functionelementarymathematical plus other functions defined by the ANSI C, ISO C and POSIX standards, using IEEE 754 floating-point arithmetic.
Positive zero (+0) is represented with sign = 0, exponent = 0 and fraction = 0; negative zero (−0) is represented with sign = 1, exponent = 0 and fraction = 0.
quietNaN is represented with either sign, exponent = 2047 and fraction > 251−1, i.e. the most significant bit of fraction set;
signalingNaN is represented with either sign, exponent = 2047 and fraction < 251, i.e. the most significant bit of fraction clear.
The fraction of a non-zero finite floating-point number is a rational number from the set {½, ¼, ¾, …, 1/252, …, (252−1)/252}.
The
significand = integer.fraction
of a non-zero finite floating-point number is in the interval
[2−52, 2−2−52],
decimal
[0.0000000000000002220446049250313080847263336181640625, 1.9999999999999997779553950749686919152736663818359375];
a normalized significand is in the interval
[1, 2−2−52].
Representable (non-zero finite) floating-point numbers, also called
machine numbers
, are in the non-contiguous
intervals
[2−1074, (2−2−52) × 21023]
and
[−2−1074, −(2−2−52) × 21023],
the full (normalized) 53-bit significand is available on
the intervals
[2−1022, (2−2−52) × 21023]
and
[−2−1022, −(2−2−52) × 21023].
The largest representable floating-point number,
(2−2−52) × 21023 = 0.179769… × 10309,
has 309 (integral) decimal digits:
179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368
The smallest representable floating-point number,
2−1074 = 0.494065… × 10−323,
has 1074 fractional decimal digits,
323 zeroes followed by 751 more digits:
0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000004940656458412465441765687928682213723650598026143247644255856825006755072702087518652998363616359923797965646954457177309266567103559397963987747960107818781263007131903114045278458171678489821036887186360569987307230500063874091535649843873124733972731696151400317153853980741262385655911710266585566867681870395603106249319452715914924553293054565444011274801297099995419319894090804165633245247571478690147267801593552386115501348035264934720193790268107107491703332226844753335720832431936092382893458368060106011506169809753078342277318329247904982524730776375927247874656084778203734469699533647017972677717585125660551199131504891101451037862738167250955837389733598993664809941164205702637090279242767544565229087538682506419718265533447265625
The largest representable subnormal floating-point number,
(2−2−52) × 2−1023 = 0.222507… × 10−307,
has 1074 fractional decimal digits, 307 zeroes
followed by 767 more digits:
0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000022250738585072008890245868760858598876504231122409594654935248025624400092282356951787758888037591552642309780950434312085877387158357291821993020294379224223559819827501242041788969571311791082261043971979604000454897391938079198936081525613113376149842043271751033627391549782731594143828136275113838604094249464942286316695429105080201815926642134996606517803095075913058719846423906068637102005108723282784678843631944515866135041223479014792369585208321597621066375401613736583044193603714778355306682834535634005074073040135602968046375918583163124224521599262546494300836851861719422417646455137135420132217031370496583210154654068035397417906022589503023501937519773030945763173210852507299305089761582519159720757232455434770912461317493580281734466552734375
Note: the decimal representation of
2−n has
n fractional digits!
To maintain the working precision and gain correctly
rounded results, calculations are performed with 3 extra bits beyond
the least significant bit of the fraction: a
guard bit, a round bit and a sticky
bit (alias inexact
flag).
Basic arithmetic operations, i.e. addition, subtraction,
multiplication, division, fused multiply-accumulate, plus square
root on (representable) floating-point numbers, including the
special values +∞, −∞ and
NaN, are performed
as if their mathematical exact (infinitely precise) result is
calculated, then mapped or rounded to a representable floating-point
number.
Arithmetic underflow yields +0 or −0, arithmetic overflow
yields +∞ or −∞, operations on
NaNs as well as
mathematically undefined operations yield
NaN, with the notable
exception that division of a non-zero finite floating-point number
by ±0 yields ±∞, and non-zero finite results
are rounded according to the selected rounding mode:
tie-break) towards zero for even ⌊precise × 252⌋ and away from zero for odd ⌊precise × 252⌋ (i.e. even ⌈precise × 252⌉);
round to nearest, ties to even!
For arithmetic operations on special values the following identities are defined:
The maximum (relative) error of a faithfully rounded result is less than 1 ULP; the maximum (relative) error of a correctly rounded result is less than ½ ULP.
Note: in all rounding modes, a faithfully rounded result is either equal to the correctly rounded result or 1 ULP off of the correctly rounded result.
nextafter()
to test whether the following mathematical identities hold for
various elementary functions and the (correctly rounded) values of
some of the constants
M_*
defined by the
ANSI C,
ISO C
and
POSIX
standards:
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
// * The software is provided "as is" without any warranty, neither
// express nor implied.
// * In no event will the author be held liable for any damage(s) arising
// from the use of the software.
// * Redistribution of the software is allowed only in unmodified form.
// * Permission is granted to use the software solely for personal private
// and non-commercial purposes.
// * An individuals use of the software in his or her capacity or function
// as an agent, (independent) contractor, employee, member or officer of
// a business, corporation or organization (commercial or non-commercial)
// does not qualify as personal private and non-commercial purpose.
// * Without written approval from the author the software must not be used
// for a business, for commercial, corporate, governmental, military or
// organizational purposes of any kind, or in a commercial, corporate,
// governmental, military or organizational environment of any kind.
#include <math.h>
#include <stdio.h>
void evaluate(double (*function)(double), double input, double reference)
{
double output = (function)(input);
if (output == reference)
printf("%.17g is correctly rounded\n", output);
else if (nextafter(output, reference) == reference)
printf("%.17g is faithfully rounded\n", output);
else
printf("%.17g is …\n", output);
}
int main(void)
{
double last, next = 0.0;
printf("sqrt(-0): ");
evaluate(sqrt, -0.0, -0.0);
#ifdef INFINITY
printf("log(0): ");
evaluate(log, 0.0, -INFINITY);
#endif
#ifdef M_PI
printf("acos(0): ");
evaluate(acos, 0.0, M_PI / 2.0);
printf("acos(0.5): ");
evaluate(acos, 0.5, M_PI / 3.0);
printf("asin(0.5): ");
evaluate(asin, 0.5, M_PI / 6.0);
#endif
#ifdef M_SQRT1_2
printf("sqrt(0.5): ");
evaluate(sqrt, 0.5, M_SQRT1_2);
#endif
#ifdef M_PI_2
printf("asin(1): ");
evaluate(asin, 1.0, M_PI_2);
#endif
#ifdef M_PI_4
printf("atan(1): ");
evaluate(atan, 1.0, M_PI_4);
#endif
#ifdef M_E
printf("exp(1): ");
evaluate(exp, 1.0, M_E);
#endif
printf("log(1): ");
evaluate(log, 1.0, 0.0);
printf("log2(1): ");
evaluate(log2, 1.0, 0.0);
printf("log10(1): ");
evaluate(log10, 1.0, 0.0);
#ifdef M_SQRT2
printf("sqrt(2): ");
evaluate(sqrt, 2.0, M_SQRT2);
#endif
printf("log2(2): ");
evaluate(log2, 2.0, 1.0);
#ifdef M_LN2
printf("log(2): ");
evaluate(log, 2.0, M_LN2);
printf("exp(%.17g): ", M_LN2);
evaluate(exp, M_LN2, 2.0);
#endif
printf("log10(10): ");
evaluate(log10, 10.0, 1.0);
#ifdef M_LN10
printf("log(10): ");
evaluate(log, 10.0, M_LN10);
printf("exp(%.17g): ", M_LN10);
evaluate(exp, M_LN10, 10.0);
#endif
#ifdef M_LOG2E
printf("exp2(%.17g): ", M_LOG2E);
evaluate(exp2, M_LOG2E, M_E);
#endif
#ifdef M_LOG10E
printf("exp10(%.17g): ", M_LOG10E);
evaluate(exp10, M_LOG10E, M_E);
#endif
#ifdef M_E
printf("log(%.17g): ", M_E);
// log(2.7182818284590452) = 1.0
evaluate(log, M_E, 1.0);
#ifdef M_LOG2E
printf("log2(%.17g): ", M_E);
// log2(2.7182818284590452) = 1.0 / log(2.0)
evaluate(log2, M_E, M_LOG2E);
#endif
#ifdef M_LOG10E
printf("log10(%.17g): ", M_E);
// log10(2.7182818284590452) = 1.0 / log(10.0)
evaluate(log10, M_E, M_LOG10E);
#endif
#endif
#ifdef M_PI_4
#ifdef M_SQRT1_2
printf("cos(%.17g): ", M_PI_4);
evaluate(cos, M_PI_4, M_SQRT1_2);
printf("sin(%.17g): ", M_PI_4);
evaluate(sin, M_PI_4, M_SQRT1_2);
#endif
printf("tan(%.17g): ", M_PI_4);
evaluate(tan, M_PI_4, 1.0);
#endif
#ifdef M_PI_2
printf("cos(%.17g): ", M_PI_2);
// cos(1.5707963267948966) = 6.123233995736766e-17
evaluate(cos, M_PI_2, 0.0);
printf("sin(%.17g): ", M_PI_2);
evaluate(sin, M_PI_2, 1.0);
#ifdef INFINITY
printf("tan(%.17g): ", M_PI_2);
// tan(1.5707963267948966) = 1.633123935319537e16
evaluate(tan, M_PI_2, INFINITY);
#endif
#endif
#ifdef M_PI
printf("cos(%.17g): ", M_PI);
evaluate(cos, M_PI, -1.0);
printf("sin(%.17g): ", M_PI);
// sin(3.1415926535897932) = 1.2246467991473532e-16
evaluate(sin, M_PI, 0.0);
printf("tan(%.17g): ", M_PI);
evaluate(tan, M_PI, 0.0);
#endif
do next = cos(last = next);
while (next != last);
printf("cos(%.17g): ", last);
// cos(0.73908513321516064) = 0.73908513321516064
evaluate(cos, last, 0.73908513321516064);
printf("acos(%.17g): ", last);
// acos(0.73908513321516064) = 0.73908513321516064
evaluate(acos, last, 0.73908513321516064);
}
A003957 - OEIS
cc -lm evaluate.c ./a.out
sqrt(-0): -0 is correctly rounded log(0): -inf is correctly rounded acos(0): 1.5707963267948966 is correctly rounded acos(0.5): 1.0471975511965979 is faithfully rounded asin(0.5): 0.52359877559829893 is faithfully rounded sqrt(0.5): 0.70710678118654757 is correctly rounded asin(1): 1.5707963267948966 is correctly rounded atan(1): 0.78539816339744828 is correctly rounded exp(1): 2.7182818284590451 is correctly rounded log(1): 0 is correctly rounded log2(1): 0 is correctly rounded log10(1): 0 is correctly rounded sqrt(2): 1.4142135623730951 is correctly rounded log2(2): 1 is correctly rounded log(2): 0.69314718055994529 is correctly rounded exp(0.69314718055994529): 2 is correctly rounded log10(10): 1 is correctly rounded log(10): 2.3025850929940459 is correctly rounded exp(2.3025850929940459): 10.000000000000002 is faithfully rounded exp2(1.4426950408889634): 2.7182818284590451 is correctly rounded exp10(0.43429448190325182): 2.7182818284590451 is correctly rounded log(2.7182818284590451): 1 is correctly rounded log2(2.7182818284590451): 1.4426950408889634 is correctly rounded log10(2.7182818284590451): 0.43429448190325182 is correctly rounded cos(0.78539816339744828): 0.70710678118654757 is correctly rounded sin(0.78539816339744828): 0.70710678118654746 is faithfully rounded tan(0.78539816339744828): 0.99999999999999989 is faithfully rounded cos(1.5707963267948966): 6.123233995736766e-17 is … sin(1.5707963267948966): 1 is correctly rounded tan(1.5707963267948966): 16331239353195370 is … cos(3.1415926535897931): -1 is correctly rounded sin(3.1415926535897931): 1.2246467991473532e-16 is … tan(3.1415926535897931): -1.2246467991473532e-16 is … cos(0.73908513321516067): 0.73908513321516067 is correctly rounded acos(0.73908513321516067): 0.73908513321516056 is faithfully roundedNote: the (correctly rounded) value of the constant
M_PI
= 0x1.921FB54442D18p+1 = 3.1415926535897932
alias machine πis about 0x1.1A62633145C07p−53 = 1.2246467991473532e−16 greater than the
exactvalue of π, and the (correctly rounded) value of the constant
M_PI_2
= 0x1.921FB54442D18p−1 = 1.5707963267948966
is about
0x1.1A62633145C07p−54 = 6.123233995736766e−17
greater than the exactvalue of π/2.
Shown by
William Kahan
(nearpi.c
),
the double-precision floating-point number that is closest to an
integral multiple of π/2 is the (integral) number
6381956970095103 × 2797 = 0x1.6AC5B262CA1FFp+849 = 5.319372648326541416707296656673541083813475…e+255,
which is about 4.68716592425462761112…e−19 less than
the exact
integral multiple of π/2; the maximum value of
the double-precision tangent is therefore about
2.13348538575370384368…e+18.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
#define FLT_RADIX 2
#define FLT_ROUNDS 1 // round to nearest, ties to even
#define FP_ILOGB0 -2147483648
#define FP_ILOGBNAN 1024
#define FP_ZERO 0
#define FP_SUBNORMAL 1
#define FP_NORMAL 2
#define FP_INFINITE 3
#define FP_NAN 4
#ifndef INFINITY
#define INFINITY (1.0 / 0.5e-323)
#endif
#define INDEFINITE (0.0 * INFINITY)
#define MATH_ERREXCEPT 1
#define MATH_ERRNO 0
#define math_errhandling (MATH_ERREXCEPT | MATH_ERRNO)
double acos(double argument);
double acosh(double argument);
double acot(double argument);
double acot2(double y, double x);
double acoth(double argument);
double asin(double argument);
double asinh(double argument);
double atan(double argument);
double atan2(double y, double x);
double atanh(double argument);
double ceil(double argument);
double copysign(double to, double from);
double cos(double radians);
double exp(double argument);
double expm1(double argument);
double exp10(double argument);
double exp2(double argument);
double exp2n(int exponent);
double fabs(double argument);
double fdim(double left, double right);
double floor(double argument);
double fma(double multiplicand, double multiplier, double addend);
double fmax(double left, double right);
double fmin(double left, double right);
double fmod(double dividend, double divisor);
int fpclassify(double argument);
double frexp(double argument, int *exponent);
double hypot(double left, double right);
int ilogb(double argument);
int isfinite(double argument);
int isinf(double argument);
int isnan(double argument);
int isnormal(double argument);
int issubnormal(double argument);
double ldexp(double argument, int exponent);
double ldexp10(double argument, int exponent);
double log(double argument);
double log1p(double argument);
double log10(double argument);
double log2(double argument);
double logb(double argument);
double modf(double argument, double *integer);
double nextafter(double from, double to);
double remainder(double dividend, double divisor);
double remquo(double dividend, double divisor, int *quotient);
double rint(double argument);
double round(double argument);
int signbit(double argument);
double sin(double radians);
double sqrt(double radicand);
double tan(double radians);
double trunc(double argument);
Note: indicated by the value 1 of the preprocessor
macro FLT_ROUNDS
, the functions presented here require
the default rounding mode round to nearest, ties to even!
Note: indicated by the value 0 of the preprocessor
macro MATH_ERRNO
, the functions presented here
don’t set the (global) errno
variable!
antilog, exhibits the identities ra+b = ra × rb, rlogrc = c and rd × logrc = cd.
The exponential function can be approximated by a (minimax) polynomial on any sufficiently small interval with high accuracy, for example faithfully rounded, as shown hereafter.
exp2()
Base-2 Exponential Functionexp2()
returns the base-2 exponential of its argument.
For −1075 < x = y + z < 1024, with z = ⌊x⌋, i.e. x rounded down towards −∞, hence 0 ≤ y ≤ 1, calculation of 2x = 2y+z = 2y × 2z is reduced to the (polynomial) approximation of 2y on the interval [0, 1], followed by the (trivial) multiplication with 2z.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double floor(double x);
double ldexp(double x, int z);
// Faithfully rounded base-2 exponential
double exp2(double x)
{
double z;
#ifdef OPTIONAL
#define INFINITY (1.0 / 0.5e-323)
#define INDEFINITE (0.0 * INFINITY)
#define M_SQRT2 1.41421356237309505
#define M_1_SQRT2 0.70710678118654752
if (x != x)
return INDEFINITE;
if (x <= -1075.0)
return 0.0;
if (x == -1.0)
return 0.5;
if (x == -0.5)
return M_1_SQRT2;
if (x == 0.0)
return 1.0;
if (x == 0.5)
return M_SQRT2;
if (x == 1.0)
return 2.0;
if (x >= 1024.0)
return INFINITY;
#endif
// for z = floor(x) and x' = x - z, 2**x = 2**(x' + z)
// = 2**x' * 2**z
z = floor(x);
x -= z;
// for 0 <= x' <= 1.0,
// a minimax polynomial of degree 11 approximates 2**x'
// with relative error 3.0545878321297965e-18 < 2**-58
return ldexp(((((((((((+6.2724342467963420e-10 * x
+6.5544572890888113e-9) * x
+1.0254457347176946e-7) * x
+1.3208193500307799e-6) * x
+1.5253190248422251e-5) * x
+1.5403511446514356e-4) * x
+1.3333558661574856e-3) * x
+9.6181290987926433e-3) * x
+5.5504108665711137e-2) * x
+2.4022650695905471e-1) * x
+6.9314718055994623e-1) * x
+1.0, (int) z);
}
Note: overflow and underflow are handled by the
ldexp()
alias
scalbn()
function!
For −1075 < x = y + z < 1024, with z = ⌊x+½⌋ for x > 0 and z = ⌈x−½⌉ for x < 0, i.e. x rounded to the nearest (even) integral number, hence −½ ≤ y ≤ ½, calculation of 2x = 2y+z = 2y × 2z is reduced to the (polynomial) approximation of 2y on the interval [−½, ½], followed by the (trivial) multiplication with 2z.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# Faithfully rounded base-2 exponential
# CAVEAT: requires default (round to nearest, ties to even) rounding mode!
# exp2(-INFINITY) = 0
# exp2(0) = 1
# exp2(1) = 2
# exp2(INFINITY) = INFINITY
# exp2(x) = 2**x
# = 2**(x - z) * 2**z, -1075 < z = rint(x) < 1024
# exp2(-x) = 1 / exp2(x)
# = 1 / 2**x
# = (1 / 2)**x
# IEEE 754 double-precision binary floating-point format:
# - 1-bit sign,
# - 12-bit characteristic is 1023 + exponent,
# - 53-bit significand is 0.fraction if 0 = characteristic,
# 1.fraction if 0 < characteristic < 2047,
# 1.anything if characteristic = 2047,
# - integer bit of significand is implied and not stored
#
# binary64 = (-1)**sign * significand * 2**(characteristic - 1023)
.arch generic64
.code64
.equiv BIAS, 1023
.intel_syntax noprefix
.text
# xmm0 = argument
exp2:
xorpd xmm1, xmm1 # xmm1 = 0.0
comisd xmm1, xmm0
jz .Lspecial # argument = ±0.0?
# argument = INDEFINITE?
.ifdef SSE4_1
roundsd xmm1, xmm0, 0 # xmm1 = argument rounded to nearest (even) integer
cvtsd2si eax, xmm1 # eax = lrint(argument)
.else
cvtsd2si eax, xmm0 # eax = lrint(argument)
.endif
# neg eax
# jo .Lrange # argument > maximum 32-bit integer?
# # argument < minimum 32-bit integer?
# neg eax
cmp eax, 1 - 52 - BIAS
jl .Lunderflow # argument < -1074.0?
# argument < minimum 32-bit integer?
# argument > maximum 32-bit integer?
cmp eax, BIAS
jg .Loverflow # argument > 1023.0?
cvtsi2sd xmm1, eax # xmm1 = rint(argument)
# = log2(scale factor)
subsd xmm0, xmm1 # xmm0 = argument - rint(argument)
# = argument' in [-0.5, 0.5]
.Lhorner:
mov rcx, 0x3DFE7AA0E43A8B3C
movq xmm1, rcx # xmm1 = 0x1.E7AA0E43A8B3Cp-32
# = 4.435280790456428e-10
mulsd xmm1, xmm0
mov rdx, 0x3E3E620FB7BAEC69
movq xmm2, rdx # xmm2 = 0x1.E620FB7BAEC69p-28
# = 7.074105630863329e-9
addsd xmm2, xmm1
mulsd xmm2, xmm0
mov rcx, 0x3E7B526788BF2851
movq xmm1, rcx # xmm1 = 0x1.B526788BF2851p-24
# = 1.0178198034320939e-7
addsd xmm1, xmm2
mulsd xmm1, xmm0
mov rdx, 0x3EB62BFC3C1C57DD
movq xmm2, rdx # xmm2 = 0x1.62BFC3C1C57DDp-20
# = 1.3215433089567188e-6
addsd xmm2, xmm1
mulsd xmm2, xmm0
mov rcx, 0x3EEFFCBFBA7B8470
movq xmm1, rcx # xmm1 = 0x1.FFCBFBA7B847p-17
# = 1.5252733489958518e-5
addsd xmm1, xmm2
mulsd xmm1, xmm0
mov rdx, 0x3F243091310BF6C4
movq xmm2, rdx # xmm2 = 0x1.43091310BF6C4p-13
# = 1.5403530462514668e-4
addsd xmm2, xmm1
mulsd xmm2, xmm0
mov rcx, 0x3F55D87FE78CF26E
movq xmm1, rcx # xmm1 = 0x1.5D87FE78CF26Ep-10
# = 1.3333558146789953e-3
addsd xmm1, xmm2
mulsd xmm1, xmm0
mov rdx, 0x3F83B2AB6FB9F413
movq xmm2, rdx # xmm2 = 0x1.3B2AB6FB9F413p-7
# = 9.618129107588335e-3
addsd xmm2, xmm1
mulsd xmm2, xmm0
mov rcx, 0x3FAC6B08D7049FD0
movq xmm1, rcx # xmm1 = 0x1.C6B08D7049FDp-5
# = 5.5504108664819921e-2
addsd xmm1, xmm2
mulsd xmm1, xmm0
mov rdx, 0x3FCEBFBDFF82C5AD
movq xmm2, rdx # xmm2 = 0x1.EBFBDFF82C5Adp-3
# = 2.4022650695910156e-1
addsd xmm2, xmm1
mulsd xmm2, xmm0
mov rcx, 0x3FE62E42FEFA39EF
movq xmm1, rcx # xmm1 = 0x1.62E42FEFA39EFp-1
# = 6.9314718055994533e-1
addsd xmm1, xmm2
mulsd xmm1, xmm0
mov rdx, 0x3FF0000000000000
movq xmm0, rdx # xmm0 = 0x1.0p+0
# = 1.0
addsd xmm0, xmm1 # xmm0 = polynomial(argument')
.Lscale:
add eax, BIAS # eax = biased exponent of scale factor
jle .Ldenormal
.Lnormal:
shl rax, 52
movq xmm1, rax # xmm1 = 2.0**unbiased exponent
# = scale factor
mulsd xmm0, xmm1 # xmm0 = polynomial(argument')
# * scale factor
# = exp2(argument)
ret
.Ldenormal:
add eax, 51 # eax = 51 + biased exponent of denormal scale factor
# = index of '1' bit in mantissa
xor edx, edx
bts rdx, rax # rdx = denormal scale factor
movq xmm1, rdx # xmm1 = denormal scale factor
mulsd xmm0, xmm1 # xmm0 = polynomial(argument")
# * denormal scale factor
# = exp2(argument)
ret
.Lunderflow:
# comisd xmm1, xmm0
# jb .Loverflow # argument > 0.0?
#
# xorpd xmm0, xmm0 # xmm0 = 0.0
# # = exp2(<=-1074.0)
# ret
.Loverflow:
# mov rax, 0x7FF0000000000000
# movq xmm0, rax # xmm0 = 0x1.0p+1024
# # = INFINITY
# # = exp2(>=1024.0)
# ret
.Lrange:
comisd xmm1, xmm0
sbb eax, eax # eax = (argument < 0.0) ? 0 : -1
shr eax, 21 # rax = (argument < 0.0) ? 0 : 0x7FF
shl rax, 52 # rax = (argument < 0.0) ? 0 : 0x7FF0000000000000
movq xmm0, rax # xmm0 = (argument < 0.0) ? 0.0 : 0x1.0p+1024
# = (argument < 0.0) ? 0.0 : INFINITY
ret
.Lspecial:
jp .Lexit # argument = INDEFINITE?
.Lzero:
mov rax, 0x3FF0000000000000
movq xmm0, rax # xmm0 = 0x1.0p+0
# = 1.0
# = exp2(±0.0)
.Lexit:
ret
.size exp2, .-exp2
.type exp2, @function
.global exp2
.end
Note: the trivial transformation of the assembler
sources with directives for Unix’ or
GNU’s as
into assembler sources for Microsoft’s
ML.EXE
or
ML64.EXE
and vice versa is left as an exercise to the reader.
Microsoft Macro Assembler Reference
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/mt720713.aspx
; exp2(x) = 2**x
.686
.model flat, C
.code
exp2 proc public ; [esp+4] = argument
fld real8 ptr [esp+4] ; st(0) = exponent
if 0
fld1 ; st(0) = 1.0,
; st(1) = exponent
fld st(1) ; st(0) = exponent,
; st(1) = 1.0,
; st(2) = exponent
fprem ; st(0) = exponent modulo 1.0,
; st(1) = 1.0,
; st(2) = exponent
f2xm1 ; st(0) = 2.0**(exponent modulo 1.0) - 1.0,
; st(1) = 1.0,
; st(2) = exponent
faddp st(1), st(0) ; st(0) = 2.0**(exponent modulo 1.0),
; st(1) = exponent
fscale ; st(0) = 2.0**exponent,
; st(1) = exponent
else
fld st(0) ; st(0) = st(1) = exponent
frndint ; st(0) = integer(exponent),
; st(1) = exponent
fsub st(1), st(0) ; st(0) = integer(exponent),
; st(1) = fraction(exponent)
fxch st(1) ; st(0) = fraction(exponent),
; st(1) = integer(exponent)
f2xm1 ; st(0) = 2.0**fraction(exponent) - 1.0,
; st(1) = integer(exponent)
fld1 ; st(0) = 1.0,
; st(1) = 2.0**fraction(exponent) - 1.0,
; st(2) = integer(exponent)
faddp st(1), st(0) ; st(0) = 2.0**fraction(exponent),
; st(1) = integer(exponent)
fscale ; st(0) = 2.0**exponent,
; st(1) = integer(exponent)
endif
fstp st(1) ; st(0) = 2.0**exponent
ret
exp2 endp
end
exp2n()
Functionexp2n(‹integer›)
is equivalent to ldexp(1.0, ‹integer›)
.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
#define INFINITY (1.0 / 0.5e-323)
double exp2n(int exponent)
{
unsigned long long ull;
if (exponent > 1023)
return INFINITY;
if (exponent < -1074)
return 0.0;
if (exponent < -1022) {
ull = 1;
ull <<= 1074 + exponent;
} else {
ull = 1023 + exponent;
ull <<= 52;
}
return *(double *) &ull;
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# Unix System V calling convention for AMD64 platform:
# - first 6 floating-point arguments (from left to right) are passed in
# registers XMM0 to XMM5;
# - first 6 integer or pointer arguments (from left to right) are passed
# in registers RDI/R7, RSI/R6, RDX/R2, RCX/R1, R8 and R9
# (R10 is used as static chain pointer in case of nested functions);
# - surplus arguments are pushed on stack in reverse order (from right to
# left), 8-byte aligned;
# - 128-bit integer arguments are passed as pair of 64-bit integer arguments,
# low part before/below high part;
# - 128-bit integer result is returned in registers RAX/R0 (low part) and
# RDX/R2 (high part);
# - 64-bit integer or pointer result is returned in register RAX/R0;
# - 32-bit integer result is returned in register EAX;
# - floating-point result is returned in register XMM0;
# - registers RBX/R3, RSP/R4, RBP/R5 and R12 to R15 must be preserved;
# - registers RAX/R0, RCX/R1, RDX/R2, RSI/R6, RDI/R7, R8, R9, R10 (in
# case of normal functions), R11 and XMM0 to XMM15 are volatile and can
# be clobbered;
# - stack is 16-byte aligned: callee must decrement RSP by 8+n*16 bytes
# before calling other functions (CALL instruction pushes 8 bytes);
# - a "red zone" of 128 bytes below the stack pointer can be clobbered.
# exp2n(<-1074) = 0
# exp2n(0) = 1
# exp2n(>1023) = INFINITY
# exp2n(n) = 2**n
# exp2n(-n) = 1 / exp2n(n)
# = 1 / 2**n
# = (1 / 2)**n
.arch generic64
.code64
.equiv BIAS, 1023
.intel_syntax noprefix
.text
# edi = exponent
exp2n:
mov eax, edi # eax = exponent
cmp eax, BIAS
jg .Loverflow # exponent > 1023?
cmp eax, 1 - 52 - BIAS
jl .Lunderflow # exponent < -1074?
add eax, BIAS # eax = biased exponent
jg .Lnormal # biased exponent > 0?
.Ldenormal:
add eax, 51 # eax = index of '1' bit in mantissa
xor edi, edi
bts rdi, rax # rdi = denormal 2.0**exponent
movq xmm0, rdi # xmm0 = denormal 2.0**exponent
ret
.Loverflow:
mov eax, 1 + 2 * BIAS
# rax = biased exponent
# = 2047
.Lnormal:
shl rax, 52
movq xmm0, rax # xmm0 = 2.0**exponent
ret
.Lunderflow:
xorpd xmm0, xmm0 # xmm0 = 0.0
# = exp2n(<-1074)
ret
.size exp2n, .-exp2n
.type exp2n, @function
.global exp2n
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; Microsoft calling convention for AMD64 platform:
; - first 4 arguments (from left to right) are passed in registers
; RCX/R1 or XMM0, RDX/R2 or XMM1, R8 or XMM2, and R9 or XMM3,
; depending on their type (for floating-point arguments of
; unprototyped or variadic functions, where argument type
; expected by callee is unknown, both registers are used);
; - arguments larger than 8 bytes are passed by reference;
; - surplus arguments are pushed on stack in reverse order (from
; right to left), 8-byte aligned;
; - caller allocates memory for return value larger than 8 bytes and
; passes pointer to it as (hidden) first argument, thus shifting
; all other arguments;
; - caller always allocates "home space" for 4 arguments on stack,
; even when less than 4 arguments are passed, but does not need to push
; first 4 arguments;
; - callee can spill first 4 arguments from registers to "home space";
; - callee can clobber "home space";
; - stack is 16-byte aligned: callee must decrement RSP by 8+n*16
; bytes when it calls other functions (CALL instruction pushes 8 bytes);
; - integer or pointer result is returned in register RAX/R0;
; - floating-point result is returned in register XMM0;
; - registers RAX/R0, RCX/R1, RDX/R2, R8, R9, R10, R11 and XMM0 to
; XMM5 are volatile and can be clobbered;
; - registers RBX/R3, RSP/R4, RBP/R5, RSI/R6, RDI/R7, R12, R13, R14,
; R15 and XMM6 to XMM15 must be preserved.
; exp2n(<-1074) = 0
; exp2n(0) = 1
; exp2n(>1023) = INFINITY
; exp2n(x) = 2**x
; exp2n(-x) = 1 / exp2n(x)
; = 1 / 2**x
; = (1 / 2)**x
.code
double record sign:1, exponent:11, mantissa:52
bias equ 1 shl (width exponent - 1) - 1
exp2n proc public ; ecx = exponent
mov eax, ecx ; eax = exponent
cmp eax, bias
jg Loverflow ; exponent > 1023?
cmp eax, 1 - width mantissa - bias
jl Lunderflow ; exponent < -1074?
add eax, bias ; eax = biased exponent
jg Lnormal ; biased exponent > 0?
Ldenormal:
add eax, width mantissa - 1 ; eax = index of '1' bit in mantissa
xor ecx, ecx
bts rcx, rax ; rcx = denormal 2.0**exponent
movd xmm0, rcx ; xmm0 = denormal 2.0**exponent
ret
Loverflow:
mov eax, bias * 2 + 1 ; rax = biased exponent
; = 2047
Lnormal:
shl rax, width mantissa
movd xmm0, rax ; xmm0 = 2.0**exponent
ret
Lunderflow:
xorpd xmm0, xmm0 ; xmm0 = 0.0
; = exp2n(<-1074)
ret
exp2n endp
end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; Common "cdecl" calling and naming convention for i386 platform:
; - arguments are pushed on stack in reverse order (from right to left),
; 4-byte aligned;
; - 64-bit integer arguments are passed as pair of 32-bit integer arguments,
; low part below high part;
; - 80-bit, 64-bit or 32-bit floating-point result is returned in FPU
; register ST0;
; - 64-bit integer result is returned in registers EAX (low part) and
; EDX (high part);
; - 32-bit integer or pointer result is returned in register EAX;
; - registers EAX, ECX and EDX are volatile and can be clobbered;
; - registers EBX, ESP, EBP, ESI and EDI must be preserved.
; exp2n(<-1022) = 0
; exp2n(0) = 1
; exp2n(>1023) = INFINITY
; exp2n(n) = 2**n
; exp2n(-n) = 1 / exp2n(n)
; = 1 / 2**n
; = (1 / 2)**n
.686
.model flat, C
.code
exp2n proc public ; [esp+4] = argument
fild dword ptr [esp+4] ; st(0) = exponent
fld1 ; st(0) = 1.0,
; st(1) = exponent
fscale ; st(0) = 1.0 * 2.0**exponent,
; st(1) = exponent
fstp st(1) ; st(0) = 2.0**exponent
ret
exp2n endp
end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; exp2n(<-1022) = 0
; exp2n(0) = 1
; exp2n(>1023) = INFINITY
; exp2n(n) = 2**n
; exp2n(-n) = 1 / exp2n(n)
; = 1 / 2**n
; = (1 / 2)**n
.686
.model flat, C
.code
exp2n proc public ; [esp+4] = argument
mov eax, [esp+4] ; eax = exponent
mov edx, 1024 ; edx = 1024
; = maximum exponent
cmp edx, eax
cmovl eax, edx ; eax = min(exponent, 1024)
if 0
dec edx ; edx = 1023
neg edx ; edx = -1023
; = minimum exponent
else
mov edx, -1023 ; edx = -1023
; = minimum exponent
endif
cmp edx, eax
cmovg eax, edx ; eax = max(min(exponent, 1024), -1023)
; = clamped unbiased exponent
sub eax, edx ; eax = clamped unbiased exponent + 1023
; = biased exponent
shl eax, 20
push eax
push 0 ; [esp] = 2.0**exponent
fld real8 ptr [esp] ; st(0) = 2.0**exponent
add esp, 8
ret
exp2n endp
end
exp10()
Base-10 Exponential FunctionTo avoid this, the product z × log102 must be calculated in higher precision and subtraction performed in 2 steps, known as Cody-Waite argument reduction: § log102 is split apart into a (double-double) head + tail pair, with tail = log210 − head and the 11 least significant bits (matching the size of the exponent) of head’s fraction clear.
The product
z′ = head × z × log102
is then exact and the difference
y′ = x − z′
according to Sterbenz’ lemma
§
too.
Subtraction of the product
z″ = tail × z × log102
from y′ gives a correctly rounded
y″ = y for the
(polynomial) approximation of 10y on the
interval
[0, log102 = 1/log210 = 0.3010299956639812],
followed by the (trivial) multiplication with
2z.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double floor(double x);
double ldexp(double x, int z);
// Faithfully rounded base-10 exponential
double exp10(double x)
{
double z;
#ifdef OPTIONAL
#define INFINITY (1.0 / 0.5e-323)
#define INDEFINITE (0.0 * INFINITY)
#define M_SQRT10 3.1622776601683793
#define M_1_SQRT10 0.31622776601683793
if (x != x)
return INDEFINITE;
if (x < -323.60724533877978)
return 0.0;
if (x == -1.0)
return 0.1;
if (x == -0.5)
return M_1_SQRT10;
if (x == 0.0)
return 1.0;
if (x == 0.5)
return M_SQRT10;
if (x == 1.0)
return 10.0;
if (x > 308.25471555991674)
return INFINITY;
#endif
// for z = x * log2(10.0) = 3.3219280948873623
// and x" = x - z * log10(2.0), 10**x = 10**x" * 2**z
//
// for integral |z| < 2048 the double-precision product
// z * 0x0.4D104D427DE00 = z * 0x1.34413509F7800p-2
// = z * 0.30102999566395283
// is exact and lies within a binade from x, therefore the
// first subtraction yields an exact intermediate result x'
//
// subtraction of the double-precision tail product
// z * 0x0.7FBCC47C4ACD6p-44 = z * 0x1.FEF311F12B358p-46
// = z * 0.28363394551044964e-13
// yields x" within 2**(-48-52) from x - z * log10(2.0)
//
// the correctly rounded x" lies within 0.5 ULP + 2**-100
// from the exact x - z * log10(2.0)
//
// for 0 <= x" <= log10(2.0) = 0.3010299956639812,
// a minimax polynomial of degree 11 approximates 10**x"
// with relative error 3.0545878321297965e-18 < 2**-58
z = floor(x * 3.3219280948873623);
x -= z * 0.30102999566395283;
x -= z * 0.28363394551044964e-13;
return ldexp(((((((((((+3.4097977633132781e-4 * x
+1.0726030173640114e-3) * x
+5.0515508830497290e-3) * x
+1.9586879159041869e-2) * x
+6.8091402676825436e-2) * x
+2.0699559408492088e-1) * x
+5.3938295003481862e-1) * x
+1.1712551478362764) * x
+2.0346785923260857) * x
+2.6509490552386914) * x
+2.3025850929940488) * x
+1.0, (int) z);
}
Note: overflow and underflow are handled by the
ldexp()
alias
scalbn()
function!
For −1075 < x × log210 < 1024, with z = ⌊x × log210 + ½⌋ for x > 0 and z = ⌈x × log210 - ½⌉ for x < 0, i.e. x × log210 rounded to the nearest (even) integral number, hence −½ × log102 ≤ y = x − z × log102 ≤ ½ × log102, calculation of 10x = 10y+z×log102 = 10y × 10z × log102 = 10y × 2z is reduced to the (polynomial) approximation of 10y on the interval [−½ × log102, ½ × log102], followed by the (trivial) multiplication with 2z.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# Faithfully rounded base-10 exponential
# CAVEAT: requires default (round to nearest, ties to even) rounding mode!
# exp10(-INFINITY) = 0
# exp10(0) = 1
# exp10(1) = 10
# exp10(INFINITY) = INFINITY
# exp10(x) = 10**x
# = 10**(x - z * log10(2)) * 2**z, -1075 < z = rint(x / log10(2)) < 1024
# exp10(-x) = 1 / exp10(x)
# = 1 / 10**x
# = (1 / 10)**x
.arch generic64
.code64
.equiv BIAS, 1023
.intel_syntax noprefix
.text
# xmm0 = argument
exp10:
xorpd xmm1, xmm1 # xmm1 = 0.0
comisd xmm1, xmm0
jz .Lspecial # argument = ±0.0?
# argument = INDEFINITE?
mov rax, 0x400A934F0979A371
movq xmm2, rax # xmm2 = 0x1.A934F0979A371p+1
# = 3.3219280948873623
# = 1.0 / log10(2.0)
# = log2(10.0)
mulsd xmm2, xmm0 # xmm2 = log2(10.0) * argument
# = argument / log10(2.0)
.ifdef SSE4_1
roundsd xmm2, xmm2, 0 # xmm2 = argument / log10(2.0) rounded to nearest (even) integer
.endif
cvtsd2si eax, xmm2 # eax = lrint(argument / log10(2.0))
# neg eax
# jo .Lrange # argument / log10(2.0) > maximum 32-bit integer?
# # argument / log10(2.0) < minimum 32-bit integer?
# neg eax
cmp eax, 1 - 52 - BIAS
jl .Lunderflow # argument / log10(2.0) < -1074.0?
# argument / log10(2.0) < minimum 32-bit integer?
# argument / log10(2.0) > maximum 32-bit integer?
cmp eax, BIAS
jg .Loverflow # argument / log10(2.0) > 1023.0?
cvtsi2sd xmm1, eax # xmm1 = rint(argument / log10(2.0))
# = log2(scale factor)
mov rdx, 0x3FD34413509F7800
movq xmm2, rdx # xmm2 = 0x1.34413509F7800p-2
# = 0.30102999566395283
# = log10(2.0)'
mulsd xmm2, xmm1
subsd xmm0, xmm2 # xmm0 = argument
# - log10(2.0)' * rint(argument / log10(2.0))
# = argument'
mov rdx, 0x3D1FEF311F12B358
movq xmm2, rdx # xmm2 = 0x1.FEF311F12B358p-46
# = 2.8363394551044964e-14
# = log10(2.0) - log10(2.0)'
# = log10(2.0)"
mulsd xmm2, xmm1
subsd xmm0, xmm2 # xmm0 = argument'
# - log10(2.0)" * rint(argument / log10(2.0))
# = argument" in [-log10(2.0) / 2.0, log10(2.0) / 2.0]
.Lhorner:
mov rcx, 0x3F2F9A47809D481E
movq xmm1, rcx # xmm1 = 0x1.F9A47809D481Ep-13
# = 2.4110911209135413e-4
mulsd xmm1, xmm0
mov rdx, 0x3F52F77F5270A2E0
movq xmm2, rdx # xmm2 = 0x1.2F77F5270A2E0p-10
# = 1.1576407794199815e-3
addsd xmm2, xmm1
mulsd xmm2, xmm0
mov rcx, 0x3F74898B16300A8C
movq xmm1, rcx # xmm1 = 0x1.4898B16300A8Cp-8
# = 5.0139840195721038e-3
addsd xmm1, xmm2
mulsd xmm1, xmm0
mov rdx, 0x3F941165ADE1D201
movq xmm2, rdx # xmm2 = 0x1.41165ADE1D201p-6
# = 1.9597614992067139e-2
addsd xmm2, xmm1
mulsd xmm2, xmm0
mov rcx, 0x3FB16E4DF62D8622
movq xmm1, rcx # xmm1 = 0x1.16E4DF62D8622p-4
# = 6.8089363672264841e-2
addsd xmm1, xmm2
mulsd xmm1, xmm0
mov rdx, 0x3FCA7ED70A468547
movq xmm2, rdx # xmm2 = 0x1.A7ED70A468547p-3
# = 2.0699584962589253e-1
addsd xmm2, xmm1
mulsd xmm2, xmm0
mov rcx, 0x3FE1429FFD1F5001
movq xmm1, rcx # xmm1 = 0x1.1429FFD1F5001p-1
# = 5.3938292921020555e-1
addsd xmm1, xmm2
mulsd xmm1, xmm0
mov rdx, 0x3FF2BD7609FD42C5
movq xmm2, rdx # xmm2 = 0x1.2BD7609FD42C5p+0
# = 1.1712551489073786
addsd xmm2, xmm1
mulsd xmm2, xmm0
mov rcx, 0x4000470591DE2C1B
movq xmm1, rcx # xmm1 = 0x1.0470591DE2C1Bp+1
# = 2.0346785922934154
addsd xmm1, xmm2
mulsd xmm1, xmm0
mov rdx, 0x40053524C73CEA7E
movq xmm2, rdx # xmm2 = 0x1.53524C73CEA7Ep+1
# = 2.6509490552392084
addsd xmm2, xmm1
mulsd xmm2, xmm0
mov rcx, 0x40026BB1BBB55516
movq xmm1, rcx # xmm1 = 0x1.26BB1BBB55516p+1
# = 2.3025850929940458
addsd xmm1, xmm2
mulsd xmm1, xmm0
mov rdx, 0x3FF0000000000000
movq xmm0, rdx # xmm0 = 0x1.0p+0
# = 1.0
addsd xmm0, xmm1 # xmm0 = polynomial(argument")
.Lscale:
add eax, BIAS # eax = biased exponent of scale factor
jle .Ldenormal
.Lnormal:
shl rax, 52
movq xmm1, rax # xmm1 = 2.0**unbiased exponent
# = scale factor
mulsd xmm0, xmm1 # xmm0 = polynomial(argument")
# * scale factor
# = exp10(argument)
ret
.Ldenormal:
add eax, 51 # eax = 51 + biased exponent of denormal scale factor
# = index of '1' bit in mantissa
xor edx, edx
bts rdx, rax # rdx = denormal scale factor
movq xmm1, rdx # xmm1 = denormal scale factor
mulsd xmm0, xmm1 # xmm0 = polynomial(argument")
# * denormal scale factor
# = exp10(argument)
ret
.Lunderflow:
# comisd xmm1, xmm0
# jb .Loverflow # argument > 0.0?
#
# xorpd xmm0, xmm0 # xmm0 = 0.0
# # = exp10(<-0x1.439B746E36B52p+8)
# # = exp10(<-323.60724533877978)
# ret
.Loverflow:
# mov rax, 0x7FF0000000000000
# movq xmm0, rax # xmm0 = 0x1.0p+1024
# # = INFINITY
# # = exp10(>0x1.34413509F79FFp+8)
# # = exp10(>308.25471555991674)
# ret
.Lrange:
comisd xmm1, xmm0
sbb eax, eax # eax = (argument < 0.0) ? 0 : -1
shr eax, 21 # rax = (argument < 0.0) ? 0 : 0x7FF
shl rax, 52 # rax = (argument < 0.0) ? 0 : 0x7FF0000000000000
movq xmm0, rax # xmm0 = (argument < 0.0) ? 0.0 : 0x1.0p+1024
# = (argument < 0.0) ? 0.0 : INFINITY
ret
.Lspecial:
jp .Lexit # argument = INDEFINITE?
.Lzero:
mov rax, 0x3FF0000000000000
movq xmm0, rax # xmm0 = 0x1.0p+0
# = 1.0
# = exp10(±0.0)
.Lexit:
ret
.size exp10, .-exp10
.type exp10, @function
.global exp10
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; exp10(x) = 10**x
; = 2**(x * log2(10))
; exp10(-x) = 1 / exp10(x)
.686
.model flat, C
.code
exp10 proc public ; [esp+4] = argument
fld real8 ptr [esp+4] ; st(0) = exponent
fldl2t ; st(0) = log2(10.0),
; st(1) = exponent
fmulp st(1), st(0) ; st(0) = exponent * log2(10.0)
if 0
fld1 ; st(0) = 1.0,
; st(1) = exponent * log2(10.0)
fld st(1) ; st(0) = exponent * log2(10.0),
; st(1) = 1.0,
; st(2) = exponent * log2(10.0)
fprem ; st(0) = (exponent * log2(10.0)) modulo 1.0,
; st(1) = 1.0,
; st(2) = exponent * log2(10.0)
f2xm1 ; st(0) = 2.0**((exponent * log2(10.0)) modulo 1.0) - 1.0,
; st(1) = 1.0,
; st(2) = exponent * log2(10.0)
faddp st(1), st(0) ; st(0) = 2.0**((exponent * log2(10.0)) modulo 1.0),
; st(1) = exponent * log2(10.0)
fscale ; st(0) = 10.0**exponent,
; st(1) = exponent * log2(10.0)
else
fld st(0) ; st(0) = st(1) = exponent * log2(10.0)
frndint ; st(0) = integer(exponent * log2(10.0)),
; st(1) = exponent * log2(10.0)
fsub st(1), st(0) ; st(0) = integer(exponent * log2(10.0)),
; st(1) = fraction(exponent * log2(10.0))
fxch st(1) ; st(0) = fraction(exponent * log2(10.0)),
; st(1) = integer(exponent * log2(10.0))
f2xm1 ; st(0) = 2.0**fraction(exponent * log2(10.0)) - 1.0,
; st(1) = integer(exponent * log2(10.0))
fld1 ; st(0) = 1.0,
; st(1) = 2.0**fraction(exponent * log2(10.0)) - 1.0,
; st(2) = integer(exponent * log2(10.0))
faddp st(1), st(0) ; st(0) = 2.0**fraction(exponent * log2(10.0)),
; st(1) = integer(exponent * log2(10.0))
fscale ; st(0) = 10.0**exponent,
; st(1) = integer(exponent * log2(10.0))
endif
fstp st(1) ; st(0) = 10.0**exponent
ret
exp10 endp
end
exp()
Base-e Exponential Functionexp()
returns the
base-e exponential of its
argument.
Calculation of the exponential function to the transcendental base e = 2.71828182845904523536028747135266249775724709369995…, known as Euler’s number and also Napier’s constant, ex = ey+z×loge2 = ey × ez×loge2 = ey × 2z for −1075 < x × log2e < 1024, with z = ⌊x × log2e⌋, i.e. x × log2e rounded down towards −∞, hence 0 ≤ y = x − z × loge2 ≤ loge2 = 1/log2e = 0.69314718055994531, is more difficult than calculation of 2x: for z × loge2 close to x, calculation of the difference y = x − z × loge2 suffers from subtractive cancellation, i.e. complete loss of precision!
To avoid this, the product z × loge2 must be calculated in higher precision and subtraction performed in 2 steps, known as Cody-Waite argument reduction: § loge2 is split apart into a (double-double) head + tail pair, with tail = log2e − head and the 11 least significant bits (matching the size of the exponent) of head’s fraction clear.
The product
z′ = head × z × loge2
is then exact and the difference
y′ = x − z′
according to Sterbenz’ lemma
§
too.
Subtraction of the product
z″ = tail × z × loge2
from y′ gives a correctly rounded
y″ = y for the
(polynomial) approximation of ey on
the interval
[0, loge2 = 1/log2e = 0.69314718055994531],
followed by the (trivial) multiplication with
2z.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double floor(double x);
double ldexp(double x, int z);
// Faithfully rounded base-e exponential
double exp(double x)
{
double z;
#ifdef OPTIONAL
#define INFINITY (1.0 / 0.5e-323)
#define INDEFINITE (0.0 * INFINITY)
#define M_E 2.7182818284590452
#define M_1_E 0.36787944117144232
#define M_SQRTE 1.6487212707001281
#define M_1_SQRTE 0.60653065971263342
if (x != x)
return INDEFINITE;
if (x < -745.13321910194121)
return 0.0;
if (x == -1.0)
return M_1_E;
if (x == -0.5)
return M_1_SQRTE;
if (x == 0.0)
return 1.0;
if (x == 0.5)
return M_SQRTE;
if (x == 1.0)
return M_E;
if (x > 709.78271289338400)
return INFINITY;
#endif
// for (integral) z = x * log2(e) = x * 1.4426950408889634
// and x" = x - z * log(2.0), e**x = e**x" * 2**z
//
// for integral |z| < 2048 the double-precision product
// z * 0x0.B17217F7D1C00 = z * 0x1.62E42FEFA3800p-1
// = z * 0.69314718055989033
// is exact and lies within a binade from x, therefore the
// first subtraction yields an exact intermediate result x'
//
// subtraction of the double-precision tail product
// z * 0x0.F79ABC9E3B398p-44 = z * 0x1.EF35793C76730p-45
// = z * 0.54979230187083712e-13
// yields x" within 2**(-50-52) from x - z * log(2.0)
//
// the correctly rounded x" lies within 0.5 ULP + 2**-102
// from the exact x - z * log(2.0)
//
// for 0 <= x" <= log(2.0) = 0.69314718055994531,
// a minimax polynomial of degree 11 approximates e**x"
// with relative error 3.0545878321297965e-18 < 2**-58
z = floor(x * 1.4426950408889634);
x -= z * 0.69314718055989033;
x -= z * 0.54979230187083712e-13;
return ldexp(((((((((((+3.5347283721656128e-8 * x
+2.5602485412126367e-7) * x
+2.7764095757136529e-6) * x
+2.4787899938611698e-5) * x
+1.9841863599469418e-4) * x
+1.3888871805082296e-3) * x
+8.3333336552944127e-3) * x
+4.1666666628388979e-2) * x
+1.6666666666933781e-1) * x
+4.9999999999990426e-1) * x
+1.0000000000000013) * x
+1.0, (int) z);
}
Note: overflow and underflow are handled by the
ldexp()
alias
scalbn()
function!
For −1075 < x × log2e < 1024, with z = ⌊x × log2e + ½⌋ for x > 0 and z = ⌈x × log2e − ½⌉ for x < 0, i.e. x × log2e rounded to the nearest (even) integral number, hence −½ × loge2 ≤ y = x − z × loge2 ≤ ½ × loge2, calculation of ex = ey+z×loge2 = ey × ez×loge2 = ey × 2z is reduced to the (polynomial) approximation of ey on the interval [−½ × loge2, ½ × loge2], followed by the (trivial) multiplication with 2z.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# Faithfully rounded natural exponential
# CAVEAT: requires default (round to nearest, ties to even) rounding mode!
# exp(-INFINITY) = 0
# exp(0) = 1
# exp(1) = e
# exp(INFINITY) = INFINITY
# exp(x) = e**x
# = e**(x - z * log(2)) * 2**z, -1075 < z = rint(x / log(2)) < 1024
# exp(-x) = 1 / exp(x)
# = 1 / e**x
# = (1 / e)**x
.arch generic64
.code64
.equiv BIAS, 1023
.intel_syntax noprefix
.text
# xmm0 = argument
exp:
xorpd xmm1, xmm1 # xmm1 = 0.0
comisd xmm1, xmm0
jz .Lspecial # argument = ±0.0?
# argument = INDEFINITE?
mov rax, 0x3FF71547652B82FE
movq xmm2, rax # xmm2 = 0x1.71547652B82FEp+0
# = 1.4426950408889634
# = 1.0 / log(2.0)
# = log2(e)
mulsd xmm2, xmm0 # xmm2 = log2(e) * argument
# = argument / log(2.0)
.ifdef SSE4_1
roundsd xmm2, xmm2, 0 # xmm2 = argument / log(2.0) rounded to nearest (even) integer
.endif
cvtsd2si eax, xmm2 # eax = lrint(argument / log(2.0))
# neg eax
# jo .Lrange # argument / log(2.0) > maximum 32-bit integer?
# # argument / log(2.0) < minimum 32-bit integer?
# neg eax
cmp eax, 1 - 52 - BIAS
jl .Lunderflow # argument / log(2.0) < -1074.0?
# argument / log(2.0) < minimum 32-bit integer?
# argument / log(2.0) > maximum 32-bit integer?
cmp eax, BIAS
jg .Loverflow # argument / log(2.0) > 1023.0?
cvtsi2sd xmm1, eax # xmm1 = rint(argument / log(2.0))
# = log2(scale factor)
mov rdx, 0x3FE62E42FEFA3800
movq xmm2, rdx # xmm2 = 0x1.62E42FEFE3800p-1
# = 0.69314718055989033
# = log(2.0)'
mulsd xmm2, xmm1
subsd xmm0, xmm2 # xmm0 = argument
# - log(2.0)' * rint(argument / log(2.0))
# = argument'
mov rdx, 0x3D2EF35793C76730
movq xmm2, rdx # xmm2 = 0x1.EF35793C76730p-45
# = 5.4979230187083712e-14
# = log(2.0) - log(2.0)'
# = log(2.0)"
mulsd xmm2, xmm1
subsd xmm0, xmm2 # xmm0 = argument'
# - log(2.0)" * rint(argument / log(2.0))
# = argument" in [-log(2.0) / 2.0, log(2.0) / 2.0]
.Lhorner:
mov rdx, 0x3E5AD661C903688B
movq xmm1, rdx # xmm1 = 0x1.AD661C903688Bp-26
# = 2.4994304016107913e-8
mulsd xmm1, xmm0
mov rdx, 0x3E928B311C7EB84F
movq xmm2, rdx # xmm2 = 0x1.28B311C7EB84Fp-22
# = 2.7632293297497039e-7
addsd xmm2, xmm1
mulsd xmm2, xmm0
mov rdx, 0x3EC71DF4520AAEEB
movq xmm1, rdx # xmm1 = 0x1.71DF4520AAEEBp-19
# = 2.7557622533559223e-6
addsd xmm1, xmm2
mulsd xmm1, xmm0
mov rdx, 0x3EFA01992D0FE736
movq xmm2, rdx # xmm2 = 0x1.A01992D0FE736p-16
# = 2.4801486521375964e-5
addsd xmm2, xmm1
mulsd xmm2, xmm0
mov rdx, 0x3F2A01A0110572B2
movq xmm1, rdx # xmm1 = 0x1.A01A0110572B2p-13
# = 1.9841269432676262e-4
addsd xmm1, xmm2
mulsd xmm1, xmm0
mov rdx, 0x3F56C16C1878111C
movq xmm2, rdx # xmm2 = 0x1.6C16C1878111Cp-10
# = 1.3888888951224038e-3
addsd xmm2, xmm1
mulsd xmm2, xmm0
mov rdx, 0x3F81111111130DD6
movq xmm1, rdx # xmm1 = 0x1.1111111130DD6p-7
# = 8.3333333335592727e-3
addsd xmm1, xmm2
mulsd xmm1, xmm0
mov rdx, 0x3FA555555554F370
movq xmm2, rdx # xmm2 = 0x1.555555554F370p-5
# = 4.1666666666492767e-2
addsd xmm2, xmm1
mulsd xmm2, xmm0
mov rdx, 0x3FC55555555554A2
movq xmm1, rdx # xmm1 = 0x1.55555555554A2p-3
# = 1.6666666666666169e-1
addsd xmm1, xmm2
mulsd xmm1, xmm0
mov rdx, 0x3FE0000000000010
movq xmm2, rdx # xmm2 = 0x1.0000000000010p-1
# = 5.0000000000000177e-1
addsd xmm2, xmm1
mulsd xmm2, xmm0
mov rdx, 0x3FF0000000000000
movq xmm1, rdx # xmm1 = 0x1.0p+0
# = 1.0
addsd xmm2, xmm1
mulsd xmm0, xmm2
addsd xmm0, xmm1 # xmm0 = polynomial(argument")
.Lscale:
add eax, BIAS # eax = biased exponent of scale factor
jle .Ldenormal
.Lnormal:
shl rax, 52
movq xmm1, rax # xmm1 = 2.0**unbiased exponent
# = scale factor
mulsd xmm0, xmm1 # xmm0 = polynomial(argument")
# * scale factor
# = exp(argument)
ret
.Ldenormal:
add eax, 51 # eax = 51 + biased exponent of denormal scale factor
# = index of '1' bit in mantissa
xor edx, edx
bts rdx, rax # rdx = denormal scale factor
movq xmm1, rdx # xmm1 = denormal scale factor
mulsd xmm0, xmm1 # xmm0 = polynomial(argument")
# * denormal scale factor
# = exp(argument)
ret
.Lunderflow:
# comisd xmm1, xmm0
# jb .Loverflow # argument > 0.0?
#
# xorpd xmm0, xmm0 # xmm0 = 0.0
# # = exp(<-0x1.74385446D71C3p+9)
# # = exp(<-744.44007192138126)
# ret
.Loverflow:
# mov rax, 0x7FF0000000000000
# movq xmm0, rax # xmm0 = 0x1.0p+1024
# # = INFINITY
# # = exp(>0x1.62E42FEFA39EFp+9)
# # = exp(>709.78271289338400)
# ret
.Lrange:
comisd xmm1, xmm0
sbb eax, eax # eax = (argument < 0.0) ? 0 : -1
shr eax, 21 # rax = (argument < 0.0) ? 0 : 0x7FF
shl rax, 52 # rax = (argument < 0.0) ? 0 : 0x7FF0000000000000
movq xmm0, rax # xmm0 = (argument < 0.0) ? 0.0 : 0x1.0p+1024
# = (argument < 0.0) ? 0.0 : INFINITY
ret
.Lspecial:
jp .Lexit # argument = INDEFINITE?
.Lzero:
mov rax, 0x3FF0000000000000
movq xmm0, rax # xmm0 = 0x1.0p+0
# = 1.0
# = exp(±0.0)
.Lexit:
ret
.size exp, .-exp
.type exp, @function
.global exp
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/c850xxez.aspx
; exp(x) = e**x
; = 2**(x * log2(e))
; exp(-x) = 1 / exp(x)
.686
.model flat, C
.code
exp proc public ; [esp+4] = argument
fld real8 ptr [esp+4] ; st(0) = exponent
fldl2e ; st(0) = log2(e),
; st(1) = exponent
fmulp st(1), st(0) ; st(0) = exponent * log2(e)
if 0
fld1 ; st(0) = 1.0,
; st(1) = exponent * log2(e)
fld st(1) ; st(0) = exponent * log2(e),
; st(1) = 1.0,
; st(2) = exponent * log2(e)
fprem ; st(0) = (exponent * log2(e)) modulo 1.0,
; st(1) = 1.0,
; st(2) = exponent * log2(e)
f2xm1 ; st(0) = 2.0**((exponent * log2(e)) modulo 1.0) - 1.0,
; st(1) = 1.0,
; st(2) = exponent * log2(e)
faddp st(1), st(0) ; st(0) = 2.0**((exponent * log2(e)) modulo 1.0),
; st(1) = exponent * log2(e)
fscale ; st(0) = e**exponent,
; st(1) = exponent * log2(e)
else
fld st(0) ; st(0) = st(1) = exponent * log2(e)
frndint ; st(0) = integer(exponent * log2(e)),
; st(1) = exponent * log2(e)
fsub st(1), st(0) ; st(0) = integer(exponent * log2(e)),
; st(1) = fraction(exponent * log2(e))
fxch st(1) ; st(0) = fraction(exponent * log2(e)),
; st(1) = integer(exponent * log2(e))
f2xm1 ; st(0) = 2.0**fraction(exponent * log2(e)) - 1.0,
; st(1) = integer(exponent * log2(e))
fld1 ; st(0) = 1.0,
; st(1) = 2.0**fraction(exponent * log2(e)) - 1.0,
; st(2) = integer(exponent * log2(e))
faddp st(1), st(0) ; st(0) = 2.0**fraction(exponent * log2(e)),
; st(1) = integer(exponent * log2(e))
fscale ; st(0) = e**exponent,
; st(1) = integer(exponent * log2(e))
endif
fstp st(1) ; st(0) = e**exponent
ret
exp endp
end
expm1()
Base-e Exponential Functionexpm1()
returns the by one decremented
base-e exponential of its
argument.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double fma(double x, double y, double z);
// Faithfully rounded base-e exponential minus 1
// for |x| < log(1.5) = 0.405465108108164382
double expm1(double x)
{
double z = 2.0884268547791305e-9;
z = fma(z, x, 2.5136640903355195e-8);
z = fma(z, x, 2.7557461207244723e-7);
z = fma(z, x, 2.7557153928447346e-6);
z = fma(z, x, 2.4801586944307795e-5);
z = fma(z, x, 1.9841269987879947e-4);
z = fma(z, x, 1.3888888889202989e-3);
z = fma(z, x, 8.3333333332766286e-3);
z = fma(z, x, 4.1666666666665637e-2);
z = fma(z, x, 1.6666666666666738e-1);
z = fma(z, x, 0.5) * x;
z = fma(z, x, x);
return z;
}
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/dn353645.aspx
; expm1(x) = e**x - 1
; = 2**(x * log2(e)) - 1
; expm1(-x) = 1 / exp(x) - 1
.686
.model flat, C
.code
expm1 proc public ; [esp+4] = exponent
fld real8 ptr [esp+4] ; st(0) = exponent
fldl2e ; st(0) = log2(e),
; st(1) = exponent
fmulp st(1), st(0) ; st(0) = exponent * log2(e)
fld1 ; st(0) = 1.0,
; st(1) = exponent * log2(e)
fld st(1) ; st(0) = exponent * log2(e),
; st(1) = 1.0,
; st(2) = exponent * log2(e)
fabs ; st(0) = |exponent * log2(e)|,
; st(1) = 1.0,
; st(2) = exponent * log2(e)
fcompp ; st(0) = exponent * log2(e)
fstsw ax ; ax = FPU status word
; B C3 TOP C2 C1 C0 low byte
; . 0 ... 0 . 0 ........ st(0) > st(1)
; . 0 ... 0 . 1 ........ st(0) < st(1)
; . 1 ... 0 . 0 ........ st(0) = st(1)
; . 1 ... 1 . 1 ........ st(0) # st(1)
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
; CF (carry flag) = C0
; C1
; PF (parity flag) = C2
; ZF (zero flag) = C3
; AF (adjust flag) = .
; SF (sign flag) = B(usy)
ja Lrange ; |exponent * log2(e)| > 1.0?
;; jp Lexit ; exponent = INDEFINITE?
f2xm1 ; st(0) = 2.0**(exponent * log2(e)) - 1.0
; = e**exponent - 1.0
ret
Lrange:
fld st(0) ; st(0) = st(1) = exponent * log2(e)
frndint ; st(0) = integer(exponent * log2(e)),
; st(1) = exponent * log2(e)
fsub st(1), st(0) ; st(0) = integer(exponent * log2(e)),
; st(1) = fraction(exponent * log2(e))
fxch st(1) ; st(0) = fraction(exponent * log2(e)),
; st(1) = integer(exponent * log2(e))
f2xm1 ; st(0) = 2.0**fraction(exponent * log2(e)) - 1.0,
; st(1) = integer(exponent * log2(e))
fld1 ; st(0) = 1.0,
; st(1) = 2.0**fraction(exponent * log2(e)) - 1.0,
; st(2) = integer(exponent * log2(e))
faddp st(1), st(0) ; st(0) = 2.0**fraction(exponent * log2(e)),
; st(1) = integer(exponent * log2(e))
fscale ; st(0) = e**exponent,
; st(1) = integer(exponent * log2(e))
fstp st(1) ; st(0) = e**exponent
fld1 ; st(0) = 1.0,
; st(1) = e**exponent
fsubp st(1), st(0) ; st(0) = e**exponent - 1.0
Lexit:
ret
expm1 endp
end
The logarithm function can be approximated by a (minimax) polynomial on any sufficiently small interval with high accuracy, for example faithfully rounded, as shown hereafter.
log()
Base-e alias Natural Logarithm Functionlog()
returns the base-e alias
natural logarithm of its argument.
logex = artanh((x2 − 1) / (x2 + 1)) = 2 × artanh((x − 1) / (x + 1)), loge(1 + x) = x1 / 1 − x2 / 2 + x3 / 3 − x4 / 4 + … = 2 × artanh(x / (2 + x)), …
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
#define INFINITY (1.0 / 0.5e-323)
#define INDEFINITE (0.0 * INFINITY)
#define M_E 2.7182818284590452
#define M_LN2 0.69314718055994531
#define M_1_SQRT2 0.70710678118654752
double frexp(double x, int *z);
// Faithfully rounded natural logarithm
double log(double argument)
{
double mantissa, x, y, z;
int exponent;
if (argument != argument)
return INDEFINITE;
if (argument < 0.0)
return INDEFINITE;
if (argument == 0.0)
return -INFINITY;
#ifdef OPTIONAL
if (argument == 1.0)
return 0.0;
if (argument == M_E)
return 1.0;
#endif
if (argument == INFINITY)
return INFINITY;
// for argument > 0,
// log(argument) = log(2) * log2(argument)
//
// for argument = mantissa * 2**exponent,
// log(argument) = log(mantissa * 2**exponent)
// = log(mantissa) + log(2**exponent)
// = log(mantissa) + log(2) * log2(2**exponent)
// = log(mantissa) + log(2) * log2(2) * exponent
// = log(mantissa) + log(2) * exponent
//
// for mantissa = 1,
// log(mantissa) = log(2) * exponent
//
// for mantissa = 1 + fraction
// = (1 + x) / (1 - x)
// and x = (mantissa - 1) / (mantissa + 1)
// = fraction / (2 + fraction)
// = 1 - 2 / (2 + fraction),
// log(mantissa) = log(1 + fraction)
// = log((1 + x) / (1 - x))
// = log(1 + x) - log(1 - x)
//
// for x = 0,
// log(1 + x) - log(1 - x) = log(1) - log(1)
// = 0
//
// for -1 < x <= 1,
// log(1 + x) = x**1/1 - x**2/2 + x**3/3 - x**5/5 + x**7/7 + ...
// = x - x**2/2 + x**3/3 - x**5/5 + x**7/7 + ...
//
// for -1 <= x < 1,
// log(1 - x) = 0 - x**1/1 - x**2/2 - x**3/3 - x**5/5 - x**7/7 - ...
// = 0 - x - x**2/2 - x**3/3 - x**5/5 - x**7/7 - ...
// = 0 - (x + x**2/2 + x**3/3 + x**5/5 + x**7/7 + ...)
//
// for -1 < x < 1,
// log(1 + x) - log(1 - x) = x - x**2/2 + x**3/3 - x**5/5 + x**7/7 - ...
// + x + x**2/2 + x**3/3 + x**5/5 + x**7/7 + ...
// = x * 2 + x**3/3 * 2 + x**7/7 * 2 + ...
// = (x + x**3/3 + x**7/7 + ...) * 2
// = x * 2 + (1 + x**2/3 + x**6/7 + ...) * 2
// = x * 2 + polynomial(x**2)
mantissa = frexp(argument, &exponent);
#ifdef OPTIONAL
if (mantissa == 0.5)
#if 0
return (exponent - 1) * M_LN2;
#elif 0
return (exponent - 1) * 0x1.EF35793C76730p-45
+ (exponent - 1) * 0x1.62E42FEFA3800p-1;
#else
return (exponent - 1) * 0.54979230187083712e-13
+ (exponent - 1) * 0.69314718055989033;
#endif
#endif
#if 0
// for 1/2 <= mantissa = 1 + fraction < 1,
// -1/2 <= fraction < 0 and x = (mantissa - 1) / (mantissa + 1),
// -1/3 <= x < 0
x = (mantissa - 1.0) / (mantissa + 1.0);
// for 0 < x < 1/3,
// a minimax polynomial of degree 10 in x**2 approximates
// (log(1 + x) - log(1 - x)) / (2 * x) with relative error
// 1.2300066608152056e-18 ~ 2**-59.5
y = x * x;
y = (((((((((+0.17060062608429468 * y
+0.083156843071811262) * y
+0.12112248959493536) * y
+0.13300102515887726) * y
+0.15386635453768495) * y
+0.18181739787751806) * y
+0.22222224111772142) * y
+0.28571428544925157) * y
+0.40000000000190325) * y
+0.66666666666666134) * y;
#else
// _ _
// for /2/2 <= mantissa = 1 + fraction < /2
// and x = (mantissa - 1) / (mantissa + 1),
// -0.29289321881345248 <= fraction < 0.41421356237309505,
// -0.1715728752538099 <= x < 0.1715728752538099
if (mantissa < M_1_SQRT2) {
mantissa += mantissa;
exponent -= 1;
}
x = (mantissa - 1.0) / (mantissa + 1.0);
// for -0.1715728752538099 <= x < 0.1715728752538099,
// a minimax polynomial of degree 7 in x**2 approximates
// (log(1 + x) - log(1 - x)) / (2 * x) with relative error
// 1.1354910268086278e-18 ~ 2**-59.6
y = x * x;
y = ((((((+0.14810529843106951 * y
+0.15312443753011222) * y
+0.18183635094502661) * y
+0.22222196988240322) * y
+0.28571428761346767) * y
+0.39999999999298882) * y
+0.66666666666667652) * y;
#endif
// K. C. Ng's formula yields an error below 1 ULP:
// for z = fraction * fraction / 2
// and x * 2 = fraction - fraction * x
// = fraction - z + z * x
// = fraction - (z - z * x),
// log(mantissa) = log(1 + fraction)
// = fraction - (fraction - polynomial(x * x)) * x
// = fraction - (z - (z + polynomial(x * x)) * x)
mantissa -= 1.0;
z = mantissa * mantissa * 0.5;
z = mantissa - (z - (z + y) * x);
// for integral |exponent| < 2048,
// the double-precision product exponent * 0x1.62E42FEFA3800p-1
// is exact; addition of the double-precision tail product
// exponent * 0x1.EF35793C76730p-45 yields log(2.0) * exponent
// within 2**(-50-52) from the exact product
//
// log(argument) = log(mantissa) + log(2.0) * exponent
// = log(mantissa) + exponent * 0x1.EF35793C76730p-45
// + exponent * 0x1.62E42FEFA3800p-1
#if 0
z += exponent * 0x1.EF35793C76730p-45;
z += exponent * 0x1.62E42FEFA3800p-1;
#else
z += exponent * 0.54979230187083712e-13;
z += exponent * 0.69314718055989033;
#endif
return z;
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# Faithfully rounded natural logarithm
# log(<0) = INDEFINITE
# log(±0) = -INFINITY
# log(1) = 0
# log(e) = 1
# log(INFINITY) = INFINITY
# log(1/x) = -log(x)
# log(x) = log(significand * 2**exponent)
# = log(significand) + log(2) * exponent
# = natural logarithm (to base e)
.arch generic64
.code64
.equiv BIAS, 1023
.intel_syntax noprefix
.text
# xmm0 = argument
log:
movq rax, xmm0 # rax = argument
add rax, rax # rax = argument << 1
# = |argument| << 1
# jz .Lzero # argument = ±0.0?
# jc .Lnegative # argument < ±0.0?
jbe .Lrange # argument <= ±0.0?
.Lpositive:
mov rcx, rax
shr rcx, 53 # rcx = biased exponent
jz .Ldenormal # biased exponent = 0?
# (argument denormal?)
sub ecx, BIAS # ecx = unbiased exponent
cmp ecx, BIAS + 1
je .Lspecial # biased exponent = 2047?
# (argument = INDEFINITE?)
# (argument = INFINITY?)
.Lnormal:
shl rax, 11 # rax = fractional part of argument << 12
.Lcontinue:
mov rdx, 0x6A09E667F3BCC909 # rdx = fractional part of sqrt(2.0) << 12
cmp rdx, rax # CF = (sqrt(2.0) < significand of argument)
sbb edx, edx # edx = (sqrt(2.0) < significand of argument) ? -1 : 0
sub ecx, edx # ecx = exponent of argument
# + (sqrt(2.0) < significand of argument)
# = exponent'
add edx, BIAS # rdx = (sqrt(2.0) < significand of argument) ? BIAS - 1 : BIAS
or rax, rdx
ror rax, 12 # rax = significand of argument'
movq xmm0, rax # xmm0 = significand of argument' in [sqrt(0.5), sqrt(2.0)]
.Ltransform:
mov rax, 0x3FF0000000000000
movq xmm1, rax # xmm1 = 0x1.0p+0
# = 1.0
movsd xmm2, xmm0
subsd xmm2, xmm1 # xmm2 = significand of argument' - 1.0
# = fraction of argument'
addsd xmm1, xmm0 # xmm1 = significand of argument' + 1.0
movsd xmm0, xmm2 # xmm0 = fraction of argument'
divsd xmm2, xmm1 # xmm2 = (significand of argument' - 1.0)
# / (significand of argument' + 1.0)
# = argument"
movsd xmm1, xmm2 # xmm1 = argument" in [-0.1715728752538099, 0.1715728752538099]
mulsd xmm2, xmm2 # xmm2 = argument"**2
.Lhorner:
mov rax, 0x3FC2F51D4A901906
movq xmm3, rax # xmm3 = 0x1.2F51D4A901906p-3
# = 0.14810529843106951
mulsd xmm3, xmm2
mov rdx, 0x3FC39994E1B48251
movq xmm4, rdx # xmm4 = 0x1.39994E1B48251p-3
# = 0.15312443753011222
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rax, 0x3FC74669DE443505
movq xmm3, rax # xmm3 = 0x1.74669DE443505p-3
# = 0.18183635094502661
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FCC71C4FE8C7EC6
movq xmm4, rdx # xmm4 = 0x1.C71C4FE8C7EC6p-3
# = 0.22222196988240322
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rax, 0x3FD2492494532F9F
movq xmm3, rax # xmm3 = 0x1.2492494532F9Fp-2
# = 0.28571428761346767
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FD999999997AC3B
movq xmm4, rdx # xmm4 = 0x1.999999997AC3Bp-2
# = 0.39999999999298882
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rax, 0x3FE55555555555AE
movq xmm3, rax # xmm3 = 0x1.55555555555AEp-1
# = 0.66666666666667652
addsd xmm3, xmm4
mulsd xmm3, xmm2 # xmm3 = polynomial(argument"**2)
.Llogarithm:
mov rdx, 0x3FE0000000000000
movq xmm2, rdx # xmm2 = 0x1.0p-1
# = 0.5
mulsd xmm2, xmm0
mulsd xmm2, xmm0 # xmm2 = 0.5 * fraction of argument'**2
addsd xmm3, xmm2 # xmm3 = polynomial(argument"**2)
# + 0.5 * fraction of argument'**2
mulsd xmm3, xmm1 # xmm3 = (polynomial(argument"**2)
# + 0.5 * fraction of argument'**2)
# * argument"
subsd xmm2, xmm3
subsd xmm0, xmm2 # xmm0 = log(significand of argument')
.Lexponent:
cvtsi2sd xmm1, ecx # xmm1 = exponent'
mov rax, 0x3D2EF35793C76730
movq xmm3, rax # xmm3 = 0x1.EF35793C76730p-45
# = 0.54979230187083712e-13
# = tail of log(2.0)
mulsd xmm3, xmm1
addsd xmm0, xmm3
mov rdx, 0x3FE62E42FEFA3800
movq xmm2, rdx # xmm2 = 0x1.62E42FEFA3800p-1
# = 0.69314718055989033
# = head of log(2.0)
mulsd xmm2, xmm1
addsd xmm0, xmm2 # xmm0 = natural logarithm of argument
ret
.Ldenormal:
bsr rcx, rax # rcx = index of most significant '1' bit in argument << 1
add rax, rax
xor ecx, 63 # ecx = number of leading '0' bits in argument << 1
# = 11 - biased exponent
shl rax, cl # rax = (fractional part of) normalized argument << 12
neg ecx # ecx = biased exponent - 11
sub ecx, BIAS - 11 # ecx = unbiased exponent of normalized argument
jmp .Lcontinue
.Lrange:
jnz .Lnegative # argument <> ±0.0?
.Lzero:
mov rax, 0xFFF0000000000000
movq xmm0, rax # xmm0 = -0x1.0p+1024
# = -INFINITY
ret
.Lspecial:
shl rax, 11
jz .Lexit # argument = +INFINITY?
.Lindefinite:
.Lnegative:
mov rax, 0x7FF8000000000000
movq xmm0, rax # xmm0 = 0x1.8p+1024
# = INDEFINITE
.Lexit:
ret
.size log, .-log
.type log, @function
.global log
.end
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
#define INFINITY 0x1.0p+1024
#define INDEFINITE 0x1.8p+1024
#define M_E 0x1.5BF0A8B145769p+1
#define M_LN2 0x1.62E42FEFA39EFp-1
#define M_1_SQRT2 0x1.6A09E667F3BCDp-1
double frexp(double x, int *z);
// Faithfully rounded natural logarithm
double log(double argument)
{
double mantissa;
int exponent;
if (argument != argument)
return INDEFINITE;
if (argument < 0.0)
return INDEFINITE;
if (argument == 0.0)
return -INFINITY;
#ifdef OPTIONAL
if (argument == 1.0)
return 0.0;
if (argument == M_E)
return 1.0;
#endif
if (argument == INFINITY)
return INFINITY;
mantissa = frexp(argument, &exponent);
#ifdef OPTIONAL
if (mantissa == 0.5)
#if 0
return (exponent - 1) * M_LN2;
#else
return (exponent - 1) * 0x1.EF35793C76730p-45
+ (exponent - 1) * 0x1.62E42FEFA3800p-1;
#endif
#endif
if (mantissa < M_1_SQRT2) {
mantissa += mantissa;
exponent -= 1;
}
mantissa -= 1.0;
// for -0.29289321881345248 <= mantissa < 0.41421356237309505,
// a minimax polynomial of degree 19 approximates log1p(1+mantissa)
mantissa += (((((((((((((((((((-0x1.CC4EC078138E3p-6 * mantissa
+0x1.0266CD08DB2F2p-4) * mantissa
-0x1.1654764F478ECp-4) * mantissa
+0x1.EA17E14773369p-5) * mantissa
-0x1.EED2E2BB64B2Ep-5) * mantissa
+0x1.0F23916A44515p-4) * mantissa
-0x1.25480A82633AFp-4) * mantissa
+0x1.3B4ED39194B87p-4) * mantissa
-0x1.554D5ACD502ABp-4) * mantissa
+0x1.745980F3FB889p-4) * mantissa
-0x1.9999C5BE751E3p-4) * mantissa
+0x1.C71C90DB06248p-4) * mantissa
-0x1.FFFFFFBD8606Dp-4) * mantissa
+0x1.249248DAE4B2Ap-3) * mantissa
-0x1.55555554A6A2Bp-3) * mantissa
+0x1.9999999A43E4Fp-3) * mantissa
-0x1.00000000013C7p-2) * mantissa
+0x1.5555555555103p-2) * mantissa
-0x1.FFFFFFFFFFFF2p-2) * mantissa) * mantissa;
// for integral |exponent| < 2048,
// the double-precision product exponent * 0x1.62E42FEFA3800p-1
// is exact; addition of the double-precision tail product
// exponent * 0x1.EF35793C76730p-45 yields log(2.0) * exponent
// within 2**(-50-52) from the exact product
mantissa += exponent * 0x1.EF35793C76730p-45;
mantissa += exponent * 0x1.62E42FEFA3800p-1;
return mantissa;
}
loge(significand × 2exponent−1023) = log2significand × (exponent - 1023) × loge2
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/t63833dz.aspx
; log(x) = log(2) * log2(x)
; = natural logarithm (to base e)
.686
.model flat, C
.code
log proc public ; [esp+4] = argument
fldln2 ; st(0) = ln(2.0)
fld real8 ptr [esp+4] ; st(0) = argument,
; st(1) = ln(2.0)
fyl2x ; st(0) = natural logarithm of argument
ret
log endp
end
log1p()
Base-e alias Natural Logarithm Functionlog1p()
returns the base-e alias
natural logarithm of its by one incremented argument.
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/mt720722.aspx
; log1p(x) = log(2) * log2(1 + x)
; = natural logarithm (to base e) of (1 + x)
.686
.model flat, C
.code
log1p proc public ; [esp+4] = argument
fldln2 ; st(0) = ln(2.0)
fld real8 ptr [esp+4] ; st(0) = argument,
; st(1) = ln(2.0)
fld st(0) ; st(0) = argument,
; st(1) = argument,
; st(2) = ln(2.0)
fabs ; st(0) = |argument|,
; st(1) = argument,
; st(2) = ln(2.0)
ifdef DOUBLE
push 3FD2BEC3r
push 33018867r ; [esp] = 1.0 - sqrt(0.5)
; = 0.292893218813452482773840301888412795960903167724609375
fcomp real8 ptr [esp] ; st(0) = argument,
; st(1) = ln(2.0)
pop eax
else
push 3E95F61Ar ; [esp] = 1.0F - sqrtf(0.5F)
; = 0.292893230915069580078125
fcomp real4 ptr [esp] ; st(0) = argument,
; st(1) = ln(2.0)
endif
pop eax
fstsw ax ; ax = FPU status word
; B C3 TOP C2 C1 C0 low byte
; . 0 ... 0 . 0 ........ st(0) > [esp]
; . 0 ... 0 . 1 ........ st(0) < [esp]
; . 1 ... 0 . 0 ........ st(0) = [esp]
; . 1 ... 1 . 1 ........ st(0) # [esp]
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
; CF (carry flag) = C0
; C1
; PF (parity flag) = C2
; ZF (zero flag) = C3
; AF (adjust flag) = .
; SF (sign flag) = B(usy)
ja Lrange ; |argument| > 1.0 - sqrt(0.5)?
;; jp Lexit ; |argument| = INDEFINITE?
fyl2xp1 ; st(0) = natural logarithm of (argument - 1.0)
Lexit:
ret
Lrange:
fld1 ; st(0) = 1.0,
; st(1) = argument,
; st(2) = ln(2.0)
faddp st(1), st(0) ; st(0) = argument + 1.0,
; st(1) = ln(2.0)
fyl2x ; st(0) = natural logarithm of (argument - 1.0)
ret
log1p endp
end
log10()
Base-10 alias Common Logarithm Functionlog10()
returns the base-10 alias common logarithm of its argument.
log10(significand × 2exponent−1023) = log2significand × (exponent - 1023) × log102
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double log(double x);
double log10(double argument)
{
return 0.43429448190325183 * log(argument);
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# log10(<0) = INDEFINITE
# log10(±0) = -INFINITY
# log10(1) = 0
# log10(10) = 1
# log10(INFINITY) = INFINITY
# log10(1/x) = -log10(x)
# log10(x) = log10(e) * log(x)
# = common logarithm (to base 10)
.arch generic64
.code64
.intel_syntax noprefix
.extern log
.text
# xmm0 = argument
log10:
call log # xmm0 = log(argument)
mov rax, 0x3FDBCB7B1526E50E
movq xmm1, rax # xmm1 = 0x1.BCB7B1526E50Ep-2
# = 0.434294481903251828
# = log10(2.71828182845904524)
mulsd xmm0, xmm1 # xmm0 = log(argument) * log10(2.71828182845904524)
# = log10(argument)
ret
.size log10, .-log10
.type log10, @function
.weak log10
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/t63833dz.aspx
; log10(x) = log10(2) * log2(x)
; = common logarithm (to base 10)
.686
.model flat, C
.code
log10 proc public ; [esp+4] = argument
fldlg2 ; st(0) = log10(2.0)
fld real8 ptr [esp+4] ; st(0) = argument,
; st(1) = log10(2.0)
fyl2x ; st(0) = common logarithm of argument
ret
log10 endp
end
log2()
Base-2 alias Binary Logarithm Functionlog2()
returns the base-2 logarithm of its argument.
log2(significand × 2exponent−1023) = log2significand × (exponent - 1023)
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
#define INFINITY (1.0 / 0.5e-323)
#define INDEFINITE (0.0 * INFINITY)
double frexp(double x, int *z);
double log2(double argument)
{
int exponent;
unsigned count = 5;
long long logarithm;
if (argument != argument)
return INDEFINITE;
if (argument < 0.0)
return INDEFINITE;
if (argument == 0.0)
return -INFINITY;
if (argument == INFINITY)
return INFINITY;
argument = frexp(argument, &exponent);
if (argument == 0.5)
return (double) (exponent - 1);
logarithm = exponent - 1;
do {
argument += argument;
argument *= argument;
argument *= argument;
argument *= argument;
argument *= argument;
argument *= argument;
argument *= argument;
argument *= argument;
argument *= argument;
argument *= argument;
argument *= argument;
argument = frexp(argument, &exponent);
logarithm <<= 10;
logarithm += exponent - 1;
} while (--count != 0);
return logarithm * 0x1.0p-50;
}
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
#define INFINITY (1.0 / 0.5e-323)
#define INDEFINITE (0.0 * INFINITY)
double frexp(double x, int *z);
double log(double x);
double log2(double argument)
{
int exponent;
if (argument != argument)
return INDEFINITE;
if (argument < 0.0)
return INDEFINITE;
if (argument == 0.0)
return -INFINITY;
if (argument == INFINITY)
return INFINITY;
argument = frexp(argument, &exponent);
#ifdef OPTIONAL
if (mantissa == 0.5)
return (double) (exponent - 1);
#endif
return 1.4426950408889634 * log(argument) + exponent;
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# log2(<0) = INDEFINITE
# log2(±0) = -INFINITY
# log2(1) = 0
# log2(2) = 1
# log2(INFINITY) = INFINITY
# log2(1/x) = -log2(x)
# log2(x) = log2(e) * log(x)
# = binary logarithm (to base 2)
.arch generic64
.code64
.intel_syntax noprefix
.extern log
.text
# xmm0 = argument
log2:
call log # xmm0 = log(argument)
mov rax, 0x3FF71547652B82FE
movq xmm1, rax # xmm1 = 0x1.71547652B82FEp+0
# = 1.44269504088896341
# = log(2.71828182845904524)
mulsd xmm0, xmm1 # xmm0 = log(argument) * log2(2.71828182845904524)
# = log2(argument)
ret
.size log2, .-log2
.type log2, @function
.weak log2
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/mt720721.aspx
; log2(x) = binary logarithm (to base 2)
.686
.model flat, C
.code
log2 proc public ; [esp+4] = argument
fld1 ; st(0) = 1.0
fld real8 ptr [esp+4] ; st(0) = argument,
; st(1) = 1.0
fyl2x ; st(0) = binary logarithm of argument
ret
log2 endp
end
logb()
Functionlogb()
returns the integral part of the base-2 logarithm of the absolute
value of its argument.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
#define INFINITY (1.0 / 0.5e-323)
#define INDEFINITE (0.0 * INFINITY)
double logb(double argument)
{
int exponent;
if (argument != argument)
return INDEFINITE;
if (argument == 0.0)
return -INFINITY;
if (argument < 0.0)
argument = -argument;
if (argument == INFINITY)
return INFINITY;
exponent = *(unsigned long long *) &argument >> 52;
return (exponent & 2047) - 1023;
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# logb(±0) = -INFINITY
# logb(±0.5) = -1
# logb(±1) = 0
# logb(±2) = 1
# logb(±INFINITY) = INFINITY
# logb(INDEFINITE) = INDEFINITE
# logb(x) = floor(log2(fabs(x)))
.arch generic64
.code64
.equiv BIAS, 1023
.intel_syntax noprefix
.text
# xmm0 = argument
logb:
movq rcx, xmm0 # rcx = argument
add rcx, rcx # rcx = argument << 1
# = |argument| << 1
jz .Lzero # argument = ±0.0?
mov rax, rcx
shr rax, 53 # rax = biased exponent
jz .Ldenormal # biased exponent = 0?
# (argument denormal?)
cmp eax, BIAS * 2 + 1
jne .Lnormal # biased exponent <> 2047?
# (argument normal?)
shl rax, 12
jnz .Lindefinite # argument = INDEFINITE?
.Linfinity: # argument = ±INFINITY
mov rax, 0x7FF0000000000000
movq xmm0, rax # xmm0 = 0x1.0p+1024
# = INFINITY
ret
.Lnormal:
sub eax, BIAS # eax = biased exponent - 1023
# = unbiased exponent of argument
cvtsi2sd xmm0, eax # xmm0 = unbiased exponent of argument
ret
.Ldenormal:
bsr rax, rcx # rax = index of most significant '1' bit
# = biased exponent + 52
sub eax, BIAS + 52 # eax = unbiased exponent of argument
cvtsi2sd xmm0, eax # xmm0 = unbiased exponent of argument
ret
.Lzero:
mov rax, 0xFFF0000000000000
movq xmm0, rax # xmm0 = -0x1.0p+1024
# = -INFINITY
ret
.Lindefinite:
mov rax, 0x7FF8000000000000
movq xmm0, rax # xmm0 = 0x1.8p+1024
# = INDEFINITE
ret
.size logb, .-logb
.type logb, @function
.global logb
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/e4x82d9s.aspx
.686
.model flat, C
.code
logb proc public ; [esp+4] = argument
fld real8 ptr [esp+4] ; st(0) = argument
fxtract ; st(0) = mantissa
; = argument / 2.0**exponent,
; st(1) = exponent
fstp st(0) ; st(0) = exponent
ret
logb endo
end
ilogb()
Functionlogb()
returns the integral part of the base-2 logarithm of the absolute
value of its argument.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
int ilogb(double argument)
{
int exponent = *(unsigned long long *) &argument >> 52;
return argument == 0.0 ? -2147483648 : (exponent & 2047) - 1023;
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# ilogb(±0) = -2**31
# ilogb(±0.5) = -1
# ilogb(±1) = 0
# ilogb(±2) = +1
# ilogb(±INFINITY) = +1024
# ilogb(INDEFINITE) = +1024
# ilogb(x) = floor(log2(fabs(x)))
.arch generic64
.code64
.equiv BIAS, 1023
.intel_syntax noprefix
.text
# xmm0 = argument
ilogb:
movq rcx, xmm0 # rcx = argument
add rcx, rcx # rcx = argument << 1
# = |argument| << 1
jz .Lzero # argument = ±0.0?
mov rax, rcx
shr rax, 53 # rax = biased exponent
jz .Ldenormal # biased exponent = 0?
# (argument denormal?)
.Lnormal:
sub eax, BIAS # eax = biased exponent - 1023
# = unbiased exponent of argument
ret
.Ldenormal:
bsr rax, rcx # rax = index of most significant '1' bit
# = biased exponent + 52
sub eax, BIAS + 52 # eax = unbiased exponent of argument
ret
.Lzero:
mov eax, -2147483648 # eax = -2**31
ret
.size ilogb, .-ilogb
.type ilogb, @function
.global ilogb
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/mt720719.aspx
.686
.model flat, C
.code
ilogb proc public ; [esp+4] = argument
fld real8 ptr [esp+4] ; st(0) = argument
fxtract ; st(0) = mantissa
; = argument / 2.0**exponent,
; st(1) = exponent
fxch st(1) ; st(0) = exponent,
; st(1) = mantissa
push eax
fistp dword ptr [esp] ; [esp] = exponent,
; st(0) = mantissa
pop eax ; eax = exponent
ret
ilogb endp
end
machine e= 0x1.5BF0A8B145769p+1 = 2.7182818284590452 is 0x1.5355FB8AC404Ep−54 = 0.7228234458646251e−16 greater than the
exactvalue of e.
machine pi= 0x1.921FB54442D18p+1 = 3.1415926535897932 is 0x1.1A62633145C07p−53 = 1.2246467991473532e−16 greater than the
exactvalue of π
π = 3.14159265358979324
= 0x1.921FB54442D18p+1
remainder = 0x1'FFFFFFFFFFFFF'A61D414728C8B'C4F533
= quadrant 1, 0x0.0000000000000'59E2BEB8D7374'3B0ACD
= 0.00000000000000007796343665038750893128850032303923134435791966849159349864407171054711716273732946547170286066830158233642578125
= 7.79634366503875089e−17
= 0x1.678AFAE35CDD1p−54
reduced = 1.2246467991473532e−16
6381956970095103 × 2797 =
0x16AC5B262CA1FF × 2797 =
0x1.6AC5B262CA1FFp+849 =
5.319372648326541416707296656673541083813475…e+255
is the binary64 that is closest to a multiple of
π/2 ???
remainder = quadrant 1, +2.983942503748065…e−19=0x1.604820E0811AA'802p−62
reduced = 4.68716592425462761112…e−19
cos(0x1.0p+120) = −0.92587902285483786730386176410741494673083320992866…
cos(2) = −0.41614683654714238699756822950076218976600077107554…
sin(22) = −0.00885130929040387592169025681577233246328920395133256644233083529808955201463… 22 = π × 7.002817496…
sin(1.0e+22) = −0.8522008497671888017727…
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
#if 0
double π = 3.14159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651320158099543173816435024903010475699058520066023404996466884655031174235631524875783834033912401662638819210406589597660857119182803827759368407141697525266952615084492003288375582792204376643164609551001015890514555563284293526749024749733507633209228515625;
#else
double π = 0x3.243F6A8885A308D313198A2E03707344A4093822299F31D0082EFA98EC4E6C89452821E638D01377BE5466CF34E9p+0;
#endif
#ifdef _MSC_VER
#pragma intrinsic(_umul128)
#else
static inline
unsigned long long _umul128(unsigned long long x, unsigned long long y, unsigned long long *z)
{
#ifdef __amd64__
uint64_t l;
__asm__ ("mulq\t%3"
:"=a" (l), "=d" (*z)
:"%0" (y), "rm" (x)
:"cc");
return l;
#else
__uint128_t p = (__uint128_t) x * y;
*z = p >> 64;
return p;
#endif
}
#endif
int clzll(unsigned long long x);
double fabs(double x);
double fma(double x, double y, double z);
double frexp(double x, int *z);
double ldexp(double x, int z);
double rint(double x);
int signbit(double x);
// reduce argument to interval [-π/4, π/4] and its quadrant
double __quarter_pi(double argument, int *quadrant)
{
double z = fabs(argument);
if (z < 262144.0) {
if (z < 0.78539816339744831) { // π/4
*quadrant = -signbit(argument);
return argument;
}
// Cody-Waite argument reduction: 127 bits of π/2
#if _MSC_VER < 1914
z = rint(argument * 0.63661977236758134); // 2/π
argument -= z * 1.5707963267923332; // high part of π/2
argument -= z * 5.1266883031679116e-12; // middle part of π/2
argument -= z * 2.1125998133974855e-23; // low part of π/2
#else
z = rint(argument * 0x1.45F306DC9C883p-1); // 2/π
argument -= z * 0x1.921FB54440000p-0; // high part of π/2
argument -= z * 0x1.68C234C4C0000p-38; // middle part of π/2
argument -= z * 0x1.98A2E03707345p-76; // low part of π/2
#endif
*quadrant = (int) z;
return argument;
}
// Payne-Hanek argument reduction:
// 1216 bits of 2/π in 2.1214 fixed-point format
static const unsigned long long pi2inverse[19] = {0x28BE60DB9391054A,
0x7F09D5F47D4D3770,
0x36D8A5664F10E410,
0x7F9458EAF7AEF158,
0x6DC91B8E909374B8,
0x01924BBA82746487,
0x3F877AC72C4A69CF,
0xBA208D7D4BAED121,
0x3A671C09AD17DF90,
0x4E64758E60D4CE7D,
0x272117E2EF7E4A0E,
0xC7FE25FFF7816603,
0xFBCBC462D6829B47,
0xDB4D9FB3C9F2C26D,
0xD3D18FD9A797FA8B,
0x5D49EEB1FAF97C5E,
0xCF41CE7DE294A4BA,
0x9AFED7EC47E35742,
0x1580CC11BF1EDAEA};
unsigned long long high, mid, low, tmp, ull;
unsigned index, shift;
int exponent;
double head, tail;
// get fraction of |argument| in 0.64 fixed-point format and its
// exponent
#if 0
ull = (unsigned long long) (ldexp(frexp(z, &exponent), 64));
#else
ull = (unsigned long long) (frexp(z, &exponent) * 0x1.0p+64);
#endif
// get 192 bits of 2/π, determined by exponent of argument, in
// 2.190 fixed-point format
index = exponent >> 6;
shift = exponent & 63; // (1 << 6) - 1
high = pi2inverse[index];
mid = pi2inverse[index + 1];
low = pi2inverse[index + 2];
tmp = pi2inverse[index + 3];
if (shift != 0) {
high = (high << shift) | (mid >> (64 - shift));
mid = (mid << shift) | (low >> (64 - shift));
low = (low << shift) | (tmp >> (64 - shift));
}
// compute fraction of |argument| * 2/π in 2.190 fixed-point format
low = _umul128(ull, low, &tmp);
low = tmp;
mid = _umul128(ull, mid, &tmp) + low;
tmp += mid < low;
mid = tmp;
high = _umul128(ull, high, &tmp) + mid;
tmp += high < mid;
// convert fraction of |argument| * 2/π in 2.190 fixed-point format
// into fraction of ±argument * 2/π in 0.192 fixed-point format,
// shifting fraction of |argument| * 2/π from interval [0.0, 1.0]
// to fraction of ±argument * 2/π in interval [-0.5, 0.5],
// set quadrant to integer part of |argument| * 2/π modulo 4,
// and increment it when fraction of argument * 2/π is negative
// (sign change is equivalent to subtraction of 1.0)
*quadrant = (int) (tmp >> 62);
tmp <<= 2;
tmp |= high >> 62;
high <<= 2;
high |= mid >> 62;
mid <<= 2;
*quadrant += (long long) tmp < 0;
// if argument is negative, complement fraction and mirror quadrant
if (signbit(argument)) {
mid = 0 - mid;
high = 0 - high - (0 < mid);
tmp = 0 - tmp - (0 < high);
*quadrant = -*quadrant;
}
// convert fraction of argument * 2/π from 0.192 fixed-point format
// into (intermediate) double-double format; complement tail part when
// head part is negative, adjusting head part on overflow, i.e. when
// tail part is 0x8000000000000000
shift = clzll(llabs(tmp));
if (shift > 11) {
tmp <<= shift - 11;
tmp |= high >> (64 - shift + 11);
high <<= shift - 11;
tmp |= high >> (64 - shift + 11);
high <<= shift - 11;
} else if (shift < 11) {
high >>= 11 - shift;
high |= tmp << (64 - 11 + shift);
(long long) tmp >>= 11 - shift;
}
if ((long long) tmp < 0) {
high = 0 - high;
tmp += high == 0x8000000000000000;
}
head = ldexp((double) (long long) tmp, shift - 11 - 64);
tail = ldexp((double) (long long) high, shift - 11 - 128);
// return remainder of argument / (π/2)
#ifdef FP_FAST_FMA
double x = tail * 1.5707963267948966;
double y = head * 6.123233995736766e-17;
z = fma(head * 6.123233995736766e-17, -y)
+ fma(tail, 1.5707963267948966, -x)
+ tail * 6.123233995736766e-17;
return fma(head, 1.5707963267948966, x + y + z);
#else
return (tail * 6.123233995736766e-17
+ (tail * 1.5707963267948966 + head * 6.123233995736766e-17))
+ head * 1.5707963267948966;
#endif
}
static inline
double __sin_cos_core(double reduced, int quadrant)
{
double square = reduced * reduced;
double result = 1 & quadrant
// polynomial approximation of cosine on [-π/4, π/4]
? ((((((-0x1.908B4EF9A7E2Ep-37 * square
+0x1.1EEB7C6903BA2p-29) * square
-0x1.27E4FA28F90C6p-22) * square
+0x1.A01A019F556D1p-16) * square
-0x1.6C16C16C16910p-10) * square
+0x1.5555555555555p-5) * square
-0.5) * square + 1.0
// polynomial approximation of sine on [-π/4, π/4]
: (((((+0x1.5E3C6B7EEB28Dp-33 * square
-0x1.AE60A561EEAB5p-26) * square
+0x1.71DE384036E7Dp-19) * square
-0x1.A01A019F1C947p-13) * square
+0x1.1111111110EB8p-7) * square
-0x1.5555555555555p-3) * square * reduced + reduced;
return 2 & quadrant ? 0.0 - result : result;
}
// NOTE: 0.0 * x + x yields NaN for x = ±INFINITY
double cos(double x)
{
int quadrant;
double reduced = __quarter_pi(0.0 * x + x, &quadrant);
return __sin_cos_core(reduced, 1 + quadrant);
}
double sin(double x)
{
int quadrant;
double reduced = __quarter_pi(0.0 * x + x, &quadrant);
return __sin_cos_core(reduced, quadrant);
}
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double fma(double x, double y, double z);
static inline
double cos_poly(double x)
{
double s = x * x;
#ifdef FP_FAST_FMA
double t = -0x1.908B4EF9A7E2Ep-37;
t = fma(t, s, 0x1.1EEB7C6903BA2p-29);
t = fma(t, s, -0x1.27E4FA28F90C6p-22);
t = fma(t, s, 0x1.A01A019F556D1p-16);
t = fma(t, s, -0x1.6C16C16C16910p-10);
t = fma(t, s, 0x1.5555555555555p-5);
t = fma(t, s, -0.5);
t = fma(t, s, 1.0);
return t;
#else
#error
#endif
}
static inline
double cot_poly(double x)
{
double s = x * x;
#ifdef FP_FAST_FMA
double t = 0x1.2113ADD876256p-35;
t = fma(t, s, 0x1.D3F62BCB56407p-33);
t = fma(t, s, 0x1.3722FC10FB082p-29);
t = fma(t, s, 0x1.7D86D9F6CBA62p-26);
t = fma(t, s, 0x1.D6DC6DA8B4B97p-23);
t = fma(t, s, 0x1.228059183E28Cp-19);
t = fma(t, s, 0x1.66A8F2D1BC68Fp-16);
t = fma(t, s, 0x1.BBD7793321936p-13);
t = fma(t, s, 0x1.1566ABC011734p-9);
t = fma(t, s, 0x1.6C16C16C16C16p-6);
t = fma(t, s, 0x1.5555555555555p-2);
return t * s;
#else
#error
#endif
}
static inline
double sin_poly(double x)
{
double s = x * x;
#ifdef FP_FAST_FMA
#if 1
double t = 0x1.5E3C6B7EEB28Dp-33;
t = fma(t, s, -0x1.AE60A561EEAB5p-26);
t = fma(t, s, 0x1.71DE384036E7Dp-19);
t = fma(t, s, -0x1.A01A019F1C947p-13);
t = fma(t, s, 0x1.1111111110EB8p-7);
t = fma(t, s, -0x1.5555555555555p-3);
t = fma(t, x, x);
return t;
#else
double t = 0x1.5D8E4FD051E03p-33;
t = fma(t, s, -0x1.AE5E54BFD59F5p-26);
t = fma(t, s, 0x1.71DE355F53FB7p-19);
t = fma(t, s, -0x1.A01A019BF2621p-13);
t = fma(t, s, 0x1.1111111110F75p-7);
t = fma(t, s, -0x1.5555555555548p-3);
t = fma(t, x, x);
return t;
#endif
#else
return ((((((((-0x1.2622B22D526BEp-57 * s
+0x1.94FA618796592p-49) * s
-0x1.AE7EA531357BFp-41) * s
+0x1.6124601C23966p-33) * s
-0x1.AE64567CB5786p-26) * s
+0x1.71DE3A5568A50p-19) * s
-0x1.A01A01A019FC7p-13) * s
+0x1.111111111110Fp-7) * s
-0x1.5555555555555p-3) * s * x + x;
#endif
}
static inline
double tan_poly(double x)
{
double s = x * x;
#ifdef FP_FAST_FMA
double t = 0x1.5D99C5B37B8FBp-16;
t = fma(t, s, -0x1.778CB8106DD3Dp-15);
t = fma(t, s, 0x1.7656FC1431EF6p-14);
t = fma(t, s, -0x1.B6EA2534187CBp-16);
t = fma(t, s, 0x1.1DF93999DC111p-13);
t = fma(t, s, 0x1.D28899E55DABCp-13);
t = fma(t, s, 0x1.37F87931093E9p-11);
t = fma(t, s, 0x1.7D5B9D094E180p-10);
t = fma(t, s, 0x1.D6D92E4DC1EC9p-9);
t = fma(t, s, 0x1.226E1281741EDp-7);
t = fma(t, s, 0x1.664F49A7087A7p-6);
t = fma(t, s, 0x1.BA1BA1B472C71p-5);
t = fma(t, s, 0x1.111111111823Fp-3);
t = fma(t, s, 0x1.55555555554F2p-2);
t = fma(t, s, 1.0);
return t * x;
#else
return ((((((((+0x1.5445F555134EDp-12 * s
+0x1.269BE400DE3AFp-11) * s
+0x1.7EEF631E20B93p-10) * s
+0x1.D6C27C371C959p-9) * s
+0x1.226E7BFA35090p-7) * s
+0x1.664F4729F98E5p-6) * s
+0x1.BA1BA1BDCEC06p-5) * s
+0x1.111111110E933p-3) * s
+0x1.5555555555568p-2) * s * x + x;
#endif
}
cos()
(Circular) Cosine Functioncos()
returns the (circular) cosine of its argument.
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/ff770589.aspx
.686
.model flat, C
.code
cos proc public ; [esp+4] = argument
fld real8 ptr [esp+4] ; st(0) = argument
fcos ; st(0) = cosine of argument
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jnp Lexit ; |argument| < 2**63?
ifndef REDUCE
fsub st(0), st(0) ; st(0) = argument - argument
; = 0.0 (or INDEFINITE)
fdiv st(0), st(0) ; st(0) = INDEFINITE
else
fld1 ; st(0) = 1.0,
; st(1) = argument
fldpi ; st(0) = pi,
; st(1) = 1.0,
; st(2) = argument
fscale ; st(0) = pi * 2**1,
; st(1) = 1.0,
; st(2) = argument
fstp st(1) ; st(0) = pi * 2**1,
; st(1) = argument
fxch st(1) ; st(0) = argument,
; st(1) = pi * 2**1
Lreduce:
fprem1 ; st(0) = argument modulo (pi * 2**1)
; = argument',
; st(1) = pi * 2**1
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jp Lreduce ; |argument'| > pi?
fstp st(1) ; st(0) = argument'
fcos ; st(0) = cosine of argument'
endif
Lexit:
ret
cos endp
end
Caveat: although the
FSCALE
instruction yields 2×π in
double-extended (80-bit) precision, and the
FPREM1
instruction operates
in double-extended (80-bit) precision too, reduction of arguments
that are greater than 263 in magnitude to the interval
(-π, π) looses almost all precision: for example
0x1.6AC5B262CA1FFp+849, the closest integral multiple of π/2 in
double precision, is reduced to -0x1.E5C3B0F08A43A7B0p0 =
-1.897517260289773073471397690781259370851330459117889404296875
instead of 4.68716592425462761112…e−19!
cot()
(Circular) Cotangent Functioncot()
returns the (circular) cotangent of
its argument.
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; cot(x) = 1 / tan(x)
; = cos(x) / sin(x)
.686
.model flat, C
.code
cot proc public ; [esp+4] argument
fld real8 ptr [esp+4] ; st(0) = argument
fptan ; st(0) = 1.0,
; st(1) = tangent of argument
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jnp Ldone ; |argument| < 2**63?
ifndef REDUCE
fsub st(0), st(0) ; st(0) = argument - argument
; = 0.0 (or INDEFINITE)
fdiv st(0), st(0) ; st(0) = INDEFINITE
ret
else
fld1 ; st(0) = 1.0,
; st(1) = argument
fldpi ; st(0) = pi,
; st(1) = 1.0,
; st(2) = argument
fscale ; st(0) = pi * 2**1,
; st(1) = 1.0,
; st(2) = argument
fstp st(1) ; st(0) = pi * 2**1,
; st(1) = argument
fxch st(1) ; st(0) = argument,
; st(1) = pi * 2**1
Lreduce:
fprem1 ; st(0) = argument modulo (pi * 2**1)
; = argument',
; st(1) = pi * 2**1
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jp Lreduce ; |argument'| > pi?
fstp st(1) ; st(0) = argument'
fptan ; st(0) = 1.0,
; st(1) = tangent of argument'
endif
Ldone:
fdivrp st(1), st(0) ; st(0) = 1.0 / tangent of argument
; = cotangent of argument
ret
cot endp
end
Caveat: although the
FSCALE
instruction yields 2×π in
double-extended (80-bit) precision, and the
FPREM1
instruction operates
in double-extended (80-bit) precision too, reduction of arguments
that are greater than 263 in magnitude to the interval
(-π, π) looses almost all precision: for example
0x1.6AC5B262CA1FFp+849, the closest integral multiple of π/2 in
double precision, is reduced to -0x1.E5C3B0F08A43A7B0p0 =
-1.897517260289773073471397690781259370851330459117889404296875
instead of 4.68716592425462761112…e−19!
sin()
(Circular) Sine Functionsin()
returns the (circular) sine of its argument.
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/ff770597.aspx
.686
.model flat, C
.code
sin proc public ; [esp+4] = argument
fld real8 ptr [esp+4] ; st(0) = argument
fsin ; st(0) = sine of argument
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jnp Lexit ; |argument| < 2**63?
ifndef REDUCE
fsub st(0), st(0) ; st(0) = argument - argument
; = 0.0 (or INDEFINITE)
fdiv st(0), st(0) ; st(0) = INDEFINITE
else
fld1 ; st(0) = 1.0,
; st(1) = argument
fldpi ; st(0) = pi,
; st(1) = 1.0,
; st(2) = argument
fscale ; st(0) = pi * 2**1,
; st(1) = 1.0,
; st(2) = argument
fstp st(1) ; st(0) = pi * 2**1,
; st(1) = argument
fxch st(1) ; st(0) = argument,
; st(1) = pi * 2**1
Lreduce:
fprem1 ; st(0) = argument modulo (pi * 2**1)
; = argument',
; st(1) = pi * 2**1
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jp Lreduce ; |argument'| > pi?
fstp st(1) ; st(0) = argument'
fsin ; st(0) = sine of argument'
endif
Lexit:
ret
sin endp
end
Caveat: although the
FSCALE
instruction yields 2×π in
double-extended (80-bit) precision, and the
FPREM1
instruction operates
in double-extended (80-bit) precision too, reduction of arguments
that are greater than 263 in magnitude to the interval
(-π, π) looses almost all precision: for example
0x1.6AC5B262CA1FFp+849, the closest integral multiple of π/2 in
double precision, is reduced to -0x1.E5C3B0F08A43A7B0p0 =
-1.897517260289773073471397690781259370851330459117889404296875
instead of 4.68716592425462761112…e−19!
tan()
(Circular) Tangent Functiontan()
returns the (circular) tangent of its argument.
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/ff770595.aspx
.686
.model flat, C
.code
tan proc public ; [esp+4] = argument
fld real8 ptr [esp+4] ; st(0) = argument
fptan ; st(0) = 1.0,
; st(1) = tangent of argument
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jnp Ldone ; |argument| < 2**63?
ifndef REDUCE
fsub st(0), st(0) ; st(0) = argument - argument
; = 0.0 (or INDEFINITE)
fdiv st(0), st(0) ; st(0) = INDEFINITE
ret
else
fld1 ; st(0) = 1.0,
; st(1) = argument
fldpi ; st(0) = pi,
; st(1) = 1.0,
; st(2) = argument
fscale ; st(0) = pi * 2**1,
; st(1) = 1.0,
; st(2) = argument
fstp st(1) ; st(0) = pi * 2**1,
; st(1) = argument
fxch st(1) ; st(0) = argument,
; st(1) = pi * 2**1
Lreduce:
fprem1 ; st(0) = argument modulo (pi * 2**1)
; = argument',
; st(1) = pi * 2**1
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jp Lreduce ; |argument'| > pi?
fstp st(1) ; st(0) = argument'
fptan ; st(0) = 1.0,
; st(1) = tangent of argument'
endif
Ldone:
fstp st(0) ; st(0) = tangent of argument
ret
tan endp
end
Caveat: although the
FSCALE
instruction yields 2×π in
double-extended (80-bit) precision, and the
FPREM1
instruction operates
in double-extended (80-bit) precision too, reduction of arguments
that are greater than 263 in magnitude to the interval
(-π, π) looses almost all precision: for example
0x1.6AC5B262CA1FFp+849, the closest integral multiple of π/2 in
double precision, is reduced to -0x1.E5C3B0F08A43A7B0p0 =
-1.897517260289773073471397690781259370851330459117889404296875
instead of 4.68716592425462761112…e−19!
acos()
Arc Cosine Functionacos()
returns the (principal) arc cosine of its argument.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double copysign(double x, double y);
double fabs(double x);
double fma(double x, double y, double z);
int signbit(double x);
double sqrt(double x);
static inline
double asin_poly(double t)
{
// for -0.5 <= t <= 0.5,
// a minimax polynomial of degree 12 in t**2 approximates asin(t)
double s = t * t;
return (((((((((((+0x1.02FF4C7428A47p-5 * s
-0x1.032E75CCD4AE8p-6) * s
+0x1.3C0E0817E9742p-6) * s
+0x1.B0EF96B727E7Ep-8) * s
+0x1.8E3FD48D0FB6Fp-7) * s
+0x1.C70DDF81249FCp-7) * s
+0x1.1C6B5042EC6B2p-6) * s
+0x1.6E89F8578B64Ep-6) * s
+0x1.F1C72C5FD95BAp-6) * s
+0x1.6DB6DB407C2B3p-5) * s
+0x1.3333333375CD0p-4) * s
+0x1.55555555552F4p-3) * s * t + t;
}
double acos(double x)
{
// for -1.0 <= x < -0.5, arccos(x) = (π/2 - asin_poly(sqrt((1 + x) / 2))) * 2
// for -0.5 <= x <= 0.5, arccos(x) = π/2 - asin_poly(x)
// for 0.5 < x <= 1.0, arccos(x) = asin_poly(sqrt((1 - x) / 2)) * 2
double z = fabs(x);
int i = z > 0.5;
#ifdef OPTIONAL
#define INFINITY (1.0 / 0.5e-323)
#define INDEFINITE (0.0 * INFINITY)
if (x != x)
return INDEFINITE;
if (x == 1.0)
return 0.0;
if (x == 0.0)
return 1.57079632679489662; // π/2
if (x == -1.0)
return 3.14159265358979324; // π
if (z > 1.0)
return INDEFINITE;
#endif
if (i)
z = sqrt(0.5 - 0.5 * z);
z = asin_poly(z);
z = copysign(z, x);
if (i) { // |x| > 0.5?
if (signbit(x)) { // x < -0.5?
#ifdef FP_FAST_FMA
z = fma(1.8656436928143307, 0.8419594442630920, z);
#else
z += -0x1.5777A5CF72CECp-18; // tail of π/2
z += 0x1.921FC00000000p-0; // head of π/2
#endif
}
z += z;
} else {
#ifdef FP_FAST_FMA
z = fma(1.8656436928143307, 0.8419594442630920, -z);
#else
z = -0x1.5777A5CF72CECp-18 - z;
z += 0x1.921FC00000000p-0;
#endif
}
return z;
}
Note: used within the
fma()
function, the product
1.8656436928143307 × 1.6839188885261840 = 0x1.DD9AD336A05p+0 × 0x1.AF154EEB562D6p+0 = 0x1.921FB54442D18469898CC517p+1
(courtesy of Norbert Juffa and Tor Myklebust) provides 104 bits of
π, equivalent to
31 decimal places.
For floating-point numbers in the
IEEE 754
32-bit binary single-precision format, the product
1.8663789 × 1.6832556 = 0x1.DDCB02p+0F × 0x1.AEE9D6p+0F = 1.921FB54442D6p+1
provides 45 bits of π, equivalent to 13 decimal places, when
used within the
fmaf()
function.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
acos:
mov rax, 0x3FE0000000000000
movq xmm1, rax # xmm1 = 0x1.0p-1
# = 0.5
xorpd xmm2, xmm2 # xmm2 = 0.0
subsd xmm2, xmm0 # xmm2 = -argument
andpd xmm2, xmm0 # xmm2 = |argument|
xorpd xmm0, xmm2 # xmm0 = (argument & -0.0) ? -0.0 : 0.0
ucomisd xmm1, xmm2 # CF = (0.5 < |argument|)
# jp .Lindefinite # argument = INDEFINITE?
sbb eax, eax # eax = (0.5 < |argument|) ? -1 : 0
jnb .Lhorner # |argument| <= 0.5?
.Lbig:
mulsd xmm2, xmm1 # xmm2 = 0.5 * |argument|
subsd xmm1, xmm2 # xmm1 = 0.5 - 0.5 * |argument|
sqrtsd xmm2, xmm1 # xmm2 = sqrt(0.5 - 0.5 * |argument|)
# = argument'
.Lhorner:
movsd xmm1, xmm2 # xmm1 = argument'
mulsd xmm2, xmm2 # xmm2 = argument'**2
mov rcx, 0x3FA02FF4C7428A47
movq xmm3, rcx # xmm3 = 0x1.02FF4C7428A47p-5
# = 0.031615876506539346
mulsd xmm3, xmm2
mov rdx, 0xBF9032E75CCD4AE8
movq xmm4, rdx # xmm4 = -0x1.032E75CCD4AE8p-6
# = -0.015819182433299966
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0x3F93C0E0817E9742
movq xmm3, rcx # xmm3 = 0x1.3C0E0817E9742p-6
# = 0.019290454772679107
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3F7B0EF96B727E7E
movq xmm4, rdx # xmm4 = 0x1.B0EF96B727E7Ep-8
# = 0.006606077476277171
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0x3F88E3FD48D0FB6F
movq xmm3, rcx # xmm3 = 0x1.8E3FD48D0FB6Fp-7
# = 0.012153605255773773
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3F8C70DDF81249FC
movq xmm4, rdx # xmm4 = 0x1.C70DDF81249FCp-7
# = 0.013887151845016092
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0x3F91C6B5042EC6B2
movq xmm3, rcx # xmm3 = 0x1.1C6B5042EC6B2p-6
# = 0.017359569912236146
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3F96E89F8578B64E
movq xmm4, rdx # xmm4 = 0x1.6E89F8578B64Ep-6
# = 0.022371761819320483
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0x3F9F1C72C5FD95BA
movq xmm3, rcx # xmm3 = 0x1.F1C72C5FD95BAp-6
# = 0.030381959280381322
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FA6DB6DB407C2B3
movq xmm4, rdx # xmm4 = 0x1.6DB6DB407C2B3p-5
# = 0.044642856813771024
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0x3FB3333333375CD0
movq xmm3, rcx # xmm3 = 0x1.3333333375CD0p-4
# = 0.075000000003785816
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FC55555555552F4
movq xmm4, rdx # xmm4 = 0x1.55555555552F4p-3
# = 0.166666666666649754
addsd xmm4, xmm3
mulsd xmm4, xmm2
.if 0
mov rcx, 0x3FF0000000000000
movq xmm3, rcx # xmm3 = 0x1.0p+0
# = 1.0
addsd xmm3, xmm4
mulsd xmm1, xmm3 # xmm1 = polynomial(argument')
.else
mulsd xmm4, xmm1
addsd xmm1, xmm4 # xmm1 = polynomial(argument')
.endif
orpd xmm0, xmm1 # xmm0 = polynomial(argument)
test eax, eax
jz .Lsmall # |argument| <= 0.5?
movmskpd eax, xmm0 # eax = (argument & -0.0) ? 0b?1 : 0b?0
shr eax, 1
jnc .Lpositive # argument > 0.5?
.Lnegative:
mov rdx, 0x3FF921FC00000000
movq xmm1, rdx # xmm1 = 0x1.921FC00000000p-0
# = 1.5707969665527344
# = head of pi/2
addsd xmm1, xmm0
mov rcx, 0xBEA5777A5CF72CEC
movq xmm0, rcx # xmm0 = -0x1.5777A5CF72CECp-21
# = -6.3975783775576863e-7
# = tail of pi/2
addsd xmm0, xmm1 # xmm0 = pi/2 - polynomial(argument)
.Lpositive:
addsd xmm0, xmm0 # xmm0 = acos(argument)
ret
.Lsmall:
mov rcx, 0xBEA5777A5CF72CEC
movq xmm1, rcx # xmm1 = -0x1.5777A5CF72CECp-21
# = -6.3975783775576863e-7
# = tail of pi/2
subsd xmm1, xmm0
mov rdx, 0x3FF921FC00000000
movq xmm0, rdx # xmm1 = 0x1.921FC00000000p-0
# = 1.5707969665527344
# = head of pi/2
addsd xmm0, xmm1 # xmm0 = pi/2 - polynomial(argument)
# = acos(argument)
ret
.size acos, .-acos
.type acos, @function
.global acos
.end
The following implementation for the i387
FPU uses its
FPATAN
instruction and the
formula
arccos(argument) = arctan2(argument, sqrt(1 − argument²))
based upon the identities
cos(result) = argument,
sin²(result) + cos²(result) = 1
and
tan(result) = sin(result) / cos(result):
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/bztkwykh.aspx
; arccos(x) = arctan(sqrt((1 + x) * (1 - x)) / x)
; = arctan(sqrt(1 - x**2) / x)
; = arctan2(x, sqrt(1 - x**2))
; = arctan2(x, sqrt((1 + x) * (1 - x)))
.686
.model flat, C
.code
acos proc public ; [esp+4] = argument
fld real8 ptr [esp+4] ; st(0) = argument
if 0
fld1 ; st(0) = 1.0,
; st(1) = argument
fadd st(0), st(1) ; st(0) = 1.0 + argument,
; st(1) = argument
fld1 ; st(0) = 1.0,
; st(1) = 1.0 + argument,
; st(2) = argument
fsub st(0), st(2) ; st(0) = 1.0 - argument,
; st(1) = 1.0 + argument,
; st(2) = argument
fmulp st(1), st(0) ; st(0) = (1.0 - argument) * (1.0 + argument)
; = 1.0 - argument**2,
; st(1) = argument
else
fld st(0) ; st(0) = st(1) = argument
fmul st(0), st(0) ; st(0) = argument**2,
; st(1) = argument
fld1 ; st(0) = 1.0,
; st(1) = argument**2,
; st(2) = argument
fsubrp st(1), st(0) ; st(0) = 1.0 - argument**2,
; st(1) = argument
endif
fsqrt ; st(0) = square root of (1.0 - argument**2),
; st(1) = argument
fxch st(1) ; st(0) = argument,
; st(1) = square root of (1.0 - argument**2)
fpatan ; st(0) = inverse circular cosine of argument
ret
acos endp
end
acot()
Arc Cotangent Functionacot()
returns the (principal) arc
cotangent of its argument.
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; arccot(x) = arctan(1 / x)
; = arctan2(x, 1)
.686
.model flat, C
.code
acot proc public ; [esp+4] = argument
fld1 ; st(0) = 1.0
fld real8 ptr [esp+4] ; st(0) = argument,
; st(1) = 1.0
fpatan ; st(0) = inverse circular tangent of (1.0 / argument)
; = inverse circular cotangent of argument
ret
acot endp
end
acot2()
Arc Cotangent Functionacot2()
returns the (principal) arc
cotangent of the quotient of its arguments.
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; arccot2(y, x) = arctan2(x, y)
.686
.model flat, C
.code
acot2 proc public ; [esp+12] = denominator
; [esp+4] = numerator
fld real8 ptr [esp+12] ; st(0) = denominator
fld real8 ptr [esp+4] ; st(0) = numerator,
; st(1) = denominator
fpatan ; st(0) = inverse circular tangent of (denominator / numerator)
; = inverse circular cotangent of (numerator / denominator)
ret
acot2 endp
end
asin()
Arc Sine Functionasin()
returns the (principal) arc sine of its argument.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double copysign(double x, double y);
double fabs(double x);
double fma(double x, double y, double z);
double sqrt(double x);
static inline
double asin_poly(double t)
{
// for -0.5 <= t <= 0.5,
// a minimax polynomial of degree 12 in t**2 approximates asin(t)
double s = t * t;
return (((((((((((+0x1.02FF4C7428A47p-5 * s
-0x1.032E75CCD4AE8p-6) * s
+0x1.3C0E0817E9742p-6) * s
+0x1.B0EF96B727E7Ep-8) * s
+0x1.8E3FD48D0FB6Fp-7) * s
+0x1.C70DDF81249FCp-7) * s
+0x1.1C6B5042EC6B2p-6) * s
+0x1.6E89F8578B64Ep-6) * s
+0x1.F1C72C5FD95BAp-6) * s
+0x1.6DB6DB407C2B3p-5) * s
+0x1.3333333375CD0p-4) * s
+0x1.55555555552F4p-3) * s * t + t;
}
double asin(double x)
{
// for -1.0 <= x < -0.5, arcsin(x) = -π/2 + asin_poly(sqrt((1 + x) / 2)) * 2
// for -0.5 <= x <= 0.5, arcsin(x) = asin_poly(x)
// for 0.5 < x <= 1.0, arcsin(x) = π/2 - asin_poly(sqrt((1 - x) / 2)) * 2
double z = fabs(x);
int i = z > 0.5;
#ifdef OPTIONAL
#define INFINITY (1.0 / 0.5e-323)
#define INDEFINITE (0.0 * INFINITY)
if (x != x)
return INDEFINITE;
if (x == 0.0)
return x;
if (z == 1.0)
return copysign(1.57079632679489662, x); // ±π/2
if (z > 1.0)
return INDEFINITE;
#endif
if (i)
z = sqrt(0.5 - 0.5 * z);
z = asin_poly(z);
if (i) {
#ifdef FP_FAST_FMA
z = fma(1.8656436928143307, 0.8419594442630920, -2.0 * z);
#else
z = 0x1.921FC00000000p-0 - (z + z); // head of π/2
z += -0x1.5777A5CF72CECp-18; // tail of π/2
#endif
}
return copysign(z, x);
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
asin:
mov rax, 0x3FE0000000000000
movq xmm1, rax # xmm1 = 0x1.0p-1
# = 0.5
xorpd xmm2, xmm2 # xmm2 = 0.0
subsd xmm2, xmm0 # xmm2 = -argument
andpd xmm2, xmm0 # xmm2 = |argument|
xorpd xmm0, xmm2 # xmm0 = (argument & -0.0) ? -0.0 : 0.0
ucomisd xmm1, xmm2 # CF = (0.5 < |argument|)
# jp .Lindefinite # argument = INDEFINITE?
sbb eax, eax # eax = (0.5 < |argument|) ? -1 : 0
jnb .Lhorner # |argument| <= 0.5?
.Lbig:
mulsd xmm2, xmm1 # xmm2 = 0.5 * |argument|
subsd xmm1, xmm2 # xmm1 = 0.5 - 0.5 * |argument|
sqrtsd xmm2, xmm1 # xmm2 = sqrt(0.5 - 0.5 * |argument|)
# = argument'
.Lhorner:
movsd xmm1, xmm2 # xmm1 = argument'
mulsd xmm2, xmm2 # xmm2 = argument'**2
mov rcx, 0x3FA02FF4C7428A47
movq xmm3, rcx # xmm3 = 0x1.02FF4C7428A47p-5
# = 0.031615876506539346
mulsd xmm3, xmm2
mov rdx, 0xBF9032E75CCD4AE8
movq xmm4, rdx # xmm4 = -0x1.032E75CCD4AE8p-6
# = -0.015819182433299966
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0x3F93C0E0817E9742
movq xmm3, rcx # xmm3 = 0x1.3C0E0817E9742p-6
# = 0.019290454772679107
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3F7B0EF96B727E7E
movq xmm4, rdx # xmm4 = 0x1.B0EF96B727E7Ep-8
# = 0.006606077476277171
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0x3F88E3FD48D0FB6F
movq xmm3, rcx # xmm3 = 0x1.8E3FD48D0FB6Fp-7
# = 0.012153605255773773
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3F8C70DDF81249FC
movq xmm4, rdx # xmm4 = 0x1.C70DDF81249FCp-7
# = 0.013887151845016092
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0x3F91C6B5042EC6B2
movq xmm3, rcx # xmm3 = 0x1.1C6B5042EC6B2p-6
# = 0.017359569912236146
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3F96E89F8578B64E
movq xmm4, rdx # xmm4 = 0x1.6E89F8578B64Ep-6
# = 0.022371761819320483
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0x3F9F1C72C5FD95BA
movq xmm3, rcx # xmm3 = 0x1.F1C72C5FD95BAp-6
# = 0.030381959280381322
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FA6DB6DB407C2B3
movq xmm4, rdx # xmm4 = 0x1.6DB6DB407C2B3p-5
# = 0.044642856813771024
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0x3FB3333333375CD0
movq xmm3, rcx # xmm3 = 0x1.3333333375CD0p-4
# = 0.075000000003785816
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FC55555555552F4
movq xmm4, rdx # xmm4 = 0x1.55555555552F4p-3
# = 0.166666666666649754
addsd xmm4, xmm3
mulsd xmm4, xmm2
.if 0
mov rcx, 0x3FF0000000000000
movq xmm3, rcx # xmm3 = 0x1.0p+0
# = 1.0
addsd xmm3, xmm4
mulsd xmm1, xmm3 # xmm1 = polynomial(argument')
.else
mulsd xmm4, xmm1
addsd xmm1, xmm4 # xmm1 = polynomial(argument')
.endif
test eax, eax
jz .Lsmall # |argument| <= 0.5?
addsd xmm1, xmm1 # xmm1 = 2.0 * polynomial(argument')
mov rcx, 0x3FF921FC00000000
movq xmm2, rcx # xmm2 = 0x1.921FC00000000p-0
# = 1.5707969665527344
# = head of pi/2
subsd xmm2, xmm1
mov rdx, 0xBEA5777A5CF72CEC
movq xmm1, rdx # xmm1 = -0x1.5777A5CF72CECp-21
# = -6.3975783775576863e-7
# = tail of pi/2
addsd xmm1, xmm2 # xmm1 = pi/2 - 2.0 * polynomial(argument')
.Lsmall:
orpd xmm0, xmm1 # xmm0 = polynomial(argument)
# = asin(argument)
ret
.size asin, .-asin
.type asin, @function
.global asin
.end
The following implementation for the i387
FPU uses its
FPATAN
instruction and the
formula
arcsin(argument) = arctan2(sqrt(1 − argument²), argument)
based upon the identities
sin(result) = argument,
sin²(result) + cos²(result) = 1
and
tan(result) = sin(result) / cos(result):
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/txk32e70.aspx
; arcsin(x) = arctan(x / sqrt((1 + x) * (1 - x)))
; = arctan(x / sqrt(1 - x**2))
; = arctan2(sqrt(1 - x**2), x)
; = arctan2(sqrt((1 + x) * (1 - x)), x)
.686
.model flat, C
.code
asin proc public ; [esp+4] = argument
fld real8 ptr [esp+4] ; st(0) = argument
if 0
fld1 ; st(0) = 1.0,
; st(1) = argument
fadd st(0), st(1) ; st(0) = 1.0 + argument,
; st(1) = argument
fld1 ; st(0) = 1.0,
; st(1) = 1.0 + argument,
; st(2) = argument
fsub st(0), st(2) ; st(0) = 1.0 - argument,
; st(1) = 1.0 + argument,
; st(2) = argument
fmulp st(1), st(0) ; st(0) = (1.0 - argument) * (1.0 + argument)
; = 1.0 - argument**2,
; st(1) = argument
else
fld st(0) ; st(0) = st(1) = argument
fmul st(0), st(0) ; st(0) = argument**2,
; st(1) = argument
fld1 ; st(0) = 1.0,
; st(1) = argument**2,
; st(2) = argument
fsubrp st(1), st(0) ; st(0) = 1.0 - argument**2,
; st(1) = argument
endif
fsqrt ; st(0) = square root of (1.0 - argument**2),
; st(1) = argument
fpatan ; st(0) = inverse circular sine of argument
ret
asin endp
end
atan()
Arc Tangent Functionatan()
returns the (principal) arc tangent of its argument.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double copysign(double x, double y);
double fabs(double x);
double fma(double x, double y, double z);
int signbit(double x);
static inline
double atan_poly(double t)
{
// for -1.0 <= t <= 1.0,
// a minimax polynomial of degree 19 in t**2 approximates atan(t)
double s = t * t;
#ifdef FP_FAST_FMA
double r = -0x1.53E1D2A25FF34p-16;
r = fma(r, s, 0x1.D3B63DBB65AF4p-13);
r = fma(r, s, -0x1.312788DDE0801p-10);
r = fma(r, s, 0x1.F9690C82492DBp-9);
r = fma(r, s, -0x1.2CF5AABC7CEF3p-7);
r = fma(r, s, 0x1.162B0B2A3BFCEp-6);
r = fma(r, s, -0x1.A7256FEB6FC5Cp-6);
r = fma(r, s, 0x1.171560CE4A483p-5);
r = fma(r, s, -0x1.4F44D841450E1p-5);
r = fma(r, s, 0x1.7EE3D3F36BB94p-5);
r = fma(r, s, -0x1.AD32AE04A9FD1p-5);
r = fma(r, s, 0x1.E17813D66954Fp-5);
r = fma(r, s, -0x1.11089CA9A5BCDp-4);
r = fma(r, s, 0x1.3B12B2DB51738p-4);
r = fma(r, s, -0x1.745D022F8DC5Cp-4);
r = fma(r, s, 0x1.C71C709DFE927p-4);
r = fma(r, s, -0x1.2492491FA1744p-3);
r = fma(r, s, 0x1.99999999840D2p-3);
r = fma(r, s, -0x1.555555555544Cp-2);
r = fma(r, s, 1.0);
return r * t;
#else
return ((((((((((((((((((-0x1.3CBF44A88555Fp-16 * s
+0x1.B81666EB938AFp-13) * s
-0x1.21F657F3915DAp-10) * s
+0x1.E5005F4C78C20p-9) * s
-0x1.2399E74A75E56p-7) * s
+0x1.0FF6A2A0D2286p-6) * s
-0x1.A1006DE22CDACp-6) * s
+0x1.14C4D24651F2Ep-5) * s
-0x1.4DEE09915F638p-5) * s
+0x1.7E4B31D8A55AEp-5) * s
-0x1.ACFE938E04FCAp-5) * s
+0x1.E16A933B73622p-5) * s
-0x1.11074E45F93E0p-4) * s
+0x1.3B1283C0CA0B1p-4) * s
-0x1.745CFD878FEE8p-4) * s
+0x1.C71C704FB4F9Fp-4) * s
-0x1.2492491E100BBp-3) * s
+0x1.999999997B9DDp-3) * s
-0x1.55555555553C5p-2) * s * t + t;
#endif
}
double atan(double x)
{
// with arctan(-x) = -arctan(x),
// arctan(1 / x) = π/2 - arctan(x)
// and arctan(1 / -x) = -π/2 - arctan(x),
// for x < -1, arctan(x) = -π/2 - atan_poly(1 / x),
// for -1 <= x <= 1, arctan(x) = atan_poly(x),
// for 1 < x, arctan(x) = π/2 - atan_poly(1 / x)
double z = fabs(x);
int i = z > 1.0;
#ifdef OPTIONAL
#define INFINITY (1.0 / 0.5e-323)
#define INDEFINITE (0.0 * INFINITY)
if (x != x)
return INDEFINITE;
if (z == INFINITY)
return copysign(1.57079632679489662, x); // π/2
if (x == 0.0)
return x;
#endif
if (i)
z = 1.0 / z;
z = atan_poly(z);
if (i) {
#ifdef FP_FAST_FMA
z = fma(1.8656436928143307, 0.8419594442630920, -z);
#else
z = -0x1.5777A5CF72CECp-18 - z; // tail of π/2
z += 0x1.921FC00000000p-0; // head of π/2
#endif
}
return copysign(z, x);
}
double atan2(double y, double x)
{
double z;
#if 0
if (fabs(x) > fabs(y))
z = atan(y / x);
else {
z = atan(x / y);
z = copysign(1.57079632679489662, z) - z;
}
if (signbit(x))
z += copysign(3.14159265358979324, y);
#else
int i;
if (x == 0.0) {
if (y > 0.0)
return 1.57079632679489662; // π/2
if (y < 0.0)
return -1.57079632679489662; // -π/2
return signbit(x) ? copysign(3.14159265358979324, y) : y;
}
y = fabs(y);
z = fabs(x);
i = z < y;
if (i)
z /= y;
else
z = y / z;
z = atan_poly(z);
if (i) {
#ifdef FP_FAST_FMA
z = fma(1.8656436928143307, 0.8419594442630920, -z);
#else
z = -0x1.5777A5CF72CECp-18 - z; // tail of π/2
z += 0x1.921FC00000000p-0; // head of π/2
#endif
}
if (signbit(x)) {
#ifdef FP_FAST_FMA
z = fma(1.8656436928143307, -1.6839188885261840, z);
#else
z -= 3.14159265358979324; // π
#endif
if (y == 0.0)
z = -z;
}
return z;
}
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
atan:
mov rax, 0x3FF0000000000000
movq xmm1, rax # xmm1 = 0x1.0p+0
# = 1.0
xorpd xmm2, xmm2 # xmm2 = 0.0
subsd xmm2, xmm0 # xmm2 = -argument
andpd xmm2, xmm0 # xmm2 = |argument|
xorpd xmm0, xmm2 # xmm0 = (argument & -0.0) ? -0.0 : 0.0
ucomisd xmm1, xmm2 # CF = (1.0 < |argument|)
# jp .Lindefinite # argument = INDEFINITE?
sbb eax, eax # eax = (1.0 < |argument|) ? -1 : 0
jnb .Lhorner # |argument| <= 1.0?
.Lbig:
divsd xmm1, xmm2 # xmm1 = 1.0 / |argument|
movsd xmm2, xmm1 # xmm2 = argument'
.Lhorner:
movsd xmm1, xmm2 # xmm1 = argument'
mulsd xmm2, xmm2 # xmm2 = argument'**2
.ifdef ALTERNATE
mov rcx, 0xBEF53E1D2A25FF34
movq xmm3, rcx # xmm3 = -0x1.53E1D2A25FF34p-16
# = -2.0258553044438107e-5
mulsd xmm3, xmm2
mov rdx, 0x3F2D3B63DBB65AF4
movq xmm4, rdx # xmm4 = 0x1.D3B63DBB65AF4p-13
# = 2.2302240345758279e-4
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBF5312788DDE0801
movq xmm3, rcx # xmm3 = -0x1.312788DDE0801p-10
# = -1.1640717779930478e-3
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3F6F9690C82492DB
movq xmm4, rdx # xmm4 = 0x1.F9690C82492DBp-9
# = 3.8559749383629666e-3
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBF82CF5AABC7CEF3
movq xmm3, rcx # xmm3 = -0x1.2CF5AABC7CEF3p-7
# = -9.1845592187165034e-3
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3F9162B0B2A3BFCE
movq xmm4, rdx # xmm4 = 0x1.162B0B2A3BFCEp-6
# = 1.6978035834597276e-2
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBF9A7256FEB6FC5C
movq xmm3, rcx # xmm3 = -0x1.A7256FEB6FC5Cp-6
# = -2.5826796814495942e-2
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FA171560CE4A483
movq xmm4, rdx # xmm4 = 0x1.171560CE4A483p-5
# = 3.4067811082715081e-2
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBFA4F44D841450E1
movq xmm3, rcx # xmm3 = -0x1.4F44D841450E1p-5
# = -4.0926382420509951e-2
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FA7EE3D3F36BB94
movq xmm4, rdx # xmm4 = 0x1.7EE3D3F36BB94p-5
# = 4.6739496199157987e-2
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBFAAD32AE04A9FD1
movq xmm3, rcx # xmm3 = -0x1.AD32AE04A9FD1p-5
# = -5.2392330054601317e-2
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FAE17813D66954F
movq xmm4, rdx # xmm4 = 0x1.E17813D66954Fp-5
# = 5.8773077721790849e-2
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBFB11089CA9A5BCD
movq xmm3, rcx # xmm3 = -0x1.11089CA9A5BCDp-4
# = -6.6658603633512573e-2
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FB3B12B2DB51738
movq xmm4, rdx # xmm4 = 0x1.3B12B2DB51738p-4
# = 7.6922129305867837e-2
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBFB745D022F8DC5C
movq xmm3, rcx # xmm3 = -0x1.745D022F8DC5Cp-4
# = -9.0909012354005225e-2
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FBC71C709DFE927
movq xmm4, rdx # xmm4 = 0x1.C71C709DFE927p-4
# = 0.11111110678749424
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBFC2492491FA1744
movq xmm3, rcx # xmm3 = -0x1.2492491FA1744p-3
# = -0.14285714271334815
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FC99999999840D2
movq xmm4, rdx # xmm4 = 0x1.99999999840D2p-3
# = 0.19999999999755019
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBFD555555555544C
movq xmm3, rcx # xmm3 = -0x1.555555555544Cp-2
# = -0.3333333333333186
addsd xmm3, xmm4
mulsd xmm3, xmm2
.else
mov rcx, 0xBEF3CBF44A88555F
movq xmm3, rcx # xmm3 = -0x1.3CBF44A88555Fp-16
# = -1.8879600846307350e-5
mulsd xmm3, xmm2
mov rdx, 0x3F2B81666EB938AF
movq xmm4, rdx # xmm4 = 0x1.B81666EB938AFp-13
# = 2.0985007664581698e-4
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBF521F657F3915DA
movq xmm3, rcx # xmm3 = -0x1.21F657F3915DAp-10
# = -0.0011061183148667248
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3F6E5005F4C78C20
movq xmm4, rdx # xmm4 = 0x1.E5005F4C78C20p-9
# = 0.003700267441887131
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBF82399E74A75E56
movq xmm3, rcx # xmm3 = -0x1.2399E74A75E56p-7
# = -0.008898961958876555
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3F90FF6A2A0D2286
movq xmm4, rdx # xmm4 = 0x1.0FF6A2A0D2286p-6
# = 0.016599329773529202
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBF9A1006DE22CDAC
movq xmm3, rcx # xmm3 = -0x1.A1006DE22CDACp-6
# = -0.025451762493231264
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FA14C4D24651F2E
movq xmm4, rdx # xmm4 = 0x1.14C4D24651F2Ep-5
# = 0.033785258000135307
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBFA4DEE09915F638
movq xmm3, rcx # xmm3 = -0x1.4DEE09915F638p-5
# = -0.040762919127683650
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FA7E4B31D8A55AE
movq xmm4, rdx # xmm4 = 0x1.7E4B31D8A55AEp-5
# = 0.046666715007784063
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBFAACFE938E04FCA
movq xmm3, rcx # xmm3 = -0x1.ACFE938E04FCAp-5
# = -0.052367485230348246
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FAE16A933B73622
movq xmm4, rdx # xmm4 = 0x1.E16A933B73622p-5
# = 0.058766639292667358
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBFB11074E45F93E0
movq xmm3, rcx # xmm3 = -0x1.11074E45F93E0p-4
# = -0.066657357936108053
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FB3B1283C0CA0B1
movq xmm4, rdx # xmm4 = 0x1.3B1283C0CA0B1p-4
# = 0.076921953831176962
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBFB745CFD878FEE8
movq xmm3, rcx # xmm3 = -0x1.745CFD878FEE8p-4
# = -0.090908995008245008
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FBC71C704FB4F9F
movq xmm4, rdx # xmm4 = 0x1.C71C704FB4F9Fp-4
# = 0.111111105648261418
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBFC2492491E100BB
movq xmm3, rcx # xmm3 = -0x1.2492491E100BBp-3
# = -0.142857142667713294
addsd xmm3, xmm4
mulsd xmm3, xmm2
mov rdx, 0x3FC999999997B9DD
movq xmm4, rdx # xmm4 = 0x1.999999997B9DDp-3
# = 0.199999999996591266
addsd xmm4, xmm3
mulsd xmm4, xmm2
mov rcx, 0xBFD55555555553C5
movq xmm3, rcx # xmm3 = -0x1.55555555553C5p-2
# = -0.333333333333311110
addsd xmm3, xmm4
mulsd xmm3, xmm2
.endif # ALTERNATE
.if 0
mov rdx, 0x3FF0000000000000
movq xmm4, rdx # xmm4 = 0x1.0p+0
# = 1.0
addsd xmm4, xmm3
mulsd xmm1, xmm4 # xmm1 = polynomial(argument')
.else
mulsd xmm3, xmm1
addsd xmm1, xmm3 # xmm1 = polynomial(argument')
.endif
test eax, eax
jz .Lsmall # |argument| <= 1.0?
mov rdx, 0x3FF921FC00000000
movq xmm2, rdx # xmm2 = 0x1.921FC00000000p-0
# = 1.5707969665527344
# = head of pi/2
subsd xmm2, xmm1
mov rcx, 0xBEA5777A5CF72CEC
movq xmm1, rcx # xmm1 = -0x1.5777A5CF72CECp-21
# = -6.3975783775576863e-7
# = tail of pi/2
addsd xmm1, xmm2 # xmm1 = pi/2 - polynomial(argument')
# = atan(|argument|)
.Lsmall:
orpd xmm0, xmm1 # xmm0 = atan(argument)
ret
.size atan, .-atan
.type atan, @function
.global atan
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/88c36t42.aspx
; arctan(x) = arctan2(1, x)
.686
.model flat, C
.code
atan proc public ; [esp+4] = argument
fld real8 ptr [esp+4] ; st(0) = argument
fld1 ; st(0) = 1.0,
; st(1) = argument
fpatan ; st(0) = inverse circular tangent of (argument / 1.0)
ret
atan endp
end
atan2()
Arc Tangent Functionatan2()
returns the (principal) arc tangent of the quotient of its arguments.
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/88c36t42.aspx
.686
.model flat, C
.code
atan2 proc public ; [esp+12] = denominator
; [esp+4] = numerator
fld real8 ptr [esp+4] ; st(0) = numerator
fld real8 ptr [esp+12] ; st(0) = denominator,
; st(1) = numerator
fpatan ; st(0) = inverse circular tangent of (numerator / denominator)
ret
atan2 endp
end
cosh()
Hyperbolic Cosine Functionacosh()
returns the hyperbolic cosine of its argument.
coth()
Hyperbolic Cotangent Functionacosh()
returns the hyperbolic cotangent of its argument.
sinh()
Hyperbolic Sine Functionacosh()
returns the hyperbolic sine of its argument.
tanh()
Hyperbolic Tangent Functionacosh()
returns the hyperbolic tangent of its argument.
acosh()
Area Hyperbolic Cosine Functionacosh()
returns the inverse hyperbolic cosine of its argument.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# arcosh(x) = log(x + sqrt((x + 1) * (x - 1)))
# = log(x + sqrt(x**2 - 1))
.arch generic64
.code64
.intel_syntax noprefix
.extern log
.text
# xmm0 = argument
acosh:
mov rax, 0x3FF0000000000000
movq xmm1, rax # xmm1 = 0x1.0p+0
# = 1.0
movsd xmm2, xmm0 # xmm2 = argument
mulsd xmm0, xmm0 # xmm0 = argument**2
subsd xmm0, xmm1 # xmm0 = argument**2 - 1.0
sqrtsd xmm0, xmm0 # xmm0 = sqrt(argument**2 - 1.0)
addsd xmm0, xmm2 # xmm0 = sqrt(argument**2 - 1.0) + argument
jmp log
.size acosh, .-acosh
.type acosh, @function
.weak acosh
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/dn465171.aspx
; arcosh(x) = log(x + sqrt((x + 1) * (x - 1)))
; = log(x + sqrt(x**2 - 1))
.686
.model flat, C
.code
acosh proc public ; [esp+4] = argument
fldln2 ; st(0) = ln(2.0)
fld real8 ptr [esp+4] ; st(0) = argument,
; st(1) = ln(2.0)
fld st(0) ; st(0) = argument,
; st(1) = argument,
; st(2) = ln(2.0)
fmul st(0), st(0) ; st(0) = argument**2,
; st(1) = argument,
; st(2) = ln(2.0)
fld1 ; st(0) = 1.0,
; st(1) = argument**2,
; st(2) = argument,
; st(3) = ln(2.0)
fsubp st(1), st(0) ; st(0) = argument**2 - 1.0,
; st(1) = argument,
; st(2) = ln(2.0)
fsqrt ; st(0) = sqrt(argument**2 - 1.0),
; st(1) = argument,
; st(2) = ln(2.0)
faddp st(1), st(0) ; st(0) = argument + sqrt(argument**2 - 1.0),
; st(1) = ln(2.0)
fyl2x ; st(0) = natural logarithm of (argument + sqrt(argument**2 - 1.0))
; = inverse hyperbolic cosine of argument
ret
acosh endp
end
acoth()
Area Hyperbolic Cotangent Functionacoth()
returns the inverse hyperbolic
cotangent of its argument.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# arcoth(x) = log((x + 1) / (x - 1)) / 2
# = log(1 + 2 / (x - 1)) / 2
.arch generic64
.code64
.intel_syntax noprefix
.extern log
.text
# xmm0 = argument
acoth:
mov rax, 0x3FF0000000000000
movq xmm1, rax # xmm1 = 0x1.0p+0
# = 1.0
movsd xmm2, xmm0 # xmm2 = argument
addsd xmm0, xmm1 # xmm0 = argument + 1.0
subsd xmm2, xmm1 # xmm2 = argument - 1.0
divsd xmm0, xmm2 # xmm0 = (argument + 1.0) / (argument - 1.0)
call log # xmm0 = log((argument + 1.0) / (argument - 1.0))
mov rax, 0x3FE0000000000000
movq xmm1, rax # xmm1 = 0x1.0p-1
# = 0.5
mulsd xmm0, xmm1 # xmm0 = acoth(argument)
ret
.size acoth, .-acoth
.type acoth, @function
.weak acoth
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; arcoth(x) = log((x + 1) / (x - 1)) / 2
; = log(1 + 2 / (x - 1)) / 2
; = log1p(2 / (x - 1)) / 2
.686
.model flat, C
.code
single record sign:1, exponent:8, mantissa:23
bias equ 1 shl (width exponent - 1) - 1
acoth proc public ; [esp+4] = argument
fldln2 ; st(0) = ln(2.0)
fld real8 ptr [esp+4] ; st(0) = argument,
; st(1) = ln(2.0)
fld st(0) ; st(0) = argument,
; st(1) = argument,
; st(2) = ln(2.0)
fld1 ; st(0) = 1.0,
; st(1) = argument,
; st(2) = argument,
; st(3) = ln(2.0)
fadd st(2), st(0) ; st(0) = 1.0,
; st(1) = argument,
; st(2) = argument + 1.0,
; st(3) = ln(2.0)
fsubp st(1), st(0) ; st(0) = argument - 1.0,
; st(1) = argument + 1.0,
; st(2) = ln(2.0)
fdivp st(1), st(0) ; st(0) = (argument + 1.0) / (argument - 1.0),
; st(1) = ln(2.0)
fyl2x ; st(0) = natural logarithm of ((argument + 1.0) / (argument - 1.0))
push (bias - 1) shl width mantissa
; [esp] = 0x3F000000
; = 0.5F
fmul real4 ptr [esp] ; st(0) = inverse hyperbolic cotangent of argument
pop eax
ret
acoth endp
end
asinh()
Area Hyperbolic Sine Functionasinh()
returns the inverse hyperbolic sine of its argument.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# arsinh(x) = log(x + sqrt(x**2 + 1))
.arch generic64
.code64
.intel_syntax noprefix
.extern log
.text
# xmm0 = argument
asinh:
mov rax, 0x3FF0000000000000
movq xmm1, rax # xmm1 = 0x1.0p+0
# = 1.0
movsd xmm2, xmm0 # xmm2 = argument
mulsd xmm0, xmm0 # xmm0 = argument**2
addsd xmm0, xmm1 # xmm0 = argument**2 + 1.0
sqrtsd xmm0, xmm0 # xmm0 = sqrt(argument**2 + 1.0)
addsd xmm0, xmm2 # xmm0 = sqrt(argument**2 + 1.0) + argument
jmp log
.size asinh, .-asinh
.type asinh, @function
.weak asinh
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/dn465168.aspx
; arsinh(x) = log(x + sqrt(x**2 + 1))
; = log1p(x + sqrt(x**2 + 1) - 1)
; = log1p(x + x**2 / (sqrt(x**2 + 1) + 1))
.686
.model flat, C
.code
asinh proc public ; [esp+4] = argument
fldln2 ; st(0) = ln(2.0)
fld real8 ptr [esp+4] ; st(0) = argument,
; st(1) = ln(2.0)
fld st(0) ; st(0) = argument,
; st(1) = argument,
; st(2) = ln(2.0)
fmul st(0), st(0) ; st(0) = argument**2,
; st(1) = argument,
; st(2) = ln(2.0)
fld1 ; st(0) = 1.0,
; st(1) = argument**2,
; st(2) = argument,
; st(3) = ln(2.0)
faddp st(1), st(0) ; st(0) = argument**2 + 1.0,
; st(1) = argument,
; st(2) = ln(2.0)
fsqrt ; st(0) = sqrt(argument**2 + 1.0),
; st(1) = argument,
; st(2) = ln(2.0)
faddp st(1), st(0) ; st(0) = argument + sqrt(argument**2 + 1.0),
; st(1) = ln(2.0)
fyl2x ; st(0) = natural logarithm of (argument + sqrt(argument**2 + 1.0))
; = inverse hyperbolic sine of argument
ret
asinh endp
end
atanh()
Area Hyperbolic Tangent Functionatanh()
returns the inverse hyperbolic tangent of its argument.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# artanh(x) = log((1 + x) / (1 - x)) / 2
# = log(1 + 2 * x / (x - 1)) / 2
.arch generic64
.code64
.intel_syntax noprefix
.extern log
.text
# xmm0 = argument
atanh:
mov rax, 0x3FF0000000000000
movq xmm1, rax # xmm1 = 0x1.0p+0
# = 1.0
movsd xmm2, xmm0 # xmm2 = argument
addsd xmm0, xmm1 # xmm0 = 1.0 + argument
subsd xmm1, xmm2 # xmm1 = 1.0 - argument
divsd xmm0, xmm1 # xmm0 = (1.0 + argument) / (1.0 - argument)
call log # xmm0 = log((1.0 + argument) / (1.0 - argument))
mov rax, 0x3FE0000000000000
movq xmm1, rax # xmm1 = 0x1.0p-1
# = 0.5
mulsd xmm0, xmm1 # xmm0 = atanh(argument)
ret
.size atanh, .-atanh
.type atanh, @function
.weak atanh
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/dn324930.aspx
; artanh(x) = log((1 + x) / (1 - x)) / 2
; = log(1 + 2 * x / (1 - x)) / 2
; = log1p(2 * x / (1 - x)) / 2
; artanh(x) = log((1 + x) / (1 - x)) / 2
; = (log(1 + x) - log(1 - x)) / 2
; = (log1p(x) - log1p(-x)) / 2
.686
.model flat, C
.code
single record sign:1, exponent:8, mantissa:23
bias equ 1 shl (width exponent - 1) - 1
atanh proc public ; [esp+4] = argument
fldln2 ; st(0) = ln(2.0)
fld1 ; st(0) = 1.0,
; st(1) = ln(2.0)
fld1 ; st(0) = 1.0,
; st(1) = 1.0,
; st(2) = ln(2.0)
fld real8 ptr [esp+4] ; st(0) = argument,
; st(1) = 1.0,
; st(2) = 1.0,
; st(3) = ln(2.0)
fadd st(2), st(0) ; st(0) = argument,
; st(1) = 1.0,
; st(2) = 1.0 + argument,
; st(3) = ln(2.0)
fsubp st(1), st(0) ; st(0) = 1.0 - argument,
; st(1) = 1.0 + argument,
; st(2) = ln(2.0)
fdivp st(1), st(0) ; st(0) = (1.0 + argument) / (1.0 - argument),
; st(1) = ln(2.0)
fyl2x ; st(0) = natural logarithm of ((1.0 + argument) / (1.0 - argument))
push (bias - 1) shl width mantissa
; [esp] = 0x3F000000
; = 0.5F
fmul real4 ptr [esp] ; st(0) = inverse hyperbolic tangent of argument
pop eax
ret
atanh endp
end
fmax()
Functionfmax()
returns its other argument if one argument is a
NaN, else the larger of its
arguments.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double fmax(double left, double right)
{
#ifdef QUIET
return (left > right) || (left == left) ? left : right == right ? right : right + right;
#else
return (left > right) || (right != right) ? left : right;
#endif
}
Note: with the preprocessor macro
QUIET
defined, a signalingNaN is returned as a
quietNaN.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = left
# xmm1 = right
fmax:
movsd xmm2, xmm0 # xmm2 = left
maxsd xmm2, xmm1 # xmm2 = (left > right) ? left : right
# = (left # right) ? right : max(left, right)
cmpsd xmm1, xmm0, 3 # xmm1 = (left # right) ? ~0L : 0L
andpd xmm0, xmm1 # xmm0 = (left # right) ? left : 0L
andnpd xmm1, xmm2 # xmm1 = (left # right) ? 0L : max(left, right)
orpd xmm0, xmm1 # xmm0 = (left # right) ? left : max(left, right)
# = fmax(left, right)
ret
.size fmax, .-fmax
.type fmax, @function
.global fmax
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/mt720717.aspx
.686
.model flat, C
.code
fmax proc public ; [esp+12] = right
; [esp+4] = left
fld real8 ptr [esp+4] ; st(0) = left
fld real8 ptr [esp+12] ; st(0) = right,
; st(1) = left
fucomi st(0), st(0) ; eflags = right ><=# right
fcmovu st(0), st(1) ; st(0) = (right # right) ? left : right,
; st(1) = left
if 0
fld st(1) ; st(0) = left,
; st(1) = (right # right) ? left : right,
; st(2) = left
fucomip st(0), st(1) ; eflags = left ><=# ((right # right) ? left : right),
; st(0) = (right # right) ? left : right,
; st(1) = left
fcmovnb st(0), st(1) ; st(0) = (left < right) ? right : left,
; st(1) = left
else
fxch st(1) ; st(0) = left,
; st(1) = (right # right) ? left : right
fucomi st(0), st(1) ; eflags = left ><=# ((right # right) ? left : right)
fcmovb st(0), st(1) ; st(0) = (left < right) ? right : left,
; st(1) = (right # right) ? left : right
endif
fstp st(1) ; st(0) = fmax(left, right)
ret
fmax endp
end
fmin()
Functionfmin()
returns its other argument if one argument is a
NaN, else the smaller of its
arguments.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double fmin(double left, double right)
{
#ifdef QUIET
return (left < right) || (left == left) ? left : right == right ? right : right + right;
#else
return (left < right) || (right != right) ? left : right;
#endif
}
Note: with the preprocessor macro
QUIET
defined, a signalingNaN is returned as
quietNaN.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = left
# xmm1 = right
fmin:
movsd xmm2, xmm0 # xmm2 = left
minsd xmm2, xmm1 # xmm2 = (left < right) ? left : right
# = (left # right) ? right : min(left, right)
cmpsd xmm1, xmm0, 3 # xmm1 = (left # right) ? ~0L : 0L
andpd xmm0, xmm1 # xmm0 = (left # right) ? left : 0L
andnpd xmm1, xmm2 # xmm1 = (left # right) ? 0L : min(left, right)
orpd xmm0, xmm1 # xmm0 = (left # right) ? left : min(left, right)
# = fmin(left, right)
ret
.size fmin, .-fmin
.type fmin, @function
.global fmin
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/mt720716.aspx
.686
.model flat, C
.code
fmin proc public ; [esp+12] = right
; [esp+4] = left
fld real8 ptr [esp+4] ; st(0) = left
fld real8 ptr [esp+12] ; st(0) = right,
; st(1) = left
fucomi st(0), st(0) ; eflags = right ><=# right
fcmovu st(0), st(1) ; st(0) = (right # right) ? left : right,
; st(1) = left
fucomi st(0), st(1) ; eflags = ((right # right) ? left : right) ><=# left
fcmovnb st(0), st(1) ; st(0) = (left < right) ? left : right,
; st(1) = left
fstp st(1) ; st(0) = fmin(left, right)
ret
fmin endp
end
hypot()
Functionhypot()
returns +∞ if one of its arguments is a
NaN, but the other argument
is ±∞, else the square root of the sum of the squares
of its arguments,
√(a2 + b2),
which is occasionally called Pythagorean Sum.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
#define INFINITY (1.0 / 0.5e-323)
#define INDEFINITE (0.0 * INFINITY)
double hypot(double p, double q)
{
double r, s;
if (p < 0.0)
p = -p;
if (q < 0.0)
q = -q;
if (p < q)
r = q, q = p, p = r;
if (p == INFINITY)
return p;
if (p == 0.0)
return p;
if (q == 0.0)
return p;
if ((p != p) && (q != q))
return INDEFINITE;
for (;;) {
r = q / p;
r *= r;
s = r + 4.0;
if (s == 4.0)
return p;
r /= s;
p += p * (r + r);
q *= r;
}
}
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
#define INFINITY (0.0 / 0.5e-323)
double fabs(double x);
double fma(double x, double y, double z);
double sqrt(double x);
double hypot(double left, double right)
{
double tmp;
right = fabs(right);
if ((right == INFINITY) || (left == 0.0))
return right;
left = fabs(left);
if ((left == INFINITY) || (right == 0.0))
return left;
if (left < right)
tmp = right, right = left, left = tmp;
right /= left;
#ifdef FP_FAST_FMA
right = fma(right, right, 1.0);
tmp = sqrt(right);
right = fma(-tmp, tmp, right) / (tmp + tmp);
return fma(left, tmp, left * right);
#else
return left * sqrt(1.0 + right * right);
#endif
}
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
#define INFINITY 0x1.0p+1024
double fabs(double x);
double sqrt(double x);
double hypot(double left, double right)
{
double tmp;
right = fabs(right);
if (right == INFINITY)
return right;
left = fabs(left);
if (left == INFINITY)
return left;
if (left > right)
tmp = right, right = left, left = tmp;
if (left < right * 0x1.6A09E667F3BCDp-27) // sqrt(0x1.0p-53)
return right;
if (left < 0x1.0p-511) { // sqrt(0x1.0p-1022)
tmp = 0x1.0p-511; // scale up to prevent underflow
left *= 0x1.0p+511;
right *= 0x1.0p+511;
} else if (right > 0x1.6A09E667F3BCCp+511) { // sqrt(0x1.0p+1023)
tmp = 0x1.0p+511; // scale down to prevent overflow
left *= 0x1.0p-511;
right *= 0x1.0p-511;
} else
tmp = 1.0;
#if 1
double delta, hypot = sqrt(left * left + right * right);
if (hypot > 2.0 * left) {
delta = hypot - right;
hypot -= (2.0 * delta * (right - 2.0 * left) + (4.0 * delta - left) * left + delta * delta) / (2.0 * hypot);
} else {
delta = hypot - left;
hypot -= ((2.0 * delta - right) * right + (delta - 2.0 * (right - left)) * delta) / (2.0 * hypot);
}
return tmp * hypot;
#else
return tmp * sqrt(left * left + right * right);
#endif
}
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
#define INFINITY 0x1.0p+1024
double fabs(double x);
double fma(double x, double y, double z);
double frexp(double x, int *z);
double ldexp(double x, int z);
double sqrt(double x);
double hypot(double left, double right)
{
double tmp;
int exponent;
right = fabs(right);
if (right == INFINITY)
return right;
left = fabs(left);
if (left == INFINITY)
return left;
if (left < right)
tmp = right, right = left, left = tmp;
left = frexp(left, &exponent);
right = ldexp(right, -exponent);
#ifdef FP_FAST_FMA
return ldexp(sqrt(fma(left, left, right * right)), exponent);
#else
return ldexp(sqrt(left * left + right * right), exponent);
#endif
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# hypot(a, ±INFINITY) = +INFINITY
# hypot(a, INDEFINITE) = INDEFINITE
# hypot(a, ±0) = |a|
# hypot(a, b) = hypot(a, -b)
# = hypot(b, a)
# hypot(a, b) = sqrt(a**2 + b**2)
# = sqrt(1 + (b / a)**2) * |a|
# = sqrt(1 + (min(|a|, |b|) / max(|a|, |b|))**2) * max(|a|, |b|)
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = left
# xmm1 = right
hypot:
xorpd xmm2, xmm2 # xmm2 = 0.0
ucomisd xmm2, xmm1
subsd xmm2, xmm0 # xmm2 = -left
andpd xmm0, xmm2 # xmm0 = |left|
jz .Lleft # right = ±0.0?
# right = INDEFINITE?
xorpd xmm2, xmm2 # xmm2 = 0.0
ucomisd xmm2, xmm0
subsd xmm2, xmm1 # xmm2 = -right
andpd xmm1, xmm2 # xmm1 = |right|
jz .Lright # left = ±0.0?
# left = INDEFINITE?
movsd xmm2, xmm0
minsd xmm0, xmm1 # xmm0 = min(|left|, |right|)
maxsd xmm1, xmm2 # xmm1 = max(|left|, |right|)
divsd xmm0, xmm1 # xmm0 = min(|left|, |right|)
# / max(|left|, |right|)
mov rax, 0x3FF0000000000000
movq xmm2, rax # xmm2 = 1.0
mulsd xmm0, xmm0 # xmm0 = (min(|left|, |right|)
# / max(|left|, |right|))**2
addsd xmm0, xmm2 # xmm0 = (min(|left|, |right|)
# / max(|left|, |right|))**2 + 1.0
sqrtsd xmm0, xmm0 # xmm0 = sqrt((min(|left|, |right|)
# / max(|left|, |right|))**2 + 1.0)
mulsd xmm0, xmm1 # xmm0 = hypot(left, right)
ret
.Lleft:
jnp .Lexit # right <> INDEFINITE?
# (right = ±0.0?)
mov rax, 0x7FF0000000000000
movq xmm2, rax # xmm2 = 0x1.0p+1024
# = INFINITY
ucomisd xmm2, xmm0
je .Lexit # left = ±INFINITY?
.Linfinity:
.Lcommon:
movsd xmm0, xmm1 # xmm0 = |right|
ret
.Lright:
jnp .Lcommon # left <> INDEFINITE?
# (left = ±0.0?)
mov rax, 0x7FF0000000000000
movq xmm2, rax # xmm2 = 0x1.0p+1024
# = INFINITY
ucomisd xmm2, xmm1
je .Linfinity # right = ±INFINITY?
.Lindefinite:
.Lexit:
ret
.size hypot, .-hypot
.type hypot, @function
.global hypot
.end
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# hypot(a, ±INFINITY) = +INFINITY
# hypot(a, INDEFINITE) = INDEFINITE
# hypot(a, ±0) = |a|
# hypot(a, b) = hypot(a, -b)
# = hypot(b, a)
# hypot(a, b) = sqrt(a**2 + b**2)
# = sqrt((max(|a|, |b|) * 2**c)**2 + (min(|a|, |b|) * 2**c)**2) / 2**c
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = left
# xmm1 = right
hypot:
xorpd xmm2, xmm2 # xmm2 = 0.0
ucomisd xmm2, xmm1
subsd xmm2, xmm0 # xmm2 = -left
andpd xmm0, xmm2 # xmm0 = |left|
jz .Lleft # right = ±0.0?
# right = INDEFINITE?
xorpd xmm2, xmm2 # xmm2 = 0.0
ucomisd xmm2, xmm0
subsd xmm2, xmm1 # xmm2 = -right
andpd xmm1, xmm2 # xmm1 = |right|
jz .Lright # left = ±0.0?
# left = INDEFINITE?
movsd xmm2, xmm0
maxsd xmm0, xmm1 # xmm0 = max(|left|, |right|)
# = left'
minsd xmm1, xmm2 # xmm1 = min(|left|, |right|)
# = right'
movq rax, xmm0
shr rax, 54
shl eax, 2 # eax = biased exponent of left'
mov ecx, BIAS * 2 - 1
sub ecx, eax # ecx = 2045
# - biased exponent of left'
# = biased exponent of (normalized) scale factor
# = {1, 5, 9, ..., 2045}
inc eax # eax = biased exponent of reciprocal scale factor
shl rcx, 52
shl rax, 52
movq xmm2, rcx # xmm2 = (normalized) scale factor
.ifdef SSE4_1
unpcklpd xmm2, xmm2
unpcklpd xmm0, xmm1 # xmm0[63:0] = left',
# xmm0[127:64] = right'
mulpd xmm0, xmm2 # xmm0[63:0] = left' * scale factor,
# xmm0[127:64] = right' * scale factor
dppd xmm0, xmm0, 0x31 # xmm0 = (left' * scale factor)**2
# + (right' * scale factor)**2
# = (left'**2 + right'**2) * scale factor**2
.else
mulsd xmm0, xmm2 # xmm0 = left' * scale factor
mulsd xmm1, xmm2 # xmm1 = right' * scale factor
mulsd xmm0, xmm0 # xmm0 = (left' * scale factor)**2
mulsd xmm1, xmm1 # xmm1 = (right' * scale factor)**2
addsd xmm0, xmm1 # xmm0 = (left' * scale factor)**2
# + (right' * scale factor)**2
# = (left'**2 + right'**2) * scale factor**2
.endif
sqrtsd xmm0, xmm0 # xmm0 = sqrt(left'**2 + right'**2) * scale factor
movq xmm1, rax # xmm1 = reciprocal scale factor
mulsd xmm0, xmm1 # xmm0 = sqrt(left'**2 + right'**2)
# = hypot(left, right)
ret
.Lleft:
jnp .Lexit # right <> INDEFINITE?
# (right = ±0.0?)
mov rax, 0x7FF0000000000000
movq xmm2, rax # xmm2 = 0x1.0p+1024
# = INFINITY
ucomisd xmm2, xmm0
je .Lexit # left = ±INFINITY?
.Linfinity:
.Lcommon:
movsd xmm0, xmm1 # xmm0 = |right|
ret
.Lright:
jnp .Lcommon # left <> INDEFINITE?
# (left = ±0.0?)
mov rax, 0x7FF0000000000000
movq xmm2, rax # xmm2 = 0x1.0p+1024
# = INFINITY
ucomisd xmm2, xmm1
je .Linfinity # right = ±INFINITY?
.Lindefinite:
.Lexit:
ret
.size hypot, .-hypot
.type hypot, @function
.global hypot
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/a9yb3dbt.aspx
; hypot(x, ±INFINITY) = +INFINITY
; hypot(x, INDEFINITE) = INDEFINITE
; hypot(x, ±0) = |x|
; hypot(x, y) = hypot(x, -y)
; = hypot(y, x)
; hypot(x, y) = sqrt(x**2 + y**2)
; = sqrt((max(|x|, |y|) / 2**z)**2 + (min(|x|, |y|) / 2**z)**2) * 2**z
.686
.model flat, C
.code
hypot proc public ; [esp+12] = right
; [esp+4] = left
fld real8 ptr [esp+4] ; st(0) = left
ftst
fstsw ax ; ax = FPU status word
; B C3 TOP C2 C1 C0 low byte
; . 0 ... 0 . 0 ........ st(0) > 0.0
; . 0 ... 0 . 1 ........ st(0) < 0.0
; . 1 ... 0 . 0 ........ st(0) = 0.0
; . 1 ... 1 . 1 ........ st(0) # 0.0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
; CF (carry flag) = C0
; C1
; PF (parity flag) = C2
; ZF (zero flag) = C3
; AF (adjust flag) = .
; SF (sign flag) = B(usy)
fld real8 ptr [esp+12] ; st(0) = right,
; st(1) = left
fabs ; st(0) = |right|,
; st(1) = left
jz Lspecial ; left = ±0.0?
; left = INDEFINITE?
fxch st(1) ; st(0) = left,
; st(1) = |right|
fabs ; st(0) = |left|,
; st(1) = |right|
fucom st(1)
fstsw ax ; ax = FPU status word
; B C3 TOP C2 C1 C0 low byte
; . 0 ... 0 . 0 ........ st(0) > st(1)
; . 0 ... 0 . 1 ........ st(0) < st(1)
; . 1 ... 0 . 0 ........ st(0) = st(1)
; . 1 ... 1 . 1 ........ st(0) # st(1)
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jp Lunordered ; |right| = INDEFINITE?
jnb Lscale ; |left| >= |right|?
Lbelow:
fxch st(1) ; st(0) = max(|left|, |right|)
; = left',
; st(1) = min(|left|, |right|)
; = right'
Lscale:
fxtract ; st(0) = left' / 2**exponent,
; st(1) = exponent,
; st(2) = right'
fmul st(0), st(0) ; st(0) = (left' / 2**exponent)**2,
; st(1) = exponent,
; st(2) = right'
fxch st(2) ; st(0) = right',
; st(1) = exponent,
; st(2) = (left' / 2**exponent)**2
fld st(1) ; st(0) = exponent,
; st(1) = right',
; st(2) = exponent,
; st(3) = (left' / 2**exponent)**2
fchs ; st(0) = -exponent,
; st(1) = right',
; st(2) = exponent,
; st(3) = (left' / 2**exponent)**2
fxch st(1) ; st(0) = right',
; st(1) = -exponent,
; st(2) = exponent,
; st(3) = (left' / 2**exponent)**2
fscale ; st(0) = right' * 2**-exponent
; = right' / 2**exponent,
; st(1) = -exponent,
; st(2) = exponent,
; st(3) = (left' / 2**exponent)**2
fstp st(1) ; st(0) = right' / 2**exponent,
; st(1) = exponent,
; st(2) = (left' / 2**exponent)**2
fmul st(0), st(0) ; st(0) = (right' / 2**exponent)**2,
; st(1) = exponent,
; st(2) = (left' / 2**exponent)**2
faddp st(2), st(0) ; st(0) = exponent,
; st(1) = (left' / 2**exponent)**2
; + (right' / 2**exponent)**2
; = (left'**2 + right'**2) / (2**exponent)**2
fxch st(1) ; st(0) = (left' / 2**exponent)**2
; + (right' / 2**exponent)**2
; = (left'**2 + right'**2) / (2**exponent)**2,
; st(1) = exponent
fsqrt ; st(0) = sqrt(left'**2 + right'**2) / 2**exponent,
; st(1) = exponent
fscale ; st(0) = sqrt(left'**2 + right'**2),
; st(1) = exponent
fstp st(1) ; st(0) = hypot(left, right)
ret
;;Lunordered:
;; fxam
;; fstsw ax ; ax = FPU status word,
;; ; ah = B:C3:T:O:P:C2:C1:C0
;; and ah, 0x45
;; cmp ah, 0x05
;; jne Lindefinite ; |left| <> INFINITY?
;;Linfinity:
;; fstp st(1) ; st(0) = |left|
;; ; = INFINITY
;; ; = hypot(±INFINITY, right)
;; ret
Lspecial:
jnp Lzero ; left <> INDEFINITE?
; left = ±0.0?
Lunordered
fxam
fstsw ax ; ax = FPU status word
; B C3 TOP C2 C1 C0 low byte
; . 0 ... 0 0 0 ........ st(0) = +unsupported
; . 0 ... 0 1 0 ........ st(0) = -unsupported
; . 0 ... 0 0 1 ........ st(0) = +indefinite
; . 0 ... 0 1 1 ........ st(0) = -indefinite
; . 0 ... 1 0 0 ........ st(0) = +finite
; . 0 ... 1 1 0 ........ st(0) = -finite
; . 0 ... 1 0 1 ........ st(0) = +infinity
; . 0 ... 1 1 1 ........ st(0) = -infinity
; . 1 ... 0 0 0 ........ st(0) = +0.0
; . 1 ... 0 1 0 ........ st(0) = -0.0
; . 1 ... 0 0 1 ........ st(0) = +empty
; . 1 ... 0 1 1 ........ st(0) = -empty
; . 1 ... 1 0 0 ........ st(0) = +denormal
; . 1 ... 1 1 0 ........ st(0) = -denormal
and ah, 0x45
cmp ah, 0x05
jne Lindefinite ; |right| <> INFINITY?
Linfinity:
Lzero:
fstp st(1) ; st(0) = |right|
; = hypot(left, ±INFINITY)
; = hypot(±0.0, right)
ret
Lindefinite:
faddp st(1), st(0) ; st(0) = INDEFINITE
ret
hypot endp
end
pow()
Functionpow()
returns +1 if its first argument is +1 or if its second argument is
±0, even if the other argument is a
NaN, else its first argument
raised to the power of its second argument.
cbrt()
Functioncbrt()
returns the cube root of its argument.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
#define INFINITY (1.0 / 0.5e-323)
#define INDEFINITE (0.0 * INFINITY)
double fabs(double x);
double frexp(double x, int *z);
double ldexp(double x, int z);
double cbrt(double argument)
{
static const double scale[5] = {0x1.428A2F98D728Bp-1, // 2**(-2/3)
0x1.965FEA53D6E3Dp-1, // 2**(-1/3)
1.0, // 2**0
0x1.428A2F98D728Bp-0, // 2**(1/3)
0x1.965FEA53D6E3Dp-0}; // 2**(2/3)
double a, b, c;
int exponent;
if (argument != argument)
return INDEFINITE;
if (argument == 0.0)
return argument;
a = fabs(argument);
if (a == INFINITY)
return argument;
a = frexp(a, &exponent);
// for 0.5 <= a < 1.0,
// a minimax polynomial of degree 6 yields an approximation
// of the cube root, followed by a single Halley iteration
b = (((((-0x1.29801E893366Dp-3 * a
+0x1.91E2A6FE7E984p-1) * a
-0x1.D5AE6CFA20F0Cp-0) * a
+0x1.39350ADAD51ECp+1) * a
-0x1.0EB8277CD8D5Dp+1) * a
+0x1.8218DDE9028B4p-0) * a
+0x1.6B69CBA168FF2p-2;
c = b * b * b;
c = b * (2.0 * a + c) / (a + 2.0 * c);
c = argument < 0.0 ? -c : c;
return ldexp(c * scale[2 + exponent % 3], exponent / 3);
}
ceil()
Functionceil()
returns the smallest integral value not less than its argument.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double ceil(double argument)
{
#ifdef TRUNC
double trunc(double x);
double tmp = trunc(argument);
return (argument > tmp) ? tmp + 1.0 : tmp;
#else
double tmp;
if ((argument > 0.0) && (argument < 0x1.0p+52)) {
tmp = argument;
argument += 0x1.0p+52;
argument -= 0x1.0p+52;
if (argument < tmp)
argument += 1.0;
} else if ((argument < 0.0) && (argument > -0x1.0p+52)) {
tmp = argument;
argument -= 0x1.0p+52;
argument += 0x1.0p+52;
if (argument < tmp)
argument += 1.0;
else if (argument == 0.0)
argument = -0.0;
} else if (argument != 0.0)
argument += 0.0;
return argument;
#endif
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# NOTE: requires SSE 4.1 instruction set!
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
ceil:
roundsd xmm0, xmm0, 2 # xmm0 = argument rounded up (towards +INFINITY)
ret
.size ceil, .-ceil
.type ceil, @function
.global ceil
.end
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# NOTE: ceil() returns -0.0 for argument in (-1.0, -0.0]
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
ceil:
mov rax, 0x4330000000000000
movq xmm2, rax # xmm2 = 0x1.0p+52
# = 4503599627370496.0
# = minimum non-fractional number
mov rax, 0x3FF0000000000000
xorpd xmm1, xmm1 # xmm1 = 0.0
subsd xmm1, xmm0 # xmm1 = -argument
xorpd xmm1, xmm0 # xmm1 = (argument & -0.0) ? -0.0 : +0.0
orpd xmm2, xmm1 # xmm2 = (argument & -0.0) ? -0x1.0p+52 : +0.x1.0p+52
movsd xmm3, xmm0 # xmm3 = argument
addsd xmm0, xmm2 # xmm0 = argument
# + (argument & -0.0) ? -0x1.0p+52 : +0.x1.0p+52
subsd xmm0, xmm2 # xmm0 = argument
# - (argument & -0.0) ? -0x1.0p+52 : +0.x1.0p+52
# = rint(argument)
movq xmm2, rax # xmm2 = 0x1.0p+0
# = 1.0
cmpsd xmm3, xmm0, 6 # xmm3 = (argument > rint(argument)) ? ~0L : 0L
andpd xmm3, xmm2 # xmm3 = (argument > rint(argument)) ? 1.0 : 0.0
addsd xmm0, xmm3 # xmm0 = (argument > rint(argument)) ? 1.0 : 0.0
# + rint(argument)
# = ceil(argument)
orpd xmm0, xmm1 # xmm0 = ceil(argument)
ret
.size ceil, .-ceil
.type ceil, @function
.global ceil
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/atdhw2dx.aspx
.686
.model flat, C
.code
ceil proc public ; [esp+4] = argument
fld real8 ptr [esp+4] ; st(0) = argument
if 0
; ceil(x) = x > trunc(x) ? trunc(x) + 1.0 : trunc(x)
ftst
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jz Lexit ; argument = ±0.0?
fld1 ; st(0) = 1.0,
; st(1) = argument
fld st(1) ; st(0) = argument,
; st(1) = 1.0,
; st(2) = argument
Lmodulo:
fprem ; st(0) = argument modulo 1.0
; = argument',
; st(1) = 1.0,
; st(2) = argument
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jp Lmodulo ; |argument'| >= 1.0?
fxch st(2) ; st(0) = argument,
; st(1) = 1.0,
; st(2) = argument'
fsubr st(2), st(0) ; st(0) = argument,
; st(1) = 1.0,
; st(2) = argument - argument'
; = trunc(argument)
fcomp st(2) ; st(0) = 1.0,
; st(1) = trunc(argument)
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
ja Labove ; argument > trunc(argument)?
fstp st(1) ; st(0) = trunc(argument)
; = ceil(argument)
ret
Labove:
faddp st(1), st(0) ; st(0) = trunc(argument) + 1.0
; = ceil(argument)
Lexit:
else
; ceil(x) = x > rint(x) ? rint(x) + 1.0 : rint(x)
fld st(0) ; st(0) = argument,
; st(1) = argument
frndint ; st(0) = rint(argument),
; st(1) = argument
fxch st(1) ; st(0) = argument,
; st(1) = rint(argument)
fucomip st(0), st(1) ; eflags = argument ><=# rint(argument),
; st(0) = rint(argument)
fld1 ; st(0) = 1.0,
; st(1) = rint(argument)
fldz ; st(0) = 0.0,
; st(1) = 1.0,
; st(2) = rint(argument)
fcmovnbe st(0), st(1) ; st(0) = (rint(argument) < argument) ? 1.0 : 0.0,
; st(1) = 1.0,
; st(2) = rint(argument)
faddp st(2), st(0) ; st(0) = 1.0,
; st(1) = ceil(argument)
fstp st(0) ; st(0) = ceil(argument)
endif
ret
ceil endp
end
fabs()
Functionfabs()
returns the absolute value alias magnitude of its argument.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double fabs(double argument)
{
*(unsigned long long *) &argument <<= 1;
*(unsigned long long *) &argument >>= 1;
return argument;
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
fabs:
.if 0
xorpd xmm1, xmm1 # xmm1 = 0.0
subsd xmm1, xmm0 # xmm1 = -argument
maxsd xmm0, xmm1 # xmm0 = |argument|
ret
.else
movq rax, xmm0 # rax = argument
btr rax, 63 # rax = |argument|
movq xmm0, rax # xmm0 = |argument|
ret
.endif
.size fabs, .-fabs
.type fabs, @function
.global fabs
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/18z15bk0.aspx
.686
.model flat; C
.code
_fabs proc public ; [esp+4] = argument
fld real8 ptr [esp+4] ; st(0) = argument
fabs ; st(0) = |argument|
ret
_fabs endp
end
fdim()
Functionfdim()
returns the positive difference of its arguments.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double fdim(double left, double right)
{
return left < right ? 0.0 : left - right;
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = left
# xmm1 = right
fdim:
movsd xmm2, xmm0 # xmm2 = left
cmpsd xmm0, xmm1, 5 # xmm0 = (left < right) ? ~0L : 0L
subsd xmm2, xmm1 # xmm2 = left - right
andnpd xmm0, xmm2 # xmm0 = (left < right) ? 0.0 : left - right
ret
.size fdim, .-fdim
.type fdim, @function
.global fdim
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/mt720714.aspx
.686
.model flat, C
.code
fdim proc public ; [esp+12] = right
; [esp+4] = left
fld real8 ptr [esp+4] ; st(0) = left
fld real8 ptr [esp+12] ; st(0) = right,
; st(1) = left
fsubp st(1), st(0) ; st(0) = left - right
fldz ; st(0) = 0.0,
; st(1) = left - right
fucomi st(0), st(1) ; eflags = 0.0 ><=# left - right
fcmovb st(0), st(1) ; st(0) = (left > right) ? left - right : 0.0,
; st(1) = left - right
fcmovu st(0), st(1) ; st(0) = (left # right) ? left - right
; : (left > right) ? left - right : 0.0,
; st(1) = left - right
fstp st(1) ; st(0) = fdim(left, right)
ret
fdim endp
end
floor()
Functionfloor()
returns the largest integral value not greater than its argument.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double floor(double argument)
{
#ifdef TRUNC
double trunc(double x);
double tmp = trunc(argument);
return (argument < tmp) ? tmp - 1.0 : tmp;
#else
double tmp;
if ((argument > 0.0) && (argument < 0x1.0p+52)) {
tmp = argument;
argument += 0x1.0p+52;
argument -= 0x1.0p+52;
if (argument > tmp)
argument -= 1.0;
} else if ((argument < 0.0) && (argument > -0x1.0p+52)) {
tmp = argument;
argument -= 0x1.0p+52;
argument += 0x1.0p+52;
if (argument > tmp)
argument -= 1.0;
} else if (argument != 0.0)
argument += 0.0;
return argument;
#endif
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# NOTE: requires SSE 4.1 instruction set!
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
floor:
roundsd xmm0, xmm0, 1 # xmm0 = argument rounded down (towards -INFINITY)
ret
.size floor, .-floor
.type floor, @function
.global floor
.end
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# NOTE: floor() preserves -0.0
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
floor:
mov rax, 0x4330000000000000
movq xmm2, rax # xmm2 = 0x1.0p+52
# = 4503599627370496.0
# = minimum non-fractional number
mov rax, 0x3FF0000000000000
xorpd xmm1, xmm1 # xmm1 = 0.0
subsd xmm1, xmm0 # xmm1 = -argument
xorpd xmm1, xmm0 # xmm1 = (argument & -0.0) ? -0.0 : +0.0
orpd xmm2, xmm1 # xmm2 = (argument & -0.0) ? -0x1.0p+52 : +0.x1.0p+52
movsd xmm3, xmm0 # xmm3 = argument
addsd xmm0, xmm2 # xmm0 = argument
# + (argument & -0.0) ? -0x1.0p+52 : +0x1.0p+52
subsd xmm0, xmm2 # xmm0 = argument
# - (argument & -0.0) ? -0x1.0p+52 : +0x1.0p+52
# = rint(argument)
movq xmm2, rax # xmm2 = 0x1.0p+0
# = 1.0
cmpsd xmm3, xmm0, 1 # xmm3 = (argument < rint(argument)) ? ~0L : 0L
andpd xmm3, xmm2 # xmm3 = (argument < rint(argument)) ? 1.0 : 0.0
subsd xmm0, xmm3 # xmm0 = (argument < rint(argument)) ? -1.0 : 0.0
# + rint(argument)
# = floor(argument)
orpd xmm0, xmm1 # xmm0 = floor(argument)
ret
.size floor, .-floor
.type floor, @function
.global floor
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/x39715t6.aspx
.686
.model flat, C
.code
floor proc public ; [esp+4] = argument
fld real8 ptr [esp+4] ; st(0) = argument
if 0
; floor(x) = x < trunc(x) ? trunc(x) - 1.0 : trunc(x)
ftst
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jz Lexit ; argument = ±0.0?
fld1 ; st(0) = 1.0,
; st(1) = argument
fld st(1) ; st(0) = argument,
; st(1) = 1.0,
; st(2) = argument
Lmodulo:
fprem ; st(0) = argument modulo 1.0
; = argument',
; st(1) = 1.0,
; st(2) = argument
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jp Lmodulo ; |argument'| >= 1.0?
fxch st(2) ; st(0) = argument,
; st(1) = 1.0,
; st(2) = argument'
fsubr st(2), st(0) ; st(0) = argument,
; st(1) = 1.0,
; st(2) = argument - argument'
; = trunc(argument)
fcomp st(2) ; st(0) = 1.0,
; st(1) = trunc(argument)
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jb Lbelow ; argument < trunc(argument)?
fstp st(1) ; st(0) = trunc(argument)
; = floor(argument)
ret
Lbelow:
fsubp st(1), st(0) ; st(0) = trunc(argument) - 1.0
; = floor(argument)
Lexit:
else
; floor(x) = x > rint(x) ? rint(x) - 1.0 : rint(x)
fld st(0) ; st(0) = argument,
; st(1) = argument
frndint ; st(0) = rint(argument),
; st(1) = argument
fxch st(1) ; st(0) = argument,
; st(1) = rint(argument)
fucomip st(0), st(1) ; eflags = argument ><=# rint(argument),
; st(0) = rint(argument)
fld1 ; st(0) = 1.0,
; st(1) = rint(argument)
fldz ; st(0) = 0.0,
; st(1) = 1.0,
; st(2) = rint(argument)
fcmovb st(0), st(1) ; st(0) = (rint(argument) > argument) ? 1.0 : 0.0,
; st(1) = 1.0,
; st(2) = rint(argument)
fsubp st(2), st(0) ; st(0) = 1.0,
; st(1) = floor(argument)
fstp st(0) ; st(0) = floor(argument)
endif
ret
floor endp
end
fma()
Functionfma()
returns the sum of the product of its first and second argument plus
its third argument, calculated in full precision and without
intermediate rounding of the product.
Note: this means for example that
fma(2.0, nextafter(INFINITY, 0.0), -nextafter(INFINITY, 0.0))
returns nextafter(INFINITY, 0.0)
, and
fma(0.5, nextafter(0.0, INFINITY), nextafter(0.0, INFINITY))
returns 2.0 * nextafter(0.0, INFINITY)
,
despite the over- respectively underflow of the (intermediate)
product!
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double frexp(double x, int *z);
double ldexp(double x, int z);
static inline // Veltkamp
void _2split(double *h, double *l, double x)
{
#if 0
int e;
double f = frexp(x, &e);
double g = f * 0x1.0000002000000p+27;
#if 1
g -= g - f;
#else
g += f - g;
#endif
*l = ldexp(f - g, e);
*h = ldexp(g, e);
#else
unsigned long long ull = *(unsigned long long *) &x & (~0ULL << 26);
*h = *(double *) &ull
*l = x - *h;
#endif
}
static inline // Dekker
void _2product(double *h, double *l, double x, double y)
{
double xl, xh, yl, yh, zl, zh = x * y;
_2split(&xh, &xl, x);
_2split(&yh, &yl, y);
zl = xl * yl + (xl * yh + (xh * yl + (xh * yh - zh)));
#ifdef DEKKER
*h = zl + zh;
#if 0
*l = zl - (*h - zh);
#else
*l = zl + (zh - *h);
#endif
#else
*l = zl;
*h = zh;
#endif
}
#if 0
static inline // Møller, Knuth
void _2sum(double *h, double *l, double x, double y)
{
double s = x + y;
double t = s - x;
#if 0
*l = (x - (s - t)) + (y - t);
#elif 0
*l = (x - (s - t)) - (t - y);
#elif 0
*l = (x + (t - s)) - (t - y);
#else
*l = (x + (t - s)) + (y - t);
#endif
*h = s;
}
#else
static inline // Boldo, Melquiond: |u| >= |v| >= |w|
double _3sum(double u, double v, double w)
{
double h = w + v;
double l = w + (v - h);
// round high part of intermediate sum to odd when
// its fraction is even and also inexact, i.e. low
// part of intermediate sum is not equal to zero
if ((l != 0.0)
&& ((*(unsigned long long *) &h & 1ull) == 0ull))
*(unsigned long long *) &h |= 1ull;
return u + h;
}
#endif
double fma(double multiplicand, double multiplier, double addend)
{
int o;
double ph, pl, qh, ql, rh, rl, sh, sl;
double product = multiplicand * multiplier;
if ((multiplicand - multiplicand != 0.0)
|| (multiplier - multiplier != 0.0)
|| (addend - addend != 0.0)) // at least one argument INFINITE?
return product + addend;
if (addend == 0.0) // when product underflows to ±0.0,
// its sign determines the sign of the result
return (product == 0.0)
&& (multiplier != 0.0)
&& (multiplicand != 0.0) ? product : product + addend;
if ((multiplicand == 0.0) || (multiplier == 0.0))
return addend;
o = product - product != 0.0;
if (o) { // product overflows?
if ((product < 0.0) == (addend < 0.0))
return product;
multiplier *= 0.5;
addend *= 0.5;
#if 0
product = 2.0 * (multiplicand * multiplier + addend);
if (product - product != 0.0)
return product;
#endif
}
_2product(&ph, &pl, multiplicand, multiplier);
#if 0
_2sum(&qh, &ql, ph, addend);
_2sum(&rh, &rl, pl, qh);
#if 0
_2sum(&sh, &sl, ql, rl);
#else
sh = rl + ql;
#endif
sh += rh;
#else
if (fabs(addend) < fabs(pl))
sh = _3sum(ph, pl, addend);
else if (fabs(addend) < fabs(ph))
sh = _3sum(ph, addend, pl);
else
sh = _3sum(addend, ph, pl);
#endif
return o ? sh + sh : sh;
}
Note: the function _2product()
implements Dekker’s product, an error-free (exact)
transformation that exposes in the absence of overflows the
properties h + l = x × y and
|h| ≥ |l| × 253.
Note: the function _2sum()
implements Møller’s and Knuth’s sum, an
error-free (exact) transformation that exposes in the absence of
overflows the properties h + l = x + y,
|l| ≤ |x| and
|h| ≥ |l| × 253.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# CAVEAT: requires default (round to nearest, ties to even) rounding mode!
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = multiplicand
# xmm1 = multiplier
# xmm2 = addend
fma:
movsd xmm3, xmm0 # xmm3 = multiplicand
movsd xmm4, xmm1 # xmm4 = multiplier
movsd xmm5, xmm2 # xmm5 = addend
subsd xmm3, xmm0 # xmm3 = multiplicand - multiplicand
subsd xmm4, xmm1 # xmm4 = multiplier - multiplier
subsd xmm5, xmm2 # xmm5 = addend - addend
ucomisd xmm3, xmm0
movsd xmm3, xmm0 # xmm3 = multiplicand
mulsd xmm0, xmm1 # xmm0 = multiplicand * multiplier
# = product
je .Lmultiplicand # multiplicand = ±0.0?
# multiplicand = ±INFINITY?
# multiplicand = INDEFINITE?
ucomisd xmm4, xmm1
je .Lmultiplier # multiplier = ±0.0?
# multiplier = ±INFINITY?
# multiplier = INDEFINITE?
ucomisd xmm5, xmm2
je .Laddend # addend = ±0.0?
# addend = ±INFINITY?
# addend = INDEFINITE?
movsd xmm4, xmm0
subsd xmm4, xmm0 # xmm4 = product - product
ucomisd xmm4, xmm0
jp .Loverflow # product = ±INFINITY?
.Lveltkamp:
mov eax, 0x03FFFFFF # rax = 2**26 - 1
movq xmm4, rax
movq xmm5, rax
andnpd xmm4, xmm3 # xmm4 = upper half of multiplicand
andnpd xmm5, xmm1 # xmm5 = upper half of multiplier
subsd xmm3, xmm4 # xmm3 = lower half of multiplicand
subsd xmm1, xmm5 # xmm1 = lower half of multiplier
.Ldekker:
unpcklpd xmm4, xmm3 # xmm4[63:0] = upper half of multiplicand,
# xmm4[127:64] = lower half of multiplicand
unpcklpd xmm5, xmm1 # xmm5[63:0] = upper half of multiplier,
# xmm5[127:64] = lower half of multiplier
unpcklpd xmm3, xmm4 # xmm3[63:0] = lower half of multiplicand,
# xmm3[127:64] = upper half of multiplicand
mulpd xmm4, xmm5 # xmm4[63:0] = upper half of multiplicand
# * upper half of multiplier,
# xmm4[127:64] = lower half of multiplicand
# * lower half of multiplier
mulpd xmm3, xmm5 # xmm3[63:0] = lower half of multiplicand
# * upper half of multiplier,
# xmm3[127:64] = upper half of multiplicand
# * lower half of multiplier
.Ltail:
movsd xmm1, xmm4
subsd xmm1, xmm0
addsd xmm1, xmm3
unpckhpd xmm3, xmm3
addsd xmm1, xmm3
unpckhpd xmm4, xmm4
addsd xmm1, xmm4 # xmm1 = upper half of multiplicand
# * upper half of multiplier
# - multiplicand * multiplier
# + lower half of multiplicand
# * upper half of multiplier
# + upper half of multiplicand
# * lower half of multiplier
# + lower half of multiplicand
# * lower half of multiplier
# = tail part of (intermediate) product
# xmm0 = head part of (intermediate) product
.Lmøller:
movsd xmm3, xmm0
addsd xmm0, xmm2
movsd xmm4, xmm0 # xmm4 = head part of first intermediate sum
subsd xmm0, xmm3
subsd xmm2, xmm0
subsd xmm0, xmm4
addsd xmm0, xmm3
addsd xmm0, xmm2 # xmm0 = tail part of first intermediate sum
.Lknuth:
movsd xmm3, xmm4
addsd xmm4, xmm1
movsd xmm2, xmm4 # xmm2 = head part of second intermediate sum
subsd xmm4, xmm3
subsd xmm1, xmm4
subsd xmm4, xmm2
addsd xmm4, xmm3
addsd xmm4, xmm1 # xmm4 = tail part of second intermediate sum
addsd xmm0, xmm4 # xmm0 = tail part of first intermediate sum
# + tail part of second intermediate sum
# = head part of third intermediate sum
.Lfinal:
addsd xmm0, xmm2 # xmm0 = product + addend
# = fma(multiplicand, multiplier, addend)
ret
.Lmultiplicand:
jp .Lfinal # multiplicand = INDEFINITE?
# multiplicand = ±INFINITY?
# multiplicand = ±0.0!
ucomisd xmm4, xmm1
.Lmultiplier:
jp .Lfinal # multiplier = INDEFINITE?
# multiplier = ±INFINITY?
# multiplier = ±0.0,
# multiplicand <> ±INFINITY,
# multiplicand <> INDEFINITE!
.Lindefinite:
movsd xmm0, xmm2 # xmm0 = addend
ret
.Laddend:
jp .Lindefinite # addend = INDEFINITE?
# addend = ±INFINITY?
# addend = ±0.0,
# multiplier <> ±0.0,
# multiplier <> ±INFINITY,
# multiplier <> INDEFINITE,
# multiplicand <> ±0.0,
# multiplicand <> ±INFINITY,
# multiplicand <> INDEFINITE!
ucomisd xmm0, xmm2
je .Lunderflow # product = ±0.0?
.Lproduct:
movsd xmm4, xmm0
subsd xmm4, xmm0 # xmm4 = product - product
ucomisd xmm4, xmm0
jnp .Lfinal # product <> ±INFINITY?
.Loverflow:
movq rcx, xmm0 # rcx = product
movq rdx, xmm2 # rdx = addend
xor rdx, rcx # rdx = (addend < 0.0) = (product < 0.0) ? positive : negative
jns .Linfinity # (addend < 0.0) = (product < 0.0)?
# (sign of addend = sign of product?)
mov rax, 0x3FE0000000000000
movq xmm5, rax # xmm5 = 0x1.0p-1
# = 0.5
mulsd xmm1, xmm5 # xmm1 = multiplier * 0.5
# = multiplier'
mulsd xmm2, xmm5 # xmm2 = addend * 0.5
# = addend'
.if 1
movsd xmm0, xmm3 # xmm0 = multiplicand
mulsd xmm0, xmm1 # xmm0 = multiplicand * multiplier'
# = product'
.else
movsd xmm4, xmm1
movsd xmm5, xmm2
mulsd xmm4, xmm3 # xmm4 = multiplier' * multiplicand
# = product'
addsd xmm5, xmm4 # xmm5 = product' + addend'
addsd xmm5, xmm5 # xmm5 = (product' + addend') * 2.0
subsd xmm5, xmm5 # xmm5 = (product' + addend') * 2.0
# - (product' + addend') * 2.0
ucomisd xmm5, xmm5
jp .Linfinity # (product' + addend') * 2.0 = ±INFINITY?
movsd xmm0, xmm4 # xmm0 = product'
.endif
call .Lveltkamp
addsd xmm0, xmm0 # xmm0 = (product' + addend') * 2.0
# = fma(multiplicand, multiplier, addend)
.Linfinity:
.Lunderflow:
ret
.size fma, .-fma
.type fma, @function
.global fma
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/.aspx
.686
.model flat, C
.code
single record sign:1, exponent:8, mantissa:23
bias equ 1 shl (width exponent - 1) - 1
fma proc public ; [esp+20] = addend
; [esp+12] = multiplier
; [esp+4] = multiplicand
fld real8 ptr [esp+20] ; st(0) = addend
fld st(0) ; st(0) = addend,
; st(1) = addend
fsub st(0), st(0) ; st(0) = addend - addend,
; st(1) = addend
fcomip st(0), st(1) ; st(0) = addend
je Laddend ; addend = ±0.0?
; addend = ±INFINITY?
; addend = INDEFINITE?
fld real8 ptr [esp+12] ; st(0) = multiplier,
; st(1) = addend
fld st(0) ; st(0) = multiplier,
; st(1) = multiplier,
; st(2) = addend
fsub st(0), st(0) ; st(0) = multiplier - multiplier,
; st(1) = multiplier,
; st(2) = addand
fcomip st(0), st(1) ; st(0) = multiplier,
; st(1) = addend
je Lmultiplier ; multiplier = ±0.0?
; multiplier = ±INFINITY?
; multiplier = INDEFINITE?
fld real8 ptr [esp+4] ; st(0) = multiplicand,
; st(1) = multiplier,
; st(2) = addend
fld st(0) ; st(0) = multiplicand,
; st(1) = multiplicand,
; st(2) = multiplier,
; st(3) = addend
fsub st(0), st(0) ; st(0) = multiplicand - multiplicand,
; st(1) = multiplicand,
; st(2) = multiplier,
; st(3) = addend
fcomip st(0), st(1) ; st(0) = multiplicand,
; st(1) = multiplier,
; st(2) = addend
je Lmultiplicand ; multiplicand = ±0.0?
; multiplicand = ±INFINITY?
; multiplicand = INDEFINITE?
fld st(0) ; st(0) = multiplicand,
; st(1) = multiplicand,
; st(2) = multiplier,
; st(3) = addend
fmul st(0), st(2) ; st(0) = multiplicand * multiplier
; = product,
; st(1) = multiplicand,
; st(2) = multiplier,
; st(3) = addend
fld st(0) ; st(0) = multiplicand * multiplier
; = product,
; st(1) = multiplicand * multiplier
; = product,
; st(2) = multiplicand,
; st(3) = multiplier,
; st(4) = addend
fsub st(0), st(0) ; st(0) = product - product,
; st(1) = multiplicand * multiplier
; = product,
; st(2) = multiplicand,
; st(3) = multiplier,
; st(4) = addend
fcomip st(0), st(1) ; st(0) = multiplicand * multiplier
; = product,
; st(1) = multiplicand,
; st(2) = multiplier,
; st(3) = addend
jp Loverflow ; product = ±INFINITY?
fxch st(2) ; st(0) = multiplier,
; st(1) = multiplicand,
; st(2) = multiplicand * multiplier
; = product,
; st(3) = addend
Lveltkamp:
mov eax, not 0 shl 26
and [esp+4], eax
and [esp+12], eax
fld real8 ptr [esp+4] ; st(0) = upper half of multiplicand,
; st(1) = multiplier,
; st(2) = multiplicand,
; st(3) = multiplicand * multiplier,
; st(4) = addend
fsub st(2), st(0) ; st(0) = upper half of multiplicand,
; st(1) = multiplier,
; st(2) = lower half of multiplicand,
; st(3) = multiplicand * multiplier,
; st(4) = addend
fld real8 ptr [esp+12] ; st(0) = upper half of multiplier,
; st(1) = upper half of multiplicand,
; st(2) = multiplier,
; st(3) = lower half of multiplicand,
; st(4) = multiplicand * multiplier,
; st(5) = addend
fsub st(2), st(0) ; st(0) = upper half of multiplier,
; st(1) = upper half of multiplicand,
; st(2) = lower half of multiplier,
; st(3) = lower half of multiplicand,
; st(4) = multiplicand * multiplier,
; st(5) = addend
fld st(0) ; st(0) = upper half of multiplier,
; st(1) = upper half of multiplier,
; st(2) = upper half of multiplicand,
; st(3) = lower half of multiplier,
; st(4) = lower half of multiplicand,
; st(5) = multiplicand * multiplier,
; st(6) = addend
Ldekker:
fmul st(0), st(2) ; st(0) = upper half of multiplier
; * upper half of multiplicand,
; st(1) = upper half of multiplier,
; st(2) = upper half of multiplicand
; st(3) = lower half of multiplier,
; st(4) = lower half of multiplicand,
; st(5) = multiplicand * multiplier,
; st(6) = addend
fsub st(0), st(5) ; st(0) = upper half of multiplier
; * upper half of multiplicand
; - multiplicand * multiplier,
; st(1) = upper half of multiplier,
; st(2) = upper half of multiplicand,
; st(3) = lower half of multiplier,
; st(4) = lower half of multiplicand,
; st(5) = multiplicand * multiplier,
; st(6) = addend
fxch st(2) ; st(0) = upper half of multiplicand,
; st(1) = upper half of multiplier,
; st(2) = upper half of multiplier
; * upper half of multiplicand
; - multiplicand * multiplier,
; st(3) = lower half of multiplier,
; st(4) = lower half of multiplicand,
; st(5) = multiplicand * multiplier,
; st(6) = addend
fmul st(0), st(3) ; st(0) = upper half of multiplicand
; * lower half of multiplier,
; st(1) = upper half of multiplier,
; st(2) = upper half of multiplier
; * upper half of multiplicand
; - multiplicand * multiplier,
; st(3) = lower half of multiplier,
; st(4) = lower half of multiplicand,
; st(5) = multiplicand * multiplier,
; st(6) = addend
faddp st(2), st(0) ; st(0) = upper half of multiplier,
; st(1) = upper half of multiplier
; * upper half of multiplicand
; - multiplicand * multiplier
; + upper half of multiplicand
; * lower half of multiplier,
; st(2) = lower half of multiplier,
; st(3) = lower half of multiplicand,
; st(4) = multiplicand * multiplier,
; st(5) = addend
fmul st(0), st(3) ; st(0) = upper half of multiplier
; * lower half of multiplicand,
; st(1) = upper half of multiplier
; * upper half of multiplicand
; - multiplicand * multiplier
; + upper half of multiplicand
; * lower half of multiplier,
; st(2) = lower half of multiplier,
; st(3) = lower half of multiplicand,
; st(4) = multiplicand * multiplier,
; st(5) = addend
faddp st(1), st(0) ; st(0) = upper half of multiplier
; * upper half of multiplicand
; - multiplicand * multiplier
; + upper half of multiplicand
; * lower half of multiplier
; + upper half of multiplier
; * lower half of multiplicand,
; st(1) = lower half of multiplier,
; st(2) = lower half of multiplicand,
; st(3) = multiplicand * multiplier,
; st(4) = addend
fxch st(2) ; st(0) = lower half of multiplicand,
; st(1) = lower half of multiplier,
; st(2) = upper half of multiplier
; * upper half of multiplicand
; - multiplicand * multiplier
; + upper half of multiplicand
; * lower half of multiplier
; + upper half of multiplier
; * lower half of multiplicand,
; st(3) = multiplicand * multiplier,
; st(4) = addend
fmulp st(1), st(0) ; st(0) = lower half of multiplier
; * lower half of multiplicand,
; st(1) = upper half of multiplier
; * upper half of multiplicand
; - multiplicand * multiplier
; + upper half of multiplicand
; * lower half of multiplier
; + upper half of multiplier
; * lower half of multiplicand,
; st(2) = multiplicand * multiplier,
; st(3) = addend
faddp st(1), st(0) ; st(0) = upper half of multiplier
; * upper half of multiplicand
; - multiplicand * multiplier
; + upper half of multiplicand
; * lower half of multiplier
; + upper half of multiplier
; * lower half of multiplicand
; + lower half of multiplier
; * lower half of multiplicand
; = tail part of (intermediate) product,
; st(1) = multiplicand * multiplier
; = head part of (intermediate) product,
; st(2) = addend
double s = x + y;
double t = s - x;
*l = (x + (t - s)) + (y - t);
Lmoller:
fxch st(2) ; st(0) = addend,
; st(1) = head part of (intermediate) product,
; st(2) = tail part of (intermediate) product
???
Lknuth:
???
Lfinal:
???
Laddend:
jp Lexit ; addend = INDEFINITE?
; addend = ±INFINITY?
fld real8 ptr [esp+12] ; st(0) = multiplier,
; st(1) = addend
; = ±0.0
Lmultiplier:
fld real8 ptr [esp+4] ; st(0) = multiplicand,
; st(1) = multiplier,
; st(2) = addend
Lmultiplicand:
fmulp st(1), st(0) ; st(0) = multiplier * multiplicand
faddp st(1), st(0) ; st(0) = multiplier * multiplicand + addend
Lexit:
ret
Loverflow:
mov eax, [esp+24] ; eax = high dword of addend
xor eax, [esp+16] ; eax = high dword of addend
; ^ high dword of multiplier
xor eax, [esp+8] ; eax = high dword of addend
; ^ high dword of multiplier
; ^ high dword of multiplicand
jns Linfinity ; (addend < 0.0) = (product < 0.0)?
; (sign of addend = sign of product?)
push (bias - 1) shl width mantissa
; [esp] = 0x3F000000
; = 0.5F
fld real4 ptr [esp] ; st(0) = 0.5,
; st(1) = ±INFINITY,
; st(2) = multiplicand,
; st(3) = multiplier,
; st(4) = addend
pop eax
fstp st(1) ; st(0) = 0.5,
; st(1) = multiplicand,
; st(2) = multiplier,
; st(3) = addend
fmul st(3), st(0) ; st(0) = 0.5,
; st(1) = multiplicand,
; st(2) = multiplier,
; st(3) = addend * 0.5
; = addend'
fmulp st(2), st(0) ; st(0) = multiplicand,
; st(1) = multiplier * 0.5
; = multiplier',
; st(2) = addend * 0.5
fld st(0) ; st(0) = multiplicand,
; st(1) = multiplicand,
; st(2) = multiplier * 0.5
; = multiplier',
; st(3) = addend * 0.5
; = addend'
fmul st(0), st(2) ; st(0) = multiplicand * multiplier'
; = product',
; st(1) = multiplicand,
; st(2) = multiplier * 0.5
; = multiplier',
; st(3) = addend * 0.5
; = addend'
???
Linfinity:
fstp st(1) ; st(0) = product
; = ±INFINITY,
; st(1) = multiplier,
; st(2) = addend
fstp st(1) ; st(0) = product
; = ±INFINITY,
; st(1) = addend
fstp st(1) ; st(0) = product
; = ±INFINITY
ret
fma endp
end
fmod()
Functionfmod()
returns the remainder from the division of its arguments.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double fma(double x, double y, double z);
double trunc(double x);
double fmod(double dividend, double divisor)
{
#ifdef TRUNC
double quotient = trunc(dividend / divisor)
#else
double tmp, quotient = dividend / divisor;
if ((quotient > 0.0) && (quotient < 0x1.0p+52)) {
tmp = quotient;
quotient += 0x1.0p+52;
quotient -= 0x1.0p+52;
if (quotient > tmp)
quotient -= 1.0;
} else if ((quotient < 0.0) && (quotient > -0x1.0p+52)) {
tmp = quotient;
quotient -= 0x1.0p+52;
quotient += 0x1.0p+52;
if (quotient < tmp)
quotient += 1.0;
}
#endif
#if 0 // avoid subtractive cancellation
return quotient == 0.0 ? dividend : dividend - divisor * quotient;
#else
return fma(-quotient, divisor, dividend);
#endif
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# NOTE: requires SSE 4.1 instruction set!
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = dividend
# xmm1 = divisor
fmod:
movsd xmm0, xmm2 # xmm2 = dividend
divsd xmm2, xmm1 # xmm2 = dividend / divisor
# = quotient
roundsd xmm2, xmm2, 3 # xmm2 = trunc(quotient)
mulsd xmm1, xmm2 # xmm1 = divisor * trunc(quotient)
subsd xmm0, xmm1 # xmm0 = dividend - divisor * trunc(quotient)
# = remainder
ret
.size fmod, .-fmod
.type fmod, @function
.global fmod
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/20dckbeh.aspx
; fmod(dividend, divisor) = dividend % divisor
; = dividend - divisor * trunc(dividend / divisor)
.686
.model flat, C
.code
fmod proc public ; [esp+12] = divisor
; [esp+4] = dividend
fld real8 ptr [esp+12] ; st(0) = divisor
fld real8 ptr [esp+4] ; st(0) = dividend,
; st(1) = divisor
Lreduce:
fprem ; st(0) = remainder,
; st(1) = divisor
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jp Lreduce
fstp st(1) ; st(0) = remainder
ret
fmod endp
end
fpclassify()
Functionfpclassify()
returns the implementation-defined category of its argument.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
#define FP_ZERO 0
#define FP_SUBNORMAL 1
#define FP_NORMAL 2
#define FP_INFINITE 3
#define FP_NAN 4
#define INFINITY 0x1.0p+1024
#define MINIMUM 0x1.0p-1022
double fabs(double x);
int fpclassify(double argument)
{
#if 1
unsigned long long ull = *(unsigned long long *) &double << 1;
if (ull == 0)
return FP_ZERO;
if (ull < (1ULL << 53))
return FP_SUBNORMAL;
if (ull < (2047ULL << 53))
return FP_NORMAL;
if (ull == (2047ULL << 53))
return FP_INFINITE;
#else
if (argument == 0.0)
return FP_ZERO;
argument = fabs(argument);
if (argument < MINIMUM)
return FP_SUBNORMAL;
if (argument < INFINITY)
return FP_NORMAL;
if (argument == INFINITY)
return FP_INFINITE;
#endif
return FP_NAN;
}
frexp()
Functionfrexp()
returns the normalized fraction and the (integral) exponent of its
first argument.
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/w1xfschh.aspx
.686
.model flat, C
.code
single record sign:1, exponent:8, mantissa:23
bias equ 1 shl (width exponent - 1) - 1
frexp proc public ; [esp+12] = address of exponent
; [esp+4] = argument
if 0
fld1 ; st(0) = 1.0
fchs ; st(0) = -1.0
fld real8 ptr [esp+4] ; st(0) = argument,
; st(1) = -1.0
fxtract ; st(0) = argument / 2.0**exponent
; = mantissa,
; st(1) = exponent,
; st(2) = -1.0
fxch st(1) ; st(0) = exponent,
; st(1) = mantissa,
; st(2) = -1.0
fsub st(0), st(2) ; st(0) = exponent + 1.0,
; st(1) = mantissa,
; st(2) = -1.0
mov eax, [esp+12] ; eax = address of exponent
fistp dword ptr [eax] ; [eax] = exponent + 1.0,
; st(0) = mantissa,
; st(1) = -1.0
fscale ; st(0) = mantissa / 2.0,
; st(1) = -1.0
fstp st(1) ; st(0) = mantissa / 2.0
else
fld real8 ptr [esp+4] ; st(0) = argument
fxtract ; st(0) = argument / 2.0**exponent
; = mantissa,
; st(1) = exponent
fxch st(1) ; st(0) = exponent,
; st(1) = mantissa
mov eax, [esp+12] ; eax = address of exponent
fistp dword ptr [eax] ; [eax] = exponent,
; st(0) = mantissa
inc dword ptr [eax] ; [eax] = exponent + 1
push (bias - 1) shl width mantissa
; [esp] = 0x3F000000
; = 0.5F
fmul real4 ptr [esp] ; st(0) = mantissa / 2.0
pop eax
endif
ret
frexp endp
end
isfinite()
Functionisfinite()
returns non-zero if its argument is a finite floating-point number.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
int isfinite(double argument)
{
return (*(unsigned long long *) &argument << 1) < (2047ULL << 53);
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
isfinite:
movq rdx, xmm0 # rdx = argument
add rdx, rdx # rdx = argument << 1
# = |argument| << 1
mov rax, 0xFFE0000000000000 # rax = 0x1.0p+1024 << 1
cmp rdx, rax
setb al # eax = (|argument| < 0x1.0p+1024) ? 1 : 0
mov eax, eax # rax = (|argument| < 0x1.0p+1024) ? 1 : 0
ret
.size isfinite, .-isfinite
.type isfinite, @function
.global isfinite
.end
isinf()
Functionisinf()
returns non-zero if its argument is +∞ or −∞.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
int isinf(double argument)
{
return (*(unsigned long long *) &argument << 1) == (2047ULL << 53);
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
isinf:
movq rdx, xmm0 # rdx = argument
add rdx, rdx # rdx = argument << 1
# = |argument| << 1
mov rax, 0xFFE0000000000000 # rax = 0x1.0p+1024 << 1
cmp rdx, rax
sete al # eax = (|argument| = 0x1.0p+1024) ? 1 : 0
mov eax, eax # rax = (|argument| = 0x1.0p+1024) ? 1 : 0
ret
.size isinf, .-isinf
.type isinf, @function
.global isinf
.end
isnan()
Functionisnan()
returns non-zero if its argument is a
NaN.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
int isnan(double argument)
{
return (*(unsigned long long *) &argument << 1) > (2047ULL << 53);
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
isnan:
movq rdx, xmm0 # rdx = argument
add rdx, rdx # rdx = argument << 1
# = |argument| << 1
mov rax, 0xFFE0000000000000 # rax = 0x1.0p+1024 << 1
cmp rdx, rax
seta al # eax = (|argument| > 0x1.0p+1024) ? 1 : 0
mov eax, eax # rax = (|argument| > 0x1.0p+1024) ? 1 : 0
ret
.size isnan, .-isnan
.type isnan, @function
.global isnan
.end
isnormal()
Functionisnormal()
returns non-zero if its argument is a non-zero finite floating-point
number.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
int isnormal(double argument)
{
return ((*(unsigned long long *) &argument << 1) < (2047ULL << 53))
&& ((*(unsigned long long *) &argument << 1) != 0);
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
isnormal:
movq rdx, xmm0 # rdx = argument
add rdx, rdx # rdx = argument << 1
# = |argument| << 1
mov rax, 0xFFE0000000000000 # rax = 0x1.0p+1024 << 1
seta cl # cl = (|argument| <> 0.0) ? 1 : 0
cmp rdx, rax
seta al # eax = (|argument| < 0x1.0p+1024) ? 1 : 0
and eax, ecx # rax = (0.0 < |argument| < 0x1.0p+1024) ? 1 : 0
ret
.size isnormal, .-isnormal
.type isnormal, @function
.global isnormal
.end
issubnormal()
Functionissubnormal()
returns non-zero if its
argument is a (non-zero) subnormal floating-point number.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
int issubnormal(double argument)
{
return ((*(unsigned long long *) &argument << 1) < (1ULL << 53))
&& ((*(unsigned long long *) &argument << 1) != 0);
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
issubnormal:
movq rdx, xmm0 # rdx = argument
add rdx, rdx # rdx = argument << 1
# = |argument| << 1
mov rax, 0x0020000000000000 # rax = 0x1.0p-1022 << 1
seta cl # cl = (|argument| <> 0.0) ? 1 : 0
cmp rdx, rax
setb al # eax = (|argument| < 0x1.0p-1022) ? 1 : 0
and eax, ecx # rax = (0.0 < |argument| < 0x1.0p-1022) ? 1 : 0
ret
.size issubnormal, .-issubnormal
.type issubnormal, @function
.global issubnormal
.end
ldexp()
Functionldexp()
returns its first argument multiplied by 2 raised to the power of
its (integral) second argument.
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/zx52ds7f.aspx
; https://msdn.microsoft.com/en-us/library/dn465179.aspx
; ldexp(x, n) = x * 2**n
; scalbn(x, n) = x * 2**n
.686
.model flat, C
.code
ldexp proc public ; [esp+12] = exponent
scalbn proc public ; [esp+4] = argument
fild dword ptr [esp+12] ; st(0) = exponent
fld real8 ptr [esp+4] ; st(0) = argument,
; st(1) = exponent
fscale ; st(0) = argument * 2.0**exponent,
; st(1) = exponent
fstp st(1) ; st(0) = argument * 2.0**exponent
ret
scalbn endp
ldexp endp
end
ldexp10()
Function// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
#define INFINITY (1.0 / 0.5e-323)
// powers of 5 from 5**0 up to 5**22 (less than 2**53, hence exact)
static const double powers5[] = {1.0,
#if 0
1.0e+1 * 0x1.0p-1,
1.0e+2 * 0x1.0p-2,
1.0e+3 * 0x1.0p-3,
1.0e+4 * 0x1.0p-4,
1.0e+5 * 0x1.0p-5,
1.0e+6 * 0x1.0p-6,
1.0e+7 * 0x1.0p-7,
1.0e+8 * 0x1.0p-8,
1.0e+9 * 0x1.0p-9,
1.0e+10 * 0x1.0p-10,
1.0e+11 * 0x1.0p-11,
1.0e+12 * 0x1.0p-12,
1.0e+13 * 0x1.0p-13,
1.0e+14 * 0x1.0p-14,
1.0e+15 * 0x1.0p-15,
1.0e+16 * 0x1.0p-16,
1.0e+17 * 0x1.0p-17,
1.0e+18 * 0x1.0p-18,
1.0e+19 * 0x1.0p-19,
1.0e+20 * 0x1.0p-20,
1.0e+21 * 0x1.0p-21,
1.0e+22 * 0x1.0p-22};
#else
5.0,
25.0,
125.0,
625.0,
3125.0,
15625.0,
78125.0,
390625.0,
1953125.0,
9765625.0,
48828125.0,
244140625.0,
1220703125.0,
6103515625.0,
30517578125.0,
152587890625.0,
762939453125.0,
3814697265625.0,
19073486328125.0,
95367431640625.0,
476837158203125.0,
2384185791015625.0};
#endif
// powers of 5 from 5**0 up to 5**(23*19) in steps of 5**23
static const double powers5positive[] = {1.0,
#if 0
1.0e+23 * 0x1.0p-23,
1.0e+46 * 0x1.0p-46,
1.0e+69 * 0x1.0p-69,
1.0e+92 * 0x1.0p-92,
1.0e+115 * 0x1.0p-115,
1.0e+138 * 0x1.0p-138,
1.0e+161 * 0x1.0p-161,
1.0e+184 * 0x1.0p-184,
1.0e+207 * 0x1.0p-207,
1.0e+230 * 0x1.0p-230,
1.0e+253 * 0x1.0p-253,
1.0e+276 * 0x1.0p-276,
1.0e+299 * 0x1.0p-299,
1.0e+322 * 0x1.0p-322,
1.0e+345 * 0x1.0p-345,
1.0e+368 * 0x1.0p-368,
1.0e+391 * 0x1.0p-391,
1.0e+414 * 0x1.0p-414,
1.0e+437 * 0x1.0p-437};
#elif 0
0x1.52D02C7E14AF6p+53,
0x1.C06A5EC5433C6p+106,
0x1.28BC8ABE49F64p+160,
0x1.88BA3BF284E24p+213,
0x1.03E29F5C2B18Cp+267,
0x1.57F48BB41DB7Cp+320,
0x1.C73892ECBFBF4p+373,
0x1.2D3D6F88F0B3Dp+427,
0x1.8EB0138858D0Ap+480,
0x1.07D457124123Dp+534,
0x1.5D2CE55747A18p+587,
0x1.CE2137F743382p+640,
0x1.31CFD3999F7B0p+694,
0x1.94BD136316C04p+747,
0x1.0BD561C834D28p+801,
0x1.627987065DE19p+854,
0x1.D524B49F94CA1p+907,
0x1.3673FAEB68902p+961,
0x1.9AE1957B849F0p+1014};
#else
1.1920928955078125e+16,
1.4210854715202004e+32,
1.6940658945086007e+48,
2.0194839173657902e+64,
2.4074124304840448e+80,
2.8698592549372254e+96,
3.4211388289180104e+112,
4.0783152924990778e+128,
4.8617306858290170e+144,
5.7956346104490959e+160,
6.9089348440755557e+176,
8.2360921431488463e+192,
9.8181869305954531e+208,
1.1704190886730495e+225,
1.3952482803738708e+241,
1.6632655625031839e+257,
1.9827670604028510e+273,
2.3636425261531484e+289,
2.8176814629473071e+305};
#endif
// powers of 5 from 5**-0 down to 5**(-23*19) in steps of 5**-23
static const double powers5negative[] = {1.0,
#if 0
1.0e-23 * 0x1.0p+23,
1.0e-46 * 0x1.0p+46,
1.0e-69 * 0x1.0p+69,
1.0e-92 * 0x1.0p+92,
1.0e-115 * 0x1.0p+115,
1.0e-138 * 0x1.0p+138,
1.0e-161 * 0x1.0p+161,
1.0e-184 * 0x1.0p+184,
1.0e-207 * 0x1.0p+207,
1.0e-230 * 0x1.0p+230,
1.0e-253 * 0x1.0p+253,
1.0e-276 * 0x1.0p+276,
1.0e-299 * 0x1.0p+299,
1.0e-322 * 0x1.0p+322,
1.0e-345 * 0x1.0p+345,
1.0e-368 * 0x1.0p+368,
1.0e-391 * 0x1.0p+391,
1.0e-414 * 0x1.0p+414,
1.0e-437 * 0x1.0p+437};
#elif 0
0x1.82DB34012B251p-54,
0x1.244CE242C5561p-107,
0x1.B9B6364F30304p-161,
0x1.4DBF7B3F71CB7p-214,
0x1.F8587E7083E30p-268,
0x1.7D12A4670C123p-321,
0x1.1FEE341FC585Dp-374,
0x1.B31BB5DC320D2p-428,
0x1.48C22CA71A1BDp-481,
0x1.F0CE4839198DBp-535,
0x1.77603725064A8p-588,
0x1.1BA03F5B21000p-641,
0x1.AC9A7B3B7302Fp-695,
0x1.43D7F68432923p-748,
0x1.E960ED3C8FD6Bp-802,
0x1.71C3978517DE1p-855,
0x1.1762C3F35BDA3p-908,
0x1.A63225B3E7F4Cp-962,
0x1.3F008FC1D0D46p-1015};
#else
8.388608e-17,
7.0368744177664e-33,
5.9029581035870565e-49,
4.9517601571415211e-65,
4.1538374868278621e-81,
3.4844914372704099e-97,
2.9230032746618058e-113,
2.4519928653854222e-129,
2.0568806966515076e-145,
1.7254365866976409e-161,
1.4474011154664524e-177,
1.2141680576410807e-193,
1.0185179881672430e-209,
8.5439481436836403e-226,
7.1671831749689735e-242,
6.0122690119010131e-258,
5.0434567931384933e-274,
4.2307582002575910e-290,
3.5490172084746430e-306};
#endif
enum {
count5 = sizeof(powers5) / sizeof(*powers5),
count5positive = sizeof(powers5positive) / sizeof(*powers5positive),
count5negative = sizeof(powers5negative) / sizeof(*powers5negative)
};
double fabs(double x);
double ldexp(double x, int z);
double ldexp10(double argument, int exponent)
{
if (argument != argument)
return INDEFINITE;
if (argument == 0.0)
return argument;
if (fabs(argument) == INFINITY)
return argument;
if (exponent > 0) {
if (exponent > 324 + 308 - 1)
return argument < 0.0 ? -INFINITY : INFINITY;
if (exponent > count5 * count5positive - 1) {
argument *= 1.0e+303;
exponent -= 303;
}
return ldexp(argument, exponent)
* powers5positive[exponent / count5]
* powers5[exponent % count5];
}
if (exponent < 0) {
if (exponent < 1 - 324 - 308)
return argument < 0.0 ? -0.0 : 0.0;
if (exponent < 1 - count5 * count5negative) {
argument /= 1.0e+303;
exponent += 303;
}
return ldexp(argument, exponent)
* powers5negative[-exponent / count5]
/ powers5[-exponent % count5];
}
return argument;
}
remainder()
Functionremainder()
returns the remainder from the division of its arguments, with the
quotient rounded according to the current mode.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double fma(double x, double y, double z);
double rint(double x);
double remainder(double dividend, double divisor)
{
#ifdef RINT
double quotient = rint(dividend / divisor);
#else
double quotient = dividend / divisor;
if ((quotient > 0.0) && (quotient < 0x1.0p+52)) {
quotient += 0x1.0p+52;
quotient -= 0x1.0p+52;
} else if ((quotient < 0.0) && (quotient > -0x1.0p+52)) {
quotient -= 0x1.0p+52;
quotient += 0x1.0p+52;
}
#endif
#if 0 // avoid subtractive cancellation
return quotient == 0.0 ? dividend : dividend - divisor * quotient;
#else
return fma(-quotient, divisor, dividend);
#endif
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# NOTE: requires SSE 4.1 instruction set!
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = dividend
# xmm1 = divisor
remainder:
movsd xmm0, xmm2 # xmm2 = dividend
divsd xmm2, xmm1 # xmm2 = dividend / divisor
# = quotient
roundsd xmm2, xmm2, 4 # xmm2 = rint(quotient)
mulsd xmm1, xmm2 # xmm1 = divisor * rint(quotient)
subsd xmm0, xmm1 # xmm0 = dividend - divisor * rint(quotient)
# = remainder
ret
.size remainder, .-remainder
.type remainder, @function
.global remainder
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/dn465170.aspx
.686
.model flat, C
.code
remainder proc public ; [esp+12] = dividend
; [esp+4 ] = divisor
fld real8 ptr [esp+12] ; st(0) = divisor
fld real8 ptr [esp+4] ; st(0) = dividend,
; st(1) = divisor
Lreduce:
fprem1 ; st(0) = remainder,
; st(1) = divisor
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jp Lreduce
fstp st(1) ; st(0) = remainder
ret
remainder endp
end
remquo()
Functionremquo()
returns the remainder and the (partial) integral quotient from the
division of its arguments.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double fma(double x, double y, double z);
double trunc(double x);
double remquo(double dividend, double divisor, int *quotient)
{
#ifdef TRUNC
double ratio = trunc(dividend / divisor);
#else
double tmp, ratio = dividend / divisor;
if ((ratio > 0.0) && (ratio < 0x1.0p+52)) {
tmp = ratio;
ratio += 0x1.0p+52;
ratio -= 0x1.0p+52;
if (ratio > tmp)
ratio -= 1.0;
} else if ((ratio < 0.0) && (ratio > -0x1.0p+52)) {
tmp = ratio;
ratio -= 0x1.0p+52;
ratio += 0x1.0p+52;
if (ratio < tmp)
ratio += 1.0;
}
#endif
*quotient = (int) ratio;
#if 0 // avoid subtractive cancellation
return ratio == 0.0 ? dividend : dividend - divisor * ratio;
#else
return fma(-ratio, divisor, dividend);
#endif
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# NOTE: requires SSE 4.1 instruction set!
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = dividend
# xmm1 = divisor
# rdi = address of quotient
remquo:
movsd xmm0, xmm2 # xmm2 = dividend
divsd xmm2, xmm1 # xmm2 = dividend / divisor
# = quotient
roundsd xmm2, xmm2, 3 # xmm2 = trunc(quotient)
mulsd xmm1, xmm2 # xmm1 = divisor * trunc(quotient)
subsd xmm0, xmm1 # xmm0 = dividend - divisor * trunc(quotient)
# = remainder
cvtsd2si eax, xmm2 # eax = trunc(quotient)
mov [rdi], eax # *quotient = trunc(quotient)
ret
.size remquo, .-remquo
.type remquo, @function
.global remquo
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/dn465175.aspx
.686
.model flat, C
.code
remquo proc public ; [esp+20] = address of (partial) quotient
; [esp+12] = divisor
; [esp+4] = dividend
fld real8 ptr [esp+12] ; st(0) = divisor
fld real8 ptr [esp+4] ; st(0) = dividend,
; st(1) = divisor
mov ecx, [esp+20] ; ecx = address of quotient
mov eax, [esp+16] ; eax = high dword of divisor
xor eax, [esp+8] ; eax = high dword of divisor
; ^ high dword of dividend
cdq ; edx = (sign of dividend <> sign of divisor) ? -1 : 0
Lreduce:
fprem1 ; st(0) = dividend modulo divisor,
; st(1) = divisor,
; C0:C3:C1 = least significant bits of quotient
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jp Lreduce
fstp st(1) ; st(0) = dividend modulo divisor
; = remainder
Lquotient:
and eax, 4300h ; eax = 0b0:C3:0000:C1:C0:00000000
imul eax, 910000h
shr eax, 29 ; eax = C0:C3:C1
; = (partial) quotient
Lsign:
xor eax, edx
sub eax, edx ; eax = (sign of dividend <> sign of divisor)
; ? -quotient : quotient
mov [ecx], eax
ret
remquo endp
end
rint()
Functionrint()
returns the according to the current rounding mode nearest integral
value to its argument.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double rint(double argument)
{
if ((argument > 0.0) && (argument < 0x1.0p+52)) {
argument += 0x1.0p+52;
argument -= 0x1.0p+52;
} else if ((argument < 0.0) && (argument > -0x1.0p+52)) {
argument -= 0x1.0p+52;
argument += 0x1.0p+52;
if (argument == 0.0)
argument = -0.0;
} else if (argument != 0.0)
argument += 0.0;
return argument;
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# NOTE: requires SSE 4.1 instruction set!
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
rint:
roundsd xmm0, xmm0, 4 # xmm0 = argument rounded according to current mode
ret
.size rint, .-rint
.type rint, @function
.global rint
.end
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# NOTE: depending on current rounding mode, rint() is equivalent to
# floor(), ceil(), roundeven() or trunc(), but differs from
# round(); while roundeven() breaks ties to the nearest even
# integer, round() breaks ties away from 0, what neither CPU
# nor FPU support in their instruction sets!
# NOTE: rint() preserves -0.0, and returns -0.0 for argument in
# [-0.5, -0.0] or (-1.0, -0.0]
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
rint:
mov rax, 0x4330000000000000
movq xmm2, rax # xmm2 = 0x1.0p+52
# = 4503599627370496.0
# = minimum non-fractional number
xorpd xmm1, xmm1 # xmm1 = 0.0
subsd xmm1, xmm0 # xmm1 = -argument
andpd xmm1, xmm0 # xmm1 = |argument|
xorpd xmm0, xmm1 # xmm0 = (argument & -0.0) ? -0.0 : +0.0
addsd xmm1, xmm2 # xmm1 = |argument| + 0x1.0p+52
subsd xmm1, xmm2 # xmm1 = |argument| - 0x1.0p+52
# = rint(|argument|)
orpd xmm0, xmm1 # xmm0 = rint(argument)
ret
.size rint, .-rint
.type rint, @function
.global rint
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/dn465165.aspx
; NOTE: depending on current rounding mode, rint() is equivalent to
; floor(), ceil(), roundeven() or trunc(), but differs from
; round(); while roundeven() breaks ties to the nearest even
; integer, round() rounds ties away from 0, what neither FPU
; nor CPU support in their instruction sets!
.686
.model flat, C
.code
rint proc public ; [esp+4] = argument
fld real8 ptr [esp+4] ; st(0) = argument
frndint ; st(0) = rint(argument)
ret
rint endp
end
round()
Functionround()
returns the nearest integral value to its argument, rounding ties
away from 0.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double round(double argument)
{
#ifdef TRUNC
double trunc(double x);
double tmp = trunc(argument);
if (argument > 0.0)
return argument - tmp < 0.5 ? tmp : tmp + 1.0;
if (argument < 0.0)
return argument - tmp > -0.5 ? tmp : tmp - 1.0;
return tmp;
#else
double tmp;
if ((argument > 0.0) && (argument < 0x1.0p+52)) {
tmp = argument;
argument += 0x1.0p+52;
argument -= 0x1.0p+52;
if (argument - tmp <= -0.5)
argument += 1.0;
} else if ((argument < 0.0) && (argument > -0x1.0p+52)) {
tmp = argument;
argument -= 0x1.0p+52;
argument += 0x1.0p+52;
if (argument - tmp >= 0.5)
argument -= 1.0;
else if (argument == 0.0)
argument = -0.0;
} else if (argument != 0.0)
argument += 0.0;
return argument;
#endif
}
roundeven()
Functionroundeven()
returns the nearest integral
value to its argument, rounding ties to even.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# NOTE: requires SSE 4.1 instruction set!
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
roundeven:
roundsd xmm0, xmm0, 0 # xmm0 = argument rounded to nearest (even) integer
ret
.size roundeven, .-roundeven
.type roundeven, @function
.global roundeven
.end
signbit()
Functionsignbit()
returns 1 if the sign of its argument is negative, else 0.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
int signbit(double argument)
{
return *(long long *) &argument < 0;
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
signbit:
movmskpd eax, xmm0 # rax = (argument & -0.0) ? 0b?1 : 0b?0
and eax, 1 # rax = (argument & -0.0) ? 1 : 0
# = signbit(argument)
ret
.size signbit, .-signbit
.type signbit, @function
.global signbit
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.686
.model flat, C
.code
signbit proc public ; [esp+4] = argument
mov eax, [esp+8] ; eax = high dword of argument
shr eax, 31 ; eax = (argument & -0.0) ? 1 : 0
ret
signbit endp
end
sqrt()
Functionsqrt()
returns the positive square root of its argument.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
sqrt:
sqrtsd xmm0, xmm0 # xmm0 = square root of argument
ret
.size sqrt, .-sqrt
.type sqrt, @function
.global sqrt
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/f1xa99e6.aspx
.686
.model flat, C
.code
sqrt proc public ; [esp+4] = argument
fld real8 ptr [esp+4] ; st(0) = argument
fsqrt ; st(0) = square root of argument
ret
sqrt endp
end
trunc()
Functiontrunc()
returns the by magnitude largest integral value not greater than its
argument.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double trunc(double argument)
{
double tmp;
if ((argument > 0.0) && (argument < 0x1.0p+52)) {
tmp = argument;
argument += 0x1.0p+52;
argument -= 0x1.0p+52;
if (argument > tmp)
argument -= 1.0;
} else if ((argument < 0.0) && (argument > -0x1.0p+52)) {
tmp = argument;
argument -= 0x1.0p+52;
argument += 0x1.0p+52;
if (argument < tmp)
argument += 1.0;
else if (argument == 0.0)
argument = -0.0;
} else if (argument != 0.0)
argument += 0.0;
return argument;
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# NOTE: requires SSE 4.1 instruction set!
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
trunc:
roundsd xmm0, xmm0, 3 # xmm0 = argument rounded towards zero
ret
.size trunc, .-trunc
.type trunc, @function
.global trunc
.end
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# NOTE: trunc() returns -0.0 for argument in (-1.0, -0.0]
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = argument
trunc:
mov rax, 0x4330000000000000
movq xmm2, rax # xmm2 = 0x1.0p+52
# = 4503599627370496.0
# = minimum non-fractional number
mov rax, 0x3FF0000000000000
xorpd xmm1, xmm1 # xmm1 = 0.0
subsd xmm1, xmm0 # xmm1 = -argument
andpd xmm1, xmm0 # xmm1 = |argument|
xorpd xmm0, xmm1 # xmm0 = (argument & -0.0) ? -0.0 : +0.0
movsd xmm3, xmm1 # xmm3 = |argument|
addsd xmm1, xmm2 # xmm1 = |argument| + 0x1.0p+52
subsd xmm1, xmm2 # xmm1 = |argument| - 0x1.0p+52
# = rint(|argument|)
movq xmm2, rax # xmm2 = 0x1.0p+0
# = 1.0
cmpsd xmm3, xmm1, 1 # xmm3 = (|argument| < rint(|argument|)) ? ~0L : 0L
andpd xmm3, xmm2 # xmm3 = (|argument| < rint(|argument|)) ? 1.0 : 0.0
subsd xmm1, xmm3 # xmm1 = (|argument| < rint(|argument|)) ? -1.0 : 0.0
# + rint(|argument|)
# = trunc(|argument|)
orpd xmm0, xmm1 # xmm0 = trunc(argument)
ret
.size trunc, .-trunc
.type trunc, @function
.global trunc
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/mt720727.aspx
.686
.model flat, C
.code
trunc proc public ; [esp+4] = argument
fld real8 ptr [esp+4] ; st(0) = argument
if 0
ftst
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jz Lexit ; argument = ±0.0?
fld1 ; st(0) = 1.0,
; st(1) = argument
fld st(1) ; st(0) = argument,
; st(1) = 1.0,
; st(2) = argument
Lmodulo:
fprem ; st(0) = argument modulo 1.0
; = argument',
; st(1) = 1.0,
; st(2) = argument
fstsw ax ; ax = FPU status word,
; ah = B:C3:T:O:P:C2:C1:C0
sahf ; SF:ZF:0:AF:0:PF:1:CF = ah
jp Lmodulo ; |argument'| >= 1.0?
fstp st(1) ; st(0) = argument',
; st(1) = argument
fsubp st(1), st(0) ; st(0) = argument - argument'
; = trunc(argument)
Lexit:
else
fld st(0) ; st(0) = argument,
; st(1) = argument
fabs ; st(0) = |argument|,
; st(1) = argument
fld st(0) ; st(0) = |argument|,
; st(1) = |argument|,
; st(2) = argument
frndint ; st(0) = rint(|argument|),
; st(1) = |argument|,
; st(2) = argument
fxch st(1) ; st(0) = |argument|,
; st(1) = rint(|argument|),
; st(2) = argument
fucomip st(0), st(1) ; eflags = |argument| ><=# rint(|argument|),
; st(0) = rint(|argument|),
; st(1) = argument
fldz ; st(0) = 0.0,
; st(1) = rint(|argument|),
; st(2) = argument
fld1 ; st(0) = 1.0,
; st(1) = 0.0,
; st(2) = rint(|argument|),
; st(3) = argument
fcmovnb st(0), st(1) ; st(0) = (rint(|argument|) <= |argument|) ? 0.0 : 1.0,
; st(1) = 0.0,
; st(2) = rint(|argument|),
; st(3) = argument
fsubp st(2), st(0) ; st(0) = 0.0,
; st(1) = trunc(|argument|),
; st(2) = argument
fucomip st(0), st(2) ; eflags = 0.0 ><=# argument,
; st(0) = trunc(|argument|),
; st(1) = argument
fst st(1) ; st(0) = trunc(|argument|),
; st(1) = trunc(|argument|)
fchs ; st(0) = -trunc(|argument|),
; st(1) = trunc(|argument|)
fcmovbe st(0), st(1) ; st(0) = (argument >= 0.0) ? trunc(|argument|) : -trunc(|argument|)
; = trunc(argument),
; st(1) = trunc(|argument|)
fstp st(1) ; st(0) = trunc(argument)
endif
ret
trunc endp
end
ceil()
Functionceil()
returns the smallest integral value not less than its argument.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# NOTE: ceil() returns -0.0 for argument in (-1.0, -0.0]
.arch generic64
.code64
.equiv BIAS, 1023
.intel_syntax noprefix
.text
# xmm0 = argument
ceil:
movq rax, xmm0 # rax = argument
mov rcx, rax # rcx = argument
add rcx, rcx
jz .Lexit # argument = ±0.0?
shr rcx, 53 # rcx = biased exponent of |argument|
sub ecx, BIAS # rcx = unbiased exponent of |argument|
jl .Lsmall # |argument| < 1.0?
cmp ecx, BIAS
jg .Lmxcsr # argument = ±INFINITY?
# argument = INDEFINITE?
sub ecx, 52
jge .Lexit # |argument| >= 0x1.0p+52?
neg ecx # ecx = number of bits in fractional part of mantissa
mov rdx, rax
shr rax, cl
shl rax, cl # rax = trunc(argument)
xor rdx, rax # rdx = fractional part of mantissa
movq xmm0, rax # xmm0 = trunc(argument)
neg rdx # CF = (fractional part of mantissa <> 0)
sbb ecx, ecx # ecx = (fractional part of mantissa <> 0) ? -1 : 0
shr ecx, 22 # ecx = (fractional part of mantissa <> 0) ? 0x3FF : 0
cqo # rdx = (trunc(argument) < 0.0) ? -1 : 0
not edx # edx = (trunc(argument) < 0.0) ? 0 : -1
and edx, ecx
shl rdx, 52 # rdx = (trunc(argument) < 0.0)
# | (fractional part of mantissa = 0)
# ? 0 : 0x3FF0000000000000
movq xmm1, rdx # xmm0 = (trunc(argument) < 0.0)
# | (fractional part of mantissa = 0)
# ? 0.0 : 1.0
addsd xmm0, xmm1 # xmm0 = ceil(argument)
ret
.Lsmall:
test rax, rax
jns .Lpositive
.Lnegative:
cqo # rdx = (argument & -0.0) ? -1 : 0
shl rdx, 63 # rdx = (argument & -0.0) ? 0x8000000000000000 : 0
movq xmm0, rdx # xmm0 = (argument & -0.0) ? -0.0 : 0.0
ret
.Lpositive:
mov rax, 0x3FF0000000000000
movq xmm0, rax # rax = 0x1.0p+0
# = 1.0
ret
.Lmxcsr:
addsd xmm0, xmm0
.Lexit:
ret
.size ceil, .-ceil
.type ceil, @function
.global ceil
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/atdhw2dx.aspx
; NOTE: ceil() returns -0.0 for argument in (-1.0, -0.0]
.code
double record sign:1, exponent:11, mantissa:52
bias equ 1 shl (width exponent - 1) - 1
ceil proc public ; xmm0 = argument
movd rax, xmm0 ; rax = argument
add rax, rax
jz Lexit ; argument = ±0.0?
shr rax, 1 + width mantissa ; rax = biased exponent of |argument|
cmp eax, bias + width mantissa
jae Lexit ; |argument| > 0x1.0p+52?
; (argument = integer?)
; argument = INDEFINITE?
cvtsd2si rax, xmm0 ; rax = llrint(argument)
cvtsi2sd xmm1, rax ; xmm1 = rint(argument)
comisd xmm1, xmm0 ; CF = (rint(argument) < argument)
adc rax, 0 ; rax = llrint(argument)
; + (rint(argument) < argument)
; = ceil(argument)
cvtsi2sd xmm2, rax ; xmm2 = ceil(argument)
xorpd xmm1, xmm1 ; xmm1 = 0.0
subsd xmm1, xmm0 ; xmm1 = -argument
xorpd xmm0, xmm1 ; xmm0 = (argument & -0.0) ? -0.0 : +0.0
orpd xmm0, xmm2 ; xmm0 = ceil(argument)
Lexit:
ret
ceil endp
end
Note: returns a signalingNaN unchanged!
copysign()
Functioncopysign()
returns its first operand with the sign of its second operand.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double fabs(double x);
int signbit(double x);
double copysign(double to, double from)
{
#if 0
return signbit(from) ? -fabs(to) : fabs(to);
#else
return signbit(from) == signbit(to) ? to : -to;
#endif
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = to
# xmm1 = from
copysign:
movq rcx, xmm0 # rcx = to
movq rdx, xmm1 # rdx = from
add rdx, rdx # CF = (from & -0.0)
adc rcx, rcx
ror rcx, 1
movq xmm0, rcx # xmm0 = copysign(to, from)
ret
.size copysign, .-copysign
.type copysign, @function
.global copysign
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/0yafk1hc.aspx
.686
.model flat, C
.code
copysign proc public ; [esp+12] = from
; [esp+4] = to
mov eax, [esp+16] ; eax = high dword of from
if 0
mov edx, [esp+8] ; edx = high dword of to
add eax, eax ; CF = (from & -0.0)
adc edx, edx
ror edx, 1
mov [esp+8], edx
else
shld [esp+8], eax, 1
ror dword ptr [esp+8], 1
endif
fld real8 ptr [esp+4] ; st(0) = (from & -0.0) ? -|to| : |to|
ret
copysign endp
end
floor()
Functionfloor()
returns the largest integral value not greater than its argument.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# NOTE: floor() preserves -0.0
.arch generic64
.code64
.equiv BIAS, 1023
.intel_syntax noprefix
.text
# xmm0 = argument
floor:
movq rax, xmm0 # rax = argument
mov rcx, rax # rcx = argument
add rcx, rcx
jz .Lexit # argument = ±0.0?
shr rcx, 53 # rcx = biased exponent of |argument|
sub ecx, BIAS # rcx = unbiased exponent of |argument|
jl .Lsmall # |argument| < 1.0?
cmp ecx, BIAS
jg .Lmxcsr # argument = ±INFINITY?
# argument = INDEFINITE?
sub ecx, 52
jge .Lexit # |argument| >= 0x1.0p+52?
neg ecx # ecx = number of bits in fractional part of mantissa
mov rdx, rax
shr rax, cl
shl rax, cl # rax = trunc(argument)
xor rdx, rax # rdx = fractional part of mantissa
movq xmm0, rax # xmm0 = trunc(argument)
neg rdx # CF = (fractional part of mantissa <> 0)
sbb ecx, ecx # ecx = (fractional part of mantissa <> 0) ? -1 : 0
shr ecx, 22 # ecx = (fractional part of mantissa <> 0) ? 0x3FF : 0
cqo # rdx = (trunc(argument) < 0.0) ? -1 : 0
and edx, ecx
shl rdx, 52 # rdx = (trunc(argument) < 0.0)
# & (fractional part of mantissa <> 0)
# ? 0x3FF0000000000000 : 0
movq xmm1, rdx # xmm1 = (trunc(argument) < 0.0)
# & (fractional part of mantissa <> 0)
# ? 1.0 : 0.0
subsd xmm0, xmm1 # xmm0 = floor(argument)
ret
.Lsmall:
test rax, rax
js .Lnegative
.Lpositive:
xorpd xmm0, xmm0 # xmm0 = 0.0
ret
.Lnegative:
mov rax, 0xBFF0000000000000
movq xmm0, rax # rax = -0x1.0p+0
# = -1.0
ret
.Lmxcsr:
addsd xmm0, xmm0
.Lexit:
ret
.size floor, .-floor
.type floor, @function
.global floor
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/x39715t6.aspx
; NOTE: floor() preserves -0.0
.code
double record sign:1, exponent:11, mantissa:52
bias equ 1 shl (width exponent - 1) - 1
floor proc public ; xmm0 = argument
movd rax, xmm0 ; rax = argument
add rax, rax
jz Lexit ; argument = ±0.0?
shr rax, 1 + width mantissa ; rax = biased exponent of |argument|
cmp eax, bias + width mantissa
jae Lexit ; |argument| > 0x1.0p+52?
; (argument = integer?)
; argument = INDEFINITE?
cvtsd2si rax, xmm0 ; rax = llrint(argument)
cvtsi2sd xmm1, rax ; xmm1 = rint(argument)
comisd xmm0, xmm1 ; CF = (rint(argument) > argument)
sbb rax, 0 ; rax = llrint(argument)
; - (rint(argument) > argument)
; = floor(argument)
cvtsi2sd xmm2, rax ; xmm2 = floor(argument)
xorpd xmm1, xmm1 ; xmm1 = 0.0
subsd xmm1, xmm0 ; xmm1 = -argument
xorpd xmm0, xmm1 ; xmm0 = (argument & -0.0) ? -0.0 : +0.0
orpd xmm0, xmm2 ; xmm0 = floor(argument)
Lexit:
ret
floor endp
end
Note: returns a signalingNaN unchanged!
frexp()
Functionfrexp()
returns the (normalized) fraction and the (integral) exponent of its
first argument.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double frexp(double argument, int *exponent)
{
unsigned long long sign, ull;
if (argument == 0.0)
*exponent = 0;
else {
ull = *(unsigned long long *) &argument;
*exponent = ull >> 52;
*exponent &= 2047;
if (*exponent > 0) {
ull &= ~(2047ULL << 52);
ull |= 1022ULL << 52;
} else {
sign = ull & (1ULL << 63);
do {
*exponent -= 1;
ull += ull;
} while (ull < (1ULL << 52));
ull ^= 1023ULL << 52;
ull |= sign;
}
*exponent -= 1022;
argument = *(double *) &ull;
}
return argument;
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.equiv BIAS, 1023
.intel_syntax noprefix
.text
# xmm0 = argument
# rdi = address of exponent
frexp:
movq rax, xmm0 # rax = argument
lea rcx, [rax+rax] # rcx = argument << 1
# = |argument| << 1
shr rcx, 1 # rcx = |argument|
mov [rdi], ecx
jz .Lexit # argument = ±0.0?
shr rcx, 52 # rcx = biased exponent
jz .Ldenormal # biased exponent = 0?
# (argument denormal?)
sub ecx, BIAS - 1 # ecx = unbiased exponent + 1
cmp ecx, BIAS + 2
mov [rdi], ecx
je .Lexit # unbiased exponent = 2047?
# (argument = INDEFINITE?)
# (argument = ±INFINITY?)
.Lnormal:
rol rax, 1
shl rax, 11
or rax, BIAS - 1
ror rax, 12 # rax = fractional part of argument
movq xmm0, rax # xmm0 = fractional part of argument
.Lexit:
ret
.Ldenormal:
xor edx, edx
add rax, rax # rax = argument << 1
# = |argument| << 1
adc edx, edx # rdx = (argument & -0.0) ? 1 : 0
shl edx, 11
or edx, BIAS - 1
bsr rcx, rax # rcx = index of most significant '1' bit in |argument| << 1
xor ecx, 63 # ecx = number of leading '0' bits in |argument| << 1
# = 11 - biased exponent
shl rax, cl # rax = normalized significand of argument << 11
add rax, rax # rax = fractional part of argument << 12
or rax, rdx
ror rax, 12 # rax = fractional part of argument
movq xmm0, rax # xmm0 = fractional part of argument
neg ecx # ecx = biased exponent - 11
sub ecx, BIAS - 12 # ecx = unbiased exponent + 1
mov [rdi], ecx
ret
.size frexp, .-frexp
.type frexp, @function
.global frexp
.end
ldexp()
Functionldexp()
returns its first argument multiplied by 2 raised to the power of
its (integral) second argument.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# NOTE: for denormal argument and negative exponent or denormal
# result, ldexp() rounds to nearest with ties to even!
.arch generic64
.code64
.equiv BIAS, 1023
.equiv JCCLESS, 1
.intel_syntax noprefix
.text
# xmm0 = argument
# edi = exponent
ldexp:
test edi, edi
jz .Lexit # exponent = 0?
movq rsi, xmm0 # rsi = argument
lea rax, [rsi+rsi] # rax = argument << 1
# = |argument| << 1
shr rax, 1 # rax = |argument|
jz .Lexit # argument = ±0.0?
mov rdx, rax # rdx = |argument|
shr rax, 52 # rax = biased exponent
jz .Ldenormal # biased exponent = 0?
# (argument denormal?)
cmp eax, BIAS * 2 + 1
je .Lexit # biased exponent = 2047?
# (argument = INDEFINITE?)
# (argument = ±INFINITY?)
.Lnormal:
add eax, edi # eax = new biased exponent
jle .Lotherflow # new biased exponent < 1?
# (possible exponent underflow?)
cmp eax, BIAS * 2
jg .Loverflow # new biased exponent > 2046?
# (exponent overflow?)
shl rdi, 52
add rdx, rdi # rdx = |argument| * 2.0**exponent
.Lcopysign:
.if 0
shld rdx, rsi, 1
ror rdx, 1 # rdx = argument * 2.0**exponent
.elseif 0
add rdx, rdx
add rsi, rsi # CF = (argument & -0.0)
rcr rdx, 1 # rdx = argument * 2.0**exponent
.else
add rsi, rsi # CF = (argument & -0.0)
adc rdx, rdx
ror rdx, 1 # rdx = argument * 2.0**exponent
.endif
movq xmm0, rdx # xmm0 = argument * 2.0**exponent
.Lexit:
ret
.Lunderflow:
xor rdx, rdx # rdx = 0.0
jmp .Lcopysign
.Loverflow:
mov rdx, 0x7FF0000000000000 # rdx = 0x1.0p+1024
# = INFINITY
jmp .Lcopysign
.Lotherflow:
cmp eax, -52
jl .Lunderflow # new (biased) exponent + 1 < -52?
# (exponent underflow, even with mantissa rounded up?)
dec eax # eax = new biased exponent
neg eax # eax = 0 - new biased exponent
mov ecx, eax # ecx = 0 - new biased exponent
# = shift count
mov rax, 0x000FFFFFFFFFFFFF
and rdx, rax # rdx = mantissa
inc rax # rax = 0x0010000000000000
# = explicit integer bit
or rdx, rax # rdx = 1.mantissa
# = significand
xor eax, eax
.Lcontinue:
shrd rax, rdx, cl # rax = excess part of significand
shr rdx, cl # rdx = significand >> -(new biased exponent)
# = |argument| * 2.0**exponent
.ifnotdef JCCLESS
add rax, rax # rax = excess part of significand << 1,
# CF = (excess part of significand >= 0x8000000000000000),
# ZF = (excess part of significand = 0x8000000000000000)
jnc .Lcopysign # excess part of significand < 0x8000000000000000?
jnz .Lround # excess part of significand > 0x8000000000000000?
.Ltie:
bt edx, 0 # CF = (significand odd) ? 1 : 0
.Lround:
adc rdx, 0 # rdx = significand rounded to nearest even
.else
xor ecx, ecx
add rax, rax # rax = excess part of significand << 1,
# CF = (excess part of significand >= 0x8000000000000000),
# ZF = (excess part of significand = 0x8000000000000000)
adc ecx, ecx # ecx = (excess part of significand < 0x8000000000000000) ? 0 : 1
neg rax # CF = (excess part of significand <> 0x8000000000000000)
sbb eax, eax # eax = (excess part of significand = 0x8000000000000000) ? 0 : -1
or eax, edx
and eax, ecx # rax = (excess part of significand > 0x8000000000000000)
# | (excess part of significand = 0x8000000000000000)
# & (significand odd) ? 1 : 0
add rdx, rax # rdx = significand rounded to nearest even
.endif # JCCLESS
jmp .Lcopysign
.Ldenormal:
bsr rcx, rdx # rcx = index of most significant '1' bit in |argument|
test edi, edi
js .Lnegative # exponent < 0?
xor ecx, 63 # ecx = number of leading '0' bits in |argument|
sub ecx, 12 # ecx = number of leading '0' bits in mantissa
cmp ecx, edi
jb .Lnormalize # exponent > number of leading '0' bits in mantissa?
mov ecx, edi
shl rdx, cl # rdx = mantissa << exponent
# = |argument| << exponent
# = |argument| * 2.0**exponent
jmp .Lcopysign
.Lnegative:
.if 0
add ecx, edi # ecx = index of most significant '1' bit in mantissa
# + exponent
# = new index of most significant '1' bit in mantissa
inc ecx # ecx = new index of most significant '1' bit in significand
.else
stc
adc ecx, edi # ecx = new index of most significant '1' bit in significand
.endif
js .Lunderflow # mantissa underflow, even with mantissa rounded up?
neg edi
mov ecx, edi # ecx = -exponent
# shrd rax, rdx, cl
# shr rdx, cl # rdx = mantissa >> exponent
# = |argument| >> exponent
# = |argument| * 2.0**exponent
jmp .Lcontinue
.Lnormalize:
inc ecx # ecx = number of leading '0' bits in mantissa + 1
# = number of leading '0' bits in significand
sub edi, ecx # edi = exponent
# - number of leading '0' bits in significand
# = new (biased) exponent
cmp edi, BIAS * 2
jge .Loverflow # new biased exponent > 2046?
# (exponent overflow?)
shl rdi, 52
shl rdx, cl # rdx = significand << number of leading '0' bits in significand
# = |argument| << number of leading '0' bits in significand
add rdx, rdi # rdx = |argument| * 2.0**exponent
jmp .Lcopysign
ret
.size ldexp, .-ldexp
.type ldexp, @function
.global ldexp
.end
modf()
Functionmodf()
returns the fractional and integral parts of its argument.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
#define INFINITY (1.0 / 0.5e-323)
double fabs(double x);
double trunc(double x);
double modf(double argument, double *integer)
{
*integer = trunc(argument);
return fabs(argument) == INFINITY ? 1.0 / argument : argument - *integer;
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.equiv BIAS, 1023
.intel_syntax noprefix
.text
# xmm0 = argument
# rdi = address of integer part
modf:
movq rax, xmm0 # rax = argument
shr rax, 52 # rax = sign and biased exponent
mov ecx, BIAS * 2 + 1
and ecx, eax # rcx = biased exponent
sub eax, ecx
shl rax, 52 # rax = sign of argument
mov [rdi], rax # *integer = ±0.0
sub ecx, BIAS # rcx = biased exponent - 1023
# = unbiased exponent
js .Lexit # unbiased exponent < 0?
# (no integer part?)
cmp ecx, 52
jge .Linteger # no fractional part?
mov rdx, 0x000FFFFFFFFFFFFF
shr rdx, cl # rdx = mask for fractional part of mantissa
movq rcx, xmm0 # rcx = argument
test rcx, rdx
jz .Linteger # fractional part of mantissa = 0?
# (argument is integer?)
.Lfraction:
not rdx # rdx = mask for sign, biased exponent and integer part of mantissa
and rdx, rcx # rdx = sign, biased exponent and integer part of mantissa
# = integer part of argument
mov [rdi], rdx
movq xmm1, rdx # xmm1 = integer part of argument
subsd xmm0, xmm1 # xmm0 = argument - integer part of argument
# = fractional part of argument
ret
.Linteger:
movq [rdi], xmm0 # *integer = argument
movq xmm0, rax # xmm0 = ±0.0
# = fractional part of argument
.Lexit:
ret
.size modf, .-modf
.type modf, @function
.global modf
.end
nextafter()
Functionnextafter()
returns the next representable double-precision floating-point
number from its first argument in direction of its second
argument.
// Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
double nextafter(double from, double to)
{
if (from == to)
return to;
if (to != to)
return to;
if (from != from)
return from;
if (from == 0.0)
return to < 0.0 ? -0x1.0p-1074 : 0x1.0p-1074;
#if 0
if ((from < to) && (from < 0.0)
|| (from > to) && (from > 0.0))
#elif 0
if ((from > to) == (from > 0.0))
#else
if ((from < to) == (from < 0.0))
#endif
--*(unsigned long long *) &from; // from -= 1 ULP
else
++*(unsigned long long *) &from; // from += 1 ULP
return from;
}
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch generic64
.code64
.intel_syntax noprefix
.text
# xmm0 = from
# xmm1 = to
nextafter:
ucomisd xmm1, xmm0 # CF = (from > to)
je .Lspecial # from = to?
# from = INDEFINITE?
# to = INDEFINITE?
.Lnotequal:
sbb rdx, rdx # rdx = (from > to) ? -1 : 0
movq rcx, xmm0 # rcx = from
mov rax, rcx
add rax, rax # CF = (from & -0.0)
jz .Lzero # from = ±0.0?
.Lnext:
sbb rax, rax # rax = (from < 0.0) ? -1 : 0
xor rax, rdx # rax = (from < 0.0) ^ (from > to) ? -1 : 0
# = (from < 0.0) & (from < to)
# | (from > 0.0) & (from > to) ? -1 : 0
or rax, 1 # rax = (from < 0.0) ^ (from > to) ? -1 : 1
# = (from < 0.0) = (from < to) ? -1 : 1
# = (from > 0.0) = (from > to) ? -1 : 1
add rax, rcx
movq xmm0, rax # xmm0 = from ± 1 ULP
.ifdef COMPLIANT
xorpd xmm1, xmm1
addsd xmm1, xmm0
.endif
ret
.Lzero:
movmskpd eax, xmm1 # rax = (to & -0.0) ? 0b?1 : 0b?0
or eax, 2 # rax = (to & -0.0) ? 0b11 : 0b10
ror rax, 1 # rax = (to & -0.0) ? 0x8000000000000001 : 1
movq xmm0, rax # xmm0 = (to & -0.0) ? -0x1.0p-1074 : 0x1.0p-1074
.ifdef COMPLIANT
xorpd xmm1, xmm1
addsd xmm1, xmm0
.endif
ret
.Lspecial:
jp .Lindefinite # to = INDEFINITE?
# from = INDEFINITE?
.Lequal:
movsd xmm0, xmm1 # xmm0 = to
ret
.Lindefinite:
addsd xmm0, xmm1 # xmm0 = INDEFINITE
ret
.size nextafter, .-nextafter
.type nextafter, @function
.global nextafter
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/h0dff77w.aspx
.code
double record sign:1, exponent:11, mantissa:52
bias equ 1 shl (width exponent - 1) - 1
nextafter proc public ; xmm0 = from
; xmm1 = to
xorpd xmm2, xmm2 ; xmm2 = 0.0
ucomisd xmm1, xmm2 ; CF = (to < 0.0)
jp Lto ; to = INDEFINITE?
sbb rax, rax ; rax = (to < 0.0) ? -1 : 0
ucomisd xmm0, xmm1 ; CF = (from < to)
;; jp Lfrom ; from = INDEFINITE?
;; je Lto ; from = to?
je Lspecial ; from = to?
; from = INDEFINITE?
Lnotequal:
sbb rcx, rcx ; rcx = (from < to) ? -1 : 0
ucomisd xmm0, xmm2 ; CF = (from < 0.0)
jz Lzero ; from = ±0.0?
Lnext:
movd rdx, xmm0 ; rdx = from
sbb rax, rax ; rax = (from < 0.0) ? -1 : 0
xor rax, rcx ; rax = (from < 0.0) = (from < to) ? 0 : -1
or rax, 1 ; rax = (from < 0.0) = (from < to) ? 1 : -1
sub rdx, rax
movd xmm0, rdx ; xmm0 = from ± 1 ULP
ifdef MXCSR
addsd xmm2, xmm0
endif
ret
Lzero:
shl rax, 63 ; rax = (to < 0.0) ? 0x8000000000000000 : 0
or rax, 1 ; rax = (to < 0.0) ? 0x8000000000000001 : 1
movd xmm0, rax ; xmm0 = (to < 0.0) ? -0x1.0p-1074 : 0x1.0p-1074
ifdef MXCSR
addsd xmm2, xmm0
endif
ret
Lspecial:
jp Lfrom ; from = INDEFINITE?
Lto:
movsd xmm0, xmm1 ; xmm0 = to
Lfrom:
ret
nextafter endp
end
Note: returns signalingNaNs unchanged!
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
.arch i686
.code32
.intel_syntax noprefix
.text
# [esp+12] = to
# [esp+4] = from
nextafter:
fld real8 ptr [esp+4] # st(0) = from
fld real8 ptr [esp+12] # st(0) = to,
# st(1) = from
fucomi st(0), st(1)
je .Lspecial # from = to?
# from = INDEFINITE?
# to = INDEFINITE?
sbb edx, edx # edx = (to < from) ? -1 : 0
fsub st(0), st(0) # st(0) = 0.0,
# st(1) = from
fucomip st(0), st(1) # st(0) = from
jz .Lzero # from = ±0.0?
sbb eax, eax # eax = (from > 0.0) ? -1 : 0
xor eax, edx # eax = (from > 0.0) ^ (from < to) ? -1 : 0
# = (from > 0.0) & (from > to)
# | (from < 0.0) & (from < to) ? -1 : 0
or eax, 1 # eax = (from > 0.0) ^ (from < to) ? -1 : 1
# = (from > 0.0) = (from > to) ? -1 : 1
# = (from < 0.0) = (from < to) ? -1 : 1
cdq # edx:eax = (from < 0.0) = (from > to) ? -1 : 1
sub [esp+4], eax
sbb [esp+8], edx # from = from
# - (from < 0.0) = (from > to) ? -1 : 1
# = from'
fld real8 ptr [esp+4] # st(0) = from',
# st(1) = from
fstp st(1) # st(0) = nextafter(from, to)
ret
.Lzero:
and dword ptr [esp+16], 0x80000000
mov dword ptr [esp+12], 0x1 # to = (to & -0.0) ? 0x8000000000000001 : 1
fld real8 ptr [esp+12] # st(0) = (to & -0.0) ? -0x1.0p-1074 : 0x1.0p-1074
# st(1) = from
.Lequal:
fstp st(1) # st(0) = nextafter(from, to)
ret
.Lspecial:
jnp .Lequal # from = to?
.Lindefinite:
faddp st(1), st(0) # st(0) = from + to
# = INDEFINITE
ret
.size nextafter, .-nextafter
.type nextafter, @function
.global nextafter
.end
rint()
Functionrint()
returns the according to the current rounding mode nearest integral
value to its argument.
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/dn465165.aspx
; NOTE: rint() preserves -0.0, and returns -0.0 for argument in
; [-0.5, -0.0] or (-1.0, -0.0]
.code
double record sign:1, exponent:11, mantissa:52
bias equ 1 shl (width exponent - 1) - 1
rint proc public ; xmm0 = argument
movd rax, xmm0 ; rax = argument
add rax, rax
jz Lexit ; argument = ±0.0?
shr rax, 1 + width mantissa ; rax = biased exponent of |argument|
cmp eax, bias + width mantissa
jae Lexit ; |argument| > 0x1.0p+52?
; (argument = integer?)
; argument = INDEFINITE?
cvtsd2si rax, xmm0 ; rax = llrint(argument)
cvtsi2sd xmm1, rax ; xmm1 = rint(argument)
xorpd xmm2, xmm2 ; xmm2 = 0.0
subsd xmm2, xmm0 ; xmm2 = -argument
xorpd xmm0, xmm2 ; xmm0 = (argument & -0.0) ? -0.0 : +0.0
orpd xmm0, xmm1 ; xmm0 = rint(argument)
Lexit:
ret
rint endp
end
Note: returns a signalingNaN unchanged!
round()
Functionround()
returns the nearest integral value to its argument, rounding ties
away from 0.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# NOTE: round() returns -0.0 for argument in (-0.5, -0.0]
.arch generic64
.code64
.equiv BIAS, 1023
.intel_syntax noprefix
.text
# xmm0 = argument
round:
movq rax, xmm0 # rax = argument
mov rcx, rax # rcx = argument
add rcx, rcx
jz .Lexit # argument = ±0.0?
shr rcx, 53 # rcx = biased exponent of |argument|
sub ecx, BIAS - 1 # rcx = 1 + unbiased exponent of |argument|
jl .Lzero # |argument| < 0.5?
cmp ecx, BIAS
jg .Lmxcsr # argument = ±INFINITY?
# argument = INDEFINITE?
sub ecx, 53
jge .Lexit # |argument| >= 0x1.0p+52?
neg ecx # ecx = number of bits in fractional part of mantissa
shr rax, cl # CF = (fraction >= 0.5)
sbb edx, edx # edx = (fraction >= 0.5) ? -1 : 0
shl rax, cl # rax = trunc(argument)
movq xmm1, rax # xmm1 = trunc(argument)
movmskpd eax, xmm0 # rax = (argument & -0.0) ? 0b?1 : 0b?0
shr edx, 22 # edx = (fraction >= 0.5) ? 0x3FF : 0
shl eax, 11
or eax, edx
shl rax, 52 # rax = (fraction >= 0.5) ? 0x3FF0000000000000 : 0
# | (argument & -0.0) ? 0x8000000000000000 : 0
movq xmm0, rax # xmm0 = {-1.0, 0.0, 1.0}
addsd xmm0, xmm1 # xmm0 = round(argument)
ret
.Lzero:
cqo # rdx = (argument & -0.0) ? -1 : 0
shl rdx, 63 # rdx = (argument & -0.0) ? 0x8000000000000000 : 0
movq xmm0, rdx # xmm0 = (argument & -0.0) ? -0.0 : 0.0
ret
.Lmxcsr:
addsd xmm0, xmm0
.Lexit:
ret
.size round, .-round
.type round, @function
.global round
.end
trunc()
Functiontrunc()
returns the largest integral value not greater than the magnitude of
its argument.
# Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
# NOTE: trunc() returns -0.0 for argument in (-1.0, -0.0]
.arch generic64
.code64
.equiv BIAS, 1023
.intel_syntax noprefix
.text
# xmm0 = argument
trunc:
movq rax, xmm0 # rax = argument
mov rcx, rax # rcx = argument
add rcx, rcx
jz .Lexit # argument = ±0.0?
shr rcx, 53 # rcx = biased exponent of |argument|
sub ecx, BIAS # rcx = unbiased exponent of |argument|
jl .Lzero # |argument| < 1.0?
cmp ecx, BIAS
jg .Lmxcsr # argument = ±INFINITY?
# argument = INDEFINITE?
sub ecx, 52
jge .Lexit # |argument| >= 0x1.0p+52?
neg ecx # ecx = number of bits in fractional part of mantissa
shr rax, cl
shl rax, cl # rax = trunc(argument)
movq xmm0, rax # xmm0 = trunc(argument)
ret
.Lzero:
cqo # rdx = (argument & -0.0) ? -1 : 0
shl rdx, 63 # rdx = (argument & -0.0) ? 0x8000000000000000 : 0
movq xmm0, rdx # xmm0 = (argument & -0.0) ? -0.0 : 0.0
ret
.Lmxcsr:
addsd xmm0, xmm0
.Lexit:
ret
.size trunc, .-trunc
.type trunc, @function
.global trunc
.end
; Copyright © 2004-2024, Stefan Kanthak <stefan.kanthak@nexgo.de>
; https://msdn.microsoft.com/en-us/library/mt720727.aspx
; NOTE: trunc() returns -0.0 for argument in (-1.0, -0.0]
.code
double record sign:1, exponent:11, mantissa:52
bias equ 1 shl (width exponent - 1) - 1
trunc proc public ; xmm0 = argument
movd rax, xmm0 ; rax = argument
add rax, rax
jz Lexit ; argument = ±0.0?
shr rax, 1 + width mantissa ; rax = biased exponent of |argument|
cmp eax, bias + width mantissa
jae Lexit ; |argument| > 0x1.0p+52?
; (argument = integer?)
; argument = INDEFINITE?
cvttsd2si rax, xmm0 ; rax = trunc(argument)
cvtsi2sd xmm1, rax ; xmm1 = trunc(argument)
xorpd xmm2, xmm2 ; xmm2 = 0.0
subsd xmm2, xmm0 ; xmm2 = -argument
xorpd xmm0, xmm2 ; xmm0 = (argument & -0.0) ? -0.0 : +0.0
orpd xmm0, xmm2 ; xmm0 = trunc(argument)
Lexit:
ret
trunc endp
end
Note: returns a signalingNaN unchanged!
Ole Møller, Quasi Double-Precision in Floating Point Addition, BIT Numerical Mathematics, Volume 5(1):37-50, March 1965, ISSN 0006-3835, 1572-9125.
Theodorus J. Dekker, A Floating-Point Technique for Extending the Available Precision, Numerische Mathematik, Volume 18(3):224-242, June 1971, ISSN 0029-599X, 0945-3245.
Pat H. Sterbenz, Floating-Point Computation, Prentice-Hall, 1974, ISBN 0-13-322495-3.
William J. Cody and William Waite, Software Manual for the Elementary Functions, Prentice-Hall, 1980, ISBN 0-13-822064-6.
Seppo I. Linnainmaa, Software for Doubled-Precision Floating-Point Computations, ACM Transactions on Mathematical Software, Volume 7(3):272-283, September 1981, ISSN 0098-3500, 1557-7295.
Mary H. Payne and Robert N. Hanek, Radian Reduction for Trigonometric Functions, ACM SIGNUM Newsletter, Volume 18(1):19-24, January 1983, ISSN 0163-5778.
Mary H. Payne and Robert N. Hanek, Degree Reduction for Trigonometric Functions, ACM SIGNUM Newsletter, Volume 18(2):18-19, April 1983, ISSN 0163-5778.
Cleve B. Moler and Donald Morrison, Replacing Square Roots by Pythagorean Sums, IBM Journal of Research and Development, Volume 27(6):577-581, November 1983, ISSN 0018-8646.
Augustin A. Dubrulle, A Class of Numerical Methods for the Computation of Pythagorean Sums, IBM Journal of Research and Development, Volume 27(6):582-589, November 1983, ISSN 0018-8646.
Sylvie Boldo and Guillaume Melquiond, Emulation of a FMA and Correctly Rounded Sums: Proved Algorithms Using Rounding to Odd, IEEE Transactions on Computers, Volume 57(4):462-471, April 2008, ISSN 0018-9340, 1557-9956.
Nelson H. F. Beebe, The Mathematical-Function Computation Handbook, Springer, 2017, ISBN 978-3-319-64109-6, 978-3-319-87725-9, 978-3-319-64110-2.
Use the X.509 certificate to send S/MIME encrypted mail.
Note: email in weird format and without a proper sender name is likely to be discarded!
I dislike
HTML (and even
weirder formats too) in email, I prefer to receive plain text.
I also expect to see your full (real) name as sender, not your
nickname.
I abhor top posts and expect inline quotes in replies.
as iswithout any warranty, neither express nor implied.
cookiesin the web browser.
The web service is operated and provided by
Telekom Deutschland GmbH The web service provider stores a session cookie
in the web
browser and records every visit of this web site with the following
data in an access log on their server(s):