?? e_pow.s

?? glibc 庫, 不僅可以學(xué)習(xí)使用庫函數(shù),還可以學(xué)習(xí)函數(shù)的具體實(shí)現(xiàn),是提高功力的好資料
?? S
?? 第 1 頁 / 共 5 頁
字號:
12 3 4 5 下一頁
.file "pow.s"// Copyright (c) 2000 - 2005, Intel Corporation// All rights reserved.//// Contributed 2000 by the Intel Numerics Group, Intel Corporation//// Redistribution and use in source and binary forms, with or without// modification, are permitted provided that the following conditions are// met://// * Redistributions of source code must retain the above copyright// notice, this list of conditions and the following disclaimer.//// * Redistributions in binary form must reproduce the above copyright// notice, this list of conditions and the following disclaimer in the// documentation and/or other materials provided with the distribution.//// * The name of Intel Corporation may not be used to endorse or promote// products derived from this software without specific prior written// permission.// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL OR ITS// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR// PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY// OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY OR TORT (INCLUDING// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.//// Intel Corporation is the author of this code, and requests that all// problem reports or change requests be submitted to it directly at// http://www.intel.com/software/products/opensource/libraries/num.htm.//// History//==============================================================// 02/02/00 Initial version// 02/03/00 Added p12 to definite over/under path. With odd power we did not//          maintain the sign of x in this path.// 04/04/00 Unwind support added// 04/19/00 pow(+-1,inf) now returns NaN//          pow(+-val, +-inf) returns 0 or inf, but now does not call error//          support//          Added s1 to fcvt.fx because invalid flag was incorrectly set.// 08/15/00 Bundle added after call to __libm_error_support to properly//          set [the previously overwritten] GR_Parameter_RESULT.// 09/07/00 Improved performance by eliminating bank conflicts and other stalls,//          and tweaking the critical path// 09/08/00 Per c99, pow(+-1,inf) now returns 1, and pow(+1,nan) returns 1// 09/28/00 Updated NaN**0 path// 01/20/01 Fixed denormal flag settings.// 02/13/01 Improved speed.// 03/19/01 Reordered exp polynomial to improve speed and eliminate monotonicity//          problem in round up, down, and to zero modes.  Also corrected//          overflow result when x negative, y odd in round up, down, zero.// 06/14/01 Added brace missing from bundle// 12/10/01 Corrected case where x negative, 2^52 <= |y| < 2^53, y odd integer.// 12/20/01 Fixed monotonity problem in round to nearest.// 02/08/02 Fixed overflow/underflow cases that were not calling error support.// 05/20/02 Cleaned up namespace and sf0 syntax// 08/29/02 Improved Itanium 2 performance// 09/21/02 Added branch for |y*log(x)|<2^-11 to fix monotonicity problems.// 02/10/03 Reordered header: .section, .global, .proc, .align// 03/31/05 Reformatted delimiters between data tables//// API//==============================================================// double pow(double x, double y)//// Overview of operation//==============================================================//// Three steps...// 1. Log(x)// 2. y Log(x)// 3. exp(y log(x))//// This means we work with the absolute value of x and merge in the sign later.//      Log(x) = G + delta + r -rsq/2 + p// G,delta depend on the exponent of x and table entries. The table entries are// indexed by the exponent of x, called K.//// The G and delta come out of the reduction; r is the reduced x.//// B = frcpa(x)// xB-1 is small means that B is the approximate inverse of x.////      Log(x) = Log( (1/B)(Bx) )//             = Log(1/B) + Log(Bx)//             = Log(1/B) + Log( 1 + (Bx-1))////      x  = 2^K 1.x_1x_2.....x_52//      B= frcpa(x) = 2^-k Cm//      Log(1/B) = Log(1/(2^-K Cm))//      Log(1/B) = Log((2^K/ Cm))//      Log(1/B) = K Log(2) + Log(1/Cm)////      Log(x)   = K Log(2) + Log(1/Cm) + Log( 1 + (Bx-1))//// If you take the significand of x, set the exponent to true 0, then Cm is// the frcpa. We tabulate the Log(1/Cm) values. There are 256 of them.// The frcpa table is indexed by 8 bits, the x_1 thru x_8.// m = x_1x_2...x_8 is an 8-bit index.////      Log(1/Cm) = log(1/frcpa(1+m/256)) where m goes from 0 to 255.//// We tabluate as two doubles, T and t, where T +t is the value itself.////      Log(x)   = (K Log(2)_hi + T) + (Log(2)_hi + t) + Log( 1 + (Bx-1))//      Log(x)   =  G + delta           + Log( 1 + (Bx-1))//// The Log( 1 + (Bx-1)) can be calculated as a series in r = Bx-1.////      Log( 1 + (Bx-1)) = r - rsq/2 + p//// Then,////      yLog(x) = yG + y delta + y(r-rsq/2) + yp//      yLog(x) = Z1 + e3      + Z2         + Z3 + (e2 + e3)//////     exp(yLog(x)) = exp(Z1 + Z2 + Z3) exp(e1 + e2 + e3)//////       exp(Z3) is another series.//       exp(e1 + e2 + e3) is approximated as f3 = 1 + (e1 + e2 + e3)////       Z1 (128/log2) = number of log2/128 in Z1 is N1//       Z2 (128/log2) = number of log2/128 in Z2 is N2////       s1 = Z1 - N1 log2/128//       s2 = Z2 - N2 log2/128////       s = s1 + s2//       N = N1 + N2////       exp(Z1 + Z2) = exp(Z)//       exp(Z)       = exp(s) exp(N log2/128)////       exp(r)       = exp(Z - N log2/128)////      r = s + d = (Z - N (log2/128)_hi) -N (log2/128)_lo//                =  Z - N (log2/128)////      Z         = s+d +N (log2/128)////      exp(Z)    = exp(s) (1+d) exp(N log2/128)////      N = M 128 + n////      N log2/128 = M log2 + n log2/128////      n is 8 binary digits = n_7n_6...n_1////      n log2/128 = n_7n_6n_5 16 log2/128 + n_4n_3n_2n_1 log2/128//      n log2/128 = n_7n_6n_5 log2/8 + n_4n_3n_2n_1 log2/128//      n log2/128 = I2 log2/8 + I1 log2/128////      N log2/128 = M log2 + I2 log2/8 + I1 log2/128////      exp(Z)    = exp(s) (1+d) exp(log(2^M) + log(2^I2/8) + log(2^I1/128))//      exp(Z)    = exp(s) (1+d1) (1+d2)(2^M) 2^I2/8 2^I1/128//      exp(Z)    = exp(s) f1 f2 (2^M) 2^I2/8 2^I1/128//// I1, I2 are table indices. Use a series for exp(s).// Then get exp(Z)////     exp(yLog(x)) = exp(Z1 + Z2 + Z3) exp(e1 + e2 + e3)//     exp(yLog(x)) = exp(Z) exp(Z3) f3//     exp(yLog(x)) = exp(Z)f3 exp(Z3)//     exp(yLog(x)) = A exp(Z3)//// We actually calculate exp(Z3) -1.// Then,//     exp(yLog(x)) = A + A( exp(Z3)   -1)//// Table Generation//==============================================================// The log values// ==============// The operation (K*log2_hi) must be exact. K is the true exponent of x.// If we allow gradual underflow (denormals), K can be represented in 12 bits// (as a two's complement number). We assume 13 bits as an engineering// precaution.////           +------------+----------------+-+//           |  13 bits   | 50 bits        | |//           +------------+----------------+-+//           0            1                66//                        2                34//// So we want the lsb(log2_hi) to be 2^-50// We get log2 as a quad-extended (15-bit exponent, 128-bit significand)////      0 fffe b17217f7d1cf79ab c9e3b39803f2f6af (4...)//// Consider numbering the bits left to right, starting at 0 thru 127.// Bit 0 is the 2^-1 bit; bit 49 is the 2^-50 bit.////  ...79ab//     0111 1001 1010 1011//     44//     89//// So if we shift off the rightmost 14 bits, then (shift back only// the top half) we get////      0 fffe b17217f7d1cf4000 e6af278ece600fcb dabc000000000000//// Put the right 64-bit signficand in an FR register, convert to double;// it is exact. Put the next 128 bits into a quad register and round to double.// The true exponent of the low part is -51.//// hi is 0 fffe b17217f7d1cf4000// lo is 0 ffcc e6af278ece601000//// Convert to double memory format and get//// hi is 0x3fe62e42fefa39e8// lo is 0x3cccd5e4f1d9cc02//// log2_hi + log2_lo is an accurate value for log2.////// The T and t values// ==================// A similar method is used to generate the T and t values.//// K * log2_hi + T  must be exact.//// Smallest T,t// ----------// The smallest T,t is//       T                   t// 0x3f60040155d58800, 0x3c93bce0ce3ddd81  log(1/frcpa(1+0/256))=  +1.95503e-003//// The exponent is 0x3f6 (biased)  or -9 (true).// For the smallest T value, what we want is to clip the significand such that// when it is shifted right by 9, its lsb is in the bit for 2^-51. The 9 is the// specific for the first entry. In general, it is 0xffff - (biased 15-bit// exponent).// Independently, what we have calculated is the table value as a quad// precision number.// Table entry 1 is// 0 fff6 80200aaeac44ef38 338f77605fdf8000//// We store this quad precision number in a data structure that is//    sign:           1//    exponent:      15//    signficand_hi: 64 (includes explicit bit)//    signficand_lo: 49// Because the explicit bit is included, the significand is 113 bits.//// Consider significand_hi for table entry 1.////// +-+--- ... -------+--------------------+// | |// +-+--- ... -------+--------------------+// 0 1               4444444455555555556666//                   2345678901234567890123//// Labeled as above, bit 0 is 2^0, bit 1 is 2^-1, etc.// Bit 42 is 2^-42. If we shift to the right by 9, the bit in// bit 42 goes in 51.//// So what we want to do is shift bits 43 thru 63 into significand_lo.// This is shifting bit 42 into bit 63, taking care to retain shifted-off bits.// Then shifting (just with signficaand_hi) back into bit 42.//// The shift_value is 63-42 = 21. In general, this is//      63 - (51 -(0xffff - 0xfff6))// For this example, it is//      63 - (51 - 9) = 63 - 42  = 21//// This means we are shifting 21 bits into significand_lo. We must maintain more// that a 128-bit signficand not to lose bits. So before the shift we put the// 128-bit significand into a 256-bit signficand and then shift.// The 256-bit significand has four parts: hh, hl, lh, and ll.//// Start off with//      hh         hl         lh         ll//      <64>       <49><15_0> <64_0>     <64_0>//// After shift by 21 (then return for significand_hi),//      <43><21_0> <21><43>   <6><58_0>  <64_0>//// Take the hh part and convert to a double. There is no rounding here.// The conversion is exact. The true exponent of the high part is the same as// the true exponent of the input quad.//// We have some 64 plus significand bits for the low part. In this example, we// have 70 bits. We want to round this to a double. Put them in a quad and then// do a quad fnorm.// For this example the true exponent of the low part is//      true_exponent_of_high - 43 = true_exponent_of_high - (64-21)// In general, this is//      true_exponent_of_high - (64 - shift_value)////// Largest T,t// ----------// The largest T,t is// 0x3fe62643fecf9742, 0x3c9e3147684bd37d  log(1/frcpa(1+255/256))=+6.92171e-001//// Table entry 256 is// 0 fffe b1321ff67cba178c 51da12f4df5a0000//// The shift value is//      63 - (51 -(0xffff - 0xfffe)) = 13//// The true exponent of the low part is//      true_exponent_of_high - (64 - shift_value)//      -1 - (64-13) = -52// Biased as a double, this is 0x3cb//////// So then lsb(T) must be >= 2^-51// msb(Klog2_hi) <= 2^12////              +--------+---------+//              |       51 bits    | <== largest T//              +--------+---------+//              | 9 bits | 42 bits | <== smallest T// +------------+----------------+-+// |  13 bits   | 50 bits        | |// +------------+----------------+-+// Special Cases//==============================================================//                                   double     float// overflow                          error 24   30// underflow                         error 25   31// X zero  Y zero//  +0     +0                 +1     error 26   32//  -0     +0                 +1     error 26   32//  +0     -0                 +1     error 26   32//  -0     -0                 +1     error 26   32// X zero  Y negative//  +0     -odd integer       +inf   error 27   33  divide-by-zero//  -0     -odd integer       -inf   error 27   33  divide-by-zero//  +0     !-odd integer      +inf   error 27   33  divide-by-zero
12 3 4 5 下一頁
?? 文件大小 20450 K
?? 上傳用戶 hanyuangu
?? 所屬分類 Linux/Unix編程
??? 相關(guān)標(biāo)簽

#glibc #庫函數(shù) #函數(shù)
?? 快捷鍵說明

復(fù)制代碼 Ctrl + C
搜索代碼 Ctrl + F
全屏模式 F11
切換主題 Ctrl + Shift + D
顯示快捷鍵 ?
增大字號 Ctrl + =
減小字號 Ctrl + -
亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频

?? e_pow.s

?? 快捷鍵說明