?? s_tanl.s

?? glibc 庫, 不僅可以學習使用庫函數,還可以學習函數的具體實現,是提高功力的好資料
?? S
?? 第 1 頁 / 共 5 頁
字號:
12 3 4 5 下一頁
.file "tancotl.s"// Copyright (c) 2000 - 2004, Intel Corporation// All rights reserved.//// Contributed 2000 by the Intel Numerics Group, Intel Corporation//// Redistribution and use in source and binary forms, with or without// modification, are permitted provided that the following conditions are// met://// * Redistributions of source code must retain the above copyright// notice, this list of conditions and the following disclaimer.//// * Redistributions in binary form must reproduce the above copyright// notice, this list of conditions and the following disclaimer in the// documentation and/or other materials provided with the distribution.//// * The name of Intel Corporation may not be used to endorse or promote// products derived from this software without specific prior written// permission.// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS // "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT // LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL OR ITS // CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, // PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR // PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY // OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY OR TORT (INCLUDING// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // // Intel Corporation is the author of this code, and requests that all// problem reports or change requests be submitted to it directly at // http://www.intel.com/software/products/opensource/libraries/num.htm.////*********************************************************************//// History: //// 02/02/00 (hand-optimized)// 04/04/00 Unwind support added// 12/28/00 Fixed false invalid flags// 02/06/02 Improved speed// 05/07/02 Changed interface to __libm_pi_by_2_reduce// 05/30/02 Added cotl// 02/10/03 Reordered header: .section, .global, .proc, .align;//          used data8 for long double table values// 05/15/03 Reformatted data tables// 10/26/04 Avoided using r14-31 as scratch so not clobbered by dynamic loader////*********************************************************************//// Functions:   tanl(x) = tangent(x), for double-extended precision x values//              cotl(x) = cotangent(x), for double-extended precision x values////*********************************************************************//// Resources Used:////    Floating-Point Registers: f8 (Input and Return Value)//                              f9-f15//                              f32-f121////    General Purpose Registers://      r32-r70////    Predicate Registers:      p6-p15////*********************************************************************//// IEEE Special Conditions for tanl:////    Denormal  fault raised on denormal inputs//    Overflow exceptions do not occur//    Underflow exceptions raised when appropriate for tan//    (No specialized error handling for this routine)//    Inexact raised when appropriate by algorithm////    tanl(SNaN) = QNaN//    tanl(QNaN) = QNaN//    tanl(inf) = QNaN//    tanl(+/-0) = +/-0////*********************************************************************//// IEEE Special Conditions for cotl:////    Denormal  fault raised on denormal inputs//    Overflow exceptions occur at zero and near zero//    Underflow exceptions do not occur//    Inexact raised when appropriate by algorithm////    cotl(SNaN) = QNaN//    cotl(QNaN) = QNaN//    cotl(inf) = QNaN//    cotl(+/-0) = +/-Inf and error handling is called////*********************************************************************////    Below are mathematical and algorithmic descriptions for tanl.//    For cotl we use next identity cot(x) = -tan(x + Pi/2).//    So, to compute cot(x) we just need to increment N (N = N + 1)//    and invert sign of the computed result.////*********************************************************************//// Mathematical Description//// We consider the computation of FPTANL of Arg. Now, given////      Arg = N pi/2  + alpha,          |alpha| <= pi/4,//// basic mathematical relationship shows that////      tan( Arg ) =  tan( alpha )     if N is even;//                 = -cot( alpha )      otherwise.//// The value of alpha is obtained by argument reduction and// represented by two working precision numbers r and c where////      alpha =  r  +  c     accurately.//// The reduction method is described in a previous write up.// The argument reduction scheme identifies 4 cases. For Cases 2// and 4, because |alpha| is small, tan(r+c) and -cot(r+c) can be// computed very easily by 2 or 3 terms of the Taylor series// expansion as follows://// Case 2:// -------////      tan(r + c) = r + c + r^3/3          ...accurately//     -cot(r + c) = -1/(r+c) + r/3          ...accurately//// Case 4:// -------////      tan(r + c) = r + c + r^3/3 + 2r^5/15     ...accurately//     -cot(r + c) = -1/(r+c) + r/3 + r^3/45     ...accurately////// The only cases left are Cases 1 and 3 of the argument reduction// procedure. These two cases will be merged since after the// argument is reduced in either cases, we have the reduced argument// represented as r + c and that the magnitude |r + c| is not small// enough to allow the usage of a very short approximation.//// The greatest challenge of this task is that the second terms of// the Taylor series for tan(r) and -cot(r)////      r + r^3/3 + 2 r^5/15 + ...//// and////      -1/r + r/3 + r^3/45 + ...//// are not very small when |r| is close to pi/4 and the rounding// errors will be a concern if simple polynomial accumulation is// used. When |r| < 2^(-2), however, the second terms will be small// enough (5 bits or so of right shift) that a normal Horner// recurrence suffices. Hence there are two cases that we consider// in the accurate computation of tan(r) and cot(r), |r| <= pi/4.//// Case small_r: |r| < 2^(-2)// --------------------------//// Since Arg = N pi/4 + r + c accurately, we have////      tan(Arg) =  tan(r+c)            for N even,//               = -cot(r+c)            otherwise.//// Here for this case, both tan(r) and -cot(r) can be approximated// by simple polynomials:////      tan(r) =    r + P1_1 r^3 + P1_2 r^5 + ... + P1_9 r^19//     -cot(r) = -1/r + Q1_1 r   + Q1_2 r^3 + ... + Q1_7 r^13//// accurately. Since |r| is relatively small, tan(r+c) and// -cot(r+c) can be accurately approximated by replacing r with// r+c only in the first two terms of the corresponding polynomials.//// Note that P1_1 (and Q1_1 for that matter) approximates 1/3 to// almost 64 sig. bits, thus////      P1_1 (r+c)^3 =  P1_1 r^3 + c * r^2     accurately.//// Hence,////      tan(r+c) =    r + P1_1 r^3 + P1_2 r^5 + ... + P1_9 r^19//                     + c*(1 + r^2)////        -cot(r+c) = -1/(r+c) + Q1_1 r   + Q1_2 r^3 + ... + Q1_7 r^13//               + Q1_1*c////// Case normal_r: 2^(-2) <= |r| <= pi/4// ------------------------------------//// This case is more likely than the previous one if one considers// r to be uniformly distributed in [-pi/4 pi/4].//// The required calculation is either////      tan(r + c)  =  tan(r)  +  correction,  or//     -cot(r + c)  = -cot(r)  +  correction.//// Specifically,////      tan(r + c) =  tan(r) + c tan'(r)  + O(c^2)//                 =  tan(r) + c sec^2(r) + O(c^2)//                 =  tan(r) + c SEC_sq     ...accurately//                as long as SEC_sq approximates sec^2(r)//                to, say, 5 bits or so.//// Similarly,////     -cot(r + c) = -cot(r) - c cot'(r)  + O(c^2)//                 = -cot(r) + c csc^2(r) + O(c^2)//                 = -cot(r) + c CSC_sq     ...accurately//                as long as CSC_sq approximates csc^2(r)//                to, say, 5 bits or so.//// We therefore concentrate on accurately calculating tan(r) and// cot(r) for a working-precision number r, |r| <= pi/4 to within// 0.1% or so.//// We will employ a table-driven approach. Let////      r = sgn_r * 2^k * 1.b_1 b_2 ... b_5 ... b_63//        = sgn_r * ( B + x )//// where////      B = 2^k * 1.b_1 b_2 ... b_5 1//      x = |r| - B//// Now,//                   tan(B)  +   tan(x)//      tan( B + x ) =  ------------------------//                   1 -  tan(B)*tan(x)////               /                         \//               |   tan(B)  +   tan(x)          |//      = tan(B) +  | ------------------------ - tan(B) |//               |     1 -  tan(B)*tan(x)          |//               \                         /////                 sec^2(B) * tan(x)//      = tan(B) + ------------------------//                 1 -  tan(B)*tan(x)////                (1/[sin(B)*cos(B)]) * tan(x)//      = tan(B) + --------------------------------//                      cot(B)  -  tan(x)////// Clearly, the values of tan(B), cot(B) and 1/(sin(B)*cos(B)) are// calculated beforehand and stored in a table. Since////      |x| <= 2^k * 2^(-6)  <= 2^(-7)  (because k = -1, -2)//// a very short polynomial will be sufficient to approximate tan(x)// accurately. The details involved in computing the last expression// will be given in the next section on algorithm description.////// Now, we turn to the case where cot( B + x ) is needed.//////                   1 - tan(B)*tan(x)//      cot( B + x ) =  ------------------------//                   tan(B)  +  tan(x)////               /                           \//               |   1 - tan(B)*tan(x)              |//      = cot(B) +  | ----------------------- - cot(B) |//               |     tan(B)  +  tan(x)            |//               \                           /////               [tan(B) + cot(B)] * tan(x)//      = cot(B) - ----------------------------//                   tan(B)  +  tan(x)////                (1/[sin(B)*cos(B)]) * tan(x)//      = cot(B) - --------------------------------//                      tan(B)  +  tan(x)////// Note that the values of tan(B), cot(B) and 1/(sin(B)*cos(B)) that// are needed are the same set of values needed in the previous// case.//// Finally, we can put all the ingredients together as follows:////      Arg = N * pi/2 +  r + c          ...accurately////      tan(Arg) =  tan(r) + correction    if N is even;//               = -cot(r) + correction    otherwise.//// For Cases 2 and 4,////     Case 2://     tan(Arg) =  tan(r + c) = r + c + r^3/3           N even//              = -cot(r + c) = -1/(r+c) + r/3           N odd//     Case 4://     tan(Arg) =  tan(r + c) = r + c + r^3/3 + 2r^5/15  N even//              = -cot(r + c) = -1/(r+c) + r/3 + r^3/45  N odd////// For Cases 1 and 3,////     Case small_r: |r| < 2^(-2)////      tan(Arg) =  r + P1_1 r^3 + P1_2 r^5 + ... + P1_9 r^19//                     + c*(1 + r^2)               N even////               = -1/(r+c) + Q1_1 r   + Q1_2 r^3 + ... + Q1_7 r^13//                     + Q1_1*c                    N odd////     Case normal_r: 2^(-2) <= |r| <= pi/4////      tan(Arg) =  tan(r) + c * sec^2(r)     N even//               = -cot(r) + c * csc^2(r)     otherwise////     For N even,////      tan(Arg) = tan(r) + c*sec^2(r)//               = tan( sgn_r * (B+x) ) + c * sec^2(|r|)//               = sgn_r * ( tan(B+x)  + sgn_r*c*sec^2(|r|) )//               = sgn_r * ( tan(B+x)  + sgn_r*c*sec^2(B) )//// since B approximates |r| to 2^(-6) in relative accuracy.////                 /            (1/[sin(B)*cos(B)]) * tan(x)//    tan(Arg) = sgn_r * | tan(B) + --------------------------------//                 \                     cot(B)  -  tan(x)//                                        \//                       + CORR  |//                                     /// where////    CORR = sgn_r*c*tan(B)*SC_inv(B);  SC_inv(B) = 1/(sin(B)*cos(B)).//// For N odd,////      tan(Arg) = -cot(r) + c*csc^2(r)//               = -cot( sgn_r * (B+x) ) + c * csc^2(|r|)//               = sgn_r * ( -cot(B+x)  + sgn_r*c*csc^2(|r|) )//               = sgn_r * ( -cot(B+x)  + sgn_r*c*csc^2(B) )//// since B approximates |r| to 2^(-6) in relative accuracy.////                 /            (1/[sin(B)*cos(B)]) * tan(x)//    tan(Arg) = sgn_r * | -cot(B) + --------------------------------//                 \                     tan(B)  +  tan(x)//                                        \//                       + CORR  |//                                     /// where////    CORR = sgn_r*c*cot(B)*SC_inv(B);  SC_inv(B) = 1/(sin(B)*cos(B)).////// The actual algorithm prescribes how all the mathematical formulas// are calculated.////// 2. Algorithmic Description// ==========================//// 2.1 Computation for Cases 2 and 4.// ----------------------------------//// For Case 2, we use two-term polynomials.////    For N even,////    rsq := r * r//    Poly := c + r * rsq * P1_1//    Result := r + Poly          ...in user-defined rounding////    For N odd,//    S_hi  := -frcpa(r)               ...8 bits//    S_hi  := S_hi + S_hi*(1 + S_hi*r)     ...16 bits//    S_hi  := S_hi + S_hi*(1 + S_hi*r)     ...32 bits//    S_hi  := S_hi + S_hi*(1 + S_hi*r)     ...64 bits//    S_lo  := S_hi*( (1 + S_hi*r) + S_hi*c )//    ...S_hi + S_lo is -1/(r+c) to extra precision//    S_lo  := S_lo + Q1_1*r////    Result := S_hi + S_lo     ...in user-defined rounding//// For Case 4, we use three-term polynomials////    For N even,////    rsq := r * r//    Poly := c + r * rsq * (P1_1 + rsq * P1_2)//    Result := r + Poly          ...in user-defined rounding////    For N odd,//    S_hi  := -frcpa(r)               ...8 bits//    S_hi  := S_hi + S_hi*(1 + S_hi*r)     ...16 bits//    S_hi  := S_hi + S_hi*(1 + S_hi*r)     ...32 bits//    S_hi  := S_hi + S_hi*(1 + S_hi*r)     ...64 bits//    S_lo  := S_hi*( (1 + S_hi*r) + S_hi*c )//    ...S_hi + S_lo is -1/(r+c) to extra precision//    rsq   := r * r//    P      := Q1_1 + rsq*Q1_2//    S_lo  := S_lo + r*P////    Result := S_hi + S_lo     ...in user-defined rounding////// Note that the coefficients P1_1, P1_2, Q1_1, and Q1_2 are// the same as those used in the small_r case of Cases 1 and 3// below.////// 2.2 Computation for Cases 1 and 3.// ----------------------------------// This is further divided into the case of small_r,// where |r| < 2^(-2), and the case of normal_r, where |r| lies between// 2^(-2) and pi/4.//// Algorithm for the case of small_r// ---------------------------------//// For N even,//      rsq   := r * r//      Poly1 := rsq*(P1_1 + rsq*(P1_2 + rsq*P1_3))//      r_to_the_8    := rsq * rsq//      r_to_the_8    := r_to_the_8 * r_to_the_8//      Poly2 := P1_4 + rsq*(P1_5 + rsq*(P1_6 + ... rsq*P1_9))//      CORR  := c * ( 1 + rsq )//      Poly  := Poly1 + r_to_the_8*Poly2//      Poly := r*Poly + CORR//      Result := r + Poly     ...in user-defined rounding//      ...note that Poly1 and r_to_the_8 can be computed in parallel//      ...with Poly2 (Poly1 is intentionally set to be much//      ...shorter than Poly2 so that r_to_the_8 and CORR can be hidden)//// For N odd,//      S_hi  := -frcpa(r)               ...8 bits//      S_hi  := S_hi + S_hi*(1 + S_hi*r)     ...16 bits//      S_hi  := S_hi + S_hi*(1 + S_hi*r)     ...32 bits//      S_hi  := S_hi + S_hi*(1 + S_hi*r)     ...64 bits//      S_lo  := S_hi*( (1 + S_hi*r) + S_hi*c )//      ...S_hi + S_lo is -1/(r+c) to extra precision//      S_lo  := S_lo + Q1_1*c////      ...S_hi and S_lo are computed in parallel with//      ...the following//      rsq := r*r//      P   := Q1_1 + rsq*(Q1_2 + rsq*(Q1_3 + ... + rsq*Q1_7))////      Poly :=  r*P + S_lo//      Result :=  S_hi  +  Poly      ...in user-defined rounding////// Algorithm for the case of normal_r// ----------------------------------//// Here, we first consider the computation of tan( r + c ). As// presented in the previous section,////      tan( r + c )  =  tan(r) + c * sec^2(r)//                 =  sgn_r * [ tan(B+x) + CORR ]//      CORR = sgn_r * c * tan(B) * 1/[sin(B)*cos(B)]//// because sec^2(r) = sec^(|r|), and B approximate |r| to 6.5 bits.////      tan( r + c ) =//           /           (1/[sin(B)*cos(B)]) * tan(x)//      sgn_r * | tan(B) + --------------------------------  +//           \                     cot(B)  -  tan(x)//                                \//                          CORR  |//                                ///// The values of tan(B), cot(B) and 1/(sin(B)*cos(B)) are// calculated beforehand and stored in a table. Specifically,// the table values are//
12 3 4 5 下一頁
?? 文件大小 20450 K
?? 上傳用戶 hanyuangu
?? 所屬分類 Linux/Unix編程
??? 相關標簽

#glibc #庫函數 #函數
?? 快捷鍵說明

復制代碼 Ctrl + C
搜索代碼 Ctrl + F
全屏模式 F11
切換主題 Ctrl + Shift + D
顯示快捷鍵 ?
增大字號 Ctrl + =
減小字號 Ctrl + -
亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频

?? s_tanl.s

?? 快捷鍵說明