?? background
字號(hào):
/*************************************************************************** ************************************************************************** Spherical Harmonic Transform Kit 2.7 Contact: Peter Kostelec geelong@cs.dartmouth.edu Copyright 1997-2003 Sean Moore, Dennis Healy, Dan Rockmore, Peter Kostelec Copyright 2004 Peter Kostelec, Dan Rockmore SpharmonicKit is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. SpharmonicKit is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. Commercial use is absolutely prohibited. See the accompanying LICENSE file for details. ************************************************************************ ************************************************************************/BACKGROUND----------First we state that all algorithms expect the number of samplesand coefficients to be a power of 2, and that sampling is doneat the Chebyshev nodes.The NAIVE algorithm is just the projection of the functiononto the Legendre functions, sampled at the Chebyshevnodes (zeroes of T_2n).The SEMINAIVE algorithm is an O(N^2) algorithm that performsprojections in frequency space rather than time space, i.e.,it projects the function to be transformed onto cosine series representations of the associated Legendre functions. This isa fast algorithm in practice, provided that the cosine seriesrepresentations of the Legendre functions have been precomputed.Note: in performing a forward spherical transform via theseminaive algorithm, the total size of the cosine seriesrepresentations (the "precomputed data") grows rather quicklyas the bandwidth of the problem increases: bw = 64 -> about 0.34 megabytes of precomputed data bw = 128 -> about 2.69 megabytes of precomputed data bw = 256 -> about 21.5 megabytes of precomputed data bw = 512 -> about 171 megabytes of precomputed data bw = 1024 -> about 1367 megabytes of precomputed dataDepending on memory available and disk storage capabilities,you may want to compute the cosine series as needed (``on the fly")or read the precomputed data off disk.In its original form, the basic DRISCOLL-HEALY (DH) algorithmcomputes a spherical transform of a function with harmonicsof at most order N in O(N^2 (log N)^2) time. This is an asymptoticresult, exact in exact arithmetic. The key component of thisalgorithm is a fast Legendre transform which, assuming aprecomputed data structure of size O(N log N), can be performedin O(N (log N)^2) operations. The fast Legendre transform algorithmworks as follows. The original problem of size N is reduced (bysmoothing and subsampling) to two smaller problems, each of sizeN/2. Then those two smaller problems are themselves smoothedand subsampled. So now there are four subproblems, each ofsize N/4. One continues to divide-and-conquer "all the way down."In terms of implementation, however, one would not divide-and-conquer "all the way down." Overhead costs would render theprogram computationally inefficient. This is one reason why morecomputer-friendly variations of the fast Legendre transformalgorithm were developed.Another reason for developing variations of the above Legendretransform was the unstable nature of the original algorithm.An integral part of the algorithm is its use of shifted Legendrepolynomials. As the order m of the transform increases, thesepolynomials become numerically suspect. Therefore, one needsto take care on how they are used. One can do this with the HYBRIDALGORITHM. Our experience indicates that the HYBRID ALGORITHMis the fastest of the variant algorithms.One can make the HYBRID LEGENDRE TRANSFORM algorithm numericallystable by varying a number of settings, but which settingis optimal (in terms of runtime and accuracy) depends onthe platform the code is run on. The settings provided herecan be thought of as starting points. You must tailor themto your computer.Note: in performing a forward spherical transform via theHYBRID ALGORITHM, the total size of the precomputed datagrows quickly, but not as quickly as the seminaive transform'sneeds. The exact growth is difficult to predict since itdepends on how much (and how) the hybrid algorithm is usedin a spherical transform. But to at least have some figuresto compare, here are the sizes of precomputed data as wehave used the hybrid algorithm: bw = 64 -> about 0.32 megabytes of precomputed data bw = 128 -> about 2.31 megabytes of precomputed data bw = 256 -> about 17.6 megabytes of precomputed data bw = 512 -> about 136 megabytes of precomputed data bw = 1024 -> about 869 megabytes of precomputed dataNote that at bw = 1024 the hybrid-based spherical transform usessignificantly less precomputed data than the seminaive sphericaltransform. But this is for us - your mileage may vary !!! As wesaid above, the hybrid algorithm's performance appears to berather architecture dependent. For example, let bw = 1024, andorder m = 0. The hybrid Legendre transform was faster than theseminaive Legendre transform on both a DEC Alpha and HP Exemplar.However, at order m = 512 (same bandwidth) the hybrid algorithmwas faster than the seminaive algorithm on the HP, but the reversewas true on the DEC! So again, performance depends on how the hybridalgorithm is used. But in terms of precomputed data size, theseminaive sizes may be taken as a sort of "maximum" for thehybrid algorithm (if the hybrid algorithm was used in the "worst"possible way). One final set of numbers: on the DEC and HP,the seminaive forward spherical transform and hybrid forwardspherical transform were roughly competitive at bw = 512. Onthe HP, at bw = 1024 we found the hybrid algorithm to be roughly30% faster, in terms of cpu and walltime. The code was nottested at bw = 1024 on the DEC because of hardware limitations(i.e. not enough available disk space).
?? 快捷鍵說(shuō)明
復(fù)制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號(hào)
Ctrl + =
減小字號(hào)
Ctrl + -