?? readme
字號:
README (05/13/00)------This is the TriMedia optimized DCT algorithm.API---The API takes an array of 64 shorts as input and renders the 64shorts as output.The input and output are in normal order. void dct8x8fix(short * restrict stab, short * restrict sres)This applies 8 row DCTs followed by 8 column DCTs.Adding -DONEPASS in the makefile means that only row DCTs are performed. Thisis useful for testing.Compiling with the option -DONEPASS means that only the row DCTs are performed.The output is placed in sres.How to compile DCT------------------ Try the following make clean make ENDIAN=el compile You will see the following warning, which can be safely ignored.dctfix.c:"dctfix.c", line 138: warning: Aliasing of restrict pointer sres by pointer res in function dct8x8fixHow to measure performance-------------------------- The DCT performance is measured by running it and getting the output of tmprof -nomm for the cycle count on the function. Note that cache stall cycles is not included. You should see get something like: Treename Executions Total Cycles (%) I$ Cycles D$ Cycles --------------- ---------- ----------------- --------- --------- _dct8x8fix 1 160 100.00 0 0 ------------------------------------------------------------------------------- total/average 1 160 100.00 0 0 How to test DCT-------------- The DCT is tested as comparing with a reference implementation (viz fdct). make clean make You should see something like this : ... Input input.linear ...21 Differences ... Input input.rand ...22 Differences ... Input test2d.2 ...35 Differences ... Input blksample.1 ...13 Differences ... Input blksample.2 ...0 Differences ... Input blksample.3 ...5 Differences ... Input blksample.4 ...19 Differences ... Input blksample.5 ...15 Differences ... Input blksample.6 ...19 Differences ... Input input.2 ...32 Differences You can checked the differences by hand. The same test is repeated for big and little endian mode.DCT directory structure----------------------------------- Main level coeffs.h - coefficients. Can be generated by one_d.float.c in ../idct. Note that coefficients are not word swapped for little endian, which for the IDCT is not the case. gen.c - Simple test vector generation program fdct.c - Simple forward DCT transform as per textbook algorithm. Includes 2D and 2x 1D implementations of an 8 x 8 DCT one_d.c - 1 D implementation of DCT ref/ Temporary directory for output of test procedure (see makefile) tmp/ Temporary directory for output of test procedure (see makefile) main.c Test program. Reads a 8 x 8 test vector using ASCII format and applies a single DCT dctfix.c This is the DCT algorithm **SEE ALSO** one_d.float.c in ../idct directory (coefficient generator program) Internal Parameters------------------- There are less parameters to adjust than for the IDCT. For most of these, see dctfix.c ENDIAN=el Little endian ENDIAN=eb Big Endian HROUND(x) Rounding macro for horizontal pass. Note that results are represented in Q.16 format and upscaled by 2. VROUND(x) Rounding macro for vertical pass. No upscaling is performedError analysis----------------Due to algorithm and arithmetical differences, differences between theoutput and the reference implementation are inevitable.To do error analysis, use the one_d.c program provided to compare 1-D outputswith the algorithm.In this case, the algorithm should be compiled with -DONEPASS. The outputsshould match in this case.The outputs of the algorithm as compiled with -DONEPASS should match upwith the outputs of fdct.The options -sep and -onepass should be used with fdct.Coefficients------------The coefficient values used are as follows. Upper 16 bits Lower 16 bits#define C0 _(-(cos(3.*M_PI/16.) + sin(3.*M_PI/16.)) cos(M_PI/16.) + sin( M_PI/16.))#define C1 _(cos(3.*M_PI/16.) - sin(3.*M_PI/16.) sin( M_PI/16.) - cos( M_PI/16.))#define C2 _(- sqrt(2.) * sin(3.*M_PI/16.), - sqrt(2.) * cos( M_PI/16.))#define C3 _( sqrt(2.) * cos(3.*M_PI/16.), - sqrt(2.) * sin( M_PI/16.))#define C4 _( sqrt(2.) * cos(3.*M_PI/16.), sqrt(2.) * sin( M_PI/16.))#define C5 _( sqrt(2.) * sin(3.*M_PI/16.), - sqrt(2.) * cos( M_PI/16. )) #define C6 _( cos(3.*M_PI/16.)-sin(3.*M_PI/16.), cos(M_PI/16.)-sin( M_PI/16.))#define C7 _( cos(3.*M_PI/16.)+sin(3.*M_PI/16.), cos(M_PI/16.)+sin( M_PI/16.))#define C8 _(1, 1)#define C9 _(1, -1)#define C10 _(sqrt(2.) * cos( M_PI/8.), sqrt(2.) * cos( 3.*M_PI/8.))#define C11 _(sqrt(2.) * cos( 3.*M_PI/8.), - sqrt(2.) * cos( M_PI/8.)) These values can be generated by using one_d.float -dctmask (see IDCT directory) **ICG -- Update table in cookbook using these values **General remarks on DCT integration----------------------------------The DCT algorithm is most useful in the context of a completemultimedia application, such as an MPEG-2 decoder, or H.263videoconferencing.More issues must be addressed,including handling of zigzag addressing, Variable Length Decoding(VLD), inverse quantization, block clear, and optimization.The TriMedia implementation has been designed for clarity and efficiencybut has not been optimized for a particular application. Performance optimizations, such as use of a shuffled input order (seethe DCT algorithm flow) are left to the user.Changes from previous versions------------------------------The main change in this version is that ittakes input in the same coefficient order and format as a normal DCT algorithm andproduces output in the same coefficient order and format. The code has been generallycleaned up, made more robust, and easier to use.Besides this, extensive QA analysis has been done on the accuracy. The DCT has beentested for compatibility with the inverse DCT.The optimizations from the MPEG decoder (output in transposed and shuffled order, outputupscaled by 8) have been removed. This allows the DCT to be used more easily.Little and big endian modes are both supported.Coefficient generator source program is providedA number of 8x8 sample vectors are provided for customer testing.The coefficients have been corrected for errors in the cookbook (sourcecode was OK)
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -