?? readme
字號:
This directory contains X86_64/x86-64 SSE assembly implementations for some of the most important Gromacs nonbonded functions.Since we want it to work with any compiler we cannot use gcc inlineassembly (Portland and Microsoft don't support it), so they are coded entirely in assembly. This is slightly (10-15%) faster too.It is not as bad as it looks to edit them - search the net for a tutorial on x86-64 assembly. The code is using the intel syntax insteadof the old AT&T syntax employed in early versions of gcc.Update, January 2006:Unfortunately we need to support BOTH AT&T as well as Intel syntaxversions, because of braindead compilers and OS vendors.For historical reasons (and sanity) I have done the editing in Intelsyntax, which is stored in the *.intel_syntax.s files. These arethen translated to AT&T syntax by the program intel2gas, and storedin the standard *.s files, which are used by default.Currently, the only platform that absolutely NEEDS intel format isWindows using the NASM assembler.I've made pretty good use of registers, and frequently use them tosave information needed several lines further down.Thus, if you need to add something in the code you might have to save a couple of registers on the stack and reload them when you are done. I haven't rewritten SSE stuff to use the extra eight registers, simplybecause it won't help much - all it gives you is a larger window tothe physical registers. Register renaming will do most of the work forus anyway...Normally, SSE works on 4 floats in parallel, and SSE2 on 2 doubles.This means I've had to write special sections to take care of theremaining odd elements when neighborlists are not a multiple of 4.When changing this code you might have to be careful and zero theunused elements; if the first three positions are valid but the 4thNaN (not a number), the energy would be NaN if we add them!The following loops have been implemented in SSE:(There are also non-force versions in each file)nb010_sse (No Coul, VdW=Lennard-Jones, no water optimization)nb030_sse (No Coul, VdW=Table, no water optimization)nb100_sse (Coul=Normal, No VdW, no water optimization)nb101_sse (Coul=Normal, No VdW, water=SPC/TIP3P-other atom)nb102_sse (Coul=Normal, No VdW, water=SPC/TIP3P-SPC/TIP3P)nb103_sse (Coul=Normal, No VdW, water=TIP4P-other atom)nb104_sse (Coul=Normal, No VdW, water=TIP4P-TIP4P)nb111_sse (Coul=Normal, VdW=L-J, water=SPC/TIP3P-other atom)nb112_sse (Coul=Normal, VdW=L-J, water=SPC/TIP3P-SPC/TIP3P)nb113_sse (Coul=Normal, VdW=L-J, water=TIP4P-other atom)nb114_sse (Coul=Normal, VdW=L-J, water=TIP4P-TIP4P)nb201_sse (Coul=RF, No VdW, water=SPC/TIP3P-other atom)nb202_sse (Coul=RF, No VdW, water=SPC/TIP3P-SPC/TIP3P)nb203_sse (Coul=RF, No VdW, water=TIP4P-other atom)nb204_sse (Coul=RF, No VdW, water=TIP4P-TIP4P)nb211_sse (Coul=RF, VdW=L-J, water=SPC/TIP3P-other atom)nb212_sse (Coul=RF, VdW=L-J, water=SPC/TIP3P-SPC/TIP3P)nb213_sse (Coul=RF, VdW=L-J, water=TIP4P-other atom)nb214_sse (Coul=RF, VdW=L-J, water=TIP4P-TIP4P)nb301_sse (Coul=Table, No VdW, water=SPC/TIP3P-other atom)nb302_sse (Coul=Table, No VdW, water=SPC/TIP3P-SPC/TIP3P)nb303_sse (Coul=Table, No VdW, water=TIP4P-other atom)nb304_sse (Coul=Table, No VdW, water=TIP4P-TIP4P)nb311_sse (Coul=Table, VdW=L-J, water=SPC/TIP3P-other atom)nb312_sse (Coul=Table, VdW=L-J, water=SPC/TIP3P-SPC/TIP3P)nb313_sse (Coul=Table, VdW=L-J, water=TIP4P-other atom)nb314_sse (Coul=Table, VdW=L-J, water=TIP4P-TIP4P)nb331_sse (Coul=Table, VdW=Table, water=SPC/TIP3P-other atom)nb332_sse (Coul=Table, VdW=Table, water=SPC/TIP3P-SPC/TIP3P)nb333_sse (Coul=Table, VdW=Table, water=TIP4P-other atom)nb334_sse (Coul=Table, VdW=Table, water=TIP4P-TIP4P)
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -