?? readme
字號:
Multiprocessor Version of SimpleScalar 30 May 2000OVERVIEW:This README file is for the multiprocessor version of the SimpleScalarsimulator.The multiprocessor extensions were conceived and implemented byNaraig Manjikian at Queen's University, Kingston, Ontario, Canada.The extensions are based on release 2.0 of the original uniprocessorSimpleScalar tool set written by Todd M. Austin.SimpleScalar software is available from http://www.simplescalar.org.The uniprocessor version of SimpleScalar has documentation on installationof binary utilities and compilers to generate executables for simulation.The software in this package produces a fast functional simulator'sim-mpfast' that interleaves the simulated execution of multiplethreads. A runtime library is also included to support thread creationand synchronization. Locks, barriers, and semaphores are supported.For simplicitly, the simulator does not allow more than one lock tobe held by a single thread. Typical multiprocessor programs adhere tothis rule, hence this should not be a significant restriction.It should be noted that the multiprocessor simulation code does _not_include any support for re-entrant standard library routines such asI/O or memory allocation. Explicit locking for mutual exclusion withinthe application program is necessary if parallel threads will engagein such operations.The software also includes 'sim-mpcache' that models a two-level cachehierarchy in a multiprocessor system and enforces cache coherence.Extensive statistics are maintained and reported at the end of thesimulation.The 'sim-mpcache' simulator also has an optional graphical visualizationcapability based on X windows to dynamically illustrate changes in cache linestates as the simulation progresses. Different colors are used torepresents the states in a MESI protocol: white=untouched cache line,black=invalid, green=shared, yellow=exclusive, red=modified, andcyan=modified-above (used only for the L2 cache to reflect an out-of-datestate with respect to the L1 cache). Furthermore, L1 and L2 cache missrates are computed over short intervals and displayed with bar graphs.The overall miss rates are still reported at the end of the simulation.The current version of the graphical capability assumes 8 processorswith L1 caches of 8 kbytes, L2 caches of 256 kbytes, and a cache linesize of 16 bytes. Experiments may be performed with different configurations,and appropriate statistics would be reported at the end of a simulation,but the display may not look good. Modifications would be required tomake the graphics code adapt the size of the display window todifferent configurations.Note that neither simulator performs detailed timing simulation;all instructions take one cycle. When a thread is blocked due tosynchronization, however, no instructions are fetched and interpretedfor that thread until the synchronization completes (i.e., needed lock isreleased, or all threads arrive at barrier).The runtime library is currently implemented in only the portableinstruction set architecture (PISA) for SimpleScalar. You will need thetools for generating PISA executables in order to build the runtimelibrary and multiprocessor programs for simulation.A file 'c.m4.ss' is included as the basis for using the PARMACS macros(Lusk et al., "Portable Programs for Parallel Processors," Holt, Rinehartand Winston, Inc., 1987). The macros that are defined are sufficient fortesting purposes and have been used with several benchmark programs,but _absolutely no guarantee_ is provided for their completeness.For a description of the differences between this multiprocessor versionof SimpleScalar and the original uniprocessor version on which it is based,consult the file called CHANGES.USAGE:The multiprocessor simulators are used in a manner similar to the uniprocessorsimulators.The 'sim-mpfast' simulator has no special command line options. It simplyperforms a functional multiprocessor simulation.The 'sim-mpcache' simulator has several command line options.The list below describes the options and their current default values: -L1 <int> # 8 # Level 1 cache size (kbytes) -L1line <int> # 16 # Level 1 line size (bytes) -L2 <int> # 256 # Level 2 cache size (kbytes) -L2line <int> # 16 # Level 2 line size (bytes) -graphics <true|false> # false # graphical display of coherence -speed <int> # 10 # animation speed (1 to 10)The latter two options above are only valid if 'sim-mpcache' was generatedwith graphics capability enabled (see notes on building simulators below).If 'sim-mpcache' is used with graphics, the display assumes up to 8 processors.An error message will be reported and the simulation will be terminated ifmore than 8 processors are used. Furthermore, the window size is set for8-kbyte L1 caches and 256-kbyte L2 caches using the default line size of16 bytes. You can try other values, but it may not look very good.Enhancements to create the window with a size that is determined by theconfiguration parameters are certainly possible.A limited amount of refresh capability is provided for the graphics in'sim-mpcache.' Exposure events on the display window are captured, andthe display is redrawn. If the window is iconified, then remapped, thedisplay is also redrawn. At the end of a simulation with graphics enabled,there is a pause for a press of the return key in the shell window fromwhich the simulation was executed. This pause allows the user to view thefinal cache states. Currently, however, no refresh is supported at theend of a simulation if the display is obscured or if the window is iconified.When graphics are enabled for 'sim-mpcache' and the window is iconified,the simulation runs much faster because a flag is used to reflect whetheror not the window is mapped. The cache simulator continues to maintainand update the states of cache lines in all of the caches, but no drawingis performed. Once the window is remapped, the exposure events cause thedisplay to be refreshed with the current cache contents, but the simulationspeed is also reduced because all cache line state changes are drawn as well.Regardless of whether graphics are enabled for 'sim-mpcache,' there is areport at the end of the simulation with hit/miss statistics and variouscoherence-related statistics.Finally, if the graphics capability is not going to be used, the 'sim-mpcache'simulator may be regenerated without the compile-time flag for graphics(see the notes below on building the simulators). The resulting simulatorshould be slightly more efficient in this manner.SUPPORTED PLATFORMS:This software package has been installed successfully on the following systems: Architecture Operating System Compiler SPARC Solaris 2 gcc TO BUILD:The following instructions describe how to build and test the multiprocessorversion of the SimpleScalar simulator: a) vi Makefile set the SS_BIN_PATH variable with the path to your version of the tools used to compile/assemble SimpleScalar code; for example, if the C compiler you use to generate PISA code for simulated programs is located at /software/SIMPLE_SCALAR/ssbig-na-sstrix/bin/gcc then set SS_BIN_PATH to /software/SIMPLE_SCALAR/ssbig-na-sstrix/bin (this path is not used to build the simulator; the variable is passed on to another Makefile for building the runtime library and for building the simple multiprocessor test programs) make sure all compile options are set for your host, you'll likely not have to change anything for the supported hosts, and if you need to change anything, it will likely be the CC variable (which specifies the ANSI C compiler to use to build the simulators). NOTE: the simulators must be built with an ANSI-C compatible compiler, if you have problems with your compiler, try using GNU GCC as it is known to build the simulators on all the supported platforms. you may control whether or not debugging features are included in the compiled simulators with the -DDEBUG option in the FFLAGS variable and you may control whether or not the 'sim-mpcache' simulator includes graphics with the -DGRAPHICS option in FFLAGS b) make this builds the multiprocessor SimpleScalar 'sim-mpfast' and 'sim-mpcache' simulators, and also builds the runtime library (by invoking a separate Makefile) c) make sim-tests for easier reading, pipe the output of this make command to more: make sim-tests |& more this runs very simple tests to ensure that the runtime library was properly generated and that 'sim-mpfast' executes successfully; a separate Makefile in a subdirectory is invoked for this purpose (note that there currently is no verification of correct operation except by you, the human, observing whether or not multiprocessor simulation of the test programs completes successful; automation comes later...) the '-d' flag is used in these multiprocessor test executions to show the synchronization activity; running the simulator without the '-d' flag will suppress the messages that are printed this make command does not test the 'sim-mpcache' simulator nor its graphical capability; see below for separate test d) if you successfully compiled the 'sim-mpcache' simulator, the following command may be used to try one of the test programs: sim-mpcache -graphics -speed 1 tests/dotbar.ss -p8 -n10000 the default speed of 10 is probably too fast to clearly see the data access patterns of the test program, so the slowest speed of 1 is recommended aboveACKNOWLEDGEMENTS:The production of this software was supported in part by funding fromQueen's University, Communications and Information Technology Ontario (CITO),and the Natural Sciences and Engineering Research Council of Canada (NSERC).......................................................................Naraig Manjikian (nmanjiki@ee.queensu.ca)Department of Electrical and Computer EngineeringQueen's UniversityKingston, Ontario, Canada K7L 3N6http://www.ece.queensu.ca
?? 快捷鍵說明
復(fù)制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -