?? lithos.htm
字號:
<TD colSpan=4>Control flow</TD></TR>
<TR>
<TD><CODE>LABEL0</CODE></TD>
<TD></TD>
<TD></TD>
<TD>Label 0</TD></TR>
<TR>
<TD><CODE>LABEL1</CODE></TD>
<TD></TD>
<TD></TD>
<TD>Label 1</TD></TR>
<TR>
<TD><CODE>JMP</CODE></TD>
<TD></TD>
<TD></TD>
<TD>Unconditional jump</TD></TR>
<TR>
<TD><CODE>JT</CODE></TD>
<TD><CODE>x</CODE></TD>
<TD></TD>
<TD>Jump if true</TD></TR>
<TR>
<TD><CODE>JF</CODE></TD>
<TD><CODE>x</CODE></TD>
<TD></TD>
<TD>Jump if false</TD></TR>
<TR>
<TD><CODE>CALL</CODE></TD>
<TD></TD>
<TD><CODE>addr</CODE></TD>
<TD>Call subroutine</TD></TR>
<TR>
<TD><CODE>RET</CODE></TD>
<TD><CODE>addr</CODE></TD>
<TD></TD>
<TD>Return from subroutine</TD></TR>
<TR>
<TD colSpan=4>Stack manipulation</TD></TR>
<TR>
<TD><CODE>DUP</CODE></TD>
<TD><CODE>x</CODE></TD>
<TD><CODE>x, x</CODE></TD>
<TD>Duplicate topmost item</TD></TR>
<TR>
<TD><CODE>SWAP</CODE></TD>
<TD><CODE>x, y</CODE></TD>
<TD><CODE>y, x</CODE></TD>
<TD>Swap topmost items</TD></TR>
<TR>
<TD><CODE>POP</CODE></TD>
<TD><CODE>x</CODE></TD>
<TD></TD>
<TD>Discard topmost item</TD></TR>
<TR>
<TD colSpan=4>Memory access</TD></TR>
<TR>
<TD><CODE>LOAD</CODE></TD>
<TD><CODE>addr</CODE></TD>
<TD><CODE>mem[addr]</CODE></TD>
<TD>Load a word from memory</TD></TR>
<TR>
<TD><CODE>STORE</CODE></TD>
<TD><CODE>addr, x</CODE></TD>
<TD></TD>
<TD>Store a word to memory</TD></TR></TBODY></TABLE>
<P>The generator used by the <CODE>RANDOM</CODE> instruction is the same as that
used by the mutation and crossover operators, and is seeded from the system
clock when the program starts up.
<P>Division by zero, or dividing <CODE>INT_MIN</CODE> by -1, is treated as
division by 1. Other arithmetic operations wrap around on overflow as normal for
machine arithmetic.
<P>The conditional branch instructions take 0 as false and any nonzero value as
true. The comparison operators return 0 for false and -1 for true. This is
because the logical operators work on each bit of their inputs (they are like
the <CODE>&</CODE>, <CODE>|</CODE> and <CODE>~</CODE> operators in C). So
<CODE>not(1 == 1)</CODE> reduces to <CODE>not(-1)</CODE> which reduces to 0, the
correct result.
<P>Label instructions perform no operation when executed. Attempts to jump to a
null or nonexistent label also perform no operation, except that inputs if any
are still consumed, and a <CODE>CALL</CODE> instruction still pushes the return
address on the stack.
<P><CODE>LOAD</CODE>s and <CODE>STORE</CODE>s outside the bounds of memory
perform no operation, except that inputs are still consumed.
<P><A name="Control flow">
<H3>Control flow</H3></A>
<P>The idea for the control flow mechanism comes from Thomas Ray's Tierra
system. In most real microprocessors, branches have an immediate offset operand
which is added to a base address (the code segment or current instruction
pointer) to give the absolute address to jump to. This is unlikely to work well
in evolutionary computation because it is too brittle. A single insertion or
deletion would change the address of every instruction after that point, thereby
typically changing the meaning of the entire program.
<P>Instead, a template matching system is used. Consider the following code
fragment:
<P><PRE> JMP
LABEL0
LABEL1
LABEL0
LABEL1
</PRE>
<P>On encountering the <CODE>JMP</CODE>, the interpreter looks at the sequence
of label instructions after it - in this case 0101. It then scans back and forth
through the code looking for occurrences of the same sequence of labels, in
other words for the fragment
<P><PRE> LABEL0
LABEL1
LABEL0
LABEL1
</PRE>
<P>not immediately preceded or followed by any other labels. The nearest such
found (measured by distance of the starting label from the jump instruction; in
case of a tie, backward jumps are preferred) is taken as the destination, and
control is transferred to the instruction immediately following that label
sequence.
<P>(In practice, for efficiency reasons the interpreter performs the scans once
up front rather than every time a jump instruction is encountered.)
<P>A subroutine calling mechanism is also provided, where the <CODE>CALL</CODE>
instruction finds its target by the template system and pushes the return
address onto the stack (in the form of an integer from 1 to <I>Max Size</I>,
being the address of the instruction <I>following</I> the <CODE>CALL</CODE>).
The <CODE>RET</CODE> instruction takes the return address from the stack
(irrespective of whether it was put there by a <CODE>CALL</CODE>). Programs
evolved thus far have rarely used this mechanism for its envisioned purpose, but
somewhat to the author's surprise, have often used <CODE>CALL</CODE> to obtain
constant numbers and <CODE>RET</CODE> as an indirect jump.
<P><A name="Example programs">
<H2>Example programs</H2></A>
<P>Two versions of the factorial function are shown below as examples. (These
were hand coded, not evolved. Comments are included in square brackets.) The
initial instruction sequence pushes the number 5 on the stack; when the routine
finishes, this has been replaced with 120.
<P><A name="Factorial(n) recursive style">
<H3>Factorial(n) recursive style</H3></A>
<P><PRE> [n = 5]
CONST0
INC
INC
INC
INC
INC
[factorial(n)]
CALL
LABEL0
[goto end]
JMP
LABEL0
LABEL0
[separator between labels]
NOT
[factorial:]
LABEL0
SWAP
[if n <= 1]
DUP
CONST1
LE
[goto done]
JT
LABEL1
[n * factorial(n - 1)]
DUP
DEC
CALL
LABEL0
MUL
[return]
SWAP
RET
[done:]
LABEL1
[1]
POP
CONST1
[return]
SWAP
RET
[end:]
LABEL0
LABEL0
</PRE>
<P><A name="Factorial(n) iterative style">
<H3>Factorial(n) iterative style</H3></A>
<P><PRE> [n = 5]
CONST0
INC
INC
INC
INC
INC
[result = 1]
CONST0
CONST1
STORE
[loop:]
LABEL0
[if n <= 1]
DUP
CONST1
LE
[goto done]
JT
LABEL1
[result *= n]
DUP
CONST0
LOAD
MUL
CONST0
SWAP
STORE
[n--]
DEC
[goto loop]
JMP
LABEL0
[separator between labels]
NOT
[done:]
LABEL1
[return result]
CONST0
LOAD
</PRE>
<P><A name="The game of Go">
<H2>The game of Go</H2></A>
<P>The current version of Lithos evolves programs to play the game of Go. It
should be emphasized that the objective of this is not to produce something that
can defeat a skilled human player - evolved programs under current hardware play
very poorly even compared to hand-written programs. (Unsurprisingly, considering
that they consist of a few hundred to a few thousand bytes of code, versus tens
to hundreds of kilobytes for hand-written programs.) The objective is to
experiment with artificial evolution on a problem that - unlike most of those
studied, but like biological evolution - is effectively open-ended.
<P>The following points should be noted concerning the rules as applied here:
<P>
<UL>
<LI>White is given a 0.5 point handicap in compensation for playing second.
(Among human players 5.5 points is traditional, but the evolved programs are
not skilled enough to exploit the advantage of the first move very well.)
<LI>Empty squares are regarded as territory only if completely surrounded and
containing no opponent's stones. Even a single opposing stone will act as a
spoiler and must actually be surrounded and captured. (This is to relieve the
evaluation function of having to exercise judgement of live versus dead groups
in assessing the score.)
<LI>Squares occupied by one's own stones are included in one's territory (as
per Chinese rather than Japanese rules). </LI></UL>
<P>In each generation, every program plays two games against every other, one
time taking black and the other time taking white. Thus, programs co-evolve in
an environment of other evolving programs; no preexisting hand-written opponent
is required.
<P>(Note that this means computation time per generation varies as the square of
<I>Population</I>. This algorithm is used anyway because much of the advantage
of having a large population in this context is a more accurate fitness
function: each individual's fitness is better assessed in play against a more
statistically significant number of different opponents.)
<P>The procedure for playing two programs against each other is as follows:
<P>
<OL>
<LI>Allocate an empty 19 by 19 board.
<LI>Allocate a memory space of <I>Max Memory</I> words for each program,
initialized to all zeros.
<LI>Present black with the current state of the board and allow it to make its
move. An invalid move is taken as a pass.
<LI>Do the same for white.
<LI>Repeat steps 3 and 4 until both players have passed in a row or both have
made <I>Max Moves</I> moves.
<LI>Record the winner of the game. </LI></OL>
<P>After all games have been played, if two programs have scored the same number
of wins but are of different sizes then size is used as a tie breaker; the
shorter program has higher fitness. (Even a single extra win counts for more
than any advantage in compactness, though.)
<P>To allow a program to make a move:
<P>
<OL>
<LI>Copy the board state row by row into words 0 to 380 of the program's
memory space. 0 = empty square, 1 = own stone, -1 = opponent's stone.
<LI>Set word 381 to the number of prisoners the player has taken and word 382
to the negative of the number the opponent has taken.
<LI>Leave the rest of the memory space unchanged so that the program can store
information between moves if it so chooses.
<LI>Run the program until it ends or times out after <I>Max Time</I>
instructions.
<LI>Take the top two words of the stack as the row and column of the desired
move (assuming pushed in row, column order).
<LI>If either is outside the range 0 to 18 or the move is otherwise illegal,
count this as a pass. </LI></OL>
<P>(As it happens, a null program will place three stones in the top left corner
of the board by effectively echoing its input. Against another null program,
this claims the entire board as territory in the absence of any opposing stones.
The <I>Score</I> column of the <CODE>log</CODE> file typically shows scores of
381 points in the first few generations, then dropping to very low levels as
wins start being contested, and slowly climbing as skill improves.)
<P>To give visual feedback, the evaluation function displays each generation the
game between the most fit (taking black) and second most fit (taking white)
individuals of the previous generation. This is shown as two boards side by
side, the first being the final state of the board (<CODE>.</CODE> = empty
square, <CODE>*</CODE> = black stone, <CODE>O</CODE> = white stone) and the
second being the resulting territory claims.
<P>The code for all of this is in the <CODE>eval.c</CODE> file. To apply Lithos
to other tasks, simply replace the contents of this file with an appropriate
evaluation function.
<P><A name=Download>
<H2>Download</H2></A>
<P><A href="http://www.esatclear.ie/~rwallace/lithos.zip"><B>C source
code</B></A>
<P>The code was developed under Microsoft C++ version 4.2, but is written in
vanilla ANSI C so should work substantially unmodified on any compiler. The
included makefile is for GNU Make.
<P><A href="http://www.esatclear.ie/~rwallace/lithos.exe"><B>Windows
executable</B></A>
<P>To use the program, simply download it into any directory and run it. The
<CODE>data</CODE> and <CODE>log</CODE> files will be placed in the current
directory. (The <CODE>params</CODE> file should also be placed there if you want
to change the parameters.)
<P>If an existing <CODE>data</CODE> file is found, the program will load it and
continue from where it left off, otherwise it will start with a fresh
population. Therefore if you want to end the current run and begin a new one,
simply move, rename or delete the <CODE>data</CODE> file before restarting the
program. When doing this you should do the same with the <CODE>log</CODE> file
so that the program will create a fresh one.
<P>To pause the program, press the space bar; press again to restart. Any other
key will save the current population to the <CODE>data</CODE> file and exit.
(With larger populations, there may be a few seconds delay while the program
finishes the current evaluation.)
<P><A href="http://www.esatclear.ie/~rwallace/index.html"><B>Back to home
page</B></A>
<P><!-- BEGIN FASTCOUNTER CODE --><A
href="http://member.bcentral.com/cgi-bin/fc/fastcounter-login?2227955"
target=_top><IMG src="Lithos.files/digits.gif" border=0></A> <!-- END FASTCOUNTER CODE --><BR><!-- BEGIN FASTCOUNTER LINK --><FONT face=arial
size=1><A href="http://fastcounter.bcentral.com/fc-join" target=_top>FastCounter
by bCentral</A> </FONT><BR><!-- END FASTCOUNTER LINK --></P></BODY></HTML>
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -