?? paper6
字號:
.ls 2.ta 0.5i +0.5i +0.5i +0.5i +0.5i +0.5i +0.5i +0.5i +0.5i +0.5i +0.5i .rh "Searching.For every group of remainders there will somewhere be a $V$ bit equal to $1$ and a $C$bit at a non-empty location equal to $1$. That is,for every $V$ bit which is $1$ there is a corresponding $C$ bit which is also $1$..FC "Fig. 1."This correspondence is indicated in Fig. 1 by the dotted lines. When searching for a key $H$ in the tablethe location $h(H)$ is examined. If the $V$ bit is $0$ then the search can stopimmediately. Otherwise a search is made for the corresponding $C$ bit which is $1$. To do this a search is made down (or up) the hash table untilan empty location is found. The number of $V$ bits which are $1$from $h(H)$ to this emptylocation are counted. The correct $C$ bit is then found by counting backup (or down) the array from the empty locationfor the same number of $C$ bits which are $1$. Details of this algorithmfollow..ls 1.sp .nf.ta 1.5c +1.35c +1.35c +1.35c +1.35c +1.35c +1.35c +1.35c +1.35c.sp.ne 2Step 1: {initialise variables} $H <-~ t(K);~~j <-~ h(H);~~rem <-~ r(H);~i <-~ j;~~count <-~ 0;$ {check virgin bit} $if~ V[j]=0~then$ {search unsuccessful} $exit ;$.sp.ne 3Step 2: {find first empty location down the table} $while ~R[i] != -1~do~~begin~~count <-~count - V[i];~i <-~ i-1 ~end ;$.sp.ne 4Step 3: {search back to find uppermost member of relevant group} $while count < 0 ~do~begin~ i <-~i+1;~count <-~count +C[i];~end ;$ {$i$ now points at the first(lowest) member of the group associated} {with the original location $j$}.sp.ne 6Step 4: {search group associated with $j$} $while R[i+1] <= rem ~and C[i+1]=0~do i <-~i+1 ;$ {check last location to see if key found} $if R[i]=rem~ mark then$ {search successful} $lineup else$ {search unsuccessful} ;.sp 2.ls 2.fi.ppAn example search is illustrated in Fig. 1 for the key 75.For this example $h$ is computed by dividing by 10 and rounding down, $r$ is computed by taking the remainder modulo 10. .brStep 1: The initial hash locationfor 75 is 7 and its remainder is 5. The $V$ bit at location 7 is 1 so the search continues..brStep 2:The first empty location found by searching down the table is at location 3.There are three $V$ bits with a value of 1 between 7 and 3 at locations 4, 6 and 7..brStep 3:Counting back from location 3 three $C$ bits are 1 at locations 4, 5 and 8.So the $C$ bit at location 8 corresponds to the $V$ bit at the original hash location 7..brStep 4:The group of remainders which share the same initial location 7 can then be found in locations 8 and 9. Thus the remainder 5 at location 8 can beunambiguously associated with the original key 75 and so it can beconcluded that the information associated with the key 75 is present at location 8 in the memory..ppIt still remains to specify the updatealgorithm and to address some issues of efficiency. To this end a thirdarray will be added..rh "Speeding up search."It was found during the simulations reported in Section 4 that the most time consuming element of this searchis step 2 when the table is scanned for an empty location. The essentialrole played by the empty locations here is to provide a synchronisationbetween the 1 bits in the $V$ and $C$ arrays. This lengthy search could be eliminated by maintaining two additional arrays,$#C[ LH ]$ and $#V[ LH ]$, which count from the start of memory the number of $C$ and $V$ bits which are 1. That is:.br $#C[i] ~==~ sum from j=lo to i |C[j]=1~and~R[j] != -1 |$.brand $#V[i] ~==~ sum from j=lo to i V[j]$ ..br.ppIn order to find the $C$ bit corresponding to some $V[i]=1$ then all that is necessary is to compute the difference $count <-~#C[i]-#V[i]$. If $count$ is zero then the remainder stored at $i$ was originallyhashed there and has not been moved. If $count$ is positive then it is necessary to scan down the memory until $'count'$ $C$ bits equal to 1 have been found. If $count$ is negative then it is necessary to scan up the memoryuntil $'-count'$ $C$ bits which are 1 have been found. Fig. 2 shows someexamples of the various situations which can arise..FC "Fig. 2.".ppIn fact, it is not necessary to store $#C$ and $#V$ explicitly, it is sufficient merely to store the differences $#C[i]-#V[i]$. To do this the.ulAt homearray, $A[ LH ]$, will be used..ppAt this point it might seem that all earlier gains have been lost becausein the most extreme case $#C[i]-#V[i]~=~M$. To store a value of $A$will require as many bits as a memory index -- precisely the gain made bystoring remainders rather than keys!\ However, all is not lost. The values of $A$ tend to cluster closely about 0. Simulationshows that a hash memory which is 95% full has 99% of the $A$ valuesin the range -15 to +15. Therefore the following strategy can beadopted. Assign a fixed number of bits for storing each value of $A$, say5 bits. Use these bits to represent the 31 values -15 to +15 and a 32ndvalue for $inf$. Then anywhere that $#C[i]-#V[i]~<~-15~"or"~>+15$ assign $inf$to $A[i]$ otherwise assign the true difference..ppWhen searching for a key a scan can now be done down (or up) the memoryuntil a location $i$ where $A[i] != inf$ is found. (At worst this will occurat the first unoccupied location where $A[i]$ will be zero.)\ From therea count can be made up (or down) the memory for the appropriate number of$C$ bits which are 1..ppIn the detailed algorithm given below some differences from the simpler searchcan be noted.In step 3, $count$ can be bothpositive and negative. Therefore code is included to scan both up and downthe memory as appropriate. At the end of step 3, $i$ can be pointing at anymember of the group associated with the original hash location. (Above$i$ was always left pointing at the lowest member of the group.)\ Therefore code is included for scanning both up and down themembers of the group. In order to prevent redundant checking of locationsby this code a flag $direction$ is used. It can take on three valuesdepending on the direction of the memory scan: $"up"$, $"down"$, and $here$(no further searching need be done)..ls 1.sp .nf.ta 1.5c +1.45c +1.45c +1.35c +1.35c +1.35c +1.35c +1.35c +1.35c.sp.ne 2{Search using at-home count}Step 1: {initialise variables} $H <-~ t(K);~~j <-~ h(H);~~rem <-~ r(H);~~i <-~ j;~~count <-~ 0;$ {check virgin bit} $if~ V[j]=0~then$ {search unsuccessful} $exit ;$.sp.ne 5Step 2: {find first well defined $A$ value down the memory} $while ~A[i] = inf~do~begin~count <-~count - V[i];~i <-~i-1 ~end ;$ $count <-~count +A[i];$.sp.ne 16Step 3: {Search either up or down until a member of sought group is found} {Also ensure $direction$ is set for Step 4.} $if count < 0 ~then$ $direction <-~"up";$ $repeat i <-~i+1;~count <-~count +C[i]~ until count = 0 ;$ $if R[i] ~>=~ rem ~then direction <-~here;$ $else if count > 0 ~then$ $direction <-~"down";$ $repeat ~count <-~count -C[i];~i <-~i-1~ until count = 0 ;$ $if R[i] ~<=~ rem ~then direction <-~here;$ $else${$count = 0$} $if R[i] > rem ~then direction <-~"down"$ $else if R[i] < rem ~then direction <-~"up"$ $else direction <-~here ;$.sp.ne 16Step 4: {search group associated with $j$} $case direction~ "\fBof\fP"$ $here: ;${do nothing} $"down": repeat if C[i] = 1 ~then direction <-~here$ $else$ $begin$ $i <-~i-1;$ $if R[i] ~<=~ rem ~then direction <-~here;$ $end$ $until direction = here ;$ $"up": repeat if C[i+1] = 1 ~then direction <-~here$ $else$ $begin$ $i <-~i+1;$ $if R[i] ~>=~ rem ~then direction <-~here;$ $end$ $until direction = here ;$ $end ;$.sp.ne 4Step 5: {check last location to see if key found} $if R[i]=rem~ mark then$ {search successful} $lineup else$ {search unsuccessful} ;.sp 2.ls 2.fi.FC "Fig. 3.".ppFig. 3, gives an example of this searching algorithm.The same memory and key (75) as in Fig. 1 are used. For thepurposes of the example each $A$ value is allocated one bit. This allows only two values 0 and $inf$. The search proceeds as follows:.brStep 1: The initial hash locationfor 75 is 7 and its remainder is 5. The $V$ bit at location 7 is 1 so the search continues..brStep 2: The first $A$ value not equal to $inf$ found by searching down the table is at location 6.There is one $V$ bit with a value of 1 between 7 and 6, at location 7.$count$ is then set to $A[6]+1~=~1$. So on the next step one $C$bit will be sought..brStep 3:Counting back up from 6 the first $C$ bit equal to 1 is at location 8.So the $C$ bit at location 8 corresponds to the $V$ bit at the original hash location 7..brStep 4:The group of remainders which share the same initial location 7 can then be found in locations 8 and 9. The remainder 5 at location 8 can thus beunambiguously associated with the original key 75 and it can beconcluded that the information associated with the key 75 is present at location 8 in the memory..rh "Insertion."Insertion of a new key into the memory requires three distinct steps:first locating whereabouts in the memory the key is to be placed;second deciding how the memory is to be rearranged to make room for the newkey; and third moving the remainders whilst correctly preserving the$A$ and $C$ values. (The $V$ bits remain fixed during the move.)\ The initial search can be done as explained above with the small addition thatthe correct insertion point must still be located when the key is not present.The second and third steps follow the algorithm in Amble and Knuth [1]with the addition that the values of the $A$ array must be re-calculatedover the shifted memory locations and the $C$ but not the $V$ bits mustbe moved with the keys. Details of this can be found in an earlier draft of this paper, [4]..sh "4 Performance".ppNow I consider how long these algorithms will take to run. The measure of run time that I will use is the number of .ulprobesthat each algorithm makes, that is, the number of times locations in the hash table are examined or updated. CPU time measures were taken as well and correlate well with the empirical counts of probes given below..FC "Table I".FC "Table II".rh "Searching." Tables I and II list the results of simulationsfor successful and unsuccessful searches respectively. Results are tabulatedfor ordinary BLP and for compact hashing with different memory loadings and different sizes forthe $A$ field. If the number of keys storedin the memory is $N$ then the memory loading is measured by $alpha ~==~N/M$, the fraction of locations in the memory which are full. Values ofNa were chosen to correspond to $A$ field lengths of 1, 2, 3,4, and 5 bits, that is for Na equal to 0, 1, 3, 7, and 15 respectively,and also for the case where no $A$ field was used.Increasing the size of the $A$ field beyond 5 bits had no effect atthe memory loadings investigated. So Na equal to 15 is effectively thesame as an unbounded size for the $A$ values. .ppThe insertion procedure is guaranteed to be optimum only for BLP, not for compact hashing. If noneof the values in $A$ is $inf$ then the sequence of locations examined bycompacthashing is the same as for BLP and so the strategy will still be optimum.(This is easily seen by noting that in compact hashing$A[h(t(K))]$ determines the directionof the search depending on whether it is positive or negative. During the subsequent search nolocations past the sought key will be probed. This is exactly the sameprobing behaviour as in BLP.)\ However, if no $A$ array is being used or if some values of $A$ are $inf$then extra locations need to be probed to find an empty location or one whichis not equal to $inf$..ppAs expected the figures in Table I show that for Na at 15 and using optimuminsertion the probes for a successful search are almost the same as for BLP.(The small differences are accounted for by statistical fluctuationsin the simulation results.)\ .pp As Na is decreased the number of probes needed for searching increases.Thisreflects the greater distances that must be traversed to find a value of $A$ not equal to $inf$. It is notable however that even a single bit allocatedto the $A$ fields dramatically improves the performance. Even at amemory density of 0.95 some 25% of non-empty locations have $A$ values of 0..ppThe pattern for unsuccessful searches is broadly the same as sketched abovefor successful searches except that in general unsuccessful searchesare quicker than successful ones. This is a result of the $V$ bitswhich allow many unsuccessful searches to be stopped after a single probe. For example even at the maximum possible memory density of 1 some 36% of$V$ bits are zero. This results in compact hashing being faster thanthe reported values for ordinary BLP. However, unsuccessful BLP searches could beimproved to a similar degree by incorporating $V$ bits..FC "Table III".rh "Insertion."The probes to insert a new key can be broken down into three components,those needed to locate the position where the key is to be inserted,those to decide the direction of movement and those to effect the movement of the memory.The first of these will be slightly larger thana successful search and so the results of Table I have not been repeated.The second two are independent of Na as they are dependent only onthe lengths of blocks of adjacent non-empty locations. The valuesfor these Na independent components are listed in Table III.In most casesthis Na independent component is much larger than the search component.The exception occurs where no $A$ values are being used, when the two componentsare comparable..ppCleary [5] examines a random insertion strategy for ordinary BLPwhere blocks of entries in the hash table are moved in a randomly chosendirectionto accomodate anew entry rather than in the optimum way described by Amble and Knuth [1].It is shown that this strategy canimprove insertion times by a factor of 4 at the expense of small degradations(at most 15%) in retrieval times. Theseresults are shown by simulation to extend to compact hashing. Indeed for small values ofNa the optimum and random strategies show no significant differences inretrieval times..rh "Analytic approximation."While analytic results are not available for the number of probes needed for retrieval or insertion an
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -