?? balance
字號:
Started Jan 2000 by Kanoj Sarcar <kanoj@sgi.com>Memory balancing is needed for non __GFP_WAIT as well as for non__GFP_IO allocations.There are two reasons to be requesting non __GFP_WAIT allocations:the caller can not sleep (typically intr context), or does not wantto incur cost overheads of page stealing and possible swap io forwhatever reasons.__GFP_IO allocation requests are made to prevent file system deadlocks.In the absence of non sleepable allocation requests, it seems detrimentalto be doing balancing. Page reclamation can be kicked off lazily, thatis, only when needed (aka zone free memory is 0), instead of making ita proactive process.That being said, the kernel should try to fulfill requests for directmapped pages from the direct mapped pool, instead of falling back onthe dma pool, so as to keep the dma pool filled for dma requests (atomicor not). A similar argument applies to highmem and direct mapped pages.OTOH, if there is a lot of free dma pages, it is preferable to satisfyregular memory requests by allocating one from the dma pool, insteadof incurring the overhead of regular zone balancing.In 2.2, memory balancing/page reclamation would kick off only when the_total_ number of free pages fell below 1/64 th of total memory. With theright ratio of dma and regular memory, it is quite possible that balancingwould not be done even when the dma zone was completely empty. 2.2 hasbeen running production machines of varying memory sizes, and seems to bedoing fine even with the presence of this problem. In 2.3, due toHIGHMEM, this problem is aggravated.In 2.3, zone balancing can be done in one of two ways: depending on thezone size (and possibly of the size of lower class zones), we can decideat init time how many free pages we should aim for while balancing anyzone. The good part is, while balancing, we do not need to look at sizesof lower class zones, the bad part is, we might do too frequent balancingdue to ignoring possibly lower usage in the lower class zones. Also,with a slight change in the allocation routine, it is possible to reducethe memclass() macro to be a simple equality.Another possible solution is that we balance only when the free memoryof a zone _and_ all its lower class zones falls below 1/64th of thetotal memory in the zone and its lower class zones. This fixes the 2.2balancing problem, and stays as close to 2.2 behavior as possible. Also,the balancing algorithm works the same way on the various architectures,which have different numbers and types of zones. If we wanted to getfancy, we could assign different weights to free pages in differentzones in the future.Note that if the size of the regular zone is huge compared to dma zone,it becomes less significant to consider the free dma pages whiledeciding whether to balance the regular zone. The first solutionbecomes more attractive then.The appended patch implements the second solution. It also "fixes" twoproblems: first, kswapd is woken up as in 2.2 on low memory conditionsfor non-sleepable allocations. Second, the HIGHMEM zone is also balanced,so as to give a fighting chance for replace_with_highmem() to get aHIGHMEM page, as well as to ensure that HIGHMEM allocations do notfall back into regular zone. This also makes sure that HIGHMEM pagesare not leaked (for example, in situations where a HIGHMEM page is in the swapcache but is not being used by anyone)kswapd also needs to know about the zones it should balance. kswapd isprimarily needed in a situation where balancing can not be done, probably because all allocation requests are coming from intr contextand all process contexts are sleeping. For 2.3, kswapd does not reallyneed to balance the highmem zone, since intr context does not requesthighmem pages. kswapd looks at the zone_wake_kswapd field in the zonestructure to decide whether a zone needs balancing.Page stealing from process memory and shm is done if stealing the page wouldalleviate memory pressure on any zone in the page's node that has fallen belowits watermark.pages_min/pages_low/pages_high/low_on_memory/zone_wake_kswapd: These are per-zone fields, used to determine when a zone needs to be balanced. Whenthe number of pages falls below pages_min, the hysteric field low_on_memorygets set. This stays set till the number of free pages becomes pages_high.When low_on_memory is set, page allocation requests will try to free somepages in the zone (providing GFP_WAIT is set in the request). Orthogonalto this, is the decision to poke kswapd to free some zone pages. Thatdecision is not hysteresis based, and is done when the number of freepages is below pages_low; in which case zone_wake_kswapd is also set.(Good) Ideas that I have heard:1. Dynamic experience should influence balancing: number of failed requestsfor a zone can be tracked and fed into the balancing scheme (jalvo@mbay.net)2. Implement a replace_with_highmem()-like replace_with_regular() to preservedma pages. (lkd@tantalophile.demon.co.uk)
?? 快捷鍵說明
復(fù)制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -