?? coss-notes.txt
字號:
COSS notesAdrian Chadd <adrian@creative.net.au>$Id: coss-notes.txt,v 1.4 2006/10/26 19:29:38 serassio Exp $COSS is a Cyclic Object storage system originally designed byEric Stern <estern@logisense.com>. The idea has been extendedand worked into the current framework by myself.In these notes I'll discuss the current implementation of COSSand note where the implementation differed from Eric's originalidea and why the design changes were made.COSS basics-----------COSS works with a single file. Eventually the file may actually bea raw disk device, but since squid doesn't cache the disk readsin memory the OS buffer cache will need to be employed for reasonableperformance. For the purposes of this discussion the COSS storagedevice will be referred to as a file.Each stripe is a fixed size an in a fixed position in the file. Thestripe size is a compile-time option.As objects are written to a COSS stripe, their place is pre-reservedand data is copied into a memory copy of the stripe. Because of this,the object size must be known before it can be stored in a COSSfilesystem. (Hence the max-size requirement with a coss cache_dir.)When a stripe is filled, the stripe is written to disk, and a newmemory stripe is created.When objects are read back from the COSS file, they can either comefrom a stripe in-memory (the current one, or one being written),or from the disk. If the object is still in a memory stripe, thenit is copied from memory rather than read of disk.If an object is read from disk, it is re-written to the head ofthe current stripe (just as if it were a new object.) This is requiredfor correct operation of the replacement policy, detailed below.When the entire COSS file is full, the current stripe again becomes thefist stripe in the file, and the objects in that stripe are released.Since the objects on disk are kept in a strict LRU representing thereplacement policy LRU linking the StoreEntry's together, this simplyinvolves walking the tail of the LRU and freeing entries until wehit an entry in the next stripe.COSS implementation details---------------------------* The stripe size is fixed. In the original COSS code, Eric optimised this a little by allowing the stripes to be truncated to not waste disk space at the end of the stripe. This was removed to simplify the allocation code slightly and make things easier when the store log and checksums are combined in the stripe for faster rebuilds.* COSS currently copies object memory around WAY too much. This needs to be fixed eventually.* It would be nice if the storeRead() interface were a little smarter and allowed the filesystem to return as much of an object as possible. This would be good for COSS since the read from disk could be simplified to use a single OS read() call - this would work really well for the object types COSS is designed to cache.* The original coss code used file_read() and file_write() for disk IO. The file_* routines were initially used to implement async disk IO, and Eric probably wrote some async disk code for windows. I've written a very very simple async_io.c module which uses POSIX AIO to implement the async IO. POSIX AIO is well-suited to the disk IO COSS performs.COSS direction--------------Eventually, when more of squid is rewritten, I'm going to replacethe replacement policy with something a little more flexible.A shortcut would be to use a slab allocator and have one slab perstripe for the StoreEntry's. When it comes time to replace a stripe,you can just treat the stripe as an array. This would not workwell in the current squid codebase, but it would work well in theplanned rewrite. This would also allow alternate replacement policiesto be used. Oh, it'd cut down the storage requirements perStoreEntry by two pointers (8 bytes on the i386.)Notes by DW July 23, 2003-------------------------Fixed up swap_filen -> offset implementation. Now user can use ablock-size setting to determine the maximum COSS cache_dir size.Fixed bug when cached response is larger than COSS stripe size.Now require max-size to be less than COSS_MEMBUF_SZ.Fixed a lockcount bug. Some aborted requests for cache hits failedto unlock the CossMemBuf because storeCossReadDone isn't called again.Solution is to add locked_membuf pointer to CossState structure andalways unlock it if set. This is probably more reliable thanunlocking based on diskstart/diskend offsets.I'm worried that COSS is susceptible to a denial-of-service. Ifthe user can create N cache misses for responses about as large asCOSS_MEMBUF_SZ, then COSS probably allocates N membufs (stripes)at the same time. For large enough values of N, this should causea malloc failure. Solution may be to refuse to allocate new stripes(thus returning failure for cache misses and hits) after so manyhave already been allocated.Adrian's code has this comment: /* Since we're not supporting NOTIFY anymore, lets fail */ assert(which != COSS_ALLOC_NOTIFY);However, COSS_ALLOC_NOTIFY was still present in the store_dir_coss.crebuild routines. To avoid assertions during rebuild, I commentedout the storeCossAllocate(SD, e, COSS_ALLOC_NOTIFY) call.-- Notes: Adrian Chadd, 9/May/2006* The types used by COSS have been modified to support Large file support, at least under Linux. One can compile with --with-large-files to make sure the right options have been enabled. no compile or run-time checks are currently made to ensure the code has been compiled to support large filesystems.. at least yet.-- Notes: Guido Serassio, 26/October/2006* When using a regular file as container, COSS storage must be initialized once using squid -z like UFS storage.
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -