?? srv0srv.c
字號:
a priority higher than normal, that is, higher than user threads.It sleeps most of the time, and wakes up, say, every 300 milliseconds,to check whether there is anything happening in the server whichrequires intervention of the master thread. Such situations may be,for example, when flushing of dirty blocks is needed in the bufferpool or old version of database rows have to be cleaned away.The threads which we call user threads serve the queries ofthe clients and input from the console of the server.They run at normal priority. The server may have severalcommunications endpoints. A dedicated set of user threads waitsat each of these endpoints ready to receive a client request.Each request is taken by a single user thread, which then startsprocessing and, when the result is ready, sends it to the clientand returns to wait at the same endpoint the thread started from.So, we do not have dedicated communication threads listening atthe endpoints and dealing the jobs to dedicated worker threads.Our architecture saves one thread swithch per request, comparedto the solution with dedicated communication threadswhich amounts to 15 microseconds on 100 MHz Pentiumrunning NT. If the clientis communicating over a network, this saving is negligible, butif the client resides in the same machine, maybe in an SMP machineon a different processor from the server thread, the savingcan be important as the threads can communicate over sharedmemory with an overhead of a few microseconds.We may later implement a dedicated communication thread solutionfor those endpoints which communicate over a network.Our solution with user threads has two problems: for each endpointthere has to be a number of listening threads. If there are manycommunication endpoints, it may be difficult to set the right numberof concurrent threads in the system, as many of the threadsmay always be waiting at less busy endpoints. Another problemis queuing of the messages, as the server internally does notoffer any queue for jobs.Another group of user threads is intended for splitting thequeries and processing them in parallel. Let us call theseparallel communication threads. These threads are waiting forparallelized tasks, suspended on event semaphores.A single user thread waits for input from the console,like a command to shut the database.Utility threads are a different group of threads which takescare of the buffer pool flushing and other, mainly backgroundoperations, in the server.Some of these utility threads always run at a lower than normalpriority, so that they are always in background. Some of themmay dynamically boost their priority by the pri_adjust function,even to higher than normal priority, if their task becomes urgent.The running of utilities is controlled by high- and low-water marksof urgency. The urgency may be measured by the number of dirty blocksin the buffer pool, in the case of the flush thread, for example.When the high-water mark is exceeded, an utility starts running, untilthe urgency drops under the low-water mark. Then the utility threadsuspend itself to wait for an event. The master thread isresponsible of signaling this event when the utility thread isagain needed.For each individual type of utility, some threads always remainat lower than normal priority. This is because pri_adjust is implementedso that the threads at normal or higher priority control theirshare of running time by calling sleep. Thus, if the load of thesystem sudenly drops, these threads cannot necessarily utilizethe system fully. The background priority threads make up for this,starting to run when the load drops.When there is no activity in the system, also the master threadsuspends itself to wait for an event makingthe server totally silent. The responsibility to signal thisevent is on the user thread which again receives a messagefrom a client.There is still one complication in our server design. If abackground utility thread obtains a resource (e.g., mutex) needed by a userthread, and there is also some other user activity in the system,the user thread may have to wait indefinitely long for theresource, as the OS does not schedule a background thread ifthere is some other runnable user thread. This problem is calledpriority inversion in real-time programming.One solution to the priority inversion problem would be tokeep record of which thread owns which resource andin the above case boost the priority of the background threadso that it will be scheduled and it can release the resource.This solution is called priority inheritance in real-time programming.A drawback of this solution is that the overhead of acquiring a mutex increases slightly, maybe 0.2 microseconds on a 100 MHz Pentium, becausethe thread has to call os_thread_get_curr_id.This may be compared to 0.5 microsecond overhead for a mutex lock-unlockpair. Note that the threadcannot store the information in the resource, say mutex, itself,because competing threads could wipe out the information if it isstored before acquiring the mutex, and if it stored afterwards,the information is outdated for the time of one machine instruction,at least. (To be precise, the information could be stored tolock_word in mutex if the machine supports atomic swap.)The above solution with priority inheritance may become actual in thefuture, but at the moment we plan to implement a more coarse solution,which could be called a global priority inheritance. If a threadhas to wait for a long time, say 300 milliseconds, for a resource,we just guess that it may be waiting for a resource owned by a backgroundthread, and boost the the priority of all runnable background threadsto the normal level. The background threads then themselves adjusttheir fixed priority back to background after releasing all resourcesthey had (or, at some fixed points in their program code).What is the performance of the global priority inheritance solution?We may weigh the length of the wait time 300 milliseconds, duringwhich the system processes some other threadto the cost of boosting the priority of each runnable backgroundthread, rescheduling it, and lowering the priority again.On 100 MHz Pentium + NT this overhead may be of the order 100microseconds per thread. So, if the number of runnable backgroundthreads is not very big, say < 100, the cost is tolerable.Utility threads probably will access resources used byuser threads not very often, so collisions of user threadsto preempted utility threads should not happen very often.The thread table containsinformation of the current status of each thread existing in the system,and also the event semaphores used in suspending the master threadand utility and parallel communication threads when they have nothing to do.The thread table can be seen as an analogue to the process tablein a traditional Unix implementation.The thread table is also used in the global priority inheritancescheme. This brings in one additional complication: threads accessingthe thread table must have at least normal fixed priority,because the priority inheritance solution does not work if a backgroundthread is preempted while possessing the mutex protecting the thread table.So, if a thread accesses the thread table, its priority has to beboosted at least to normal. This priority requirement can be seen similar tothe privileged mode used when processing the kernel calls in traditionalUnix.*//* Thread slot in the thread table */struct srv_slot_struct{ os_thread_id_t id; /* thread id */ os_thread_t handle; /* thread handle */ ulint type; /* thread type: user, utility etc. */ ibool in_use; /* TRUE if this slot is in use */ ibool suspended; /* TRUE if the thread is waiting for the event of this slot */ ib_time_t suspend_time; /* time when the thread was suspended */ os_event_t event; /* event used in suspending the thread when it has nothing to do */ que_thr_t* thr; /* suspended query thread (only used for MySQL threads) */};/* Table for MySQL threads where they will be suspended to wait for locks */srv_slot_t* srv_mysql_table = NULL;os_event_t srv_lock_timeout_thread_event;srv_sys_t* srv_sys = NULL;byte srv_pad1[64]; /* padding to prevent other memory update hotspots from residing on the same memory cache line */mutex_t* kernel_mutex_temp;/* mutex protecting the server, trx structs, query threads, and lock table */byte srv_pad2[64]; /* padding to prevent other memory update hotspots from residing on the same memory cache line *//* The following three values measure the urgency of the jobs ofbuffer, version, and insert threads. They may vary from 0 - 1000.The server mutex protects all these variables. The low-water valuestell that the server can acquiesce the utility when the valuedrops below this low-water mark. */ulint srv_meter[SRV_MASTER + 1];ulint srv_meter_low_water[SRV_MASTER + 1];ulint srv_meter_high_water[SRV_MASTER + 1];ulint srv_meter_high_water2[SRV_MASTER + 1];ulint srv_meter_foreground[SRV_MASTER + 1];/* The following values give info about the activity going on inthe database. They are protected by the server mutex. The arraysare indexed by the type of the thread. */ulint srv_n_threads_active[SRV_MASTER + 1];ulint srv_n_threads[SRV_MASTER + 1];/*************************************************************************Sets the info describing an i/o thread current state. */voidsrv_set_io_thread_op_info(/*======================*/ ulint i, /* in: the 'segment' of the i/o thread */ const char* str) /* in: constant char string describing the state */{ ut_a(i < SRV_MAX_N_IO_THREADS); srv_io_thread_op_info[i] = str;}/*************************************************************************Accessor function to get pointer to n'th slot in the server threadtable. */staticsrv_slot_t*srv_table_get_nth_slot(/*===================*/ /* out: pointer to the slot */ ulint index) /* in: index of the slot */{ ut_a(index < OS_THREAD_MAX_N); return(srv_sys->threads + index);}/*************************************************************************Gets the number of threads in the system. */ulintsrv_get_n_threads(void)/*===================*/{ ulint i; ulint n_threads = 0; mutex_enter(&kernel_mutex); for (i = SRV_COM; i < SRV_MASTER + 1; i++) { n_threads += srv_n_threads[i]; } mutex_exit(&kernel_mutex); return(n_threads);}/*************************************************************************Reserves a slot in the thread table for the current thread. Also creates thethread local storage struct for the current thread. NOTE! The server mutexhas to be reserved by the caller! */staticulintsrv_table_reserve_slot(/*===================*/ /* out: reserved slot index */ ulint type) /* in: type of the thread: one of SRV_COM, ... */{ srv_slot_t* slot; ulint i; ut_a(type > 0); ut_a(type <= SRV_MASTER); i = 0; slot = srv_table_get_nth_slot(i); while (slot->in_use) { i++; slot = srv_table_get_nth_slot(i); } ut_a(slot->in_use == FALSE); slot->in_use = TRUE; slot->suspended = FALSE; slot->id = os_thread_get_curr_id(); slot->handle = os_thread_get_curr(); slot->type = type; thr_local_create(); thr_local_set_slot_no(os_thread_get_curr_id(), i); return(i);}/*************************************************************************Suspends the calling thread to wait for the event in its thread slot.NOTE! The server mutex has to be reserved by the caller! */staticos_event_tsrv_suspend_thread(void)/*====================*/ /* out: event for the calling thread to wait */{ srv_slot_t* slot; os_event_t event; ulint slot_no; ulint type;#ifdef UNIV_SYNC_DEBUG ut_ad(mutex_own(&kernel_mutex));#endif /* UNIV_SYNC_DEBUG */ slot_no = thr_local_get_slot_no(os_thread_get_curr_id()); if (srv_print_thread_releases) { fprintf(stderr, "Suspending thread %lu to slot %lu meter %lu\n", (ulong) os_thread_get_curr_id(), (ulong) slot_no, (ulong) srv_meter[SRV_RECOVERY]); } slot = srv_table_get_nth_slot(slot_no); type = slot->type; ut_ad(type >= SRV_WORKER); ut_ad(type <= SRV_MASTER); event = slot->event; slot->suspended = TRUE; ut_ad(srv_n_threads_active[type] > 0); srv_n_threads_active[type]--; os_event_reset(event); return(event);}/*************************************************************************Releases threads of the type given from suspension in the thread table.NOTE! The server mutex has to be reserved by the caller! */ulintsrv_release_threads(/*================*/ /* out: number of threads released: this may be < n if not enough threads were suspended at the moment */ ulint type, /* in: thread type */ ulint n) /* in: number of threads to release */{ srv_slot_t* slot; ulint i; ulint count = 0; ut_ad(type >= SRV_WORKER); ut_ad(type <= SRV_MASTER); ut_ad(n > 0);#ifdef UNIV_SYNC_DEBUG ut_ad(mutex_own(&kernel_mutex));#endif /* UNIV_SYNC_DEBUG */ for (i = 0; i < OS_THREAD_MAX_N; i++) { slot = srv_table_get_nth_slot(i); if (slot->in_use && slot->type == type && slot->suspended) { slot->suspended = FALSE; srv_n_threads_active[type]++; os_event_set(slot->event); if (srv_print_thread_releases) { fprintf(stderr, "Releasing thread %lu type %lu from slot %lu meter %lu\n", (ulong) slot->id, (ulong) type, (ulong) i, (ulong) srv_meter[SRV_RECOVERY]); } count++; if (count == n) { break; } } } return(count);}/*************************************************************************Returns the calling thread type. */ulintsrv_get_thread_type(void)/*=====================*/ /* out: SRV_COM, ... */{ ulint slot_no; srv_slot_t* slot; ulint type; mutex_enter(&kernel_mutex); slot_no = thr_local_get_slot_no(os_thread_get_curr_id()); slot = srv_table_get_nth_slot(slot_no); type = slot->type; ut_ad(type >= SRV_WORKER); ut_ad(type <= SRV_MASTER); mutex_exit(&kernel_mutex); return(type);}/*************************************************************************Initializes the server. */voidsrv_init(void)/*==========*/{ srv_conc_slot_t* conc_slot; srv_slot_t* slot; dict_table_t* table; ulint i; srv_sys = mem_alloc(sizeof(srv_sys_t)); kernel_mutex_temp = mem_alloc(sizeof(mutex_t)); mutex_create(&kernel_mutex); mutex_set_level(&kernel_mutex, SYNC_KERNEL); mutex_create(&srv_innodb_monitor_mutex); mutex_set_level(&srv_innodb_monitor_mutex, SYNC_NO_ORDER_CHECK); srv_sys->threads = mem_alloc(OS_THREAD_MAX_N * sizeof(srv_slot_t)); for (i = 0; i < OS_THREAD_MAX_N; i++) { slot = srv_table_get_nth_slot(i); slot->in_use = FALSE; slot->type=0; /* Avoid purify errors */ slot->event = os_event_create(NULL); ut_a(slot->event); }
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -