?? readme
字號:
Scheduler BUG 6-11-02: The RTLinux scheduler for version 3.1 is re-entrant when it wants to suspend a thread. The procedure is the following. When it wants to suspend a thread different from pthread_self() (with pthread_suspend_np()) it sends a signal to that thread ( RTL_SIGNAL_SUSPEND). Then the scheduler decides to give the CPU for that thread so it has a pending signal. But the handler for that signal marks the thread as suspended and calls the scheduler again. If the process is repeated, as in the following example (sched_bug.c), what we are doing is to push calls to the scheduler in the threads stack. If we are suspended and nobody wakes up us (in a mutex, calling pthread_suspend_np,etc ..) the stack becomes exhausted in a finite time. In version 3.0 the scheduler wasn't reentrant and this error doesn't occur. A good test is to change the stack size and observe that the iterates proportionaly to the stack increment.We have the same problem with user signals handlers. Proposed solution:The solution was to return to the scheduler policy of version 3.0, being non re-entrant.Mutex BUG 4-12-02: Well, the fact is that do_signal when receives the signal RTL_SIGNAL_SUSPENDS sets t->abort to zero. So, when it calls do_abort(t) it has no effect. In our example, blocked thread was blocked on a mutex. Meantime mutex owner thread is sending signal to it (pthread_suspend_np, pthread_wakeup_np,pthread_wakeup_np). What happens is the following: 1.- The blocked gets blocked on the mutex (calling rtl_wait_sleep on the pthread_mutex_lock loop), and it is queued in the mutex wait queue. 2.- At this point, mutex owner sends the following signals: 2.1.- RTL_SIGNAL_WAKEUP: blocked thread takes the CPU to manage it. Then it calls to do_abort and rtl_wait_abort takes it out of the mutex wait queue. Then it loops again and calls rtl_wait_sleep and it is queued on mutex wait queue. 2.2.- Next blocked thread receives the signal RTL_SIGNAL_SUSPEND which sets t->abort to zero, marks blocked thread suspended and calls the scheduler. 2.3.- Finally, blocked thread receives the RTL_SIGNAL_WAKEUP again. But this time when do_abort is called, it has no effect (so RTL_SIGNAL_SUSPEND set its to zero). So blocked thread isn't removed from mutex wait queue. But the code follows (rtl_wait_sleep returns) looping at pthread_mutex_lock loop, calling rtl_wait_sleep again. This function queues blocked thread in mutex wait queue which was queued allready (well, not exactly since the struct queued was local to rtl_wait_sleep.). At this time mutex wait queue contains the following: head -> thread1 -> head. And here is the bug we got a double linked wait queue where the head and the tail are storing the same waiter (blocked thread ). 4.- When thread 0 calls pthread_mutex_unlock and runs the mutex wait queue to wake up blocked threads it runs an infinite loop and the user lost the machine control (so linux never enters). Proposed solution: Possibly, setting do_abort to zero when managing RTL_SIGNAL_SUSPEND is for future compatibility or a simple mistake. Now abort field of thread's structure is only used for mutexes and semaphores. One solution is to not set to zero and execute do_abort when managing the RTL_SIGNAL_SUPEND signal.timespec_add_ns BUG 22-11-02The macro timespec_add-ns available in include/rtl_time.h is implemented as:#define old_timespec_add_ns(t,n) do { \ (t)->tv_nsec += n; \ timespec_normalize(t); \} while (0) and timespec_normalize is implemented as: #define timespec_normalize(t) {\ if ((t)->tv_nsec >= NSECS_PER_SEC) { \ (t)->tv_nsec -= NSECS_PER_SEC; \ (t)->tv_sec++; \ } else if ((t)->tv_nsec < 0) { \ (t)->tv_nsec += NSECS_PER_SEC; \ (t)->tv_sec--; \ } \ } What should happen if the result of (t)->tv_nsec += n; is bigger than two seconds. Clearly, this will lead to an invalid time specification having tv_nsec field a value bigger of NSECS_PER_SEC (1000*1000*1000). Also overflow could happen if the result is bigger than 2^31 ( 2147483648 ). The alternative solution is to implement timespec_normalize as:#define TWOSECONDS (NSECS_PER_SEC*2)#define timespec_add_ns(t,n) do { \ long long aux=(t)->tv_nsec+(n);\ \ if ((aux > TWOSECONDS) || (aux < -TWOSECONDS)) /*check overflow*/ {\ (t)->tv_nsec +=((n) % NSECS_PER_SEC) ; \ (t)->tv_sec += ((n) / NSECS_PER_SEC); \ } else { (t)->tv_nsec=aux; }\ \ timespec_normalize(t); \} while (0) The file timespec_add_ns placed in the dirtory examples/bug tests the both implementations.
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -