Ruby multi-threading programs will not gain real advantage when executed on multi-core CPUs.
This is a study of Ruby1.9.1-p243 internals considering threading implementation. I have tried several tools for source code investigation, starting with grep.
But KScope turned out to be the most helpful to my taste. In this article, "Ruby" generally refers to Ruby 1.9.1 C implementation. Implementation details and concurrency limitations could be changed in future versions, according to developers' warnings in code comments and README files.
The things I wanted to know were:
- How does Ruby interpreter ensure that only one thread could be in runnable state?
- How threads are scheduled so that several threads may appear to run simultaneously?
- Why 1.9 interpreter uses two native threads for a single-threaded Ruby program ?
- How the threads in C/C++(or other languages) may interact with Ruby code ?
Thread implementation code is located, unsurprisingly, in thread*.c source files. Ruby interpreter utilizes a global mutex object to limit simultaneous thread execution. The technique known as A Global Interpreter Lock (GIL) is used in other interpreted languages implementation (e.g. Python). In ruby 1.9 the term
is GVL Global VM Lock or Giant VM Lock, according to source code comments.
The GVL is accessed through th->vm->global_vm_lock expression, where th is a pointer to any valid interpreter's internal thread structure, rb_thread_t.
Considering C-level implementation of Ruby programs we can distinguish two types of threads:
- Regular Ruby threads - which on higher level appear as instances of Ruby's Thread class. On Interpreter level they have rb_thread_t structure associated with them, alongside with VALUE data.
- Low-level threads, which I will call C-threads. They are not visible on Ruby level at all.
Ruby architecture imposes different limitations on Ruby threads and C-thread behavior. The main limitation - exclusive execution - applies to Ruby threads only.
There is no such restriction on C threads. So, when an extension for ruby is written in C, it may actually utilize more than one thread at the same time, provided they are low-level C threads.
On the other hand, only Ruby threads are allowed to deal with Ruby objects. No C thread may touch, create, destroy or return any data, related to Ruby object system and interpreter implementation. There may be some exceptions for this rule, but they require more research and deeper understanding to formulate here.
Typical usage of C threads in Ruby extension require that all data on which C thread operate is converted from Ruby format to custom C structures before it is passed to the C thread for processing. And all results are converted back to Ruby VALUEs with the code executed in Ruby thread only. C threads could, of course, be started indirectly in a third-party library, e.g.for a database or network access. The same restrictions also apply.
GVL variable is of rb_thread_lock_t type, and rb_thread_lock_t is a typedefed platform mutex type. It is initialized and acquired by the main thread of Ruby interpreter, which is also main Thread of interpreted code. If there are more than one Ruby thread, one of them has the GVL, and other threads could be in one of the following states:
- Waiting for GVL to become free in an acquisition call.
- Waiting for a blocking operation to finish: an IO, or semaphore (other than GVL) wait
- New thread is being initialized - before GVL could be acquired some internal structures should be established, memory allocated, etc. Everything is done on C level, without any Ruby objects being touched.
- Executing some system-level calls after finishing a blocking operation and before an attempt to acquire the GVL. This is a brief moment when more than one "Ruby" thread may technically execute in parallel, but no Ruby object is touched during such a quick intermission.
When a Ruby thread is making a potentially blocking library call, it always releases GVL and re-acquires it after the call finishes. Note, that it is perfectly normal to have a situation when no Ruby thread owns GVL at a given time. Every thread is just blocked on some other resource.
Heavy use of GVL in Ruby interpreter makes OS-level thread preemption virtually insignificant for scheduling of Ruby threads. On interpreter level Ruby threads are de-facto cooperative, because they voluntarily give up GVL ownership. But this cooperation is not visible on the Ruby code level. For Ruby program it may seem that scheduling is preemptive; no special coding is required from Ruby programmer to give all Ruby threads a fair chance to run. Even threads without I/O operation doing heavy calculations are regularly scheduled by Ruby interpreter. That is, the GVL is transparent from the Ruby programmer point of view. The only observable effect of the Ruby 1.9 threading implementation architecture for a Ruby programmer may be under-utilization of multiple cores on a modern processor.
Effective thread scheduling implies regular switching between threads. In Ruby interpreter this is achieved by calling rb_trhead_schedule function periodically.
void rb_thread_schedule(void) { thread_debug("rb_thread_schedule\n"); if (!rb_thread_alone()) { rb_thread_t *th = GET_THREAD(); thread_debug("rb_thread_schedule/switch start\n"); rb_gc_save_machine_context(th); native_mutex_unlock(&th->vm->global_vm_lock); { native_thread_yield(); } native_mutex_lock(&th->vm->global_vm_lock); rb_thread_set_current(th); thread_debug("rb_thread_schedule/switch done\n"); RUBY_VM_CHECK_INTS(); } }
The rb_thread_schedule function is heavily used in Ruby implementation. For example, Ruby Thread.pass class method actually just calls this function. To achieve the illusion of preemptiveness in Ruby programs, the interpreter ensures that the function rb_thread_schedule is being called on regular basis. It is implemented with the help of a timer thread and a kind of interrupt mechanism. Every 10 microseconds the timer thread sets timer interrupt flag. At the end of each Ruby method call just before the result of the call is returned to the caller, Ruby interpreter checks whether interrupt flags are set (with RUBY_VM_CHECK_INTS macro). If they are, rb_thread_execute_interrupts is called. rb_thread_execute_interrupts function remembers the state of the flags and clears them. If timer interrupt have been set, then the thread scheduling is performed with the call to rb_thread_schedule.
#define RUBY_VM_CHECK_INTS_TH(th) do { \ if (UNLIKELY(th->interrupt_flag)) { \ rb_thread_execute_interrupts(th); \ } \ } while (0) #define RUBY_VM_CHECK_INTS() \ RUBY_VM_CHECK_INTS_TH(GET_THREAD())
Timer thread is a background C thread.
It runs an infinite loop: sleeps for a fixed time and then sets timer interrupt flag for whatever Ruby thread is considered current at the moment. Though on single processor system a Ruby thread cannot actually be running in parallel with the timer thread, there is always a current Ruby thread - namely the thread that owns GVL.
Timer thread calls following macro:
RUBY_VM_SET_TIMER_INTERRUPT(vm->running_thread);
As for the sleep period of the timer thread, it surprisingly enough seems to be platform-dependent. It is 10 milliseconds for Windows and 10 microseconds on Linux and *nix-compatible platforms.
- NB: This seems a bit strange. Could I have misunderstood the source? Is there a way to verify above statement experimentally?
Timer thread is started during the interpreter initialization. It is worth mentioning, that Ruby interpreter temporarily shuts timer thread down during the start of a new process (with fork, exec etc.). After the process is started, the timer thread is re-launched.
Great post, thanks a lot. I've been learning a little about the threading model myself, but mainly because it overlaps with signal handling that I was researching. It's great to get a better understanding of it - I'll be looking into it some more soon!
ReplyDeleteThanks for this post! I am curious if YARV will really bring any performance advantages and keep the flexibility of Ruby.
ReplyDeleteBest
Zeno