Recently, part of the test suite I was working on was getting stuck in some test. Unfortunately I wasn’t able to find out where or why from the logs. To make things harder I could not find the right spot to place the debugger either. To sum up, I was totally stuck.
Finally I read a nice article about using GDB to inspect a running Ruby process. I thought it would be a piece of cake to get it working. But it turned out to be quite an effort to make the GDB and Ruby to play it nice. Here is the story…
First attempt was quite successful:
rspec spec/some_spec.rb -b --fail-fast & pgrep rspec # ( or ps | grep rspec ) to get the PID gdb `which ruby`
Reading symbols from /Users/mikz/.rvm/rubies/ruby-1.9.3-p374-railsexpress/bin/ruby...done. (gdb) attach PID # where PID is the running process id call rb_p(rb_eval_string_protect("Kernel.caller",(int*)0)) Unable to call function "rb_eval_string_protect" at 0x101e594a0: no return type information available. To call this function anyway, you can cast the return type explicitly (e.g. 'print (float) fabs (3.0)')
OK. So the debugger “works”. It successfully attaches to the process. Debugging symbols “are there” (sort of). So why is not working?
My Ruby was compiled with
-O3 flag so probably is missing some information gdb needs to get the return types of functions.
The next step was to find out these types so I could write them by hand. I found The Ruby Cross Reference site which does great a job annotating Ruby source code with links and meanings.
You can just search for type identifier (type, function, whatever) and it will point you at the definition in header files. For example the
rb_eval_string_protect has a return type
VALUE. Hmm. Using
VALUE as type does not work in gdb, probably because of optimization. So clicking
VALUE reveals that it is defined as
unsigned long or
uintptr_t. On OS X gdb does not know about
uintptr_t so, by elimination, it is
unsigned long. The final version of the gdb macro looks like this:
define ruby_eval call (void)rb_p((unsigned long)rb_eval_string_protect($arg0,(int*)0)) end
Armed with this knowledge I was able to annotate some gdb_macros_for_ruby with type information and use them to get stack traces and variables.
If you want to have ruby with debugging symbols you have to compile it. I have
rvm_configure_env=(CFLAGS=-O3) in my
.rvmrc so all rubies are compiled with a high optimization level. Unfortunately such optimization conflicts with flags we want to pass to rvm. As a workaround you can do:
$ unset rvm_configure_env # this is probably not needed, maybe if you have really old rvm $ export optflags="-O0 -ggdb" $ rvm install 1.9.3-debug --debug -j 3 -- --enable-shared optflags="-O0 -ggdb" debugflags="-ggdb3" # this should print that ruby was compiled with -O0 -ggdb $ rvm ruby-1.9.3-debug do ruby -rrbconfig -e 'puts RbConfig::CONFIG["optflags"]'
Using GDB with Ruby compiled this way is much easier because you don’t have to specify the return types and get them for free even with source code. It is useful to learn how to annotate gdb macros with return types in case you don’t want or can’t compile own Ruby, but it is not mandatory.
The first GDB I tried was GDB 7.5.1 installed via homebrew from homebrew-dupes.
rb_ call I tried to do failed miserably with:
warning: Mach error at "darwin-nat.c:726" in function "void darwin_resume_thread(struct inferior *, darwin_thread_t *, int, int)": (os/kern) failure (0x5)
Unfortunately I wasn’t able to find out why this happened or how to work around it. If you know why this is happening please leave a comment, it would be awesome to know why.
In our CI (Continuous integration) build we had some processes that from time to time got stuck. But thanks to GDB we have the tools to know what is going on.
And what is really cool is that when you have
pry, you can spawn new Pry session from the gdb like:
(gdb) ruby_eval "binding.pry" # or (gdb) ruby_eval "self.pry"
Together with pry-stack_explorer you should be able to get anywhere in the stack and inspect things. Then just
exit the pry and continue.