March 2008

High cpu usage from eog / gnome-panel ? blame gtk’s recent-manager.

After re-organising a large number of photos using eog I started to notice spikes of cpu usage while moving from image to image. At first this was just inconvenient but as time went on (and as more photos were viewed) the cpu usage spikes became longer and longer. Is my Ubuntu install getting slower? it feels like I’m back in Windows, time to investigate.

Quick Investigation.

Under Linux we can use strace to follow all the system calls a particular process makes, its very high level as we won’t be able to trace any library code. However, it may give us some a clue as to where we should start looking.
Using the command ’strace -o eog.strace eog *.jpg’ we can run eog for a little while and then look through the output for something interesting. After browsing through the output I found the following was appearing after we move from one image to the next but before the next image was displayed.

close(15) = 0
munmap(0xb634f000, 4096) = 0
time(NULL) = 1206991162
time(NULL) = 1206991162
time(NULL) = 1206991162

[ several hundred calls to the time() system call]

time(NULL) = 1206991162
time(NULL) = 1206991163
time(NULL) = 1206991163
gettimeofday({1206991163, 403748}, NULL) = 0
gettimeofday({1206991163, 404023}, NULL) = 0
gettimeofday({1206991163, 404288}, NULL) = 0

We call time() several hundred times in what appears to be a tight loop? something certainly seems broken or at the very least not optimal. Looking through the strace output which traced loading eog, displaying one image, moving to a new image and then exiting we called time() 25267 times.

Preparing To Go Deeper

Usually I debug apps under Solaris which has the awesome power of dtrace along with bundled binaries which are not stripped, this makes it really easy to trace library calls and user land calls. By default Ubuntu packages are shipped with all the binaries stripped which has upset some people I know. The Ubuntu devs are clearly not stupid and provide some debugging aids, not just debug builds (which aren’t really *that* useful in a production environment) but by providing separate packages which include the debugging symbols.

One can add the debug symbol packages to the Gutsy release of Ubuntu by adding the following to their /etc/apt/sources.list file:

deb http://ddebs.ubuntu.com/ gutsy main universe
deb http://ddebs.ubuntu.com/ gutsy-updates main universe

Once that has been done the debugging symbols packages will show up with ‘-dbgsym’ appended to them in apt, for now I am just going to install the eog-dbgsym package. I’m also making sure that I have gdb installed, as that is what I plan to dig deeper with.

Digging Deeper.

Firstly, lets start up eog as we did before but this time attach gdb and set a break point for the time() call.

# eog *.jpg
!! note I know eog’s pid is 23700 !!
# gdb -p 23700
!! loading gdb output….
0xffffe410 in __kernel_vsyscall ()
(gdb) b time
Breakpoint 1 at 0xb73fde25
(gdb) c
Continuing.

!! move to a new image with eog !!

[Switching to Thread -1228670288 (LWP 23700)]

Breakpoint 1, 0xb73fde25 in time () from /lib/tls/i686/cmov/libc.so.6
(gdb) bt
#0 0xb73fde25 in time () from /lib/tls/i686/cmov/libc.so.6
#1 0xb74fae7b in g_bookmark_file_set_mime_type () from /usr/lib/libglib-2.0.so.0
#2 0xb7a34499 in gtk_recent_manager_add_full () from /usr/lib/libgtk-x11-2.0.so.0
#3 0×08067c42 in eog_window_display_image (window=0×8124118, image=0×827ece8) at eog-window.c:870
#4 0×080683df in eog_job_load_cb (job=0×82c6400, data=0×8124118) at eog-window.c:1286
#5 0xb75b8c09 in g_cclosure_marshal_VOID__VOID () from /usr/lib/libgobject-2.0.so.0
#6 0xb75ab772 in g_closure_invoke () from /usr/lib/libgobject-2.0.so.0
#7 0xb75bc323 in ?? () from /usr/lib/libgobject-2.0.so.0
#8 0×082d5450 in ?? ()
#9 0×00000000 in ?? ()
(gdb)

I repeated this a few times to see if the back trace changed, it did not (much). We’re calling some code in libgtk, specifically gtk_recent_manager_add_full(). What is gtk recent manager? what are we bookmarking with ‘g_bookmark_file_set_mime_type()’, all very relevant questions I feel. To dig deeper lets install the gtk and glib debugging symbols packages and look at an improved stack trace (note: here we have passed the first few ‘time()’ break points and are now stuck in the loop which calls time() an excessive number of times.

Breakpoint 1, 0xb74b6e25 in time () from /lib/tls/i686/cmov/libc.so.6
(gdb) bt
#0 0xb74b6e25 in time () from /lib/tls/i686/cmov/libc.so.6
#1 0xb7aecacf in build_recent_info (bookmarks=0×8128a88, info=0×82c1728) at /build/buildd/gtk+2.0-2.12.0/gtk/gtkrecentmanager.c:1677
#2 0xb7aece1c in IA__gtk_recent_manager_get_items (manager=0×81189b0) at /build/buildd/gtk+2.0-2.12.0/gtk/gtkrecentmanager.c:1311
#3 0×08063f2a in eog_window_update_recent_files_menu (window=0×8124118) at eog-window.c:3468
#4 0xb7671c09 in IA__g_cclosure_marshal_VOID__VOID (closure=0×8279148, return_value=0×0, n_param_values=1, param_values=0xbfb05fac, invocation_hint=0xbfb05ebc,
marshal_data=0×80641c0) at /build/buildd/glib2.0-2.14.1/gobject/gmarshal.c:77
#5 0xb7664772 in IA__g_closure_invoke (closure=0×8279148, return_value=0×0, n_param_values=1, param_values=0xbfb05fac, invocation_hint=0xbfb05ebc)
at /build/buildd/glib2.0-2.14.1/gobject/gclosure.c:490
#6 0xb7675323 in signal_emit_unlocked_R (node=0×81255d8, detail=0, instance=0×81189b0, emission_return=0×0, instance_and_params=0xbfb05fac)
at /build/buildd/glib2.0-2.14.1/gobject/gsignal.c:2440
#7 0xb7676847 in IA__g_signal_emit_valist (instance=0×81189b0, signal_id=110, detail=0, var_args=0xbfb061ec “��˷�Ǯ���˷(b��!ծ�\210\212\022\b�9+\b”)
at /build/buildd/glib2.0-2.14.1/gobject/gsignal.c:2199
#8 0xb7676a09 in IA__g_signal_emit (instance=0×81189b0, signal_id=110, detail=0) at /build/buildd/glib2.0-2.14.1/gobject/gsignal.c:2243
#9 0xb7aec7cc in gtk_recent_manager_changed (recent_manager=0×8298778) at /build/buildd/gtk+2.0-2.12.0/gtk/gtkrecentmanager.c:1380
#10 0xb7aed521 in IA__gtk_recent_manager_add_full (manager=0×81189b0, uri=0×82b39d8 “file:///home/iain/Photo-0178.jpg”, data=0×82a2ba0)
at /build/buildd/gtk+2.0-2.12.0/gtk/gtkrecentmanager.c:1001
#11 0×08067c42 in eog_window_display_image (window=0×8124118, image=0×827ed70) at eog-window.c:870
#12 0×080683df in eog_job_load_cb (job=0×84d9dc0, data=0×8124118) at eog-window.c:1286
#13 0xb7671c09 in IA__g_cclosure_marshal_VOID__VOID (closure=0×82b8430, return_value=0×0, n_param_values=1, param_values=0xbfb0656c, invocation_hint=0xbfb0647c,
marshal_data=0×80680c0) at /build/buildd/glib2.0-2.14.1/gobject/gmarshal.c:77
#14 0xb7664772 in IA__g_closure_invoke (closure=0×82b8430, return_value=0×0, n_param_values=1, param_values=0xbfb0656c, invocation_hint=0xbfb0647c)
at /build/buildd/glib2.0-2.14.1/gobject/gclosure.c:490
#15 0xb7675323 in signal_emit_unlocked_R (node=0×83234e0, detail=0, instance=0×84d9dc0, emission_return=0×0, instance_and_params=0xbfb0656c)
at /build/buildd/glib2.0-2.14.1/gobject/gsignal.c:2440
#16 0xb7676847 in IA__g_signal_emit_valist (instance=0×84d9dc0, signal_id=207, detail=0,
var_args=0xbfb067ac “\b7\v\b\b7\v\b
#17 0xb7676a09 in IA__g_signal_emit (instance=0×84d9dc0, signal_id=207, detail=0) at /build/buildd/glib2.0-2.14.1/gobject/gsignal.c:2243
#18 0×08084c97 in eog_job_finished (job=0×84d9dc0) at eog-jobs.c:120
#19 0×0808454e in notify_finished (job=0×84d9dc0) at eog-job-queue.c:65
#20 0xb75ca551 in g_idle_dispatch (source=0xb4726718, callback=0×1a, user_data=0×84d9dc0) at /build/buildd/glib2.0-2.14.1/glib/gmain.c:4132
#21 0xb75cc11c in IA__g_main_context_dispatch (context=0×80e0010) at /build/buildd/glib2.0-2.14.1/glib/gmain.c:2061
#22 0xb75cf55f in g_main_context_iterate (context=0×80e0010, block=1, dispatch=1, self=0×80b75f8) at /build/buildd/glib2.0-2.14.1/glib/gmain.c:2694
#23 0xb75cf909 in IA__g_main_loop_run (loop=0×832feb0) at /build/buildd/glib2.0-2.14.1/glib/gmain.c:2898
#24 0xb7a919e4 in IA__gtk_main () at /build/buildd/gtk+2.0-2.12.0/gtk/gtkmain.c:1144
#25 0×08060c31 in main (argc=5, argv=Cannot access memory at address 0×4
) at main.c:221

Still not much closer but now we have some source files, functions and line numbers to go and look at, perhaps the problem isn’t just with eog but with gtk instead?

Read The Source

From the above stack trace and my tracing of eog I can see that we’re spending a lot of time in build_recent_info() gtkrecentmanager.c:1090, which takes two arguments, a GBookmarkFile and GtkRecentInfo. After some hunting through the source I found that the GBookmarkFile eventually references back to the real file “~/.recently-used.xbel” this file appears to be an xml formatted file for managing “recently” used documents.

Part of the gtk_recent_manager_add_full() function is to add the recently viewed file/image/movie/song to this history file, in this case its an image. Once we’ve added it (which includes adding a time stamp) we then update the last viewed images which are displayed in the file menu of eog. Also (though I found this out some what later) we send a signal to gnome-panel to update the ‘places -> recent documents’ menu.

Once eog (or gnome-panel) receives this ‘update’ signal they run through every item stored in the .recently-used.xbel file, calling time() for each entry allowing us to find the last five items viewed in eog (or for gnome-panel its the last ten). On the surface this seems like a fine idea, though somewhat inefficient (why not just call time() the once for each run through the file, not per entry) making the current implementation slightly flawed.

The Fatal Problem

There is no default limit to the amount of entries we store in the .recently-used.xbel file the more images you view in eog (or as I later found out any gtk app which has been coded to use gtk-recent-manager) the more entries there will be, however, eog will only ever display the last five. This coupled with when running through the list to display the last five entries, time() is called against each entry - this is what leads to the gradual slowdown I described at the beginning of this post.

A single entry in the .recently-used.xbel file is 14 lines long, based on my file I currently have 1734 entries (and growing each day), thus when I move from one image to the next in eog time() is called at least 1734 times in eog followed by gnome-panel performing a second 1734 calls. Clearly this is not acceptable, something must be done - on a slow 1.3ghz laptop this leads to about 5 - 10 seconds of 100% cpu time purely to display the last five images view in eog.

Possible Solutions

  • Add a cron entry to delete the .recently-used.xbel file once a week

This is clearly a hack / workaround. It does not address the root cause, it also does not address calling time() multiple times without any need.

  • Set a default limit of 25 entries per application to be stored and only call time() once per run through of the .recently-used.

A much better solution, this would limit the size of the .recently-used.xbel file and make sorting it more efficient. gnome-panel only displays the last ten viewed documents, eog only displays the last five. Therefore holding the last 25 entries is still overkill but its at least manageable.

Conclusion

I’d love to fix this my self but I just don’t have the time to dedicate to it, hopefully this blog entry will aid people who have found the same issue and possibly help the gtk developers to improve their software.

GTK Devs please fix this!

code
Linux
hacks
gtk
debug

Comments (0)

Permalink

Which Jython?

After reading about Sun hiring some Jython developers I decided to install it myself under Debian testing to see what all the fuss is about. As a big Python fan a world without the GIL (Gloabl Interpreter Lock) of CPython sounds just lovely.

After installing it I had a little poke about, along the way I hit upon this lovely gem.

iain@hostname ~>file `which jython`
/usr/bin/jython: perl script text executable

Shocking results! luckily its just a Debian wrapper script but theres a certain amount of irony involved thats for sure.

code
python
jython

Comments (0)

Permalink