Performance and Qt 4.5

For Qt 4.5, the development team has worked hard with performance. There have been a few blogs on the topic already, but I was asked to give a small chat about it from a slightly higher-level perspective, now that the release is finally out.

It is extremely annoying for a user to sit and wait for a file dialog – so Alexis did something about it! Looking at the, admittedly extreme, case of showing 10000 sub-directories, the time for local folders has changed from 21 seconds to 360 milliseconds. The speed up in this case came from three major areas. The internal data model was rewritten to scale better, a cache of filesystem icons was introduced to avoid fetching these from the system for every individual file or directory. Finally, he sat down with the itemview guys and tweaked the treeview and listview classes to perform a bit better, so there should be some candy in it for itemview users in general.

Accessing the same folder over the locally shared network, the improvement is smaller in percentage, but reduces the waiting time from "hey my application has crashed" to "I really need to do something about my directory structure". Its down from 77 seconds, to 17 seconds. Again, much because of the datastructure rewrite.

filedialog

You might recognize the chip demo in the image below. Forty thousand chips, each with an individual text, together forming a picture of parts of the Qt crew. Back when we launched Qt 4 we thought that was pretty cool. It still it, but from Qt 4.5 is will be faster. The improvements range from around 30% on Windows for most zoomed out operations to 5-100 times better performance on Linux with the raster paint engine.

chipdemo

Prior to calling QGraphicsItem::paint() there is a little bit of work involved in figuring out exposed areas, the items StyleOption, intersecting shapes, clips etc. The graphicsview guys, Andreas, Alexis, Ariya and Bjørn Erik, have spent quite a bit of time in minimizing the effort spent here.

Also, when zoomed out, the chips are drawn using a painter.fillRect(rect, color) which in 4.4 allocated a brush. Both on the opengl paint engine and the software paint engine this path is now, malloc free, which again brings down its cost quite a bit. As for software outperforming X11, the use-case fits very well to the software engine. Lots of small transformed primitives (small being important) with different colors, transformations etc. Specially state-handling is a lot more complicated in the X11 paint engine.

However, the biggest improvements come with nested items. Examples of this is when moving groups of items by moving their parents, source here. When moving a top-level item containing fifty children that, in turn, holds ten children each, in a scene with one hundred top-level items (all in all, moving 551 items in a scene of 55100 items) the speed improvements range from 20-64 times – check out the bars below for the bigger picture.

graphicschildren

Again, QGraphicsView has gotten a lot simpler in the "stuff" it does between paint(), like how often it goes up and the parent chain to figure out current transformation etc. Also, the example relies on clipping which was sped up quite a bit in 4.5 for raster and the opengl engines, particularly where save/restore is involved.

Learn more about the performance improvements introduced with Qt 4.5 by reading a new whitepaper prepared for the release. (Sorry, but it seems registration is necessary to access it ;) )


Blog Topics:

Comments