Thoughts about graphics performance

Ah, performance. Today we had yet another discussion about how to make Qt even faster, preferably all over the place. Generally, we feel that resources spent on hunting down performance problems are resources well spent. But we're (surprisingly?) not a huge lot of people working on Qt's 1000000 lines of code, and many of us feel that we're spread thin trying to juggle major and minor feature development, maintenance and innovation, and so on. So how do you spend your effort? I personally feel strongest about graphics.

Qt is, of course, a kick-ass toolkit written by a bunch of hard-core (yet wonderfully pleasant!) developers, and sometimes we try to make everything as fast as possible, other times we're forced to make a trade of API quality versus performance. Before he left us, former Troll Zack(r) held a seminar where he stressed how truly high-performant graphics, (and we're talking the tens-of-thousands-of-polygons full-screen with tons of impressive effects at 75fps,) needs tailoring, and is typically written very close to the actual hardware, in order to run as fast as possible. Convenient APIs can still perform well if they are sufficiently high-level ("I know exactly what you want to do!"), or sufficiently low-level ("I don't know squat about why you want to draw these trigons but I swear I'll do it real fast!"). Qt is in the middle. You can't ask Qt to mock together Black & White III (you would naturally use OpenGL for that!), but you can't ask Qt to upload a subroutine to the graphics card's pixel shaders either. It does that for you in the background, but it doesn't provide that API. Qt can render vector graphics for you. It does it incredibly fast, and beautifully, but pixel perfect vector graphics is not always what you want. Otoh, Qt doesn't know that. :-) Gah!

Now Arthur's rendering model does give you many options to write beautiful and fast graphics. The default paint engines and QPainter provide a mid-level intuitive API for drawing vector graphics, implemented with the best effort approach (speed + quality). Depending on the platform and complexity of your shapes, Arthur picks the best approach giving you high-quality output at high speed. Through QImage, you can also access pixels directly and work all your magic "by hand". With Qtopia Core you also have QDirectPainter, which lets you touch pixels directly on the frame buffer. QGLWidget provides two things: Both the QPainter API, which I consider to be truly unique, basically the exact same operations you use with QWidget are translated to OpenGL calls. And, of course, you can use OpenGL directly (the context is set up by default in QGLWidget::paintEvent(), just fire away those GL calls!).

But QPainter, the heart of our graphics API, works on a non-compositional model. QPainter doesn't know enough about how you want to blend your stuff together; it has to rely on you to do the smart stuff. For example, if you ask QPainter to draw a complex path, say a QPainterPath representation of a text document, and then ask it to do it again, and again, and again, QPainter has a pretty tough time figuring out that it would be nice to cache that path. Even if it could, it doesn't necessarily know how to make it also look good in your case! It cannot know what's best for you; to a certain degree it must rely on you knowing what you're doing. You, on the other hand, do know what you're doing. You know perfectly well what could make it faster, but maybe you don't know how to make Qt do what you want. Somewhere between QPainter and you, there's some smart stuff flying around, I've just been feeling that there's got to be an API in there somewhere ;-). The trick is to pull it out of the hat somehow.

People tend to prefer quality high-level APIs over low-level ones. By that I mean APIs that are easy to understand and use, empowering you and allowing you to quickly transform your brain vibes into shapes on the screen. Now how can you do that with a really addictive, efficient and intuitive API, while still keeping it blindingly fast? The hard problem lies in finding the right level of abstraction. Our closest thing so far to an abstraction over Arthur is the Graphics View API. I've spent some time with QGV to bring my ideas into the API somehow. Does QGV "know" enough about what you're doing, to do it more efficiently?

QGraphicsItem knows that everything going on inside of paint() is basically drawn on one surface using a single homogenous transform, and it can render the item off-screen into a texture in logical coordinates to avoid asking QPainter to redraw and redraw and redraw and redraw. That texture could then be stored in graphics memory, like QPixmap already works with QGLWidget, and you could transform and translate the item without ever "redrawing" it. Your paint() functions wouldn't even get called at all. I just think it could do miracles for lots of graphics apps that spend a lot of time redrawing. So when the item is exposed, or even transformed, instead of retesselating, rescaling and rerendering the thing, we just blit the texture. Whenever the item needs to redraw parts of itself, it could call QGraphicsItem::invalidate(QRectF), as opposed to update(QRectF), which just reblits the texture. The following screenshot shows an app I wrote to measure just how fast or slow a straight-forward application for Qtopia Core would run. It's a phone keypad navigator:


Here's the source code, download and unpack:

The Pad Navigator Source Code

And my patch to Qt 4.3.x, download and unpack:

My Patch to Qt 4.3.x

Now, it goes like this: Download your favorite open source edition of Qt (I prefer the all-package for simplicity), preferably 4.3.1. Unpack, apply the above patch to src/gui/graphicsview - it should apply cleanly with no conflicts. Build Qt, and build the padnavigator example. Now run the example. Play around with the key pad, press enter, bla bla. Resize it, it's resolution independent. Now, if you hit space, the whole example enables logical caching for all its items. Notice how the quality level goes down, but speed just goes sky-rocketing. OK - I think I'm onto something here, now back to the drawing board.

PS: Try without OpenGL and compare the performance by removing the setViewport() call in main.cpp.

Disclaimer: If you don't really notice any other difference than image quality degradation, you probably have state-of-the-art hardware and a modern graphics card. Don't blame me for Qt being fast without any tricks! ;-) Try running padnavigator over a remote X connection on Linux, or run it through a heavy profiler like valgrind, just to "emulate" slow hardware.

Blog Topics: