50% scaling of (A)RGB32 image

In digital systems, an N-bit adder can be implemented by N-1 full-adders and one half-adder. When accumulating numbers, carry save adder is an interesting alternative since it is faster. As explained somewhere else already, the same technique is also useful if we want the average of 16-bit numbers encoded in 32-bit numbers. Or, for all that matters, 8-bit in 32-bit. This of course fits nicely as we average two colors in ARGB color space, where each component takes 8 bits. The code is as simple as (0xfefefefe is there to mask out the bit not to overflow and falsify the next 8-bit):

quint32 avg = (((c1 ^ c2) & 0xfefefefeUL) >> 1) + (c1 & c2);

This is faster to compute, rather than taking the alpha, red, green, and blue of the first and second colors, average each component invidually, and then combine them again to find the final result, like the messy lines below:

quint32 avg = qRgba((qRed(c1) + qRed(c2))     >> 1,
(qGreen(c1) + qGreen(c2)) >> 1,
(qBlue(c1) + qBlue(c2)) >> 1,
(qAlpha(c1) + qAlpha(c2)) >> 1);

But how fast is faster? I decided to write an example that uses the above mentioned trick to speed up downscaling an image to half its original size. Usually you do this using QImage::scaled() function. If you pass Qt::SmoothTransformation as the transformation mode for this function, then halfscaling the image is the same as taking every 2x2 pixels, average their color values, and use the result as the final color. On the other hand, Qt::FastTransformation will just sample one of those 4 pixels. Surely it means that it is faster (the name implies that), however the lack of box filter there also means the quality is not as good as using Qt::SmoothTransformation. Here comes the trick of ARGB32 pixel averaging, which allows us to write QImage halfSized(const QImage&) that is really really fast compared to QImage::scaled() with Qt::SmoothTransformation, but still give the same visual quality. Using the new benchmark feature in our beloved test framework, here is the speed comparison (longer is better).

"Normal" refers to scaling with Qt::SmoothTransformation, whereas "Optimized" is our custom halfSized() function. The numbers represent the iterations for every 10e12 CPU ticks in order to halfscale a 10-megapixel image. As you can see, the improvement is about an order of magnitude. Impressed?

The code is still fresh at Graphics Dojo repository under the subdirectory halfscale. Take a look and have a try. Do not forget also the catches: potential round-off error and even columns and/or rows. If you can get away with the loss of up to two bits and cutting one last (or in the middle) vertical and horizontal pixels, then this halfSized() function is your new friend.

Now carefully examine following screenshot. Just like other previous examples, you can always drag and drop an image from the file manager or web browser. For this one, I used Gianni's Urban solitude picture (Creative Commons NC ND). As you can see, when you stick with FastTransformation, there are jagged lines and some effect like Moire patterns in the downscaled image. This problem disappear when you use SmoothTransformation. In addition, the optimized half scaling method presented here gives a result just like when you use SmoothTransformation.

If you start to ask why all this halfscaling seems to be important at all, just watch this blog and see what will come next. Hint: you might guess it already if you were at my last DevDays talk.


Blog Topics:

Comments