A quick explanation of the benchmark
I just realized that I didn't actually explain how to interpret the results of the benchmark I mentioned in my previous post. So, here it is...
When you run the benchmark (don't forget -comparison if you want std::string and QByteArray in the results), the first line you'll see is this:
Time to run = 1000 ms
Which tells us how long each test function is run. Next, you'll see output like this:
length 10, thrA: 40201 per-ms, thrB: 44077 per-ms, thrC: 40000 per-ms
length 100, thrA: 35429 per-ms, thrB: 40167 per-ms, thrC: 40033 per-ms
length 1000, thrA: 40167 per-ms, thrB: 40033 per-ms, thrC: 43915 per-ms
The first line tells us which class and which functionality we are testing. In this case, we're testing AtomicString copy construction. The following 3 lines tell us the length of the string we are using, followed by the results of the run. We use 3 threads (not with SharedString, it is not reentrant): one by itself and the other two concurrently. In the results, thrA is run by itself, then thrB and thrC are run concurrently. This gives us an idea of whether or not the test function scales. Larger numbers are better. In this case, we can make roughly 40,000 copies of an AtomicString per millisecond, which scales to multiple CPUs (the test machine is a dual-core opteron).
Also, all the results for similar test functions are grouped together, so it's easy to see how AtomicString compares with, e.g., SimpleString2 or std::string.
There are several test functions:
- copyConstruction - copy construct a instance of the class with the given length.
- appendSingleCharacter - append characters to an instance of the class. The string is truncated when length reaches the given length.
- nonMutatingAccess - use operator to fetch a value from the string and add it to a volatile integer.
- nonMutatingAccessAfterCopy - same as above, except that a copy is made before using operator.
- nonMutatingAccessOnCopy - same as above, except that operator is called on the copy (not the original).
- mutatingCopy1 - 1/3 of copies are const, 1/3 are non-mutating (operator), 1/3 are modified once (append a single character).
- mutatingCopy2 - 1/2 of copies are const, 1/4 are non-mutating (operator used 3 times), 1/4 are modified (append a single character 3 times).
- functionWithArgument - call a function with an instance of the class with the given length as the only argument (argument is passed by value).
- functionWithTemporaryArgument - same as above, except that the argument is a temporary copy.
- functionReturningCopy - an instance of the class with the given length is returned by value.
Look at testfunctions.h for the code for each test. Each test is implemented as a template function, which is called by the test harness in main.cpp. In all cases, each test function is working on its own instance of the class, never on the same instance, which would require locking, defeating the purpose of the benchmark :)
The results of the benchmark show that in almost all cases, SharedString and AtomicString are more efficient than SimpleString and SimpleString2. I really only care about AtomicString, since SharedString cannot be used in a threaded program (since the reference count on the shared_null quickly gets corrupted). In many cases, AtomicString is faster than std::string, which is surprising, since AtomicString hasn't been optimized beyond the normal patterns used in the Qt library (shared_null, ByteRef returned from operator, exponential growth strategy). The cases where AtomicString is slower than SimpleString and SimpleString2 are typically for length 10 strings, but AtomicString quickly recovers for longer strings (mutatingCopy1 and mutatingCopy2).
I have included results for a few test machines in the benchmark as well, in the results/ subdirectory:
- results/results-rayon.txt - My dual-core AMD Opteron 165 @ 1.8GHz running Kubuntu Dapper Drake (amd64), built with GCC 4.0.3
- results/results-stri.txt - My Pentium M @ 1.73GHz running Kubuntu Dapper Drake (i386), built with the Intel C++ Compiler 9.1.043
- results/results-error.txt - My colleagues dual-core AMD Opteron 165 @ 1.8GHz running Windows XP (i386), built with MSVC2005
My conclusion: Atomically reference counting all implicitly shared classes is the best thing to do in Qt. You will see benefits for both threaded and non-threaded programs.
Subscribe to our newsletter
Try Qt 5.15 LTS Now!
Download the latest release here: www.qt.io/download.
Qt 5.15 was developed with a strong focus on quality and is a long-term-supported (LTS) release that will be supported for 3 years.
Check out all our open positions here and follow us on Instagram to see what it's like to be #QtPeople.
Näytä tämä julkaisu Instagramissa.
Want to build something for tomorrow, join #QtPeople today! We have loads of cool jobs you don’t want to miss! http://qt.io/careers #builtwithQt #software #developers #coding #framework #tool #tooling #C++ #QML #engineers #sales #tech #technology #UI #UX #CX #Qt #Qtdev #global #openpositions #careers #job
Henkilön Qt (@theqtcompany) jakama julkaisu