In this article, I am going to talk about having a PGO (Profile-Guided Optimization) build of Clang.
Qt Creator has had a PGO build of clangd (and previously libclang.dll) on Windows x64 for quite a while now. This was a MinGW GCC build of clangd, since the MSVC compiler generated a slower PGO binary.
With the update to Clang 22, the MinGW GNU compiler (13.1.0) was having issues performing the PGO build, which forced a different approach.
This approach was the multi-stage PGO build of Clang itself.
This is no longer exclusive to Windows x64; it is now available for all platforms!
The llvm.org documentation mentions:
PGO (Profile-Guided Optimization) allows your compiler to better optimize code for how it actually runs. Users report that applying this to Clang and LLVM can decrease overall compile time by 20%.
Test setup
I have tested Qt Creator 19.0.2 (clangd 21.1.2) and Qt Creator 20.0.0 beta1 (clangd 22.1.2).
In Qt Creator, I opened the Qt Creator 19.0 branch source code, opened texteditor.cpp (~11.000 lines), and scrolled to the end of the file.
Then, I enabled the Debug value of the logging category qtc.languageserverprotocol.timing and collected these two values:
16:45:10.429 qtc.languageserverprotocol.timing: received server reply to "textDocument/documentHighlight" after 1912 ms
16:45:10.443 qtc.languageserverprotocol.timing: received server reply to "textDocument/semanticTokens/full" after 639 ms
The hardware used for testing:
- Apple MacBook Pro (Apple M3 Pro) with macOS 26.5
- Bosgame M5 (AMD RYZEN AI MAX+ 395 w/ Radeon 8060S) with Windows 11 x64 25H2, and Ubuntu 24.04 in WSL2
- ThinkPad T14s Gen 6 (Snapdragon(R) X Elite - X1E78100) with Windows 11 arm64 25H2, and Ubuntu 24.04 in WSL2
clangd speed up
I have tested five times on each computer and operating system.
| Operating System |
Qt Creator version |
clangd version |
Document Highlight |
Speed Up |
Semantic Tokens |
Speed Up |
| macOS (MacBook) |
19.0.2 |
21.1.2 |
1914.8 ms |
- |
643.4 ms |
- |
| macOS (MacBook) |
20.0.0 beta1 |
22.1.2 |
1428.6 ms |
25.4% |
515.8 ms |
19.8% |
| Windows x64 (Bosgame) |
19.0.2 |
21.1.2 |
3233.4 ms |
- |
953.4 ms |
- |
| Windows x64 (Bosgame) |
20.0.0 beta1 |
22.1.2 |
2837.8 ms |
12.2% |
807 ms |
15.4% |
| Linux x64 (Bosgame) |
19.0.2 |
21.1.2 |
3282.8 ms |
- |
982.4 ms |
- |
| Linux x64 (Bosgame) |
20.0.0 beta1 |
22.1.2 |
2581.2 ms |
21.4% |
784.4 ms |
20.2% |
| Windows arm64 (ThinkPad) |
19.0.2 |
21.1.2 |
3865 ms |
- |
1096.6 ms |
- |
| Windows arm64 (ThinkPad) |
20.0.0 beta1 |
22.1.2 |
2553.8 ms |
33.9% |
751 ms |
31.5% |
| Linux arm64 (ThinkPad) |
19.0.2 |
21.1.2 |
3020.6 ms |
- |
938.8 ms |
- |
| Linux arm64 (ThinkPad) |
20.0.0 beta1 |
22.1.2 |
2050.4 ms |
32.1% |
709.2 ms |
24.5% |
The Windows x64 (Bosgame) improvement is smaller because Qt Creator 19 already had a MinGW GCC PGO-optimized clangd.
Build times
Now that I had a PGO-optimized compiler, I thought to myself: why not do a PGO-optimized build of the linker and the debugger as well?
I decided to build Qt Creator 19.0 (three times) itself on every platform using both the native compiler and the PGO Clang compiler and linker.
| Operating System |
Compiler |
Build Time (mm:ss) |
Speed Up |
| macOS (MacBook) |
Apple Clang 21.0.0 |
06:17 |
- |
| macOS (MacBook) |
Clang 22.1.2 |
05:45 |
8.5% |
| Windows x64 (Bosgame) |
MSVC 2026 |
06:54 |
- |
| Windows x64 (Bosgame) |
Clang-cl 22.1.2 |
04:40 |
32.3% |
| Linux x64 (Bosgame) |
GCC 13.3.0 |
06:30 |
- |
| Linux x64 (Bosgame) |
Clang 22.1.2 |
04:15 |
34.6% |
| Windows arm64 (ThinkPad) |
MSVC 2026 |
18:35 |
- |
| Windows arm64 (ThinkPad) |
Clang-cl 22.1.2 |
10:30 |
43.5% |
| Linux arm64 (ThinkPad) |
GCC 13.3.0 |
16:29 |
- |
| Linux arm64 (ThinkPad) |
Clang 22.1.2 |
10:32 |
36.1% |
These are significant results! The macOS improvement is smaller, likely because Apple performs a PGO build of clang but does not use Qt code as part of the training.
Notes
On Windows, I used this CMakeUserPresets.json file to compile with clang-cl. To ensure the correct configuration, I manually configured the MSVC 2026 environment in Qt Creator so that the CMake preset kit will pick it up.
{
"version": 4,
"cmakeMinimumRequired": {
"major": 3,
"minor": 23,
"patch": 0
},
"configurePresets": [
{
"name": "MSVC-Release",
"displayName": "MSVC Release Build",
"binaryDir": "${sourceDir}/build/${presetName}",
"generator": "Ninja",
"architecture" : {
"value": "x64"
},
"cacheVariables": {
"CMAKE_C_COMPILER": "c:/llvm/msvc/bin/clang-cl.exe",
"CMAKE_CXX_COMPILER": "c:/llvm/msvc/bin/clang-cl.exe",
"CMAKE_LINKER": "c:/llvm/msvc/bin/lld-link.exe",
"CMAKE_CXX_FLAGS_INIT": "-Wno-clang-cl-pch",
"CMAKE_PREFIX_PATH": "c:/Qt/6.11.1/msvc2022_64;c:/llvm/msvc",
"CMAKE_BUILD_TYPE": "Release",
"WITH_QMLDESIGNER": false,
"WITH_CCACHE_SUPPORT": false
}
}
]
}
On macOS, the Clang compiler had an issue with macwebkithelpviewer.mm Objective-C++ file, so I decided to exclude it from the build system:
--- src/plugins/help/CMakeLists.txt
+++ src/plugins/help/CMakeLists.txt
@@ -34,15 +34,15 @@ extend_qtc_plugin(Help
DEFINES QTC_DEFAULT_HELPVIEWER_BACKEND="${HELPVIEWER_DEFAULT_BACKEND}"
)
-extend_qtc_plugin(Help
- CONDITION FWWebKit AND FWAppKit
- FEATURE_INFO "Native WebKit help viewer"
- DEPENDS ${FWWebKit} ${FWAppKit}
- DEFINES QTC_MAC_NATIVE_HELPVIEWER
- SOURCES
- macwebkithelpviewer.h
- macwebkithelpviewer.mm
-)
+# extend_qtc_plugin(Help
+# CONDITION FWWebKit AND FWAppKit
+# FEATURE_INFO "Native WebKit help viewer"
+# DEPENDS ${FWWebKit} ${FWAppKit}
+# DEFINES QTC_MAC_NATIVE_HELPVIEWER
+# SOURCES
+# macwebkithelpviewer.h
+# macwebkithelpviewer.mm
+# )
option(BUILD_HELPVIEWERBACKEND_QTWEBENGINE "Build QtWebEngine based help viewer backend." YES)
find_package(Qt6 COMPONENTS WebEngineWidgets QUIET)
I don’t think this was the reason the build was 8.5% faster, though! 😅
On the ThinkPad, I have increased the WSL2 memory size from 16GB to 24GB, but I encountered some compilation issues with precompiled headers. I decided to add the -k 0 argument to the build step to allow the build to continue despite errors.
The error was:
src/tools/qtc-askpass/qtc-askpass_autogen/mocs_compilation.cpp
error: is pie differs in precompiled file '/home/cristian/Projects/QtCreator/build/GCC-Release/src/libs/3rdparty/syntax-highlighting/CMakeFiles/QtCreatorPchGui.dir/cmake_pch.hxx.pch' vs. current file
I tried modifying CMAKE_POSITION_INDEPENDENT_CODE:ON so that all CMake targets would receive the property, but it didn’t help.
I suspect it is a Clang bug on Linux arm64, as everything worked fine on Linux x64 using the same Ubuntu version, same Qt Creator code checkout, and same Clang compiler.
CMake has an open issue, PCH: PIC/PIE mismatch when using REUSE_FROM regarding this case.
The build is slightly slower on Linux arm64 than the reported 36.1%, as the final Ninja progress looked like:
[3630/3642 5.7/sec] Elapsed time: 10:33
You can find my raw notes for Bosgame, MacBook Pro, and ThinkPad.
Other benefits
The build is faster, but I cannot say much about the generated code. I can only point to the clangd tests above, which show that the Clang build was faster than the GCC and MSVC builds.
On macOS, you get to use C++ 20 Modules, which cppreferences.com documents as “Partial” for AppleClang, and arewemodulesyet.org has it marked with ❌.
On Windows, you can use long source file paths, since Clang handles them without issue, whereas MSVC cannot.
Availability
You can download the Clang builds (7z archives) from download.qt.io.
Qt Creator ships with clangd, and on Windows, it includes the lldb debugger.
Should Qt Creator also ship the compiler and linker by default? Let us know in the comments below!