Integrating Compiler Messages

Attention to details is ok, but compiler messages has historically not received it. Here’s an example of GCC’s output:

qt/src/xml/query/expr/qcastingplatform.cpp: In member function 'bool CastingPlatform::prepareCasting():
qt/src/xml/query/expr/qcastas.cpp:117: instantiated from here
qt/src/xml/query/expr/qcastingplatform.cpp:85: error: no matching function for call to ‘locateCaster(int)’
qt/src/xml/query/expr/qcastingplatform.cpp:93: note: candidates are: locateCaster(const bool&)

Typically compiler messages have been subject to crude printf approaches and dignity has been left out: localization, translation, consistency in quoting style (for instance), adapting language to users (e.g, to not phrase things preferred by compiler engineers), good English, and just generally looking sensible.

To solve that it requires quite some work, and that’s probably the explanation to why it often is left out. To have line numbers, error codes, names of functions, and whatever available and flowing through the system requires quite some plumbing and room in the design.

Another thing is that nowadays we really should expect that compiler messages within IDEs or other graphical applications should be sanely typeset. If not, we’ve lost ourselves in all this UNIX stuff. Keywords and important phrases should be italic, emphasized, colorized depending on the GUI style.

For shuffling compiler messages around it is customary to pass a set of properties: a URI, line number, column number, a descriptive string, and possibly an error code. Apart from that it falls short reaching the goals outlined in this text, it encounters a problem which I think is illustrated in the above example with GCC. What does one do if the message involves several locations?

Even if a message involves several locations, it is still one message and should be treated so, and presented as so. The approach of using a struct with properties falls short here, and chops the message into as many parts as it has locations.

For Patternist I wanted to make an attempt at improving messages. So far it is an improvement at least. For instance, for this message that the command line tool patternist outputs:


the installed QAbstractMessageHandler was passed a QSourceLocation and a message which read:

<p>Operator <span class='XQuery-keyword'>+</span> is not available between atomic values of type <span class='XQuery-type'>xs:integer</span> and <span class='XQuery-type'>xs:string</span>.</p>

It was subsequently converted to local encoding and formatted with ECMA-48 color codes. (The format is not spec’d yet, it will probably be XHTML with specified class ids.)

While using markup for the message is a big improvement, it opens the door for formatting and all, this API still has the problem of dealing with multiple locations.

What is the solution to that?

Striking the balance between programmatic interpretation(such that for instance source document navigation is doable) and that the message reads naturally as one coherent unit is to… maybe duplicate the information, but each time tailored for a particular consumer?

<p xmlns:l="">In my <l:location href="myDocument.xml" line="57" column="3">myQuery.xq at line 57, column 3</l:location>, function <span class="XQuery-keyword">fn:doc()</span> failed with code <span class="XQuery-keyword">XPTY0004</span>: the file <l:location href="myDocument.xml" line="93" column="9">myDocument.xml failed to parse at line 93, column 9</l:location>: unexpected token <span class="XQuery-keyword">&</span>.</p>

This is complicated by that language strings cannot be concatenated together since that prevents translation. But I think the above paragraph is possible to implement. As above, the message reads coherently, but still allows programmatic extraction. A language string and formatted data sits in opposite corners of extremity, and maybe markup is the balance between them.

Would this give good compiler messages and allow slick IDE integration? If not, what would?

Blog Topics: