This is an example. An anecdote. But it is illustrative of a problem, and illustrates a hole in our understanding of the novice’s programming process.
The following are three snippets of Java code:
Snippet One:
public void setFoo() {
. . .
}
Snippet Two:
public int getBar() {
. . .
}
Snippet Three:
public String getGeeWhiz() {
. . .
}
The difference between each of these has to do with colour, or the lack thereof. In the first two examples, it seems like the return type is highlighted in red; in fact, if you start learning to program, and spend the first three weeks of your programming career only using int, void, and boolean, you don’t encounter Snippet Three.
When you do encounter Snippet Three, you wonder “Why isn’t that highlighted?” And clearly, something you’ve done is wrong. Or, perhaps it isn’t clear. Either way, the reason these words are highlighted is a very leaky abstraction (a la Joel Spolsky). This is a perfect example of an IDE breaking a neat nesting of abstractions like I described in a post entitled My First Programming Language. The syntax highlighting (oddity?) depicted above is differentiating between primitive types and object types in Java. The reason for there being different … well, types of types is buried deep in the implementation of the Java Virtual Machine, and has to do with choices made long ago by the Java language designers. In other words, this particular choice of syntax colouring has nothing to do with a programmer who has only been interacting with the Java programming language for three weeks.
One of my students ran into this just yesterday. They were, I believe, thrown (in whole or in part) by the fact that the word String was not highlighted in red. And the reason that the word String is not highlighted has nothing to do with it’s position in the code, but everything to do with decisions made by Java’s designers. In short, the distinction means nothing to a novice programmer, and perhaps has no business being made in a programming editor for beginners.
I’ve been led to believe that the book Human factors and typography for more readable programs by Ronald M. Baecker and Aaron Marcus represents some of the only research on the role of typography on program comprehension, and that was entirely based on the printed form of programs, not the display of code in an interactive text editor (although we would hope some of the research would transfer). I haven’t been able to divine (via Google) any work regarding the role of colour and syntax highlighting in program editors, and the effect of colour on the programming process.
So a student of mine ran into this whole mess full tilt, and no-one knows anything about why we make the choices regarding highlighting that we do, except that those choices are typically made by experts. It would appear that we have little or no research to guide us in implementing syntax colouring and other types of programming support for novices.
POSTSCRIPT
As a point of comparison, I downloaded another Java programming tool for novices. In this particular editor, you can change the colours used, but not the kinds of things coloured. That’s OK. Interestingly, there’s only one category for “types,” and it produces code that looks like
public class Foo {
private int x;
public String getFoo() {
...
}
}
The difference? Everything that is a type is highlighted in the same colour. I like this (from the perspective of instructing novices) because classes are types, and the colouring reinforces this fact. Good. Would this other tool have helped my student? Perhaps not. Why? Because he could have typed
public class Foo {
private int x;
public Stringg getFoo() {
...
}
}
(misspelling “>String) and the editor still colours it. So, really, I would argue that (done right) the editor should actually keep track of the types that are available to the student at the particular line they are on, and highlights things accordingly. This way, types declarations and usages that are misspelled, or if the student is missing an import statement—effectively, anywhere (syntactically) the type doesn’t exist yet—these cases do not get highlighted.
public class Foo {
private int x;
public Stringg getFoo() {
...
}
}
Of course, I’m just hypothesizing; a study to explore whether this use of colour for highlighting type declarations would help students find syntax errors would need to be done to verify my hypothesis that this is indeed a (better?) behavior.
Emacs highlights Java like your second editor, where every type, or things in positions where a type should be, get the same colour.
While your idea might work well for Java, where you need to define all your classes explicityly, it would not help with other languages like ML or PROLOG, where you can define new types and classes at will, and there would be no way to know if you really meant to type Stringg or String. It seems to be a problem of having the editing tool be able to understand your semantics based just on your syntax, a hard problem for people, much less computers.