Uniform Referents

Back in 1969, at the first “Software Engineering” conference, Doug Ross presented a paper entitled “Uniform Referents: An Essential Property for a Software Engineering Language” [Ross 1970].  Ross’s idea was that the way that you access an attribute should be independent of an implementation. In his language, he used the notation A(B) to always mean “Property A of thing B”. If A were an array, then B would be an index; if A were a component, then B would be a pointer; if A were a function, then B would be an argument.

The reason that Ross though this was an essential property for a Software Engineering Language is that the implementation of a software artifact changes as its design evolves. Indeed, when you  start the design process, there is no implementation: you are just defining an interface, keeping all implementation options open. Ross argued that one ought to be able to slide different implementations under the same interface and not have to change the code.

Bertrand Meyer, in his 1998 book Object Oriented Software Construction, called what appears to be the same idea the “Uniform Access Principle”. We have it in Grace, at least where the “thing” is an object: we write b.a for attribute a of object b, regardless of whether the implementation of b chooses to store a in memory, or compute it on demand.

Uniform Referents in Grace: accessing object attributes

When we announced this decision during the early design of Grace, someone on the mailing list complained that, when reading a client program, they wouldn’t be able to distinguish field access from method invocation. That, indeed, was the whole point: we didn’t think that you ought to be able to distinguish.

Over the last few days I’ve been augmenting the Grace parse tree to enable more accurate highlighting of errors. The idea, which is hardly novel, is that the IDE should do something like this:

In this screenshot, the user accidentally typed dow instead of do; in addition to getting the message

in the grasscatcher, I wanted to highlight the location of the error.

The parse tree records the starting position (line and column) of each construct, so this ought to have been pretty simple. The problem was that it did not record the ending position. So I set out to fix this, by adding a method end to every node in the parse tree.

It turns out that there are about 30 different kinds of parse tress nodes. For most of them, figuring the end position is pretty trivial. For something like an identifier, it’s just the start position plus the size of the identifier (- 1). For a comment node, it’s the end of the line (because comments always go to the end of the line in Grace); the source code is available, so with a small amount of grubbing around, the end method can come up with the right answer.  Many nodes just inherit from baseNode the following method:

For most composite nodes, the end of the node is the end of the final component, and so on. In many nodes, the code for the end method was one line long.

For a few kinds of node, however, the answer isn’t so simple. The end of a string is not just string.size + 2 - 1  (+ 2 for the quotes) characters away from the start, because it might contain embedded escapes like \" or \t. So, in these cases, I simply added a field end to the node, and set that field in the parser when the parse node is created. There was only one place where string nodes were created, so this wasn’t very onerous.

Because end is a public field, it overrides the inherited end method. The clients of a parse node didn’t care — indeed, they couldn’t tell — whether end was implemented by a field or a method. Voilà! The Uniform Referent Principle at work!

A fly in the ointment

Fairly quickly, I had something working:

But not quite:

Notice that the closing parenthesis is not highlighted. The problem here is that the node that represents a method request is calculating its end position as the end of its last argument list. The argument list, in turn, is calculating the position as the end of the last argument in that list. At this point, no-one knows any longer whether that argument list was parenthesized. Scanning the input for closing parentheses won’t work, because we can’t tell if they belong to this request or some enclosing expression. What to do?

I though of a number of hacks, but it became clear that the right thing to do was simply to store the end position of the request in the parse node when the request is parsed — after all, the parser knows where it is in the input. The only issue was that there are lots of places where request nodes are created (some of them because of internal rewritings, which don’t correspond to any place in the input). Did I have to find (and fix) them all? After all, my heuristic that approximated the end of a request by the end of its last argument worked well enough most of the time.

My solution relied on Grace’s ability to override not just the field end by a method end, but also the setting of that field with end := newPosition. Instead of just replacing the end method with an end field, I made the method smart:

This code overrides not just end (lines 2–12) but also end:=(_) (on line 13): in Grace, we have not just uniform access to an attribute (as an r-value), but a true uniform referent mechanism that deals with l-values too.  In the code above I declare a field, called endPos , as well as the two methods.  A request on the method  end:=(_) is redirected to set  endPos. I hope that the logic of the end method is fairly clear. If  endPos still has its initial value of noPosition , we approximate the answer using the old heuristic; however, if  endPos has been set, we return its value.

Extending the principle

I was quite pleased with the way that Grace helped me handle this problem — it really is quite graceful. The Uniform Referent Principle is so useful, though, that I started to wonder why we didn’t adopt it elsewhere. Why do we make the programmer write

but

thus forcing them to distinguish between implementing a function by calculation and implementing it with a table? Why not write both

I wonder …