Loop with exits revisited

I was recently talking to one of programmers moving some Java code into Grace. He was refactoring Java code that uses “break” and “continue” statements, and wondered why Grace doesn’t support them. I answered that even if break and continue aren’t in the standard library the can certainly be written by a programmers.

The only infelicity is that rather than being commands, “break” and “continue” in Grace will be blocks, supplied as arguments into loops — to take one of these exits the program must apply the appropriate block argument. Here’s a simple loop using both “break” and “continue” — not exactly the best style but hopefully illustrativ

running this code produces:

A 1
B 1
A 2
A 3
B 3
A 4

as you might expect.

How is this implemented? Well first, here’s a simpler case: a “doWithExit” statement that takes a block and exits when the block is applied:

this will print “A”, “B”, and then “Done” because “exit.apply” exits the block. The definition of the “doWithExit” statement is quite straightforward:

we just run the block supplied to the method, passing a second block as an argument, this second block calls return and so exits from the method, consequentially exiting the argument block early.

We can then build upon “doWithExit” to build up the “loopWithExits” statement above. The trick here is we inject two blocks into the looping code. The first of these, the “break” exit, returns from the whole “loopWithExits” statement, stopping the loop (just like C or Java’s break) — the implementation here is just the same as the single exit in the “doWithExit” method. The second injected argument comes from a “doWithExit” inside the loop proper to exit from only the code being looped. When that inner code returns, the control flows out of the “doWithExit” block, into the block controlled by the “loop” statement. Since we’re at the end of the loop, the control flows around to the next iteration of the loop — just like C or Java’s continue.

This isn’t the simplest Grace code — frankly I wouldn’t teach it at all to first year students — but hopefully it does show the kind of things that programmers can write in a simple, language based on “a small number of underlying principles that can be applied uniformly”. In Grace, “loopWithExits” is just a method, like any other. The “loopWithExits” method is implemented using nothing but objects, blocks, requests, and returns, without any macro processing or explicit metaprogramming

Finally, it’s interesting to realize that we talked about this on the blog over a year ago and outlined what we hoped the basic Grace code for doing this would be. It’s gratifying to realize that the code above, that works on our current prototype Grace systems, is pretty much the same as last year’s sketches. And it’s even more gratifying to see last year’s sketches made flesh and running this year!

Private & Confidential

We’ve been discussing encapsulation controls in Grace. For teaching, a rather conventional “visibility system” that lets features of an object be accessible only to

(a) the object itself, or
(b) the object and the heirs of that object, or
(c) the object, heirs and clients of the object

is probably necessary and sufficient – although we’re not 100% sure we need to support each of these visibility levels. Necessary, because at least some instructors will want to teach students how to use such a system, and sufficient for our intended audience. (I still believe that an industrial-strength language in which reuse is a goal should put control of encapsulation in the hands of the client, as we did in the Schärli, Black & Ducasse OOPSLA 2004 paper. However, even with controllable encapsulation policies, there still needs to be a way to define default policies for heirs and clients. )

If once accepts that as a starting point, there are still a number of decision to be made. We need to pick names for each kind of visibility — so we can have this conversation — and then we need to decide how to apply the different levels to the features of an object — its methods, variables and constants. Another decision is whether the visibility declarations should be advisory, or whether there should be work-arounds, as there are in almost every industrial-strength language. (Even if you really want a feature to be private, you probably still want to be able to look at it in the debugger.)

The problem with names is that all of the good ones are taken. The best name that I can think of for (a) is private, but Java uses private to mean “accessible not only by me, but also by any object that happens to share the same class as me”, and Ruby uses private to say that the method can be used only with self as the receiver, which means that it can be used by heirs as well by an object itself. Still, private is the best name that we could think of for (a). For access level (c), public seems entirely appropriate, but level (b) is tricky. “Family access only” is the most descriptive phrase that I could come up with. “Confidential” is a possibility — less private than private, but not public. Other possibilities are “hidden”, but it’s not clear that hidden is more accessible than private, and “protected”, which has the same disadvantage, as well as being taken by Java to mean “more accessible than the default”. Can you suggest good terms?

Speaking of defaults, should we have default visibilities, or must they always be declared? The principles of clarity and readability tell us that we should tag each feature with its visibility explicitly. However, the principle of brevity tells us that the common case should require less characters on the screen. The common case, I believe — at least in the beginning — is that methods should be public and that constants and variables should be private, or perhaps confidential. But that gives rise to a complicated rule: the default depends on how the feature is implemented. A simpler rule — simpler to teach and simpler to remember — would be for every feature to have the same default visibility. So those of us who are enthusiasts for encapsulation would make all features default to private, and those of us who are enthusiasts for reuse would make everything default to confidential. However, in both cases, methods intended for clients would have to be declared to be public explicitly. Is this an undue burden, considering that, at least in the beginning, the only reason to declare a method is to provide public access?

Another issue is the syntax for these accessibility declarations. Should we use keywords public, private and confidential? Should we sigils, such as UML’s +, – and #? Should we use annotations (attributes), as we are considering for “overrides”? The argument for sigils is brevity, and lack of syntactic clutter, especially if every method, constant and variable needs to have its visibility made explicit; the argument against is that we want students to learn the right vocabulary, and refer to private methods as “private methods” and not as “- methods”. This is the rationale that has led us towards using more keywords in Grace than in many other languages (for example, labeling methods with method), and generally spelling keywords out in full (integer rather than int).

A final issue — and one that is often neglected — is to define the semantics of the visibility annotations. Private methods are not really methods at all: they are early bound and have lexical scope, and thus they should not be part of an object’s type. Public and Confidential methods are both real, late bound methods; the distinction is that confidential methods can be requested only from self. We think it’s reasonable to make that a syntactic restriction, so that

self.protectedMethod

is OK, but

o := self
o.protectedMethod

is an error.

Should Confidential methods appear in an object’s type? That rather depends on the rôle of types: do they exist to describe what clients can do, or what heirs can do? It’s actually quite reasonable to have two different type systems with those two different goals, but the type system for clients will be the more widely used, and probably the one that teachers will want to emphasize. We think that all of the checks required for the correct use of inheritance can be done statically in a whole-program analysis without burdening the programmer with a second type system.

Types vs Classes

Like many object-oriented languages, Grace will have classes. Like some object-oriented languages, Grace will have types. Like a few
object-oriented languages, Grace programmers have the option to ignore classes and use only objects, or to ignore static types and use only dynamic types.

The relationships between objects and classes, and static and dynamic types, are well known. So, then, what’s the relationship between
classes and types in Grace? Or rather, what should be the relationship between classes and (static types?

Here’s the problem. Let’s take a simple Grace class:

This class creates a factory object that supports the creation of new Cat instances in response to the method request “new(aColour,aName)” (cognoscenti will notice we’re trying “def x =” syntax to define constants rather than “const x :=”. Sorry Niklaus).

What type does the variable “fergus” have? As in C#, local type inference gives it whatever type the “Cat.new” method returns. What if we want to declare that type explicitly, say for a variable?

Here the name “Cat” is being used as a type, rather than a factory. The key question is: where did that type come from? There seem to be two options in the design here:

  1. The Cat class declaration implicitly creates a Cat type.
  2. The Cat type must be declared explicitly, separate from the class
    declaration:

The first option, a class implicitly creating a type, is what most typed object-oriented languaes do: a class declaration also creates a type (technically the cone type rooted at the class). Implicit class-types lead to more concise programs, and allow “static typing early” courses to have students write and use their own classes without requiring an explicit concept or separate declaration of a static type.

On the other hand, the second option, explicit type declarations, make static types much more explicit. Under this option, Grace programmers couldn’t declare an explicitly typed variable (or more likely, any method, as method arguments are not inferred) without an explicit declaration of the Cat type. But this clarity comes at a price: simple programs are longer, requiring apparently redundant type declarations, declarations that are close to class declarations, but duplicated some information with some mandatory tweaks.

Grace programmers can avoid the price of a separate declaration in a couple of ways. First they can use dynamic types or local type inference — omitting types from variable and constant definitions will find types via local inference (if the type-checker is run) while omitting types from method arguments and results are interpreted as type dynamic. So the costs of declarations (presumably) would only be required whenever a type is to be written explicitly. Still, this is another case where a ”better” program (with explicit types) is longer and more redundant than a ”worse” program (without them). Most Grace programmers may choose to omit the declarations, so the language would fall into being dynamically typed by default.

In fact, the real situation is worse than this: there are about five or six kinds of “class-like” or “type-like” objects in Grace: a good solution here should address all these roles:

  • ”’Factory”’ object that creates new instances of a class

  • ”’Type”’ with which variables, arguments, and methods are declared

  • ”’Reified Type Parameter”’ Since Grace will have “reified”
    generics, what value or object should the reified value be?

    Note that inside the Collection, the reified type argument will have to be bound to the formal type parameter.

  • ”’Pattern”’ object used to match objects of that type in
    match/case statements

  • ”’Mirrors”’ used to reflect on instances of the class

If these are played by different objects — how many different namespaces will Grace need to name them all? If they are accessible in a shared namespace, how are names resolved?

Finally, following C#, Grace will provide constructs to reify the declared static type of an expression (perhaps “decltype(e)”) and the exact dynamic type of an object (“o.dyntype”, or perhaps alternatively “reflect(o).dyntype” via a mirror). The aim here is to let programmers write programs that interrogate the static and dynamic types in their programs. And, whatever the relationship we end up with, do we need better names for static type and dynamic type?

Learning Edge Momentum

Like many computer science or software engineering departments, at VUW we seem cursed with a “bimodal” distribution of marks in first year programming courses. While some students do very well and collect A or A+ grades, about as many do very poorly, taking away only Ds and Es — and with relatively few students in the
middle. This profile is very different from most other courses at the university — and generally from our second and third year courses — which have much more normal distributions.

A number of hypotheses have been proposed for the bimodal distribution — most commonly, that a large proportion of the population are congenitally unable to learn programming, and that our advertising fails to dissuade them from enrollment.

Recently, Anthony Robins, a colleague from Otago University in NZ (the southern-most university in the world, and oldest university in the NZ) has developed a novel rationale to explain these distributions. His paper, “Learning Edge Momentum” , hypothesizes that introductory programming is unlike many other disciplines, in particular, that “success in acquiring one concept makes learning other closely linked concepts easier (whereas failure makes it harder).”

What’s this to do with Grace? Well, one of our main aims in the design of Grace is to reduce the accidental difficulties of learning to program. In Robins’ terms, I think this could be described as “uncoupling the concepts” within the language — partly removing concepts, but mostly trying to make concepts less
closely linked.

Here’s a simple example. In Java 1.0:

To me, this needs a whole collection of tightly-linked concepts:

  • For loop
  • Integers, and “int” type
  • Variables and assignment
  • Length of a collection and that “col.length()” gets the length
  • Less than, and that “< " is less-than
  • ++ increment operator…
  • “col.at(i)” to get i’th member of a collection.
  • System.out.println to print things.

But the “new for loop” in Java 5.0 needs far fewer concepts:

In particular:

  • For loop
  • Variable declaration
  • System.out.println to print things.

Hopefully Grace’s “for” loop:

will be closer to the Java 5.0 loop rather than Java 1.0!

I look forwards to seeing how Robin’s hypothesis is developed and tested over time. I also look forwards to see how much we can ensure a clear and loosely coupled conceptual model under Grace.

Object Independence Day

We hold these truths to be self-evident, that all objects are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are sole access to their internal representation

An object-oriented language should provide encapsulation: an object should be able to protect its representation from unwanted external access. Programming languages provide a wide range of different encapsulation mechanisms: statically and dynamically checked; encapsulating names or objects; encapsulating within objects, or within classes, or within modules…

Grace will have a module system to support separate development and compilation: to manage coupling between different development units. The design is yet to be finalised (Ok, yet to be started) – but we are thinking along similar lines to Newspeak, gBeta, J&t or Ceylon in that it will be based on classes nested within objects.

But this large-scale encapsulation, within compilation units, doesn’t really address the key independence of objects in object-orientation: that each object should be independent, having sole access to its own internal representation. Now many object-oriented languages don’t in fact support this either. C++, C#, and Java have private and protected modifiers: but these restrict access within classes not objects: an instance of one class may access private or protected fields of any other instance of the same class — and for protected, of any subclasses as well.

For this reason, like Smalltalk, Ruby, and other languages, Grace will provide object-level encapsulation. Encapsulated names – methods and fields – will only be accessible from within the same object — i.e. by method requests on self. Again like Smalltalk and Ruby, encapsulation in Grace can depend on types (or classes) but not on static types or classes: we don’t want programmers to e.g. remove static types from their code just to get around encapsulation!

But there are a number of different options even within this design:

  • private modifier – marks encapsulated attributes; no modifier means no encapsulation
  • public or shared modifier – marks unencapsulated attributes; no modifier means no access except through self
  • textual rules for encapsulation. In Go for example, names beginning with a Capital letter are public, names beginning with a lower-case letter are private. This convention is used throughout .Net programs – but so far we’ve been using a methods-as-lowercase convention, common in Java, Smalltalk, etc. This kind of implicit coding doesn’t work for code in non-Western alphabets that don’t distinguish between upper/lower case
  • “reverse” textual rules – lower case is public, Upper Case is Private. At least this would work better with the Java convention – but that seems to be all it has going for it.
  • sigils – use a non-alphabetic character to start all (encapsulated) identifiers. For example, any private const, var, or method must begin with an underscore (self._myStuff) — identifiers without underscores are public

There are only a few differences between these options: should the default be accessible (easier for novices to get started) or encapsulated (easier to learn good habits). Encoding encapsulation into the names (as in the last three options) makes their relationship with inheritance clearer, but changing visibility means changing names. On the other hand, modifiers on attributes decouples encapsulation status from the text of the name, but requires modifier-consistency rules across inheritance.

We’re in the process of working our way through these options (or at least thinking about them every so often). We’ll hope to have a decision by the time of the next Grace workshop (30 July, in Lancaster, after ECOOP).

If anyone has any random opinions (or even considered thoughts) on this, we’d love to hear them.