Regular readers of this blog will be aware that we are currently developing a new language named Sapphire.
Sapphire is not a replacement for Ruby, it’s not a ‘fork’ of Ruby. As a company we remain dedicated to supporting Ruby now and in the future. Sapphire is a new, highly encapsulated, language which will have a Ruby-like syntax but which, in many quite fundamental respects, will be significantly different from Ruby. Our implementation of Sapphire will run on the DLR (Dynamic Language Runtime) for .NET and it will incorporate some ‘special’ capabilities which we’ll blog about at a later date.
To give you a broad flavour of Sapphire, try to imagine a language that is as thoroughly object oriented as Smalltalk and as modular as Modula-2 but with a ‘light’ easily-readable syntax similar to Ruby. As we’ve discovered, achieving and two out of these three goals is relatively straightforward. But satisfying all three is tricky.
For example: achieving strict encapsulation/modularity is no problem. But doing that without significantly increasing the verbosity of the syntax is a challenge. Similarly, having a Smalltalk-like version of OOP with a Ruby-like syntax is straightforward. But making such as language as modular as Modula-2 is more difficult. We are still haggling over the details of how we’ll achieve all these aims. I’ll have more to say on this when we have a completely a first draft of the formal syntax of Sapphire.
The Problem Of Modules
There is more than syntax to worry about, however. There are also fundamental features of the language. To take one examples: modules.
I have never been completely satisfied with the implementation of modules in Ruby. In principle, they offer a ‘safe’ alternative to multiple inheritance. By including (‘mixing’) modules into classes, theoretically they provide the good bits of multiple inheritance (reusability of code from more than one other class) without the bad bits (‘cross-over’ dependencies in ancestor classes).
Let me explain what I mean by ‘cross-over dependencies. Imagine that a class D inherits from both class B and class C. This is all simple enough unless both B and C inherit from a common ancestor, class A. This is called the ‘diamond problem’. If you aren’t familiar with it, Wikipedia has a useful explanation).
Multiple inheritance: the ‘diamond problem’ - D inherits from B and from C, each of which inherits from A (and so on). Pretty soon things get very complicated!
This is particularly troublesome when a new D object is created: given the fact that this object has two parents and that it must call the constructor of each, which constructor should be called first? Since each of B and C’s constructors must itself call the constructor of D, who knows what side effects could result due to calling the constructors in different orders (since B and C might modify inherited data or behaviour in unpredictable ways)? In a large hierarchy with many crossed dependencies spreading back through a complex network of ancestors, this causes both a problem of implementation (how can we make resolve all the potential ambiguities?) as well as a problem of understanding (even if we do resolve them, how on earth is the programmer going to keep a clear ‘picture’ of that network of dependencies in his or her head?)
There are all kinds of ways of ‘getting around’ the Diamond Problem. The trouble is that none of the solutions is particularly easy to understand or deal with simply and reliably. Fundamentally, we are unconvinced that there are any major programming problems to which multiple inheritance is the best solution. Moreover, a design goal of Sapphire is that the language should be clear, unambiguous and as free as possible of unintentional side-effects.
The Limitations of Mixins
Ruby’s modules and ‘mixins’ provide one solution to the problem of code re-use from multiple classes. But it is a solution with notable limitations. Put simply, while Ruby modules are, in principle, ‘just like’ regular classes they are, in fact, quite unlike regular classes. In particular, regular Ruby classes form a part of the inheritance tree. Having created class C, you are quite at liberty to create class D as its descendent. You can’t do that with modules. Modules stand alone in splendid isolation - if they are classes, they are very much ’second class’ classes.
Mixins: Ruby provide single inheritance for ‘regular’ classes and the ability to ‘mix in’ modules such as M but not to inherit from them.
In my view, this has led to modules being widely used just like ordinary ‘code libraries’ in procedural languages. Often they are used as repositories for lots of ‘free-standing’ functions that become available for re-use by inclusion. This means that the programmer is obliged to decide at a very early stage in the development process whether a certain class should provide behaviour by descent (in which case it is defined as a class) or by inclusion (in which case it is defined as a module). That simple decision fixes the behaviour of that class/module forever thereafter.
Consider an example. Let’s assume you want you create a TimeCalculator class which will contain all kinds of clever code to calculate leap years, the date of the Chinese New Year, the number of days since Elvis Presley died and so on. You now have to decide whether other classes will ‘use’ this functionality through inclusion or whether they will ‘extend’ the functionality through inheritance. Let’s suppose you decide to go for inclusion. This means you define TimeCalculator to be a module. Now whenever some class (say class BankBalance or class Horoscope) needs some do calculations involving time, they just ‘mix in’ TimeCalculator.
But later on you need to write a program to calculate times between notable historical events. You decide that the best way to do this is to create special-purpose TimeCalculator classes - specifically a JulianTimeCalculator a GregorianTimeCalculator and a TaichuTimeCalculator. Logically, these classes should be descendants of a common ancestor, TimeCalculator: they should inherit and extend its existing features. But that is an option that is not available to you since TimeCalculator is a module and you can’t inherit from modules.
Oh well, OK, so you can just go ahead and include TimeCalculator, I guess. Well yes, so you can. But the more you include rather than inherit, the less you are using the essential features of Object Orientation. If you make a habit of this, it could be argued that you might as well be programming in a procedural language such as Pascal!
Inheriting Modules?
But wait a minute. Why shouldn’t you be able to inherit from a module? The diamond problem, remember, applies to crossed-links in ancestor classes, not in descendants. So, while it is perfectly reasonable to insist that a class that’s defined to be a module should have no ancestors, that does not mean that it should have no descendants. Let’s try out a new definition of a module then:
A module is a class with no ancestors. A module may be included in other classes.
But maybe you are thinking - yes, well, that’s all very well until you have descendants of modules. If M2 and M3 both descend from Module M1, we are back with the old diamond problem just as soon as M2 and M3 are included in a class. Not so. You see, by definition a module is “a class with no ancestors”. Therefore, any class which descends from a module cannot itself be a module. It is just a regular class whose line of descent happens to go back to a module.
‘Module Classes’: Sapphire proposes ‘module classes’ (here CM which are regular classes from which other classes may be derived but have no ancestors and can be mixed in.
We are still playing with this idea - it has not yet been ‘baked into’ the Sapphire language. If you have any opinions, please let us know! In fact, modules are not the only types of ‘special’ classes in Ruby (and some other OOP languages). Another type of extremely ‘special’ class is called a ‘metaclass’. I’ll discuss metaclasses in another blog entry soon...