The Sapphire Language - the problem of modules
Regular readers of this blog will be aware that we are currently developing a new language named Sapphire.
Sapphire is not a replacement for Ruby, it’s not a ‘fork’ of Ruby. As a company we remain dedicated to supporting Ruby now and in the future. Sapphire is a new, highly encapsulated, language which will have a Ruby-like syntax but which, in many quite fundamental respects, will be significantly different from Ruby. Our implementation of Sapphire will run on the DLR (Dynamic Language Runtime) for .NET and it will incorporate some ‘special’ capabilities which we’ll blog about at a later date.
To give you a broad flavour of Sapphire, try to imagine a language that is as thoroughly object oriented as Smalltalk and as modular as Modula-2 but with a ‘light’ easily-readable syntax similar to Ruby. As we’ve discovered, achieving and two out of these three goals is relatively straightforward. But satisfying all three is tricky.
For example: achieving strict encapsulation/modularity is no problem. But doing that without significantly increasing the verbosity of the syntax is a challenge. Similarly, having a Smalltalk-like version of OOP with a Ruby-like syntax is straightforward. But making such as language as modular as Modula-2 is more difficult. We are still haggling over the details of how we’ll achieve all these aims. I’ll have more to say on this when we have a completely a first draft of the formal syntax of Sapphire.
The Problem Of Modules
There is more than syntax to worry about, however. There are also fundamental features of the language. To take one examples: modules.
I have never been completely satisfied with the implementation of modules in Ruby. In principle, they offer a ‘safe’ alternative to multiple inheritance. By including (‘mixing’) modules into classes, theoretically they provide the good bits of multiple inheritance (reusability of code from more than one other class) without the bad bits (‘cross-over’ dependencies in ancestor classes).
Let me explain what I mean by ‘cross-over dependencies. Imagine that a class D inherits from both class B and class C. This is all simple enough unless both B and C inherit from a common ancestor, class A. This is called the ‘diamond problem’. If you aren’t familiar with it, Wikipedia has a useful explanation).
This is particularly troublesome when a new D object is created: given the fact that this object has two parents and that it must call the constructor of each, which constructor should be called first? Since each of B and C’s constructors must itself call the constructor of D, who knows what side effects could result due to calling the constructors in different orders (since B and C might modify inherited data or behaviour in unpredictable ways)? In a large hierarchy with many crossed dependencies spreading back through a complex network of ancestors, this causes both a problem of implementation (how can we make resolve all the potential ambiguities?) as well as a problem of understanding (even if we do resolve them, how on earth is the programmer going to keep a clear ‘picture’ of that network of dependencies in his or her head?)
There are all kinds of ways of ‘getting around’ the Diamond Problem. The trouble is that none of the solutions is particularly easy to understand or deal with simply and reliably. Fundamentally, we are unconvinced that there are any major programming problems to which multiple inheritance is the best solution. Moreover, a design goal of Sapphire is that the language should be clear, unambiguous and as free as possible of unintentional side-effects.
The Limitations of Mixins
Ruby’s modules and ‘mixins’ provide one solution to the problem of code re-use from multiple classes. But it is a solution with notable limitations. Put simply, while Ruby modules are, in principle, ‘just like’ regular classes they are, in fact, quite unlike regular classes. In particular, regular Ruby classes form a part of the inheritance tree. Having created class C, you are quite at liberty to create class D as its descendent. You can’t do that with modules. Modules stand alone in splendid isolation - if they are classes, they are very much ’second class’ classes.
In my view, this has led to modules being widely used just like ordinary ‘code libraries’ in procedural languages. Often they are used as repositories for lots of ‘free-standing’ functions that become available for re-use by inclusion. This means that the programmer is obliged to decide at a very early stage in the development process whether a certain class should provide behaviour by descent (in which case it is defined as a class) or by inclusion (in which case it is defined as a module). That simple decision fixes the behaviour of that class/module forever thereafter.
Consider an example. Let’s assume you want you create a TimeCalculator class which will contain all kinds of clever code to calculate leap years, the date of the Chinese New Year, the number of days since Elvis Presley died and so on. You now have to decide whether other classes will ‘use’ this functionality through inclusion or whether they will ‘extend’ the functionality through inheritance. Let’s suppose you decide to go for inclusion. This means you define TimeCalculator to be a module. Now whenever some class (say class BankBalance or class Horoscope) needs some do calculations involving time, they just ‘mix in’ TimeCalculator.
But later on you need to write a program to calculate times between notable historical events. You decide that the best way to do this is to create special-purpose TimeCalculator classes - specifically a JulianTimeCalculator a GregorianTimeCalculator and a TaichuTimeCalculator. Logically, these classes should be descendants of a common ancestor, TimeCalculator: they should inherit and extend its existing features. But that is an option that is not available to you since TimeCalculator is a module and you can’t inherit from modules.
Oh well, OK, so you can just go ahead and include TimeCalculator, I guess. Well yes, so you can. But the more you include rather than inherit, the less you are using the essential features of Object Orientation. If you make a habit of this, it could be argued that you might as well be programming in a procedural language such as Pascal!
But wait a minute. Why shouldn’t you be able to inherit from a module? The diamond problem, remember, applies to crossed-links in ancestor classes, not in descendants. So, while it is perfectly reasonable to insist that a class that’s defined to be a module should have no ancestors, that does not mean that it should have no descendants. Let’s try out a new definition of a module then:
A module is a class with no ancestors. A module may be included in other classes.
But maybe you are thinking - yes, well, that’s all very well until you have descendants of modules. If M2 and M3 both descend from Module M1, we are back with the old diamond problem just as soon as M2 and M3 are included in a class. Not so. You see, by definition a module is “a class with no ancestors”. Therefore, any class which descends from a module cannot itself be a module. It is just a regular class whose line of descent happens to go back to a module.
We are still playing with this idea - it has not yet been ‘baked into’ the Sapphire language. If you have any opinions, please let us know! In fact, modules are not the only types of ‘special’ classes in Ruby (and some other OOP languages). Another type of extremely ‘special’ class is called a ‘metaclass’. I’ll discuss metaclasses in another blog entry soon...
Why would you want that? New languages are a good thing. Some don’t make it mainstream and some do: a sort of selection of the fittest. But how else would the foundation of out industry evolve? Lots of people have new and great ideas, and in our field, programming languages are a major result of those ideas. Learning new languages is a great way to learn new ways to think about programming too. I think that the Sapphire project has some intriguing goals, and I will definitely be keeping my eye on it.
Thanks for the comments.
We have some very specific goals in the development of Sapphire (only some of which we have thus far spoken about). Naturally, if we don’t meet those goals or if we provide features which people don’t need, Sapphire will quite rightly fall into the dusty corner which history reserves for failed programming languages.
Suffice to say, we believe that Sapphire will offer a number of features which should preserve it from that fate. But that’s just my opinion.
To put this in perspective, I should perhaps explain that the development of Sapphire is a pretty natural progression for us due to the fact that we have already created a sophisticated parser, lexer and interpreter (the IntelliSense ‘inference engine’) for Ruby (in Ruby In Steel) and we are also supporting IronRuby for the DLR. So we already have a solid base upon which to construct a new language. If you look back to blog entries made right back in the early days of SapphireSteel Software you’ll find that we were already discussing a future ‘Sapphire project’. At that time there was no DLR and .NET was not the ideal platform on which to implement what we then had in mind for Sapphire. The release of the DLR really opened up new possibilities to us. The focus of Sapphire is very different from that of Ruby (safety, lack of ambiguity and a rigorous version of encapsulation are among its primary goals) - and, as you might imagine, we also want it to have a very nice programming environment.
I’ll explain a bit more about the design strategy of Sapphire (in particular, its approach to Object Orientation and encapsulation/modularity) in forthcoming blogs. We’ll release more specific details in the months ahead - for example, a first draft grammar and formal specification. Later on I’ll explain some of the goals which we haven’t yet discussed - though I don’t plan to say too much about those features for a little while yet ;-)
But the more you include rather than inherit, the less you are using the essential features of Object Orientation
Maybe I am missing something, but I don’t quite see what Ruby is missing in this regard. You CAN do module inheritance (as you said), so is the issue you have the syntax? For example:
module M1; end module M2 ; include M1 ; end
puts M2.ancestors # => [M2, M1]
What would be different? If you could do:
module M2 < M1; end
I would expect the exact same ancestry. The end goal seems to be met; the differences seem to be semantic. In this instance, you make "include vs inherit" a dichotomy when it isn’t. As paraphrased by Matz, you use include to achieve inheritance. (including multiple inheritance)
But wait a minute. Why shouldn’t you be able to inherit from a module?
Again, I think this is same issue. You CAN inherit from a module:
class SomeClass ; include M2 ; end
puts SomeClass.ancestors # => [SomeClass, M2, M1, Object]
Again, the desired result is met.
A module is a class with no ancestors.
If this were the definition of modules, I think your points would be correct. But in Ruby this isn’t really the case, and frankly I don’t see that as a bad thing.
But then, I think I disagree with the premise to begin with.There are always features of a language that can be used incorrectly. Multiple inheritance is one that is often cited. Yes, one can concoct a scenario to demonstrate improper use (Diamond Patter), but I don’t think that highlights a shortcoming of the language; the shortcoming is with the developer. Ruby’s inheritance rules, I think, are pretty easy to understand. And a test (or an IRB session) to check "ClassName.anscestors" provides a really simple mechanism for assuring an expected ancestry.
Though, I do have my gripes with some aspects of Ruby’s ancestry. Assuming the M1 and M2 modules from above, I would expect this:
class SomeClass ; include M2 ; include M1 ; end
... to have the ancestry of [SomeClass, M1, M2, Object] because I explicitly declared that I wanted M1 included last. I would get that ancestry if M2 did not include M1. My intuition would say that my local include should take precedence, but that isn’t be the case. But again, this is fairly contrived problem, and I have no qualms with being expected to understand the outcome and produce code appropriately.