A small discussion about ‘encapsulation’ broke out a few months back in response to an article I wrote in my series, ‘Ruby The Smalltalk Way’. In the course of that article, I discussed the principle of encapsulation. I gave this description of encapsulation, taken from the ‘Smalltalk/V Tutorial’:
“Related data and program pieces are encapsulated within a Smalltalk object, a communicating black box. The black box can send and receive certain messages. Message passing is the only means of importing data for local manipulation within the black box.”
I pointed out that encapsulation is broken when a programmer assigns a variable (let’s call it x) and passes it to a method (e.g. someOb.someMethod( x )) inside which the value of x is modified and the programmer then uses this modified value - e.g.
x = 10
someOb.someMethod( x ) # <= let’s suppose this method changes x to 20
y = x * 2 #<= Now y is 40!
My assertion that this broke encapsulation turned out to be somewhat controversial. For example, in a thread on reddit, one writer commented: “Modifying objects passed into a method by calling methods on them does not break encapsulation, it’s the heart of what OO is about, creating side effects.”
The statement that ‘creating side effects’ is at the heart of Object Orientation surprised me (to put it mildly). However, the same writer later went on to expand upon his views, commenting: “That a programmer may implement a method outside of an object that modifies its state says nothing about the language or its ability to enforce encapsulation. The visitor pattern violates encapsulation in exactly this manner on purpose as its primary goal.” You can read the rest of this thread here: http://www.reddit.com/info/6d2p0/comments/
The mention of the ‘Visitor pattern’ in the comment above sent me scurrying away for my copy of the well-known ‘programming pattern’ reference, “Design Patterns: Elements of Reusable Object-Oriented Software” by Gamma, Helm, Johnson and Vlissides. First I looked up their definition of ‘encapsulation’ in the glossary. Here it is:
encapsulation: The result of hiding a representation and implementation in an object. The representation is not visible and cannot be accessed directly from outside the object. Operations are the only way to access and modify an object’s representation.”
Yes, well, that seems a pretty good definition to me. So now let’s see what they have to say about the ‘Visitor pattern’. This is what I find on page 337:
“Breaking encapsulation: ... the pattern often forces you to provide public operations that access an element’s internal state, which may compromise its encapsulation.”
OK, so far so good. We have a workable definition of encapsulation. We also agree that the visitor pattern violates this. But I still disagree (profoundly) with the statement: “Modifying objects passed into a method by calling methods on them does not break encapsulation.”
Note: Some of the quotations in this article are taken from BYTE magazine, August 1981.
Others are taken from a number of classic Smalltalk books which are now available for download from:
Stéphane Ducasse’s :: Free Online Books |
Implementation-Hiding
In my view, encapsulation necessarily means that the internal representation of an object (both its data and its methods) are hidden from the world outside. In the August 1981 ‘Smalltalk special’ issue of BYTE magazine (which I bought when it first appeared in 1981 and which still sits, tatty and much thumbed, here on the shelf next to my desk), Dan Ingalls says this:
”No component in a complex system should depend on the internal details of any other component.”
(The Dan Ingalls article is available online HERE)
That simple, succinct sentence tells you everything you need to know about encapsulation.
What this implies is that it should be possible substantially to rewrite the implementation of a method without having any effects (what I would call ‘side-effects’) on any code that ‘calls’ that method.
Here, are a couple more quotes on the same theme, once again taken from that classic issue of BYTE:
“A message must be sent to an object to find out anything about it... This is needed because we don’t want the form of an object’s inside known outside of it.”
(‘Object -Oriented Software Systems, David Robson, Xerox PARC, BYTE Magazine, August 1981)
“Modularity: No component in a complex system should depend on the internal details of any other component....
The message-sending metaphor provides modularity by decoupling the intent of a message (embodied in its name) from the method used by the recipient to carry out the intent. Structural information is similarly protected because all access to the internal state of an object is through this same message interface.”
(‘Design Principles Behind Smalltalk, Daniel H. H. Ingalls, BYTE Magazine, August 1981)
In other words, the key, the central idea of what we now call ‘encapsulation’ is not merely data-hiding, but implementation-hiding. You don’t need simply to hide information (variables) from the world beyond the object - you also want to hide behaviour (methods). If the implementation details of a method have any effect of any sort on code outside of that object, then encapsulation is broken.
Simple Ways To Break Encapsulation
Here are a few examples of ways in which you can easily break encapsulation (that is, you can make external code dependent on the internal implementation details of objects ) in Ruby:
1. Modifying the value of an argument inside a method breaks encapsulation
class C
def aMethod( aVar )
aVar << "hello"
return aVar.reverse
end
end
ob1 = C.new
mystring = ["world"]
In the above, the author of class C appends “hello” to the argument, aVar and reverses the array when returning it. So, when a C object is used as the author intended, this is the result:
p ob1.aMethod( mystring ) #<= [“hello”, “world”]
But, since the argument, aVar is modified inside the aMethod() method, it is quite possible for a programmer to ‘hang onto’ the ingoing variable and use its modified value instead of the returned value, giving this result:
p mystring #<= [“world”, “hello”]
In other words, the same method produces different results according to how it is invoked. By ‘hanging onto’ the ingoing values (in the second example above), a programmer’s code becomes implementation dependent. If the author of class C reimplements aMethod, a programmer who uses the explicit return value will see no change (encapsulation seems to be working!) whereas the code of the programmer who uses the value of the ingoing argument will now behave differently (contrary to the intentions of the author of class C):
2. Dynamic programming Breaks Encapsulation
class C
end
ob1 = C.new
ob1.instance_variable_set(:@a, 100 )
p ob1 #<= #<C:0x2ab0584 @a=100>
In Ruby and many other ‘dynamic’ languages you can create or modify classes and objects at runtime. In many cases, this lets you tinker with the internal details of objects ‘from the outside’. The above example in Ruby code is a case in point - it creates and initializes an instance variable, @a, which is then poked into the object, ob1. The Ruby class documentation describes this method thus: “Sets the instance variable names by symbol to object, thereby frustrating the efforts of the class‘s author to attempt to provide proper encapsulation.” Getting dynamic programming and encapsulation to live together in peace poses a very tricky problem!
3. Global Variables Break Encapsulation
$x = 100
class C
def aMethod
$x = 200
end
end
class C2
def anotherMethod
return $x * 2
end
end
ob1 = C.new
ob2 = C2.new
In the above, ob2 and ob1 are objects of different classes. Calling methods of one object should have no effect when calling methods of the other. But they do, thanks to the reference to the global variable, $x. In fact, the results of my code change according to the order in which the methods are called...
puts ob1.aMethod #<= 200
puts ob2.anotherMethod #<= 400
puts ob2.anotherMethod #<= 200
puts ob1.aMethod #<=200
That, in effect, ‘exposes’ the implementation details of each class’s methods. If I change the code inside them, the effects will ripple through to unrelated objects!
I won’t labour the point and further. Suffice to say that in most mainstream languages there are many ways in which an object’s encapsulation (that is, the strict privacy of its internal structure - either its data or code) may be broken. For example, you can do so by extending or modifying base classes (from which other classes are derived) or by passing internal details from one object to another (via, for example, ‘lambda functions’, ‘blocks’ or ‘closures’). I’m not saying that these activities are always or necessarily bad - but, nonetheless, they do have important consequences for encapsulation.
Most modern OOP languages - C++, C#, Delphi, Java et al - don’t pay much attention to the data-hiding part of encapsulation. They generally consider this to be an optional extra, something you can enforce to a greater or less degree by using ‘privacy’ keywords, voluntarily adhering to certain coding standards or just documenting your intentions. This may explain why many programmers regard ‘encapsulation’ as a description of the ‘wrapping up’ inside an object of locally scoped variables and the methods to act upon them but do not consider it to imply the hiding of information (data) and implementation details.
Here are a couple more quotes that address the vital area of ‘implementation hiding’:
“Encapsulation is a great bonus from the point of view of the user of an object - they do not need to know anything about the object’s implementation, only what its published protocols are.”
(Smalltalk and Object Orientation: An Introduction - John Hunt)
“The separation between the internal and external views of an object is fundamental to the programming philosophy embodied in Smalltalk. To use an object, it is necessary to understand only its protocol or external view. The fundamental advantage of this approach is that, provided the message protocol or external view is not changed, the internal view may be changed without impacting users of the object.”
(Inside Smalltalk, by Wilf R LaLonde and John R Pugh, 1990)
Since it is clearly the case that ‘encapsulation’ means different things to different people, I can’t help thinking that it might be clearer if I were to use the term ‘modularity’ instead. Indeed, in the early literature of Smalltalk and OOP, ‘modularity’ was the more commonly used term. These days, however, even the word ‘modularity’ is ambiguous. For example, a Modula-2 module bears no resemblance to a Ruby module. Moreover, ‘encapsulation’ has become one of the three great tenets of OOP: inheritance, polymorphism and encapsulation so, for better or worse, we are stuck with the word!
In the next article in this series, I will explain exactly what I mean by modularity and why I believe it to be so important - and will become increasingly important during the coming decade.
Huw Collingbourne is one of the architects of Sapphire - a new OOP language for the DLR which is currently being developed by SapphireSteel Software. One of the fundamental design principles of Sapphire is rigorous encapsulation/modularity.