Encapsulation Is Not Information Hiding

See http://www.javaworld.com/javaworld/jw-05-2001/jw-0518-encapsulation.html

A quick glance through the link shows an introductory article to OOP with an eye-catching title.


See http://www.toa.com/pub/abstraction.txt

Many people, the designers of C++ and Java among them, confuse encapsulation with InformationHiding.

Encapsulation and InformationHiding are general software design principles that predate object-oriented programming. Therefore, any definition of them in terms of objects is inadequate. You can say "to an object-oriented programmer, encapsulation means ..." but that is not a complete definition of encapsulation.

Encapsulation means drawing a boundary around something. It means being able to talk about the inside and the outside of it. Object-oriented languages provide a way to do this. Some provide several ways.

InformationHiding is the idea that a design decision should be hidden from the rest of the system to prevent unintended coupling. Encapsulation is a programming language feature. InformationHiding is a design principle. InformationHiding should inform the way you encapsulate things, but of course it doesn't have to. They aren't the same thing.

If you want true InformationHiding in C++, use the BridgePattern. In Java, use interfaces rather than objects (other than at creation). Of course, method calls through interfaces are more expensive than through a concrete object.


Please provide a reference for these definitions of Encapsulation and Information Hiding. The following references I have found indicates "Encapsulation" and "Information Hiding" are synonymous.

As the article linked to at the top of the page mentions, "information hiding" was introduced by DavidParnas in the early 1970's. The best known (but I believe not earliest) paper by him about it is OnDecomposingSystems, where he shows two decompositions of a system for producing a KWIC index. This paper is also referenced by the CodeComplete chapter on modularity. The conclusion of the paper recommends that "one begins [to decompose a system] with a list of difficult design decisions or design decisions that are likely to change. Each module is then designed to hide such a decision from the others." this is contrasted with the "conventional" decomposition, which is driven by functional decomposition.

The two solutions to the KWIC problem are not presented at a high level of detail, but do clearly involve a program built out of "functions or procedures", a module being accessed by the use of some nominated functions that form the interface to the module. The paper is all about improving the use of modular programming.

CodeComplete pp118 clearly states that information hiding is a more fundamental technique than encapsulation. McConnell writes In object-oriented design, information hiding gives rise to the notions of encapsulation and abstraction.

But if your read a little further, things become less clear. On page 150, Mr. McConnell writes "This (encapsulation) is the information hiding described in Section 6.2 all over again. You know everything about a module that it wants you to know and nothing else." Within the context, the sentence on page 118 is merely an attempt to introduce the terms "encapsulation" and "abstraction." When he gets around to defining encapsulation, though, he defines it as information hiding.

Which is a mistake, and a shame.


In the glossary of Design Patterns, Gamma et al., encapsulation is defined as "The result of hiding a representation and implementation in an object..."

So such luminaries as Gamma et al. and the designers of C++ (and Java) are wrong!?

Clearly that there are two definitions of encapsulation, but both valid. It�s common for the same word to have different meanings in different communities or contexts. It is not meaningful to say one definition is correct and the other wrong.

JohnWilkinson

Not at all, it simply means that there's more information to be hidden than just the representation of a type. Information hiding or the opposite, information leakage may span multiple objects, or it may involve other forms of coupling between client code and implementation code. For example any call order dependency will expose some information that normally should be hidden. Another example might be default values that are used by the client code. There are examples of perfectly encapsulated code that still exihibits information hiding problems.

I agree that information hiding does not equal encapsulation, but, by the definition of encapsulation I prefer, encapsulation does need some degree of information hiding.

You can talk about an object being strongly or weakly encapsulated depending on how much of its implementation is hidden.


By the way, OnDecomposingSystems does not mention 'encapsulation'. Does anyone know where that term/concept was first presented?


Is the confusion of these two concepts a problem? Yes, it is.

Information hiding is technique for aiding developers in finding good modules, encapsulation is a technique for expressing the boundaries of those modules (and not the only one). The "conventional" (i.e., not information hiding) decomposition in the Parnas paper is as well encapsulated as the information hiding driven one. It just isn't as well designed.

A developer who believed that encapsulation and information hiding are the same thing will likely also believe that by virtue of exhibiting encapsulation their program also exhibits information hiding "for free" -- KeithBraithwaite and others

In the same way that if a developer believed that using classes and objects is OOP, he will likely also believe that by virtue of using classes their program also gains the benefits of OOP "for free"? The problem here is incompetent/ignorant developers, a clear definitions of the terms won't help. After all, you can writing Fortran using any language. -- OliverChung

Is the ignorance and incompetence the disease, or the symptom? Without clear definitions of terms, reflecting the significant distinctions in the field being discussed, how is the incompetent/ignorant developer able to improve or to learn? Such definitions aren't sufficient for that, certainly, but they are necessary. --KB


Could someone provide a short toy example that demonstrates an example of encapsulation without information hiding and/or information hiding without encapsulation, or any other illustrative examples really. I'm having a hard time seeing the distinction without actual code (Java would be fine).

:) An oversimplified example:

 class NoEncapsulationOrInformationHiding { 
public ArrayList widths = new ArrayList();
 }

class EncapsulationWithoutInformationHiding { private ArrayList widths = new ArrayList(); public ArrayList getWidths(){ return widths; } }

class InformationHidingWithoutEncapsulation { public List widths = new ArrayList(); }

class EncapsulationAndInformationHiding{ private ArrayList widths = new ArrayList(); public List getWidths(){ return widths; } }

Encapsulation allows you to check access to your own innards, and provide specific methods of performing such access. It does not specifically address leaking implementation details; although you can control access to that variable, you can no longer control the fact that the client knows your using an ArrayList, even if you later decide to use LinkedList.

Information Hiding, on the other hand is hiding the fact that your using an ArrayList to implement your widths. Although the client could still find out how your implementing it (InformationHidingIsNotInformationErasing), you're no longer responsible for supporting that implementation.

-- WilliamUnderwood

Another trivial example (without code) is the StrategyPattern. Introducing it appropriately can increase encapsulation, but there is no InformationHiding involved. -- SiduPonnappa

Good examples. I'm trying to come up with a short characterization of encapsulation and information hiding (for my purposes, in regards to Java). Something like this: Information hiding is when you keep users of the class from knowing too much about the innards of the class, like when you use interfaces, it helps ReduceCoupling; Encapsulation is when you keep users from messing around with the innards of the class, like when you make a field private and its accessor methods public, it helps IncreaseCohesion?.

I'm pretty sure I adequately understand information hiding. It's the encapsulation I'm having trouble putting into words. Perhaps it is because encapsulation always (almost always?) uses at least some form of information hiding. For instance, making a field private and providing a getter for it still hides the fact that you do have a field (as opposed to a calculated property such as Square.getArea()). Also, I'm not sure that the claim that encapsulation helps increase cohesion does justice to the concept of cohesion.

Does not the class EcapsulationWithoutInformationHiding actually hide some information? It can be altered considerably without breaking clients, eg:

 class EncapsulationWithoutInformationHiding {
public ArrayList getWidths(){
return gloabl_function_get_widths();
}
 }


Encapsulation seems to be a combination of one or more of:

Information Hiding, on the other hand, is


The truth is out there ;). My impression is that what makes parts of this page confusing is a mixing of two different levels, that of design and modeling principles, and that of OO language mechanics.

First, let's look at the topic from the lofty realm of design and modeling principles:

In a design, we are looking for ways to control complexity, and for ways to control change impact.

In order to control complexity, we compartmentalize the complexity by grouping related things together, and we reduce the number of possible connections between these groups by providing an interface for each group which only allows specific kinds of connections. These two are aspects of encapsulation.

In order to control change impact, we first analyze the system for the embedded design decisions which are most likely to change and would have the worst impact on the system if they do. Then we strive to protect the other parts of the system from change in such a design decision by making the other parts of the system oblivious of the decision. These are aspects of information hiding.

But the principle of encapsulation in and of itself is not complete, because it doesn't answer the question which things are to be considered "related", and which kinds of inter-group connections to allow. The application of the information hiding principle provides these answers: group things together which are related to a design decision to be hidden, and disallow such connections from other groups which would expose the decision.

Likewise, the principle of information hiding is dependent on encapsulation, as the "grouping" aspect of encapsulation provides the parts of the system that hide information from each other, and the "access control" aspect of encapsulation provides the means by which they do it.

In conclusion, encapsulation and information hiding are separate but inter-dependent. Insert mandatory yin-yang reference here ;).

These terms are used very differently when somebody is thinking about the level of language mechanics. Here, when people say "encapsulation", they usually mean data encapsulation, namely, making fields private and providing accessors / mutators as appropriate. Also, from this technical point of view, you could say that information hiding happens any time that there's an implementation detail in one part of a system that is not visible to other parts of the system.

In this sense, data encapsulation actually hides quite a few bits of information "for free": Does the data come from a field or is it fetched or calculated? If so, is it pre-calculated or lazily calculated? If it comes from a field, what's the field's exact type? etc. Problem is, the hidden implementation details are probably not be the most important design decisions to hide, and you can easily violate both principles while still having the "mechanics" of data encapsulation correct.

I so hope all this makes sense... ;) -- FalkBruegmann

Category MuchAdoAboutNothing?


How do WilliamUnderwood's examples of information hiding actually hide information? As far as I can tell, any object in the system can discover the widths. Consequently, they can still add and manipulate the widths without the original object knowing. I feel the following is a better example:

 class NoEncapsulationOrInformationHiding { 
public const int STATUS_ACTIVE = 0;
public const int STATUS_HALTED = 1;
public int status = STATUS_ACTIVE;
 }

class EncapsulationWithoutInformationHiding { public const int STATUS_ACTIVE = 0; public const int STATUS_HALTED = 1; private int status = STATUS_ACTIVE; public int getStatus() { return status; } }

class EncapsulationAndInformationHiding { private const int STATUS_ACTIVE = 0; private const int STATUS_HALTED = 1; private int status = STATUS_ACTIVE; private int getStatus() { return status; } public boolean isActive() { return getStatus() == STATUS_ACTIVE; } }

In the last example class, the information about the state machine is not exposed to the rest of the system. I've seen developers stop at EncapsulationWithoutInformationHiding, which yields tightly coupled implementations:

 class TightlyCoupled {
public void process( EncapsulationWithoutInformationHiding ewif ) {
if( ewif.getStatus() == EncapsulationWithoutInformationHiding.STATUS_ACTIVE )
ewif.setStatus( EncapsulationWithoutInformationHiding.STATUS_HALTED );
}
 }

The TightlyCoupled class has no business knowing about any information within the "ewif" instance.

An approach not as tightly coupled would look as follows:

 class LooselyCoupled {
public void process( EncapsulationAndInformationHiding eaih ) {
eaih.process();
}
 }

In this fashion, the "eaih" instance knows if it is able to process itself, thereby making the transition from its current status (e.g., STATUS_ACTIVE) to the new status (e.g., STATUS_HALTED). If the transition is not possible, an exception can be thrown.

Lastly, this implies that InformationHidingWithoutEncapsulation is not possible.


Don't worry about it. The distinctions aren't that important. Right good code. Enjoy life. -- StevenShaw

See Also: GateKeeper, ShieldPattern, CouplingAndCohesion


It does not matter what is the precise or pedantic meaning of the word "encapsulation", what matters is how everybody uses that word, how everybody understands it. Main majority of authoritative people use the word "encapsulation" and "information hiding" synonymously; after many people use the word in some way it gets that meaning!

Here are some quotes from authoritative sources:

Design Patterns - Elements of Reusable Object-Oriented Software by Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides, Glossary: "Encapsulation - The result of hiding a representation and implementation in an object. The representation is not visible and cannot be accessed directly from outside the object. Operations are the only way to access and modify an object's representation."

Design Patterns - Elements of Reusable Object-Oriented Software by Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides, p. 19 (this quote shows that object is not encapsulated by definition): "Because inheritance exposes a subclass to details of its parent's implementation, it's often said that 'inheritance breaks encapsulation'."

Object-Oriented Analysis and Design with Applications by Grady Booch, Ivar Jacobson, and James Rumbaugh, p. 51: "Encapsulation is most often achieved through information hiding (not just data hiding), which is the process of hiding all the secrets of an object that do not contribute to its essential characteristics; typically, the structure of an object is hidden, as well as the implementation of its methods. "No part of a complex system should depend on the internal details of any other part" [50]. Whereas abstraction "helps people to think about what they are doing," encapsulation "allows program changes to be reliably made with limited effort" [51]."

Effective Java™ Second Edition by Joshua Bloch, p. 67: "The single most important factor that distinguishes a well-designed module from a poorly designed one is the degree to which the module hides its internal data and other implementation details from other modules. A well-designed module hides all of its implementation details, cleanly separating its API from its implementation. Modules then communicate only through their APIs and are oblivious to each others' inner workings. This concept, known as information hiding or encapsulation, is one of the fundamental tenets of software design."

Code Complete Second Edition by Steve McConnell?, p. 567 (this book has a lot of information about encapsulation, this quote is chosen because it uses both notions synonymously): "Encapsulation (information hiding) is probably the strongest tool you have to make your program intellectually manageable and to minimize ripple effects of code changes. Anytime you see one class that knows more about another class than it should–including derived classes knowing too much about their parents–err on the side of stronger encapsulation rather than weaker."

Wikipedia, http://en.wikipedia.org/wiki/Encapsulation_(computer_science) "In computer science, encapsulation is the hiding of the internal mechanisms and data structures of a software component behind a defined interface, in such a way that users of the component (other pieces of software) only need to know what the component does, and cannot make themselves dependent on the details of how it does it."

I think it is clear that "encapsulation" and "information hiding" are synonymous, at least most people use them that way.

See http://electrotek.wordpress.com/2009/04/29/encapsulation-and-information-hiding/


To me it is absolutely clear that "encapsulation" is by far not a synonym to "information hiding".

First I have to write about all these "it is been used in a book in synonymous way". Simply said, it does't count. Sometimes notions are used in a particular book in a special context not according to their original meaning. Why? There is no adequate reason, it just happens sometimes. Sometimes notions are used in a way they seem to be anonymous (which is actually not the case). It would take ages to prove this statement providing enough examples and convincing argumentation, therefore I let it out.

Second issue is about examples. Encapsulation is something you can easy explain with few lines of code. Information hiding must be implemented in every line of your code, but it starts to have an effect after the code reaches the size of several thousands of lines, the crucial points of information hiding become visible in the code of several dozens of thousands and it pays off only if you have to make changes that hit many parts of your system, while you want to keep the big part of the functionality untouched. This alone is a sufficient proof that encapsulation is not synonymous to information hiding (assuming the previous sentence is correct).

One more word about examples. If you want to explain woodcutting to someone who has never seen a tree, you can use a knife to scratch the trunk around until the tree falls, which is formally a perfect demonstration of the main idea. However, the person would never get the correct understanding of the matter. In fact, once this person sees the woodcutting industry at work, a new notion for woodcutting will be born.

The examples provided by WilliamUnderwood and by a no-name contributor certainly show me they both understand the concept of information hiding (even though the no-name contributor is obviously wrong saying "Lastly, this implies that InformationHidingWithoutEncapsulation is not possible"). Nevertheless, these examples cannot help people to grasp the concept and its power. Let me as well provide some code, which will be as satisfying the information hiding as useless to people who didn't got the concept yet.

 class InformationHidingWithoutEncapsulation {
public STATUS m_status;
public boolean isActive() {
return m_status == STATUS_ACTIVE;
}
public void activate() {
m_status = STATUS_ACTIVE;
}
public void halt() {
m_status = STATUS_HALTED;
}
public void switchToStatus(STATUS aStatus) {
m_status = aStatus;
}
 }

At least this example shows that information hiding is possible even without encapsulation. If you see this public section of the class you have no idea how STATUS is implemented. You can ask every object if it is active or not, you can activate and halt every object and you can even let one object go into the same STATUS as some other object without asking anyone which STATUS the are actually in. But do you know what is STATUS? Is is some native type or class? And the fruit is of course, the implementation of STATUS can be easy changed any time without touching InformationHidingWithoutEncapsulation. Of course "easy" is a bit tricky, considering all the overloading of = and potential copy constructors or static calls.

Note, you can mess up the objects of this class because the encapsulation is not used.

On the last note I want to tell about my own experience. I learned to program in conservative way and gathered some experience by solving tasks and implementing some little things for myself. With an experience of three years I read the book about OOP and tried instantly to practice the learned concepts. I invested some dozens of hours to implement a kind of text editor applying OOP (or so I thought). I have analyzed my own code recalling the ideas of OOP and came to the clear conclusion that I failed to apply OOP on the whole line. It was frustrating on the one hand, and exciting on the other hand: I am obviously able to recognize my own mistakes. Since then I have produced a lot of code, which applied OOP (when I had enough freedom to do so), and a lot more, which was far away from applying OOP (mostly because I had to implement not OOP interfaces or to depend on something else, which was not OOP).

However, after few years I recognized that encapsulation is not enough to keep big systems maintainable and extendable. I have desperately searched for OOP concept, which would help me to decouple different parts of the system from each other, but there are none. No wonder, OOP covers the low-level issues (low-level in terms of system design). I have also learned about design patterns and tried to use these, but it was all in vain. After a certain time I was able to develop "self-made" techniques to cover the high-level issues, which I called interfaces-in-the-broad-sense or modularization-in-the-broad-sense.

Several weeks ago I started to read Code Complete and learned the notion "information hiding", which perfectly covered my "self-made" techniques and clarified some issues. I also learned some interesting (and probably useful) design heuristics.

Why did I tell the story? Because I think one need to gather some knowledge and experience before being able to move on. Can you teach a man from a forest to control the space shuttle? Of course, if he is mentally and physically capable to. Would you start to teach him about space shuttle control from the very beginning? Definitely not. First he will need to learn what is a button and a lever and how to "control" these. Some playing time with any console (be it a game boy or a modern mobile phone) would be probably good experience as well. Which ever steps you will go through and how many of them there will be is not the point. The point is you will go for a step for step education. Compare to the common education from the school to the university.

So, basically you need to learn a lot about programming and design and gather some experience before you can get the idea of information hiding and be able to distinguish it from the flat encapsulation.

-- SergejPauls (2012.07.21)

You are finding a very confused part of the field, where terms have been bandied about somewhat recklessly without setting very well. You might see if ConfusedComputerScience resonates.


See also: EncapsulationDefinition, InformationHiding


CategoryCoding


EditText of this page (last edited November 27, 2014) or FindPage with title or text search