Gate Keeper

Hark, who goes there! What's in your sack?

The Gate-Keeper Argument for OO (and perhaps AbstractDataTypes)

Please, no TSA jokes.

The above title says "GateKeeper". Does anyone here believe that the safety of any set of data is better if no single entity is responsible for the correctness of that data?

Is it reasonable to think that data stored in an RDBMS and accessed from multiple places in an OOP language can ensure the correctness of that data? If data is stored in a database, then 100% of the safety of that data, must reside with the database and not be done in a OOP. Except for code in triggers, stored procedures should be used to "Gate Keep" the data in a database. If you have a "policy" that treats the database as just a "data store" and gives control to a "data layer" in the OOP, you will probably have a hard time enforcing that because the database can talk to all manner of sources. (Politics between DBA's, programmers, developers and system administrators not withstanding.)

Who should be in charge of data that isn't stored in a database? If the data isn't correct, then what code needs changing? That is where "objects" and encapsulation come in. An object should always be responsible for the data that is entrusted to it. To me, this is the most important reason to program in OOP languages. If you have to program in a "non OOP language", then create and enforce your own version of encapsulation.

No single responsibility for data means (at least some of the time) nobody is responsible.

-- DavidClarkd

One of the ideas with OOP is that data is operated on by methods. And that there are only certain methods/behaviors that should be applied to a given set of data. This is a justification for containing a given "data set", and the methods that will operate on it, in a single object. This affords the data, protection from being operated upon inappropriately. Only intended operations are allowed - the encapsulating object is the data's protector, and gate keeper relative to those trying to access the data.

Non-OOP developers might retort that this encourages duplication. These developers might think that OOP would require creating different objects, and duplicating data (heaven forbid) when one needs to operate on the data in a manner different from that of its original encapsulating object. They might argue that "free-standing" data allows more flexibility, because it can be operated upon in any context. But this is precisely what OOP is intended to prevent.

I don't think OO has a monopoly on such. Given specific specifications (that are not paradigm-specific), I could probably duplicate them in non-OO systems and/or databases.

That is, in a macrocosmic sense (different applications, etc.), there may be many contexts in which it may be appropriate to operate on a set of data, but in a microcosmic sense (the data's perspective, if you will), the allowable or appropriate set of operations should still be limited to those allowed by the containing object.

However, this does not require different objects with duplicated sets of data, for different macrocosmic contexts. The OOP approach would be to declare a new class containing an instance of the object that contained the data upon which to operate. The new class would represent the new macrocosmic context, and the contained instance the microcosmic context. This way, the data is used in a different context, without risking its integrity.

I am not quite sure what you mean here. Perhaps an example would help. How can a wrapper class get more access to the guts than the original? Further, if you have a bunch of classes that offer different views of the same state/data, then you are reinventing procedural/relational techniques and queries in a way, it seems to me.

This might be the crux of the problem a SeparationOfDataAndCode developer might have with OOP. -- BryanHoover

Um, GatedCommunityPattern might be one way of describing this. It's not really a pattern, I don't think, but it is a conceptual package. -- MartySchrader

One problem with this philosophy is how to share data with multiple languages and paradigms? Often a database is the owner and not any single class. You can designate that a single class is to access a given DB entity, but OO by itself does not guarantee it any more than functions do.

Also, who has access to what may not be hierarchical. I find ACL's (AccessControlList) more general-purpose than a bunch of keywords like "public, private, friend, final, etc."

Further, this approach ends up duplicating for each interface many features found in databases. IOW, each class may end up reinventing Insert, Delete, Find, Sort, Save, Cross-reference (join), etc. Using a database (effectively) tends to factor these out of the application, or at least out of interfaces, in my observation. Factoring interfaces is often just as important as factoring implementation (Based on OnceAndOnlyOnce, see ReinventingTheDatabaseInApplication)

InterfaceFactoring

Databases also have triggers, referential integrity, and stored procedures to enforce data rules.

If you use triggers and stored procedures, on all you're tables to enforce data integrity, you've just reimplemented OO in the database, but in a far weaker language, SQL.

That depends on how you define OO. I can never get a consensus definition to really verify such claims. As far as "weaker language", triggers and so forth do not need to be implemented in just SQL. (Comparing SQL to say Java is apples to oranges, thus "weaker" is misleading.) Ideally, any language with the proper hooks should be available. Venders seem to be widening the options of late. I'd argue it is more powerful than OOP when sets are a better way to manage access than trees, which is often the case for more complex access management, per below. -- top

Also, who has access to what may not be hierarchical. I find ACL's (AccessControlList) more general-purpose than a bunch of keywords like "public, private, friend, final, etc."

I think this is interesting... see CapabilitySecurityModel. Is this the difference? Objects are capabilities, and if you can't get a hold of the object you need to do something, then by definition you're not allowed to do what that object would allow you to do. And it's provable, in the sense that there is no way to go from this object you have here to this other object I have, unless I give you some reference to it. -- WilliamUnderwood

The bottom line is that if access is forced to be through a given "object" regardless of the language of the application that wants access, then that object/class must take on aspects of a database (unless it is trivial). This then gets into issues of whether a relational database is better than an OODBMS, and we are then back to standard HolyWars.

I'm not quite convinced.

There's nothing stopping one from implementing a relational api through an object. However, given only a relational api and AccessControlList based security, you cannot get back to a CapabilitySecurityModel. This is not to say relational theory is bunk, it is tremendously useful. The problem is that ACL's are packaged up with the relational theory as if they belong together, giving ACL's credibility they don't deserve, and at the same time shoehorning CSM's out of a service that they perform extremely well. -- WilliamUnderwood

This is getting into the complex area of security models.

Which of course, is the purpose of GateKeeper.

Out-of-the box OO provides bare-bones security at best. You are talking about specific security paradigms or methodologies, which perhaps may not be directly tied to paradigms (OO, procedural, relational, functional, etc.)

Typical pure OO gives you (1) non-spoofable object references, (2) defined standard methods of passing object references, and (3) actions which can be performed on the basis of having a valid object reference. This is the essence of a capability system.

non-spoofable object references

: You can't directly access the memory which represents an object, and therefore can't "make up" a reference to an object which you don't already have access to.

defined standard methods of passing object references

: You can share references with other entities by passing them a reference to it. The manners in which a reference can be transmitted done by protected calls to ensure that the previous item doesn't need to be weakened unnecessarily.

actions which can be performed on the basis of having a valid object reference

: Hence, you get access errors when you try to perform an activity with no authority. The object reference is the authority by which you perform an action, and by definition, you can only perform actions which that authority allows you.

Now, in reality, any product out of the box won't provide good usable security... there's always non-trivial configuration issues. If your system allows any user to access a complete list of object references, then that's a serious hole - but then that it also allow one the unrestrained ability to shoot yourself in the foot by violating encapsulation - but not because of an inherent problem with OO.

Now, to be honest, I don't really consider OO to be a specific paradigm in the same nature as procedural or functional, nor do I see relational as competitive to procedural or functional. Thinking about it, there has been OO systems written in functional styles (CommonLispObjectSystem, SmalltalkLanguage kindof) and procedural styles (JavaLanguage, CeePlusPlus), and likewise for relational systems; if I'm not mistaken, relational algebra is to relational calculus as procedural is to functional, and as imperative is to declarative.

[work in progress... writing this on break]

And please tear this to pieces... I don't get that many opportunities for useful criticism in the real-world.

-- WilliamUnderwood

I would like to explore a specific example. In my opinion, CapabilitySecurityModel more or less requires a GodLanguage approach to systems design and is less conducive to a mixed-tool environment. -t

(Moved from OoLacksConsistencyDiscussion)

Re: Databases are like a bunch of global variables. Encapsulation reins that in.

We had the discussion before and in the end it turned out to be all mental conventions. OO did not forcefully "lock" stuff away safely any more than any other paradigm. Getting data from a database is not that much different than getting it from classes with regard to "encapsulation". You have to go and ask something. Whether that something is a database or a bunch classes does not really change the "global variable" picture. Classes are just as "global" as databases from the app's perspective. The difference is that I don't have to reinvent a database in app code because Oracle or MySql did that for me. Plus, relational rules supply a consistency to data handling that OO lacks. -- top

There are two primary myths about databases with regard to encapsulation:

"Naked" (Classes provide more protection than databases) The truth is that triggers, referential integrity, etc. can do just about anything classes can with regard to enforcing data integrity. (Although specific vendor implementations may have limits.)
"Global" - Database info is no more global than data through class interfaces. Databases are interfaces to data. It is simply a choice between using a class front-end or a database front-end to the data. OO does not inherently limit where the front end is available within an app. Thus, it is just as "global". See DatabaseNoMoreGlobalThanClass?.

Encapsulation of classes is not about protection, it's about making things easy-to-use, e.g. not have to worry about whether some other piece of the structure needs to be updated simultaneously when you want to change the value of one field of an object. And it's about differentiating between interface and internal representation, such that it's possible to change the internal representation without change to the interface.

About global data. Objects do not necessarily provide any access to the data they store. Any globality that is relevant is not about the interface to an object, it's about the internals. With databases, you don't want the database file in the disk to be directly accessible either (you could break transactional consistency if you did allow that!). So I believe the globality debate is about apples and oranges.

Widest Method Access Principle

If a class offers 3 methods to a developer, for example, then the minimum access of the set is the maximum access of any of those 3. If methods A and B only read info, but method C deletes, then any limits offered by A and B are moot, other than by volunteer convention, which functions give us also. Anyone who has access to the class can then delete. If all access to state is truly only through a given class, as pure encapsulation would dictate, then somewhere there are probably going to be methods that can alter just about anything. Sure, one can rig up pass-words and so forth, but that is extra stuff beyond the OO paradigm. -- top

Why would methods have no security? Where does it say that "security" or "passwords" can't be used in OOP languages? What do passwords have to do with the "relational model" but all RDBMS implement security using passwords and tokens. Your argument is a simple "straw man"! -- DavidClarkd

Access provided by a class is not important. Access provided by individual interfaces is more important, since those are what are seen by users. Users do not know the "most-derived class" of the object. So access that a class provides for the code that just created an object is not important, since that will not be the access provided to users of the class anyway. But access control as whole is not important. Applicability of the class is much more important, since reusability is the goal of OO methodology.

This does not seem to be the primary claim of OOP these days.

Objects are not intended to be exposed out of the thread that created it, so protection provided by access control is not considered in OO methods, and "volunteer convention" is sufficient.

But real-world state often has to last beyond a given thread. For example, if you book a flight, you don't want that reservation to disappear because the thread that created it ended. The object is just a proxy to a larger "state". Putting limits on the proxy does not necessarily limit the state that remains beyond the proxy's job. For example, I may have an armored guard (proxy) transfer my money to the bank. However, if other proxies can later enter the bank and take out my money, then it does not matter how careful the original proxy was.

Private data is essential to OO methods, but it does not attempt to provide security. Encapsulation is provided only to keep other coders from *accidentally* relying on implementation details. All OO code should have many users, since it's intended to be reusable.

OOP gets messy when one is trying to control access to variations on a theme, such as "sub-classes", when those variations are complex. If the variations fit nicely into a "sub-type" hierarchy, then OOP does fairly well. However, if the variations don't fit nicely into a hierarchy (often described using some form of set-theory instead), then using OOP gets quite messy, and one has to invent layers and "manager" objects etc. MultipleInheritance is not a full solution because it's often not an improvement over set-theory-based solutions, and arguably worse. -t

So you've alleged for years. I suppose what you describe might be an issue if you're mistakenly forcing OOP into domain modelling -- say, by clinging to early-1990s notions of OO when general-purpose OO databases were briefly touted as a good idea -- but it's hardly an issue for computational modelling (i.e., modern application development) where it generally works adequately. Whilst I'd be the last to regard OOP as a universal panacea, difficulties with "sub-classes" dwarf in comparison to others like over-reliance on identity and object IDs, an encouragement toward error-prone stateful programming, and an unfortunate tendency to treat n-adic operators' first parameter differently from the others.

What you describe generally doesn't contradict my statement if one follows the conditions I laid out. Thus, we are not really in disagreement. Computational modelling tends to have cleaner classification systems than domains for whatever reason. I suspect it's largely because engineers try to develop "clean" categories; whereas those who control domain categories value many other social and political issues above "clean design". (I question the reliance on hierarchies for GUI components, but the hierarchical approach is "good enough" so far. Related: GodGuiWidget) As far as other alleged difficulties with OOP, I'm dealing with variations-on-a-theme at this spot. I'll leave the other issues to other topics or sections and won't comment on their relative problem levels here. Related: OopNotForDomainModeling.

See Also: ImplementationHidingDiscussion?, OoEmpiricalEvidence, OoIsPragmatic, BenefitsOfOo, StaticAssert