Location Transparency

[ComponentDesignPatterns | CategoryPattern]

Context

Where an Artifact is located when it is a MultiUserArtifact? beomes an issue of Reachability when for whatever reason it is relocated, renamed, reformatted, restricted, or removed

Problem

The result of an application, query, search or attempted use results in an error or malfunction when the given locale of a resource is not a reachable address

Forces

Solution

Resulting Context

Intent

Decouple the physical location of a component from the knowledge of its existence.

Motivation

When you are building a distributed, component-based system, one of the key questions that must be answered is "Where do things go?", which is closely followed by its corollary, "How do I know where things are?"

The simplest way to build a distributed system is to hard-code the knowledge of the location of the components into the components themselves. For instance, if a Stock Trading Service needs to get the latest prices from a Price Quoting Service, then hard-code that reference into the Stock Trading Service. However, this has some significant drawbacks.

If a requirement of the system was that it should be highly available, then it is very likely that each Service would have a backup service running in parallel to it. If the Price Quoting Service went down, then the Stock Trading Service should be able to switch to the backup. However, this adds complexity to the system. What if you wanted two backups? Each additional measure of safety adds more coding complexity into the Stock Trading Service, and more highly couples it to its existing environment.

Also, consider the problem of a Servlet engine that needs to be able to respond to HTTP requests from many clients. If each client could only communicate with a single server, than you would quickly reach a saturation point where the server would be overwhelmed. Obviously, the solution involves distribution of the requests, but how is that achieved?

The solution to both of these problems is to make the location of each distributed component somehow separate from the knowledge of its existence. If some sort of "broker" object can translate general requests for an object's reference into specific object references at runtime, then the system can be made both more redundant, and more decoupled.

Consequences

Less coupling between components. If a client only knows an abstract "handle" to a server, and knows that the server implements a particular interface, then the client is protected from implementation changes in the server.
SinglePointOfFailure -- In naive implementations, the "broker" component that exchanges abstract handles for actual referencs becomes a single point of failure. The entire system can fail if that component is unavailable. However, there are many ways to remove this limitation, including using multiple, replicated brokers that use synchronisation protocols to keep in synch, fail-over to standby broker servers, multicast protocols to find services, etc.

Known Uses

TransparentDistribution is the driving force behind two standard CORBA Services; the CORBA Naming Service, which allows objects to be identified by name independently of the machine the run on, and the CORBA Trader Service, which matches requestors of services with providers of services.

The IBM WebSphere Performance Pack allows users of IBM's WebSphere Application Server to gain performance by distributing the handling of HTTP service requests across multiple Application Servers. It does this by using TransparentDistribution in that the clients have only a generic handle (URL) to a service name -- the performance pack decouples that from a physical machine id and handles communication of session information between the boxes.

TransparentDistribution also happens in (D)COM via the registry database. The registry generally replaces the INI files we used to store and retrieve settings back in the old days. Under HKEY_CLASSES_ROOT, you see a bunch of programmatic IDs (Prog IDs) and one key that contains the class IDs (HKEY_CLASSES_ROOT\CLSIDS). This allows the creator of a COM component to use the ProgID or CLSID without knowing its location.

Related Patterns

Most uses of this pattern are complemented by the application of ProcessBoundary. Without careful attention to the spacial considerations, components initially assumed to be domestically located in a single computing context (e.g., process or address space) could be moved to a foreign location (e.g., on a machine over a potentially slow network). This can cause unexpected performance problems, and an increases the complexity of relability.

See Also: MultiCaster (of course)

BradAppleton: I think there is a bit more to the consequences than the above "lets on." By trying to achieve transparency in the face of possible failure along any link in the network "chain", it almost defeats itself in some respects. Failure may occur in more places than just the server. I dont think its really a SinglePointOfFailure. The failure could happen at any of the many communication layers on both the client-side as well as on the server-side.

This causes lots of extra "exception handling" code in OOPLs like Java. Whereas before you would just code it up, now you have to be aware of the fact that distribution might be happening. You might not know where or how (or even if its taking place, you might turn out to be on the same host as the server after all). So you do gain location transparency, and you also gain a uniform way of treating component service requests, regardless of where they carom about before reaching the server ("nothing but net" ;-).

But you dont quite gain transparency of distribution. You may be shielded from the details of how and where things are distributed, but now every morsel of client-code needs to be aware that the potential for distribution exists and has to add a lot of distracting exception handling code, even when the operation may be locally serviced.

In Java, this makes for very cluttered code-reading, and almost defeats one of the main purposes of exceptions; namely to allow you to handle primary cases first in a more logical flow, and defer the exception handling stuff to the end where its less distracting to the flow of the code. But using RMI and CORBA/COM, Java code tends to use exception handlers after just about each and every potentially distributed call. The handlers look no less cluttered than the DefensiveProgramming styles used in C (and C++ before it had exceptions). Its a lot harder to defer much of the exception handling code till later in the method. Which makes it harder to achieve things like SimplyUnderstoodCode (IMHO of course ;-). -- BradAppleton

Brad, you wrote this comment almost a year ago, but it is still very valid. As the language has evolved, I have noticed that this pattern is better named "LocationTransparency" or "LocationAbstraction?" because really what you're after is not being explicitly dependent upon a certain physical location for a component. It makes stuff lots harder. OTOH, so does total transparency with regards to distribution.

When distribution affects the behavior of code, because now you must do a lot more validation and exception handling as you mention above. Not only is reliability much more of an issue when dealing with foreign space, so are security and integration. Still, I think the present theme here is focused on location issues, and so I will factor in your comments as part of the tention builder section of the pattern (i.e., "hard-coding access to component binaries sucks, but so does too much transparency -- witness java code doing CORBA invocations..."). -- PhilipEskelin

Do I mis-understand what "SinglePointOfFailure" means? --EricHerman

no, you got it. if there is only one directory where proxy code or other helper objects get the specifics of component location based on a name or handle that was sent to them, then it is a SinglePointOfFailure if there is no failover mechanism or redundancy of services available as an alternative.

On the web this is already being done using DNS. "www.mycompany.com" is purely a logical name, not an actual machine (or machines). But are we finished? No, because it's still possible to defeat load balancing flexibility by using the same name for too many things. For maximum flexibility, you'd have to give each service its own domain name. There's a tradeoff between maximum flexibility in locating services and names that are easy to remember.

-- BrianSlesinsky

However, DNS lookups can resolve to addresses based on the client's geographic location, so the same name can refer to hosting sites in different parts of the world. IP masquerading can then be used to give that address to multiple machines. Large web services are implemented this way, using large server farms in hosting facilities, such as TeleHouse.

Yet another application of one of the fundamental patterns of software engineering: every problem can be solved by adding AnotherLevelOfIndirection? :-)

A terminology used in COM to describe the way a ComComponent can be configured so that a client has no (and wants no) knowledge of where the server is actually being run in regards to apartment, process, machine (with DistributedCom), or (in ComPlus) context.

The term LocationTransparency also comes from the RM-ODP (ReferenceModelForOpenDistributedProcessing) standard. It is one of the 9 transparencies identified by the RM-ODP as being desirable in a distributed system. (Whether they are all desirable or not is debatable). You can see AnoteOnDistributedComputing for some cautions about LocationTransparency and the way it's used.

I don't know much about COM and the other things mentioned above but the term LocationTransparency reminds me a lot of the term "Generica" as in the UnitedStatesOfGenerica. -- PatNotz

CategoryDiscovery