Xml Acceptor Pattern

The XmlAcceptorPattern involves combining logical IDL and XML. The pattern is best described with an example. The IDL consists of one method:

  int Accept([in] String xml_in, [out] String xml_out);

This method follows the request-response paradigm popularized by the HyperTextTransferProtocol. Clients post XML to the server (aka "acceptor") and get XML back. The server parses the xml_in string and routes it to the appropriate handler depending on what type of document it is. For example, the document may update an employee's vacation time. The acceptor responds with another XML document in xml_out -- this document contains status information, including any error messages.

The benefits of this technique include:

Simplicity: There is only one method. New operations can be added to the acceptor simply by passing different XML documents to the method.
Extensibility: When IDL is used in a traditional manner, client and server programmers can work in isolation, but the IDL must be defined. We all know that IDL changes during development, which is a painful process. Changes to IDL must be carefully coordinated. The XmlAcceptorPattern allows methods, input, and output data to evolve without changing the IDL.
Richness: XML data can be (almost) arbitrarily complex. XML can be used to pass simple datatypes or complex graphs of objects. The complex data is passed "by value" whereas in the old days, complex data would probably be passed "by reference" as objects. Passing data by value is usually faster than passing data by reference because round-trips are avoided to actually access the data.

"Richness"... XML can only define tree structured data, which is no more rich or expressive as that which can be expressed in CORBA IDL, and a good deal less expressive than that which can be expressed in MIDL. Object graphs can be passed by value by using XmlLinks within a document and then translating the XML into objects, but this is what one would have to do in CORBA anyway.

Another counter-argument:

In 1998 I was working on a CORBA project where we decided to send generic sets of hierarchal named items in XML format. Seemed like a good idea.

Then we realized that you can CORBA IDL definitions can be just as generic as XML, if you want. CORBA has...

The "Any" type. (Which can store an instance of any type -- like a MicroSoft Variant.)
Vectors (arrays)
Structures.
Union.

I can use a structure to say that a given name has a given value. The value can be a scalar type (like a string) or another vector.

I can "do" practically all of XML in CORBA IDL, if I want. -- JeffGrigg

(I guess I should be talking about this in XmlAndCorba.)

XMLAcceptor is not limited to Corba. It applies to BusinessLogic in general.

Any vs XML

It is true that XML brings nothing new to the table. In fact, you can do anything with a TuringMachine, so in theory ObjectOrientedProgramming really isn't needed either. The question is: is it harder or easier to use the TheRightToolForTheJob?

Interesting that you mention Any and Variant. In fact the COM interface I use accepts and returns a Variant. This has proven to be very flexible as it allows the caller to pass in a string, a URL, a stream (which I have never actually done), or a MSXML DOM pointer (node or element). This has resulted in an extremely flexible and usable interface.

One substantial benefit of XML over traditional datatypes is that, "commands" (XML documents) can be created in a text or XML editor and stored in the file system or on a web site. This is extremely useful for QA purposes because it eliminates a lot of hand-coding in tests. Also, since the output is an XML document, it can be converted into CanonicalXml and "diff"d against another XML file. A test harness that automates test cases is substantially easier to build with XML Acceptor. This doesn't only result in benefits for QA engineers, because, as we all know, ProgrammersShouldTestTheirOwnCode.

I simply no longer agree with passing traditional datatypes to BusinessLogic, local or remote. I stand behind my claim that XML allows far more richness and usability even compared to Any or Variant (including arrays). Defining classes and structures in IDL is cumbersome. Writing code to read and populate arrays is cumbersome. Using XSLT, XPath, and the DOM is actually quite comfortable.

If the data is just data, pass it as such. If the data are objects, pass them as IDL explicit objects. Notice that XML Acceptor does not pass objects (besides possibly a DOM) because XML can not encode object references.

Although this makes the hair stand up on proponents of ObjectOrientedProgramming, in practice it actually isn't so bad. Passing data by value, rather than by reference, is what the web is based on. One can hardly argue with cgi-bin scripts, servlets, SimpleObjectAccessProtocol, and other uses of RepresentationalStateTransfer. Pass-by-value semantics eliminate excessive RoundTrips.

XMLAcceptor has proven itself by scaling up to arbitrarily complex tasks. A workflow engine has been built in which each component of the workflow is an XML Acceptor. The workflow engine itself is an XML Acceptor.

The workflow engine is similar to ApacheAnt. There is currently standards work going on in the area of orchestrating web services http://xml.coverpages.org/wsfl.html. My workflow engine can be considered as an early implementation of these efforts which are, after all, implementations of XML Acceptor. I regret that I am unable to distribute the engine and sample components as OpenSource.

The downside to an "emit and consume" approach is that the XML can get pretty hairy. Documentation becomes of utmost importance to make sure that components don't step on each others' toes. The nice part about IDL is that it's rigid and therefore the data can't get out of control as long as you avoid Variant and Any.

This need for external documentation is the most important weakness of an approach such as XmlAcceptorPattern, or a CORBA method that accepts an Any as an argument and returns an Any. An interface definition that says "Anything can be passed in, and anything can be returned" is not much of a definition. When multiple organizations utilize such an interface, somebody needs to take on the responsibility of documenting exactly what the valid inputs are and what outputs might be generated.

XML Acceptor is not for everything. The decision making process is based on the TransactionGrainSize. ObjectOrientedProgramming has lots of benefits. But it is my feeling that if most BusinessLogic in a system was implemented as XML Acceptors, you could do some mind-blowing stuff (as I have) that would simplify programming greatly. For example, since all messages are XML, it is trivial to pass messages bidirectionally between components via a message queue. This is a lot faster than using an application server which introduces a great deal of overhead.

However, the lack of compile-time interface enforcement and overhead of parsing and validating XML really turns some people off. By the way, passing Variants and Anys has the same compile-time enforcement problem. It is not possible for XMLAcceptor to take hold in an organization without the evangelization by someone of considerable influence (see below). I hope that some day computers will be fast enough for most of the negatives of XML Acceptor to simply be irrelevant. In my applications they are. Believe me, I have spent enough time worrying about them.

Evangelization by someone of considerable influence:

Introducing new solutions to old problems usually requires approval from the managers of an organization. The managers need some convincing because they believe IfItAintBrokeDontFixIt. Someone in the organization must "evangelize" the new solution to the managers. This must be someone of considerable influence, especially if the idea is controversial.

The IDL consists of one method:

  int Accept([in] String xml_in, [out] String xml_out);''

This has the feel of having only one function as the API for some module, which does everything the module can do.

  void* doWork(void* input); // in C++

  public Object doWork(Object input); // in Java

I'm not sure where you're going with this. One object has one method but there can be many objects. The set of objects comprise the API's functionality.

A system filled with classes with only one method with the same signature, isn't that scary? A typical code snippet becomes like this

  boo = foo.doWork(null);
  bar = foo.doWork(boo);
  bar.doWork(foo);

and every stack trace consist of many doWork(). Wouldn't it be a hell to debug or maintain?

In practice, it isn't. It results in highly modular and reusable code. XML Acceptor forces developers to write components that DoOneThingAndOneThingOnly. This is a goal of object oriented systems but requires discipline to achieve in practice. XMLAcceptor forces the issue.

Of course, XML Acceptors are implemented using modern ObjectOrientedProgrammingLanguages. One can "step" inside an XML Acceptor and see the implementation. XMLAcceptor follows the FacadePattern. The interface looks simple but behind the "wall" there is some complex stuff going on in there.

Also note that a single object can support multiple documents. Generally the "other" documents are configuration documents which are then followed by a <doWork/> document. This means objects are not free-threaded. In fact this is where JavaTwoEnterpriseEdition and ComPlus are headed. Systems generally scale better when multiple instances are used rather than using the same instance in multiple threads due to synchronization issues.

The top-level document node (e.g., the DOCTYPE) is analogous to a method. In XML Acceptor, methods don't matter. Only documents and objects matter.

[I've seen this pattern before. In the early 90s I wrote a bunch of Windows database code using Novell Btrieve. The primitive interface to the Btrieve API was a single entry point hooked into an interrupt. The variables you lined up to explain to this API what operation to perform was impressive. Nobody used the low-level interface that I knew. There was a rather large C wrapper library of, I forget, 40 functions that mapped subsets of arguments to that single call. I would imagine that any halfway complex XmlAcceptorPattern implementation wouldn't be useful until someone had built a library of specializers. The advantage the Btrieve interrupt had was that it was very compact. It used a single interrupt, so you could slip it under the Windows 3.11 drivers and still have room for NDIS.SYS and maybe a TCP/IP stack. I don't think compactness is often a BigWin? in these days of 12GB memory and 4GHz CPUs]

Event parser ExPat for xml is available at http://expat.sourceforge.net/

An important question is "What is CORBA doing for you?" when using the XmlAcceptorPattern. If you are using the capabilities of the CORBA Naming Service, Trading Service, or other CORBA-based infrastructure, then this is a useful idiom. However, if all your system does is pass XML around, it would probably be easier to use HTTP or other simple text-based protocols rather than bringing in the development and deployment overhead of CORBA.