Expression Api Complaints

(renamed from WildlyBloatedOopCode?)

Complaints about the tendency of some to turn mathematical, Boolean, and/or relational expressions (such as SQL) into wrapper API's. They allegedly make code larger and harder to read, especially in Java. Proponents claim they provide more type protection, validation, and implementation interchangeability. Maybe there are some compromises between bloat and protection. Is such an API an AbstractionInversion?

Why do otherwise simple things end up being so long at the hands of some programmers and API designers? Why is it tolerated?

Exaggerated Example:

   Normal:

   A = B + C * D

   Bloated:

   adder = new Adder(new FloatManager())
   multiplier = new Multiplier(new FloatManager())
   temp = new Float()
   result = new Float()
   temp.setValue(multipler.multiply(C, D))
   result.setValue(adder.add(B, temp))

There are API's which try to wrap or replace SQL that resemble this.

On the other hand, try to rewrite:

   A = B + C ** d CR ^M ^A t Mike - shall we go for lunch? -- Joe ^S

Anyone can build something that works when it works; ItsHowItWorksWhenItDoesntWorkThatCounts.

Huh?

The programmed interface catches more errors. In the particular case of accessing an RDBMS from a program, you can either pass in a string or use the bloated library. The library may take more code (especially if it's badly designed), but it does mean that many errors are caught by the compiler, rather than being detected only when the code is run.

[It's not hard to check SQL queries. You write the query, give it some dummy variables, and then send it to the database. The parsing engine in the DB will tell you if you have any errors. If you're dynamically generating large portions of SQL (instead of just filling in values), this may be useful, but then you'd want to use a parse tree anyway. In general, the typechecking a compiler can give you is far less useful than UnitTests written against the database itself.]

Maybe this relates to the on-going compile-versus-interpreter, AKA "static typing", HolyWar. It is generally a battle over easier to read and write versus compiler checking. Are errors caused by the difficulty of reading and managing verbose code made up for by compile-time checking? Some say yes and some say no. I personally don't like working with verbose code. The closer it is to (good) pseudo-code, the better. But it may depend on the nature of the app and it's environment. In writing a life-support system for Mars-bound astronauts I'd likely take a different approach than in writing a monthly report for a middle manager.

Such reminds me of Assembler Language. "Natural" expressions are forsaken for something more stiff and mechanical.

A lot of this comes from flaws (well, design tradeoffs) in the language rather than any inherent bloatedness of OOP. Conciseness was never a design goal of Java - it was designed to keep the core language relatively simple and easy to learn while allowing you to accomplish most of what you want to do. As a result, we get things like the primitive/reference distinction, the lack of operator overloading, and the requirement that "everything is a class", all of which often mandate more typing than would otherwise be necessary.

If you were working in CeePlusPlus or DylanLanguage (and I think SmallTalk), "A = B + C * D" would be perfectly legal code, regardless of the types involved. In fact, good OOP code in those languages can often be much smaller than equivalent P/R code because the language lets you define abstractions that can then be manipulated with normal syntax constructs. Operator+ might contain all sorts of code to keep the object's internal state consistent, but you don't have to see it.

Now, if you could provide an example of these "bloated SQL class libraries", we could see if this applies to them too...

-- JonathanTang

See middle of OoLacksMathArgument

That isn't a bloated SQL class library, it's a proof that relational math is inside the scope of Java.

Perhaps it is both. One can do assembly even using Java API's. That does not prove much from a practical perspective though.

Such also risks the same problems described in LispIsTooPowerful. A language that makes it too easy to re-shape a language is no longer a language but a meta-language, and learning every application becomes the equivalent of learning a new language AND the application. -t

There are API's which try to wrap or replace SQL that resemble this.

Those query APIs are very useful, since they can be used to programmatically modify a query. Doing this directly with SQL requires messy string processing. -- AndersBengtsson

That's basically putting the parse tree in the code. Useful, as you said, if you ever need to programmatically modify a portion of the parse tree. But unless your requirements ask for it, YouArentGoingToNeedIt. We have compilers so we can abstract away all the messy details of language syntax behind messy string processing. Otherwise I'd just use CommonLisp. -- JonathanTang

I'm using a tool (HiberNate) which both provides a string-based query API and a Criteria API. Most queries I use need to be constructed dynamically, so for me it's the string-based API that is YAGNI, not the other way around. -- AndersBengtsson

In the above example, it seems there is more than what the eyes see. If the FloatManager is really required, then its definitely not a case of simple arithmetic operation. Otherwise, I would write it as

   A = B.add(C.multiply(D));

While arithmetic operators are definitely more intuitive, well written API are not necessarily that bad.

Note: If A,B,C,D are types that support arithmetic operations, then they must be supporting within their class.

-- VhIndukumar

Further discussion moved to PrimaryNoun

Yeah, OO code is always bloated, unlike SQL code which remains elegant and simple no matter what you need to do. Case in point:

 select 
    xx.xx_rec_id promise_unit_ID, 
    xx.xx_volume aggregate_volume,
    xx_category category,
    xx_difference difference, 
    xx_initial_liability_amt initial_liability, 
    xx_out_of_pocket out_of_pocket, 
    xx_holdings_Secured_Flag holdings_UNSECURED_FLAG, 
    xx.xx_normal_liability_chart normal_liability_chart,
    xx.xx_facility_liability_chart FACILITY_liability_chart,
    yy.yy_rec_id target_REC_ID,
    yy.yy_prediction_meter chart_prediction_meter, 
    yy.yy_unique_instance_liability_chart unique_instance_liability_chart, 
    yy.yy_target_liability_chart target_liability_chart,
    xx.xx_indicator_id, xx.xx_unused_src_calc UNUSED_count_SOURCE,
    yy.yy_target_nm target_identifier, yy.yy_target_id target_ID,
    xx.xx_facility_id mqxy_promise_unit_ID,xx.xx_amt mqxy_unit_present_count,
    xx.xx_units mqxy_unit_present_units,
    xx.xx_standard_amt mqxy_standard_promise_count,
    xx.xx_units mqxy_standard_unit_units,
    to_CHAR(trunc(xx_during_contract_DT)) during_contract_DATE,
    to_CHAR(trunc(xx_termination_DT)) termination_DATE,
    to_CHAR(trunc(xx.xx_expiration_DT)) expiration_DATE,
    nvl(agreement_aggregate.agreement_aggregate,0) agreement_aggregate,
    nvl(lc_aggregate.lc_aggregate,0) LC_aggregate,
    r.r_xoynx_RTG xoynxS_RATING,
    r.r_grrrr_RTG grrrr_RATING 
 FROM 
       (select 
            lyy.xx_rec_id, sum(lyy.l_count) agreement_aggregate 
        from 
            target yy, promise_agreement zz, promise_unit aa, agreement_unit lo 
        where 
             zz.yy_rec_id = yy.yy_rec_id 
             and xx.ca_rec_id = zz.ca_rec_id 
             and lo.xx_rec_id = xx.xx_rec_id 
        group by 
            yy.yy_rec_id,lo.xx_rec_id )
    agreement_aggregate, 
       (select 
            lc.xx_rec_id, sum(lc.lc_amt) lc_aggregate 
        from 
            target yy, 
            promise_agreement zz, 
            promise_unit aa, 
            lc_unit lc 
        where 
            zz.yy_rec_id = yy.yy_rec_id 
            and xx.ca_rec_id = zz.ca_rec_id 
            and lc.xx_rec_id = xx.xx_rec_id 
        group by 
            yy.yy_rec_id, lc.xx_rec_id ) 
    lc_aggregate,
    target yy, 
    promise_agreement zz, 
    rating r,
    promise_unit cs  
 where 
    zz.yy_rec_id = yy.yy_rec_id  
    and r.yy_rec_id(+) = yy.yy_rec_id 
    AND xx.ca_rec_id = zz.ca_rec_id  
    AND lc_aggregate.xx_rec_id(+) = xx.xx_rec_id  
    AND agreement_aggregate.xx_rec_id(+) = xx.xx_rec_id 
    and xx.xx_indicator_id = 1 
    and zz.ca_indicator_id = 1  
    and (r.r_indicator_id = 1 or r.r_indicator_id is null) 
 order by 
    target_identifier

While any RelationalWeenie can easily grok this statement, the equivalent OO code would make any mind crumble.

Don't tempt me :)

Wasn't this example discussed in SqlFlaws? No need to revisit it here. I think it (and SQL) suffers from ThickBreadSmell.

Don't try this at home with any ExpressionApi?

Actually, I have an expression API for C++ that I've been using to build far more complex queries than this one. -- DanMuller

Very few will argue that SQL is the ideal query language for complex queries. But specifying the same thing as an expression API wrapper over SQL is probably not going to simplifying it. Time for a new query language.

Effectively, my expression API is a new language. It is not modelled on SQL, but rather on TutorialDee. A moderate amount of processing is done to translate to SQL.

Have you by chance considered TQL? (TqlRoadmap)

Yes, I keep half an eye on that effort. To put it succinctly, it doesn't have a solid enough theoretical basis, and the discussions tend to meander around topics of which I already have a firm understanding. TutorialDee (or more accurately, the work underlying it) provides a solid basis ready-made, allowing me to focus on implementation issues and details of convenient API syntax.

Working in C++, it's easy to come up with an API that has relatively low syntactic overhead, allowing the queries to be expressed clearly. I've thought a bit about implementing something similar for C#, and I'm at a bit of a loss as to how to achieve the same economy of expression. The need for the 'new' keyword, or qualification of static function names by their (required) containing class, really imposes unnecessary and annoying verbosity, in order to single-mindedly enforce an object-oriented flavor to everything (even when a functional style is being used via static 'methods').

To be fair, the C++ API is hobbled by some limitations, too. The API uses a dynamic approach to attribute and relation references, so their names have to be supplied as strings, which have to be 'wrapped' by calls to functions (or constructor calls) to turn them into objects representing attributes and relations. This also adds verbosity, albeit less than the analogous constructs in Java or C#. The only languages I'm familiar with that provides tools to avoid these problems are, not surprisingly, those in the Lisp family, with their emphasis on symbolic processing and language extensibility. -- DanMuller

This topic is not necessarily to imply that all OOP code is bloated, only to point out and study the patterns of bloated OOP and how it got that way. Simple, the OO got bloated because people tried to 'bend' OO to match their favourite methodology. It is not the problem of OO, but of the people using it. And for that, I am afraid, there is no solution other than to teach/replace the offenders.

[The page shouldn't be about OO, it should be about API complaints.. so what's with the OOP hatred? There are API problems in procedural code, like the windows API, and there are API problems in SQL, a bloated baroque language.. and there are API problems in COBOL, and PHP. The title of this page is ExpressionApiComplaints, so why would you "point out and study the patterns of bloated OOP", but not study problems with say SQL, Windows API, Ada, Algol, procedural programming, etc.? Sorry, but it seems biased to me, and a lot of OOP hatred is here.]

On the contrary, I find there's hardly anything more elegant than well-factored OO code. Poorly factored OO code is just like any other poorly factored programmatic expression - a mess.

That SQL statement above is clearly a mess. SQL is like that though - anything but simple queries reminds me of code written by people who like to unnaturally inline a ton of statements just because they can. -- SteveConover

As described under SqlFlaws, some of the problems with that example are due to bad coding, and some are due to flaws in SQL itself and not inherent in the relational paradigm. SQL is the COBOL of relational languages IMO. Stupid people with stupid languages can make messes in any paradigm. I don't think anybody disputed that.

Under PerniciousIngrownSql, I describe some techniques for creating "light-duty" wrappers for repeating patterns sometimes found in SQL. (Perhaps these should be moved to HelpersInsteadOfWrappers.) They are merely convenience string functions, but don't obligate one to commit to pure API's for an entire SQL command. It is a middle ground that allows one to create abstractions only when and where needed. -- top

Writing expressions that can be manipulated at runtime can be a bit of a trick in Java and related languages, consider Microsoft's LINQ (LanguageIntegratedQueryProject) solution for the .Net platform: It has some really interesting ideas on this subject.

Specifically, see "Lambda Expressions and Expression Trees" at http://msdn.microsoft.com/netframework/future/linq/default.aspx?pull=/library/en-us/dndotnet/html/linqprojectovw.asp#linqprojec_topic3 A method parameter of 'Expression<T>' receives the AbstractSyntaxTree for the function that's passed to it, rather than a pointer to the executable code. "Now there's an interesting idea!"' -- JeffGrigg ;->

It seems a main contributor to this page blames OOP when it has little to do with OOP. It's more about choosing the write syntax and the write tool for the job. Java happens to be OOP but that doesn't make it an OOP problem, that makes it a Java language problem. Algol and Ada were not OOP and they are seen as bloated baroque tools, with API's that also seem bloated and baroque. It seems someone is venting and blaming OOP since they have a bias against anything OOP. Sometimes OOP can be part of the problem when it comes to bloat (using OOP when it is not necessary). However often OOP reduces bloat and tersens certain tasks, which procedural code would be very verbose at (example: look at the Windows API and how they emulate OOP, and how scary that API is to work with. I don't know one single Windows API lover. The ones that claim to love it really don't, they just lie and pretend they are smart since they can work directly with the dangerous API; that's an ego problem, which is similar to "real men use assembly code".).

Many non OOP languages are bloated, such as SQL itself. Try for example to put a window, panel, or GUI into SQL and you will have a bloated mess which relational doesn't work as well with. It's like using an integer to store a string, when you could just use the string. OOP is better at modeling certain things, while relational is better at modeling repetitive similar structured data, while procedural is better at modeling simple tasks that need to be done. It's of course about using the right tool for the job, but blaming OOP for all programming problems is silly. OOP can actually tersen code in many situations and can reduce reinventing the wheel that would have been rolling in procedural code.

OOP does need more theory behind it, in order for it to be taken more seriously, but it does work despite it lacking theoretical background at this time.

In fact in Wirth's and many others' point of view, structures in procedural code are really just limited objects! OOP just offers more abilities to use those structures than before. OOP is procedural code with more extensions available. Some pure OOP advocates think that OOP is something completely different than procedural code with structures, but in fact there are many similarities: an object is a structure, and methods do stuff, just like procedures. Some feel that OOP is just a reinvented procedural coding system with a new name: OOP, procedural code and structures with more benefits and abilities than in C or Pascal. In Wirth's and many others' view, objects are just extended structures, and methods are just procedures).

So the point is that those who hate OOP better look in the mirror - OOP is procedural coding with more abilities. If you hate OOP, you must hate procedural code too - because they are the same thing, OOP is just procedural code with more benefits. And this is coming from someone who doesn't use OOP when it's not needed, so don't think this is a OOP lover speaking.

Perhaps rather than "OOP hate", it's more about "using the wrong tool for the job". It's hate of using technique X to achieve goal Y. Using OOP to build an expression tree is ugly and unnatural to read and/or write for a good many readers. Sure, OOP may be able to extend procedural programming, but sometimes a different kind of tool/approach is needed rather than an extension.

Re: "Many non OOP languages are bloated, such as SQL itself. Try for example to put a window, panel, or GUI into SQL and you will have a bloated mess which relational doesn't work as well with." - To me, this is almost like saying, "Adding a new employee via SQL is bloated and difficult." That is true as stated, but that is not the proper use of SQL. I'm all for storing GUI attributes and configuration in an RDBMS, if tools better supported it. However, that does not mean that SQL itself is the proper way to actually put GUI info into the database. Things like (app-language-neutral) screen designers or GuiMarkupProposal would actually be used for putting the majority of info into the GUI DB. Using gajillion OOP new's and/or set's for GUI setup is not only ugly, but app-language-specific, meaning each app language has to invent it's own GUI engine, which is the situation we find ourselves in today. This is anti-OnceAndOnlyOnce. If one sets out to remedy this by making an app-language-neutral GUI engine, most will likely find that a RDBMS-like model starts to make more sense. Related: EverythingIsCrud. -t