Scatter Sql Everywhere

Alleged AntiPattern:

An n-tier application which contains SQL syntax in all the tiers.

Child's play. Define that slice of the app with the SQL as the "data access system". The real fun starts when your project has SQL syntax in every file. -- PhlIp

And all the while IBM and many other vendors thinking that EmbeddedSql was a good thing. ;->

Your alternative is to do what? List them and we can come up with downside those approaches as well. I've always avoided ESQL like the plague, but it had its place, especially when systems were much less powerful than they are now. Anyway, stored procedures have been available for a long time, you never needed to embed SQL.

The alternative is to isolate the SQL and stored procedure calls into a "data access" layer (or set of functions). Doing this enables you to more easily find and optimize the SQL statements, or change them from/to stored procedures, or provide artificial subs during UnitTesting.

When SQL statements (or even stored procedures) are distributed throughout your application's business code, then changing the database schema, or switching from one database to another, can be very expensive and time consuming. Business code and SQL should not be mixed.

No one in their right mind has giant parameter lists, you pass around data structures, not simple types. That may be a struct, an object, a recordset, whatever, but no need for long paramater lists.

As far as changing database vendors, that does not happen often enough, and thus YagNi trumps wrapping for something that may be ten years off. Languages at a site tend to change more often than DB vendors in my observation.

Ha! What about when you have to provide support for 10 different DB vendors at once? Which by the way.. many applications do. It is quite common to support at least 2 to 3 different databases with one application.

How about something like this in the using module?:

  function queryX(...) {
    theSql = "select...";  // default
    if (dbVendor == 'oracle') {
      theSql = "select...";
    } elseif (dbVendor == 'sybase') {
      theSql = "select...";
    } elseif...
    }
    return(runQuery(theSql));
  }

However, I am sure some will complain about the grouping. Some will want to group all by vendor, others by query, others will want code in a table so that the grouping is virtual, etc. Another Grand Grouping HolyWar. Ideally something like HibernateQueryLanguage may be the way to go, if perfected.

EmbeddedSql -- a good thing to use when you're convinced that your programmers aren't smart enough to make subroutine calls or write data driven code.

Don't you have bigger problems if that's the case?

Yes, but unfortunately like defensive driving we have to operate in an imperfect world.


Abstracting SQL dialects into a PhraseBook?

There is an AnalysisPattern which can be applied to this problem, which is called the PhraseBook? pattern. This fundamentally entails creating a book containing your SQL statements. In your code, you can ask the book to return phrases (or SQL statements) by name. You can also pass parameters to replace variables. I can't remember where I saw this pattern, but it was definitely at one of the PloP proceedings online. I think it was in the same group as the SkinPattern?. -- TobinHarris

I HaveThisPattern - I tend to use ResourceBundles in Java for this purpose. -- KrisNuttycombe

The PhraseBook? pattern was used by an OO library vendor to implement transparent support for all the major DBs of its time. It made for impressive demos. "Swap Oracle for Informix? Just change this one statement here..."

In my opinion, SQL wrap proponents exaggerate the frequency of changing vendors and the benefits that wrapping gives you if it does happen. It is fairly easy to find SQL statements with text search commands. I don't know why they exaggerate it. Maybe it happens more to them than it does to me. But, I've worked at a lot of companies in my contractor days and it does not happen that often. Generally a given shop will have 2 dialects at the most and one can find commonalities between the two dialects to write SQL in the first place that does not need that much changing if you do swap between them. If you do have to change them, putting all the SQL together in one repository does not significantly improve the change speed, and makes the code harder to change for regular (non-vendor-swap) changes.

Business logic-related maintenance often happens several times a year for several years; while vendor switching is usually a one-time occurrence. I'd say it's about a 15-to-1 ratio, as a rough estimate. In other words, for any given SQL statement, the quantity of changes or visits due to biz logic changes is about 15 times for every time it's visited for DB vendor changes. Why make the 1 harder at the expense of the 15? -t

Is it wrong to want to keep my SQL PhraseBook? in the db?


I like wrapping SQL, but not primarily for the future-safety it gives me. I like it for the way it simplifies and clarifies the code by removing the specifics of the SQL query. When I ask my friend to buy me some milk, I don't care which path she chooses between the isles at the grocery store; I care about the request ("Could you buy me some milk?") and the result (the milk). Same with SQL; the details of the query detract from narrative of the code.

As a side bonus, a DatabaseAbstractionLayer makes it easier to inject a mock for unit testing. -- CarlMasak?

That's generally what is covered in HelpersInsteadOfWrappers. Nobody is really against localized abstraction. It becomes an issue when somebody decrees that ALL sql must be behind wrappers.

All SQL should be wrapped. Database schemas change for different reasons than application logic. They should therefore be decoupled.

Not if the carrying cost of the extra layer is greater than the average payoff from wrapping. You pay the extra layer-bureaucracy "tax" maintaining each part of the layer regardless of whether it actually helps down the road or not. I'd like to look at representative samples of schema changes that you think justify such. In my experience extra layers usually don't pay off.

The assumption appears to be that encapsulating SQL is expensive. There's no reason it needs to be.

With the exception of simple CRUD systems where the database is dedicated to one application and the access patterns are straightforward, it is always best practice to encapsulate the specifics of data access and manipulation. The core reason is, as noted previously, because the database schema and the application logic change for different reasons and at different rates. Encapsulating SQL means that:

The payback for a thin encapsulation layer is immediate. There is no reason not to do it (again, except in the case of a very simple CRUD application). Scattered SQL everywhere should never make it through a code review.

The schemas usually change for the same reasons that apps do, in my experience. Your characterization does not match my roughly 20 years of industry experience with RDBMS at about 3-dozen different organizations (which includes consulting). And, I've never seen a useful "thin" one. Again, I'd like to see some UseCases to evaluate their likelihood and alternatives. Making extra functions and methods that are called from one place clutters up the code with extra scaffolding and formality and makes the eye and scroll-bar have to jump around more to read and change the code. Use comments if you only need comments. It's a great socialistic jobs program for programmers, but usually makes no business sense. Most orgs I work for want more nimbleness, not code-intensive formalities. (However, domains such as accounting and life-support may have a different labor-to-risk profile.)

The schema and application logic will only change at the same time and for the same reasons in simple CRUD applications, which is why I excepted those explicitly. For systems where the database is shared by multiple applications or where the application logic is more complex than CRUD, ScatterSqlEverywhere is without doubt an antipattern, in part because of the four bulleted issues you failed to address.

To "test" your bulleted list, I would apply various scenarios. They are only "true" in situations, not universal. For example, whether "changes can be made in exactly one place" depends on the change. I can give a scenario where having wrappers requires more change spots, but I doubt that would prove anything substantial, since everyone agrees there are exceptions to most "rules". It's determining the "rules", or most common situations, that matters the most. Further, the boundary between CRUD and non-CRUD can be very blurry. Also, being "CRUD" and being simple are not necessarily the same. CRUD==simple is simply naive (or subconscious domain ego inflation). Otherwise, why would there be 100,000 lines of CRUD-centric code out there for many apps? If CRUD was simple, somebody would factor it all into a library/API(s) by now such that the 100,000 LOC apps would only be 2k LOC, or the like. Furthermore, most production apps have a large CRUD component to them, even if they do "fancy stuff" like launch rockets, predict whether, etc. If you have some fairly clear rules for what's CRUD versus non-CRUD, I'd like to see them. -t

As far as I can see, you haven't addressed the core issues I raised. First, it is obvious that, aside from simple CRUD systems, the database schema and the application logic do change for different reasons and at different rates. That alone suggests that the best practice is to decouple the two components to make the system easier to maintain and extend.

Second, restricting changes to one place, even if that place is a module rather than a single function, further increases maintainability and extensibility by reducing the risk of change.

Third, wrapping SQL, even in a simple function, significantly simplifies unit testing. You can mock out the function to test the application logic in isolation and you can test changes to the schema without pulling in application logic.

Fourth, interface functions with descriptive names do make the code easier to understand, again improving maintainability and extensibility.

Fifth, and related to the first and second points, insulating applications from changes related solely to other applications further improves maintainability and extensibility by reducing the impact and risk of changes.

A SQL wrapping layer would have to be pretty heavy to outweigh those benefits.

You have yet to provide any reasons not to decouple SQL. My point here stands.

Yes I have. It can create two places that may need updating instead of one. For example, if a new column is added to a table, one may then have to change both the SQL isolator function/method AND the app function/method that uses the new column. This is compared to one method that would have to be changed if they are not separated. One is less than two. And it creates more indirection code (layers) to have to maintain. All things being equal, less code is better and less layers is better. Layer hopping is extra busy work that costs time and money. It's not a free lunch. I've worked on layer-happy systems, and it can take almost double the time to make any given change. That's roughly $60,000 more the owner is paying each year per developer for the "privilege" of layers. Again, I am not always against it, but rather saying the cost and benefits need to be weighed.


Re "CRUD is simple" - I believe this common misconception comes about because on a high level CRUD is simple. Unlike say weather forecasting which requires years of study to know the general domain of forecasting, CRUD as a general concept comes pretty natural to most humans. However, making it work right for a given situation requires myriad interacting details to be coordinated and tuned. Weather forecasting is front-loaded (or both-loaded) with complexity, while CRUD is back-loaded. The devil is in the details such that it's a lot of little devils instead of one big one. --top


See NoStrings, HelpersInsteadOfWrappers, DatabaseVendorLock, PerniciousIngrownSql <-- should this be merged with this page? {except it's already big}


JulyEleven

CategoryAntiPattern, CategorySqlProgramming, CategoryInfoPackaging


EditText of this page (last edited June 2, 2014) or FindPage with title or text search