Stored Procedures Are Evil

StoredProcedures are not inherently evil, any more than imperative code anywhere else is evil. In certain cases, StoredProcedures may provide certain advantages over alternative approaches. Because they are tightly coupled to the database, they may provide significant performance benefits over accomplishing the same tasks outside the database. They may also represent functionality OnceAndOnlyOnce within the managed, durable, secure, transactional environment of the database -- rather than having to replicate the functionality among clients or deploy it in complex and possibly insecure or brittle tiers above the database management system.

However, StoredProcedures as typically implemented in popular database management systems use proprietary, vendor-specific syntax which hinders portability. Also, some database management systems make refactoring StoredProcedures very difficult, due to poorly implemented mechanisms for updating existing StoredProcedures. Finally, StoredProcedures may sometimes be used as an awkward means of overcoming a database product's lack of declarative constraint mechanisms. These factors can make maintaining systems and applications based on StoredProcedures quite awkward, and therefore contribute to the view that StoredProcedures are evil.

Issues to consider when making the decision to use SP's:

Performance
Security
Development and maintenance effort to add, debug, and change them

Discussion regarding the evilness of StoredProcedures moved here from StoredProcedures.

Stored procedures and triggers put application logic into the BackEnd. This is not a GoodThing in a modern n-tier application. It is too easy to end up locked into one DatabaseTechnology? and lose all that flexibility that n-tier promises (and sometimes delivers).

This is a handwaving style of argument that if not developped should be deleted.

{I'd suggest simply asking, "Could you please clarify and/or provide sample scenarios?". No need to pre-suppose malice here.}

Stored procedures are perfectly good. Both reasons the creator of this page offer to damn stored procedures are completely false. These are important tools, and a smart and knowledgeable programmer knows when to use them and when not to use them.

It is false to say: "Stored procedures and triggers put application logic into the BackEnd"

For example, a great use of stored procedures is to combine a multi-step activity into a single request as an optimization. And an example of an intelligent use of triggers is the production of an audit trail upon revision -- it is not truly application logic, and moving that activity out to a trigger reduces clutter in the middle tier and enforces uniformity.

It is false to say: "It is too easy to end up locked into one DatabaseTechnology? and lose all that flexibility that n-tier promises."

The ability to switch database vendor is a mere side-effect, and most emphatically not the reason we build multitier systems. Moreover, all database products worthy of consideration for significant systems provide stored procedures [why?], so using stored procedures introduces no lock-in. Stored procedures merely require a porting effort should you wish to change database -- an effort that is no smaller when limiting yourself to only queries.

Why, you ask, do all database products worthy of consideration for significant systems provide stored procedures? You'd have to ask those vendors why they made that choice, not me. But they all do provide stored procedures. (I recognize that "why" might be a nudge towards a discussion of WhichDatabasesAreWorthy?. That deserves a different page.)

I have implemented systems all the way from (1) no stored procedures at all to (2) practically all business logic was in stored procedures, and many systems in between. From experience, I have found it better to have no stored procedures or triggers. Triggers are only tolerable for maybe a few auditing functions, but they're generally not necessary and practically always more bother than they're worth.

I have used them and will use them as a performance optimization. But only as a last resort. My experience has been that people grab for "the stored procedure solution" long before trying anything else -- like thinking or doing their job, for example. -- JeffGrigg

But couldn't the same argument be made about OOP? keep all your data in structs and arrays, not in the object.. because if you keep all the data in structs and arrays, you can easily map that data to a database and need no object relational mapping. The advantage of OOP is that the data and procedures are tightly bound - an object has both methods and data. A struct on the other hand has only data, and the procedures are kept separate. Problem with structs and procedures is the syntax disadvantage: passing in structs as parameters causes the code to be harder to read (more function parameters, for example Win32 API functions with 6 or 9 function parameters. They split the parameters on to separate lines in the documentation because it is such a mess and so hard to follow. Writing a plain C win32 API app is extremely difficult and verbose, whereas using OO technology like visual C or delphi makes it much less verbose/arcane). When you keep code separate from the database, you have to do more mapping and futzing around to map your app to the database. I've never been a fan of stored procedures myself but I'm reconsidering the whole idea now, as languages like TutorialDee use procedures (operators). I'm still researching about third manifesto "domains", and trying to find out why Darwen and Date think domains=classes (if domains don't have any methods they cannot be classes, and are not very similar.. domains are more like types in non OO languages)

My company uses tons of stored procedures, and our system handles some complicated tasks quite well. This was a conscious design decision that we feel paid off. So I have a hard time seeing them as evil. Wrong for some jobs maybe, but not evil.

A good use for a stored procedure: in a distributed environment where there is a need to do something relatively simple frequently to a large dataset. Bad efficiency to pull all that information across the net (using JDBC or whatever), modify each row or item, just to push it all back into the database. In other words, where you might use a nested select for update construction.

This is a good example of a valid performance optimization.

But... If you regularly need to say "discount all products in this line 10%," and you find that to do this, you must update thousands of records, then I think this implies that the database is not well designed to support your business requirements.

Another solution would be to use a "3-tier" organization, which places 'host language business logic' on the server, "next to the database."

I think stored procedures should not be used except as an optimization. Putting the SQL statements in your source code is simpler by far.

(Seems to me that the issue discussed in this block is NOT one that requires StoredProcedures. You only need a more sophisticated DataManipulationLanguage?. Then you could send the DataManipulation operation to the Database, which could then compile and perform the manipulation locally (in accordance with your capability/rights), if necessary as a single transaction. Even if you need the same operations many times, you could do something similar with caching or by having the front-end forward a session-limited 'StoredProcedure' that you can then invoke as often as necessary, and the results of this compilation can be cached if that is appropriate.)

(If one needs a stored procedure, it's because it should be sitting there in everyone's namespace. It becomes a true interface to the Database - one with which users will 'couple' themselves (which ultimately reduces the flexibility of the Database as much as the schema). Session-limited procedures don't cause this sort of coupling.)

(StoredProcedures have some purposes in maintaining certain meta-data and constraints that the DBMS is incapable of providing directly (covering another weakness). They also have purpose in essentially extending the scope of the DBMS as a remote service. However, mere optimization doesn't really require the 'stored' half of 'StoredProcedures'. Just something to keep in mind.)

Aren't stored procedures a natural extension of Once and Only Once? I really don't care where a common utility executes as long as it is easily available to multiple applications.

Are stored procedures the only way you can reuse code between applications? As an alternative, one might consider "middle-tier business logic" via CORBA, DCOM, RMI, SOAP, or other such protocol. If that's still too complicated, one could try calling a commonly used subroutine library to reuse code (...with the downside being versioning and release control procedures).

No stored procedures are not the only way to reuse code, but some times they are the best way to implement the code to be reused, especially across multiple application development environments. Real world example: We have a large database being accessed by multiple applications including SQL Loader based scripts, Oracle Forms, Microsoft ADO and DCOM, Oracle Reports, and Crystal Reports. Where we don't have stored procedures we often have multiple almost alike implementations for common functionality. Stored procedures are merely one option for implementing shared functionality with their own set of advantages and disadvantages. I would not exclude using stored procedures out of hand nor use them as the only solution.

What's wrong with PreparedStatements? Surely a pool of these with the SQL created dynamically at server startup or on a regular refresh basis would avoid the problems of distribution and having logic 'hidden' in the back end.

Yes! Exactly!

I've been on many projects where people say "Dynamic SQL is giving us inadequate performance; we have to do everything with stored procedures." I say, "why haven't you considered 'prepare and execute?' It's standard, widely used, portable across databases, and very effective at improving performance. I've seen 'prepare & execute' improve performance by up to 10x." ("But we have to do stored procedures!" they say, and yet they can't explain why.)

No! Wrong! That StrawMan argument does not refute any goodness of stored procedures.

The use of stored procedures for optimization has nothing to do with the issues of dynamic vs. static SQL. See above for some ideas. There are many fantastic optimizations available for which prepared statements offer no benefits [E.g.?]. Besides, all good middle-tier SQL programming -- aside from user-guided ad-hoc query -- already relies almost exclusively on prepared statements. Those 10x improvements are only available when fixing the code of unskilled programmers.

Not a StrawMan: I've been on several projects where the performance issues of dynamic SQL were used as the justification for why we "had to" use stored procedures. Static SQL or prepared statements were not considered an option because the people didn't know about them, or because prepared statements are more difficult to accomplish with commonly used Microsoft database APIs. -- JeffGrigg

I find that arguments for or against stored procedures often boil down to turf and control issues. DBAs like stored procedures because it prevents developers from misusing the database and lets the DBAs do all the performance tweaking. Developers hate stored procedures if they have to get a DBA's permission or assistance to change them.

My experience is that stored procedures are useful in that they create an explicit functional interface to databases at the database. This allows a reduced coupling to business logic which is a good thing. Changes to a database can be tested using the stored-procedure contracts.

The alternative is to test all the SQL potentially distributed widely across applications which is not OO, though it is object-centric.

RichardHenderson

[Material moved from StoredProceduresAreOk?; the pros/cons are mixed in above]

Portability

What if the StoredProcedures are written in an interpreted language that is available on many environments and operating systems? (e.g. PerlLanguage) Does this simply make it less evil?

What if it's the same interpreted language that all the user interface is written in -- i.e., the interpreter must run on the target environment anyway. For example, PerlLanguage scripts embedded in a database that is implemented with one of perl's DB interfaces). DougKing

I find database portability to be nearly a non-issue with MIS systems. Seriously, when was the last time you had to move from one database to another? Databases usually survive SEVERAL generations of the programs that run on them.

Databases do tend to hang around in the same form while the tools that access them change. But is that a good thing? Does it really imply that the database is so good that it doesn't need to change? Maybe the database should change, but it's so rigid and inflexible that it doesn't get changed, but everything else, which does change with changing business needs, works around its limitations? --StevenNewton

This is a vendor-specific issue it seems to me. It is not a fundamental rule of databases that stored procedures must be in one language.

Above, I was speaking about the one significant database application that I have ever developed (it used Berkeley DB_File and perl). Early in that development, I put a few perl regular expressions in the database to adapt it to different environments for each database. I found that if there was a problem, I had to go into each DB to correct it. I figured out later that a better way to adapt it was to put some ordinary data into the database, and load different perl modules depending on the data. For example, it had to parse the output of different compilers, so instead of putting parsing code in the database, I put just the name of the compiler, and loaded an appropriate perl module to do the parsing. That way, when there is an error, just re-distribute the incorrect module. -- DougKing

Unclear layering

Regarding unclear layering, if you put it all in a specific application, then other languages and applications may not be able to "share" them. Stored procedures should normally be something that you want available to many applications and/or languages.

Different change velocities

Put it plainly, infrastructure for managing changes in application sources are much more mature than for managing DB changes, as application source usually changes much more frequently (both in development and after production) than the DB definitions (table definitions). Stored Procedures has the unfortunate properties of being a application code stored in DB, changing it become much more problematic.

One can version control the source code of stored procedures the same way as any other source code. Stored procedure source code is plain text source code, the compiled version stored in the database is no more difficult than application executable to manage. Stored procedure is actually more fluid than most application languages, database server compiles SPs on the fly and handles dependencies among them.

They can require interface micromanagement. If you want to add a parameter, you have to change it in both the callee and the caller. Direct SQL often requires less work because there are less spots that have to be changed per change request.

One of the problems with SP's is that they usually come with crappy, outdated languages that lack things like dynamic typing, associative arrays, and named parameters. If the language and related tools were improved, then they could feel just like regular functions in an app language. Or, perhaps make the choice of SP language selectable.

Some DBMSes have addressed these issues. The standard PostgreSQL distribution, for example, supports stored procedures in C, its own SQL derivative, Tcl, Perl, and Python. Other languages are available as add-ons, and new ones can be added. Oracle PL/SQL has the ANYTYPE datatype, associative arrays, and named parameters (and has had these features for years and years).

Perhaps we need to address the issue of when and when not to use stored procedures rather than write them off all the way.

Opinions on this question are often based on a belief of WhereTheLightIsBetter.

I'm surprised no-one has mentioned the inherent difficulty in debugging stored procedures (especially once they are in production)... unlike other procedural languages of our day, there is no source debugger or other aids to assist when things goes really wrong (and perhaps you were not the original developer of these SP's).

In medium and big shops, there is often an SQL developer that handles that and knows how to debug SQL. (I don't recommend stored procedures for small shops, except for special cases.) I personally find that SQL is a poor language for conceptual DivideAndConquer and repetition abstraction, and is something that I feel SmeQl could help with because of its smaller statement granularity and the option of named references instead of just nesting, which is what SQL relies on (SmeQl provides both). It's easier to debug by portion, and in SQL would often require copy-and-paste-and-edit steps do similar sub-query analysis. See TqlDataAlterationOperators for an example. -t

{TutorialDee also has named references, finer statement granularity, and -- most importantly -- is actually implemented. See the RelProject.}

That's your opinion. For one, one has to toss or make odd fudges for existing RDBMS engines because of it's purity obsession. Better to make sure one gets the language right before trying to take on SQL with option-limiting implementations. (For readers wanting to know more about the SmeQl versus TutorialDee "battle", please see HowOtherQueryLanguagesAddressSqlFlaws and HowOtherQueryLanguagesAddressSqlFlawsDiscussion.)

{"Toss or make odd fudges..."??? "Option-limiting implementations..."? If you have some legitimate, evidence-based criticisms, they'd be interesting to discuss. Otherwise, you're banging on a StrawMan with a cotton-wool hammer.}

I've given scenarios. A common one is an event log that simply saves events with a time-stamp. Having an auto-number key is nice, but often skipped because it just takes up space and processing time. (Whether or not I condone skipping an auto-number, in practice it's often done and one has to work with existing data-sets in such a state. Often a developer has to use what's already in place and is not permitted to redesign schemas. I'm not writing with the assumption one has CIO ranking here.)

{The time-stamp is the primary key, isn't it? If, for some reason, it isn't and you can legitimately have multiple identical events, then the external table link layer can trivially provide an auto-generated key for you. On average, it only takes up a handful of extra bytes overall; because all relational operators are pipelined, it only takes up space in a current tuple rather than being stored in every tuple of an entire relation. Maintaining such a key takes no measurable extra processing time, and the processing time gained by purely-relational optimisations more than makes up for it.}

No, a time-stamp cannot be used as a primary key in most cases. Two events can happen at the same time, at least within the typical internal clock's rounding range. And I'm skeptical it's trivial to add an external key unless you make some assumptions/tradeoffs that make the key nearly useless for anything beyond solving a purity obsession. In other words, the key's usefulness is inversely proportional to the triviality of the solution used.

{However, this is all moot. SmeQl only exists on paper, so it's only a viable alternative to paper SQL.}

Some may value "getting it right" over the existence of implementation. While an important factor, it's not necessarily an overriding one. I'll LetTheReaderDecide the value of implementation rather than let you be the lone voice for the ranking of that factor. Steve Jobs credited some of his success to his willingness to halt production and "back up" if a sufficient design problem is discovered near the release date. He claimed mediocre companies will just "push it through", and that's why they produce allegedly inferior products. I'm letting the world see and criticize SmeQl in the open. TutorialDee mostly skipped that process and should apologize to the tech community for that Gaffe. We have SQL already and it's "good enough" for the short term. Thus, there's no hurry. -t

That is like saying the CeeLanguage is good enough and we should not be in a hurry to create things like GoLanguage. People abused Cee for such a long time and now it is too late since C has already taken foothold. people are still using C to build desktop applications even though C doesn't even have a "string" (serious flaw, that should have killed C long ago). At least someone was in a hurry to develop GoLanguage recently, and even PHP is kind of better than C, so it was good someone was in a hurry to create PHP (but I think it could have been done better).

I believe that's a flawed analogy because C's niche has changed. It is becoming the "new assembler", and less often used for custom and shop-specific applications, where Dot-Net, MS-Access, and web scripting languages are more common. If performance and resource control are the bottleneck, then C makes sense, such as SystemsSoftware and embedded programming (appliance and toy chips). If the developer time and effort are the bottleneck, then C is usually not the best choice. -t

It is indeed changing nowadays, but in the 1990's people were using Cee for application development which is ridiculous. All those applications still need to be maintained - just have a look at the CVS repository of any BSD distro. Look at the user applications, not the system kernel. It's mostly Cee. It's too late. Only new applications can be written in other languages like GoLanguage. All sorts of plain win32 api C applications exist too. I don't think C makes sense for systems software because systems software needs to be safe. That's what GoLanguage and other languages like OberonLanguage were intended for: safe systems language. But it is too late, they only have a chance if people start writing new software in it. You said that TutorialDee skipped the process of open criticism. That is false. Anyone can criticize TutorialDee and the Third Manifesto - that is why they wrote the book and created that language. They admit upfront that TutorialDee is for demonstration only, and IndustrialDee? would be the one created AFTER criticisms. They are not making the mistake that Pascal made (teaching language released as the final pascal standard). They openly admit up front that TutorialDee is a toy teaching language. Darwen even calls it a toy language. Your claim that TutorialDee is not open to criticism is wrong, and false. You seem to have difficulty making distinctions (TutorialDee vs IndustrialDee?).

I don't know much about GoLanguage, but Ada was supposed to be the "super-safe" language of the century. It just didn't catch on. I don't why, but it had plenty of chances. I suspect 3 reasons together: 1. Long learning curve, 2. Expensive, big, slow compiler, 3. C allows one to easily "talk to the guts" of the hardware when needed.

Well unix was written in C so it became popular before Ada could. If unix was written in fortran, fortran would be popular. Ada is more complex than it needs to be, and adds even more verbosity to the already verbose pascal. The verbosity is too much like cobol.

Fortran was widely used until about the mid 1990's. It was common in job ads. Pascal also had a good chance to take off because it was used heavily in teaching in the 80's, and was given a further boost by Turbo Pascal and Delphi. I'm not convinced that being early is the only reason for C's popularity.

Re: "You said that TutorialDee skipped the process of open criticism. That is false. Anyone can criticize TutorialDee and the Third Manifesto - that is why they wrote the book and created that language." - How is that sufficient vetting? I've criticized its lack of meta ability and nobody did shit to fix it. It seems they want Data-Ada when I want sometime more like a lisp/JavaScript hybrid. That's fine a guess: TotorialDee? for the TypeSafety fans and SmeQl for the scripty/dynamic fans. -t

{"I've criticized its lack of meta ability and nobody did shit to fix it." Huh??? Since 2005, there has been a publicly-accessible email-based forum -- linked from Date and Darwen's http://thethirdmanifesto.com -- for discussing D in general and TutorialDee in particular, in which Hugh Darwen actively participates. Issues and ideas for features and new syntax are often discussed there. I don't recall you posting any criticisms of "its lack of meta ability" there.}

Why email-based? That's a little out-dated and limiting, even in '05.

{Why not? It was explicitly chosen by its key participants as the preferred and most effective means to facilitate discussion and collaboration. Email lists are still very popular for niche, focused, and often highly technical or scientific subjects.}

It sounds like it was already a cabal and then kept a cabal via email.

{What a peculiar response. Would you have preferred a "click here to post your language criticisms on a billboard outside Chris Date's door" button? Date & Darwen are notably accessible and responsive to considered commentary; perhaps unusually so in the scientific community.}

Something like HowOtherQueryLanguagesAddressSqlFlaws could have been created earlier.

{Sorry, you've lost me. What I'm getting here is that Date & Darwen have courteously provided a public forum for you to comment on and even contribute to their work, but you're annoyed that they haven't... What? Created a language feature comparison table??? I don't get it.}

We can bicker forever over whether their broadcasting and decision documentation technique was "good enough". Instead, let me ask you this: what if flaws or gaps are discovered later in the game?

{No less than three editions of TTM each introducing changes and revisions, plus an additional book that is not part of the series but an adjunct -- also presenting changes and revisions, many in response to or shaped by discussion and commentary on the mailing list I mentioned above -- plus addenda, links and additional material on the http://thethirdmanifesto.com Web site, clearly answer your question.}

Trying to get a more general survey of "what do you want in a query language" seems like a logical thing to do. Even if you decide to ignore some of the suggestions for architectural or philosophical reasons, at least it helps the designer be more thorough and think wider. Something like SqlFlaws and HowOtherQueryLanguagesAddressSqlFlaws is a fairly good start.

{So... You think Date & Darwen should have started with a... Poll?}

They've taught me the advantages of OOP. They've taught me that I get OOP by placing together the data and the functions that operate on that data. Would anyone say that OOP is evil?

"Together" may not mean much in cyberland. You may also be interested in ArgumentsAgainstOop. OOP is somewhat falling out of favor and being supplanted with a multi-paradigm view these days, I would note.

{The "taught me the advantages of OOP" writer is unquestionably superficial and naive, but I'm curious about your "OOP is somewhat falling out of favor" claim. Do you have figures to back it up? Looking at the TiobeIndex, which is probably the best statistical assessment we have of ongoing language popularity, I'd be hard pressed to agree with your assessment. In particular, what language(s) do you believe exemplify a "multi-paradigm view"?}

Many of the "in" languages are multi-paradigm. I don't have any hard facts, just anecdotal info, such as http://www.i-programmer.info/news/150-training-a-education/2188-cmu-drops-oop.html
I also agree that the "writer is unquestionably superficial and naive" comment is uncalled for. However, keep in mind that many on C2 suffer AspergersSyndrome; so don't take it personally.

Thanks for your pertinent and useful answers, and for your unquestionable politeness. Let's go back to the question, thanks. What I mean is, why stored procedures are evil, if they are functions stored together with data, and placing functions together with data means OOP, which they say is a good thing?

{I believe the first paragraph of this page answers that more than adequately, and I'll only add that OOP is more than merely colocating data structure definitions with associated operators.}

Thank you.

The "evil"ness of Stored Procedures generally comes from developers who cannot write stored procedures. They want the "their" SQL buried in "their" code.

1) Announce a change to the database, ask developers to check their SQL. 2) You get no response. 3) You implement the change which includes modifications to stored procedures. 4) Applications using stored procedure still work. 5) Some applications with embedded SQL break. 6) Developers blame database team fro "breaking" their applications.

Often their inline SQL is not close to optimal. The inline SQL appects performance of the database. The app-DB pipelne is filled with data that should not be there.

But they also go against the grain of typical maintenance patterns, creating extra "busy work", movements of eyes/fingers/mice, which is extra time and money for the owner of the app or shop. Maybe if smarter IDE's existed, this could be mitigated.:(

If you are testing your SQL code somehow, why don't just create the Stored Procedure to begin with. You can re-use the code, it's better for performance, with no doubts easier to maintain, test and troubleshoot. Ah.. one more thing, what about upgrades on your SQL Engine???. Think about it. Do with Crystal what it is good for, create nice output, charts, etc. and leave the SQL to do their job. If you cannot see this, you might just doing join betwen tables and giving headaches to the DBA.

I'm not sure why you mention upgrading the SQL engine. The biggest headache with SP's is adding columns. It's a lot of double-work to add a column to both the application and then go to a different area/editor to add it to the SP. If the column selection could somehow be "dynamically" specified at the app side instead of hard-coded into SP's, then you'll save a lot of grunt work. Similar may apply to criteria statements, such as QueryByExample patterns. The granularity of SP's is wrong: some parts you want hard-coded and some you want app-determinable to avoid the double-editor-visit cost.