An alternative take on TopOnPreparedStatements (temp name) because I felt that page's introduction is poorly written and misleading, probably due to author bias. -t
[DRAFT]
NAK|WRP|PST|SPR| Protects from:
---------------------------------
.N.|.Y.|.Y.|.Y.| Quote escaping injection
.N.|.I.|.Y.|.Y.| Classic Sql Injection (besides quote issues)
.N.|.I.|.I.|.I.| Tokenized, P-Code, or Machine-code Injection
.N.|.I.|.I.|.I.| Buffer overflow (value "running into" command area)
Symbol Guide:
- N = "No"
- Y = "Yes"
- I = "Implementation Dependent"
- NAK = Naked or raw SQL (no wrappers, API's, Prepared Statements, or Stored Procedures)
- WRP = Value wrapper API[1]
- PST = Prepared Statements
- SPR = Stored Procedures
Table should probably be
NAK|WRP|PST|SPR| Protects from:
---------------------------------
.N.|.I.|.Y.|.I.| Quote escaping injection
.N.|.I.|.Y.|.I.| Classic Sql Injection (besides quote issues)
.I.|.I.|.I.|.I.| Buffer overflow (value "running into" command area)
.I.|.I.|.I.|.I.| Tokenized, P-Code, or Machine-code Injection (other than buffer overflow)
Justification for the changes.
- Some databases' stored procedures can create sql statements dynamically from strings. This means that stored procedures can use naked SQL, wrapped SQL, or prepared statements.
- How do we know PS's are not doing the same, at least in some cases?
- [What makes you think they do? Why would a DBMS vendor deliberately risk SQL injection, and slow down their prepared statements to be equivalent in performance to dynamic queries, when implementing prepared statements is a technically-straightforward adaptation of existing stored procedure and/or query planning facilities?]
- Because in some cases it may simplify implementation. For example, if a value is blank, it may consider the case that it doesn't need to use an index that it did before. Rather than have logic to regenerate part of a query plan, it may be simpler to redo the whole thing.
- [In a DBMS that already has support for stored procedures and/or query plans -- i.e., pretty much all of them; definitely all the popular ones -- it's simpler to leverage the existing stored procedure facilities and/or query plans than implement a query re-writing system.]
- Top, can you name one that does? Any at all? You know, that empirical evidence you keep talking about.
- Nope, I'm just counter-speculating your speculation. We are both fighting with rubber swords here.
- What speculation are you countering? How does your speculation counter it?
- Whether or not a wrapper function protects against Quote escaping depends on the implementation of the wrapper. It's quite possible for the wrapper author to forget or otherwise choose not to protect against such attacks.
- Please elaborate. ANY tool or technique can be fscked up by implementation mistakes, so while single out wrappers?
- Wrappers can only protect against quote escaping if the authors of the wrappers decide to include such protection. I.e. it's dependent on the implementation.
- [Because PreparedStatements make wrappers unnecessary. To make PreparedStatements permit classic SQL injection, the way a single miss-step in writing a wrapper can, would require a deliberate and conscious effort to make a PreparedStatements implementation as flawed as possible.]
- Sorry, you lost me.
- [Wrappers are inherently risky; one minor coding error and SQL injection is possible. A minor coding error in a PreparedStatement won't result in SQL injection -- most likely, it will simply cause the query to fail to execute -- because PreparedStatements are invariably implemented as parameterised functions, usually using existing stored procedure facilities. That's the easiest way to implement PreparedStatements. To make a PreparedStatement implementation capable of failing in a manner that causes SQL injection would require a deliberate effort -- and I do mean effort, given that implementing them properly is easier -- to make a flawed PreparedStatement implementation.]
- This is a strange argument. Because wrappers are simpler they are easier to "half break" than PS's? Security-through-complexity? That's a new one to me.
- [How did you arrive at that conclusion? It has nothing to do with "simpler", and everything to do with the fact that a bug in a wrapper may allow SQL injection, but a bug in PreparedStatements does not allow SQL injection. They are fundamentally different mechanisms; the former is a barrier against classic SQL injection, the latter removes classic SQL injection as a possibility. If there's a bug in a PreparedStatement, it doesn't make classic SQL injection re-appear as a possibility. If there's a bug in a filtering wrapper, classic SQL injection becomes possible.]
- Re: "a bug in PreparedStatements [implementation] does not allow SQL injection." -- Can you back this claim up solidly? -t
- [Sure. See below.]
- Naked SQL doesn't prevent buffer overflows or other forms of machine language (or p-code or whatever) injection. However, careful coding of the engine that processes the naked SQL could very well be free of such things. Not so careful coding could result in it being subject to those attacks. Therefore, whether or not naked SQL is subject to those attacks is implementation dependent.
Re: "...because I felt that page's introduction is poorly written and misleading, probably due to author bias."
What, "author bias" in favour of PreparedStatements? When it comes to avoiding SQL injection, you'd better believe it. I consider choosing to use filters with dynamic queries on a public Web site or private intranet -- when PreparedStatements are available -- to be professional malpractice.
It's not directly about "favoring", it's providing misleading details or impressions. The biggest sign of bias is conflating "naked" SQL issues with wrapped SQL issues, and second, the implication that there are no trade-offs involved. At the very least, it's sloppy writing, but appears to be caused by bias rather than just crappy text authoring ability alone. -t
From a security point of view, filters are only infinitesimally better than "naked" SQL, to the point of being effectively equivalent. The chances of failure due to errors or unconsidered conditions in the filters, or due to inadvertently omitting filters, are unacceptably high for any but throwaway purposes. There's nothing "misleading" about that, because it's true.
In general it's an IT management decision to use PS's or not. Developers and (low-level) DBA's can make recommendations, but the final call is not theirs. Whether you feel like suing the managers or beating them with a Safety Stick, that's a personal issue. (Just let me know so I can conveniently slip out and not be a witness.)
Of course it's an "IT management decision", but that doesn't mean filters should be treated as equivalent to PreparedStatements. They aren't. Any "IT management" that uses dynamic SQL with filters instead of PreparedStatements should do so fully advised that they're making a profoundly stupid decision, on the basis that using filtered dynamic SQL instead of PreparedStatements is a fool's choice.
And prepared statements (PS) do have downsides, such as vendor and/or version specificness to API's that may make it difficult to swap DB vendors or versions. Whether all or most difference can be hidden behind wrappers is questionable, or at least an open question. They also add more layers to the mix, and the more layers, the more problems, typically.
Arguing that the negligible effort to provide generic interfaces to specific DBMSs is a "downside" to PreparedStatements is akin to arguing against having a password that secures your on-line bank account because it will "add more layers to the mix". And it's hypocritical, given that your filtering wrappers are also adding "more layers to the mix."
Given a choice between filtered dynamic queries vs PreparedStatements, the only argument in favour of filtered dynamic queries might be that your application is throwaway and it doesn't matter if gets hacked. Of course, even that is highly questionable -- every instance of hacking due to classic SQL injection, even if on a throwaway application, calls into question the capability and security-awareness of IT professionals in general, not to mention the fact that you have opened your organisation up to potentially unknown compromises.
- Your risk assessment opinions have been noted.
You have NOT made an iron-clad case that these downsides are overwhelmed by the upsides. My approach is to present all known trade-offs and risk/probability estimates and known research studies to the decision-makers (managers & owners).
The final choice is theirs and I will not pre-bias written summaries to tilt them toward my favored selection, which appears to be what you are doing.
If you're not "biased" in favour of PreparedStatements, then you're deliberately courting SQL injection, even if you think your filters are 100% correct and used 100% of the time. I don't know how you can justify that, let alone argue that the downsides of PreparedStatements are not overwhelmed by their upsides.
Some orgs want to be wild churn-and-burn cowboys; that's their prerogative. -t
"Churn-and-burn cowboys" (?) or not, I've yet to meet an organisation that wants to be hacked.
Just because somebody likes bungee jumping does NOT mean they WANT to die. Just because you accept that your warehouse has wooden doors instead of metal doors does not mean you WANT to get burglarized/robbed. Just because you choose the $50/month house alarm instead of the $200/month house alarm does not mean you WANT to get burglarized/robbed.
Indeed, but anyone who wishes to choose wooden doors over metal doors should be made fully aware that he's being an idiot, and we're not talking about $50/month vs $200/month, we're talking about spending $19.00 once on a set of windchimes vs spending $19.01 once on a fully-alarmed vault.
- Giving a professional opinion is fine, but be careful about pulling probability numbers out of your ass without the requisite disclaimers about being a little more than a "professional guess". -t
- Where do you think I'm guessing?
- The risk being high enough to justify extra layers. I'm not necessarily disagreeing with your risk assessment, but that's only because our GUESSES roughly match. We have very little empirical data on the risks, as described and elaborated on in the other topic.
- Apparently, the risk is high enough to justify the "extra layers" of using filtering wrappers, no? Studies on the frequency of classic SQL injection attempts run at about 10% of all external attack types, with the volume obviously dependent on the popularity (and perhaps the desirability) of the target.
- They may not be using wrappers. I know of production public software where the programmer put in zero protection, not even quote escaping. That 10% does not tell us much that can be used in this case. We are comparing 3 options and the stats tell us very little about the distribution of hacks among those 3 options.
- I'm not referring to successful compromises, only attacks. The important factor is that a filter is brittle; it can be compromised by a trivial error in coding or by a trivial omission of the filter. PreparedStatements are invariably implemented using parameters to compiled code -- usually as functions based on or using stored procedures -- and so are more robust than filters. A failure in PreparedStatement code might result in a query that doesn't run, but doesn't risk SQL injection. A failure in a filter might result in a query that doesn't run, but it might also risk SQL injection. That difference makes PreparedStatements inherently more secure.
- Can you PROVE it's using "parameters to compiled code" rather than just claim it? I'm not claiming it's not, but your burden of evidence has been set pretty high by calling non-use "professional malpractice". Such a claim deserves something beyond ArgumentFromAuthority. See, here in America, strong evidence is presented first, and THEN we zap people in the electric chair. Not the other way around. Is UK different? Maybe that whole King and Queen thing made you all authoritiphiles.
- PageAnchor mysql-query-reuse
- See http://dev.mysql.com/doc/internals/en/prepared-stored.html E.g., "... prepared statements and stored routines need to reuse the same parsed tree to execute a query many times." The 'd' in "parsed" appears to be a typo; the usual term is "parse tree", which is a data structure that represents the result of parsing a sentence in a grammar. It is a common way for an interpreter to represent the result of parsing source code in order to execute it, i.e., it is the compiled code.
- I interpret that as possibly meaning, "IF the system wants to reuse the same query, in this new version it is required to use the same parse tree." It's a statement on how reuse is done WHEN the system decides to do reuse, and does not say anything about when the reuse decision itself is done. Further, it's a statement about a particular version of MySql, and says nothing about other versions nor other vendors.
- You interpret incorrectly. It's a statement of how prepared statements are implemented in MySQL version 5.0 and higher. If you don't believe me, you can examine the source code yourself. Older versions didn't support prepared statements or stored procedures. If you know of a DBMS that implements prepared statements using filters and query re-writing, please let us know. Otherwise, we can safely assume that in the absence of evidence to the contrary, popular DBMS vendors aren't being deliberately foolish or making extra work for themselves by not re-using existing mechanisms -- like stored procedures and related infrastructure -- to support parameterised, compiled code.
- It's not clear enough to back your claim as written. It's too fuzzy about WHEN it does reuse. It's describing HOW it does reuse, not WHEN. You are asking readers to trust your interpretation, i.e, ArgumentFromAuthority. (We could decide to LetTheReaderDecide how to interpret it and end the debate on it.) And assuming "DBMS vendors aren't being deliberately foolish" may be assuming too much. Larry Ellison can be a spinster prick at times. Again, I would generally agree with your GUESS, but it still is a guess, and NOT strong enough to back your "professional malpractice" claim.
- I think you must be misunderstanding something. The description at http://dev.mysql.com/doc/internals/en/prepared-stored.html describes what is always done. The DBMS has no way of knowing whether you intend to execute a prepared statement once or many times. The document makes reference to the fact that MySQL alters the parse tree during execution, so changed state needs to be restored prior to the next execution (if there is one) of a given query.
- If you know of production public software where "the programmer put in zero protection", then you know of production public software that is at risk of security compromises. If you have any conscience or sense of ethics, you will make every effort to warn the site stakeholders that their site is at risk.
If you really want to change people's opinions, then find some solid empirical evidence: for example, a good survey that shows wrappers are 20x more likely to be hacked than PS's.
Science will go a lot further than insistent paternalistic lectures based on
ArgumentFromAuthority. You are just coming off as a big blowhard asshat. Blowhard asshats are a cheap commodity on the web, solid empirical research is not.
Granted, some think I also come off as a blowhard asshat, but at least I don't insist people take MY word for it. My suggested authority is empirical research; NOT me, myself, and I. -t
In terms of security verification, "empirical evidence" borders on worthless. Empirical testing that consists of nothing but showing that 100 test cases didn't violate security only suggests that test 101 might have broken it. In security, a theoretical proof of a known mechanism always trumps empirical testing of an unknown mechanism.
That's questionable, but where's your "theoretical proof" anyhow?
That's the ideal. Knowing that a mechanism like PreparedStatements does not use query rewriting comes close. Empirical testing, in comparing filters to PreparedStatements, demonstrates nothing.
Like we already debated, lack of query rewriting doesn't prevent all possible injections, and we don't know if that's how they are actually implemented. As I remember it, you quoted some Microsoft documentation that claimed it "usually" doesn't re-generate the query. That's not very reassuring.
Who said lack of query rewriting prevents all possible injections? PreparedStatements prevent all possible classic SQL injection attacks that result from query rewriting.
But we don't know the ratio (and future) between classic injection and non-classic-injection. If it's 1-to-10, then using PS's to reduce total injection only gets you a 10% reduction in injection attacks and you are asking orgs to complicate their tool stack to get that 10%. It's arguably not worth it. (I personally doubt it is 1-to-10, but will admit my estimates are not backed by solid empirical studies, and neither are yours, such that I shall not be as insistent as you are in your opinion re: "professional malpractice". THOU SHALL NOT BE INSISTENT WITHOUT SOLID EMPIRICAL FACTS. That's common sense to almost anybody except the willfully stubborn.)
I'd argue that insisting you are absolutely right without solid empirical evidence could likewise be considered "professional malpractice". If you pull conclusions from your ass based on estimated proportions not backed by solid evidence and strongly CLAIM them the OneTrueWay, then you should be rightly subject to professional malpractice lawsuits. -t
You could argue that, and security experts would laugh at you.
- ArgumentFromAuthority again. Tsk tsk tsk. And I'm not necessarily disagreeing with security exports, I'm only pointing out that you are ONLY presenting ArgumentFromAuthority and then chewing people out for not accepting it as iron-clad. Thus, you are creating a straw-man argument. Jerk.
- That's not a reference to authority, but a reference to the ludicrousness of performing security auditing by empirical testing.
- I'm not saying empirical testing is the "only" way to check security, but it should be a part of it. If security is somehow immune from empirical science, then ExtraordinaryClaimsRequireExtraordinaryEvidence and I would expect strong reasoning for such given the insistent nature of your claim. Vendors lie and spin, there is no reason we should take their word for implementation claims at face value. (Plus, their existing claims appear to be rather vague.)
- Security auditing is dependent on logic and analysis.
- Good! Then get an independent third party(s) to do a code analysis on Oracle, Microsoft, DB2, and Sybase, and get back to us, and THEN we have good enough evidence for your rather strong "professional malpractice" claim. Open-source analysis may not be representative (and it's not my job to dig into MySql to find evidence for your claim).
- Such a code analysis would be ideal. In the mean time, we have no evidence that any popular DBMS is subject to classic SQL injection with prepared statements. Compare that to dynamic SQL.
The ratio of classic injection to non-classic-injection is irrelevant. All it takes is one attack to be compromised. Given that the difference in coding difficulty between dynamic queries with filtering wrappers vs PreparedStatements actually favours PreparedStatements, and given that filtering wrappers are only secure to the extent that the filters are 100% correct and 100% used consistently and correctly -- i.e., they always represent a risk that PreparedStatements do not -- there is almost no sustainable argument in favour of using dynamic queries with filtering wrappers. I say "almost", because the only argument in favour of wrapped dynamic queries would be the rare case where a dynamic query optimises to being faster than its PreparedStatement equivalent and/or the two accesses to the DBMS required to prepare and execute a PreparedStatement both represent an unacceptable overhead. I can't think of a practical circumstance where this might occur, but I suppose it might.
Re: "The ratio of classic injection to non-classic-injection is irrelevant." -- Please elaborate. When evaluating risk, I see no reason to ignore the historical ratio of usage-to-failures between two options. As far as the "difficulty" difference, that's being debated below. Currently, I disagree with you on that. And PS's have not been proven "100% correct" either. It looks like you are creating a double standard. Would you like Clippy to help you with fairness? -t
The essential distinction is between classic SQL injection vs second order SQL injection. The former can be organised into subcategories (e.g., blind SQL injection, classic SQL injection due to incorrect type handling, classic SQL injection due to incorrect escaping, etc.) but they all stem from levering query re-writing to redefine a query's logic. Second order SQL injection results when a seemingly-legitimate user entry -- one not requiring quoting and of the correct type -- is stored for later use and later results in unintended query logic being executed because the value is retrieved and used incorrectly when constructing a dynamic query string. Hence, the ratio of classic injection to non-classic-injection is irrelevant, because they're different and have different causes. Of course, proper use of PreparedStatements -- with static definitions -- will prevent either form of SQL injection. Unfortunately, second order SQL injection sometimes appears when constructing definition strings for PreparedStatements using user input. User input should never be used to construct PreparedStatement definitions.
- The above is not clear to me. But I don't want to delve into that today.
As for claiming that PreparedStatements have not been proven "100% correct", that's true, but nothing has been proven "100% correct." Indeed, even code as simple and innocuous as a 'while (true) {...}' loop might fail, if the underlying processor has a hardware bug that causes some machine language construct to break after (say) exactly 1.24x10E34 iterations. That sort of thing happens. However, what we do know is that PreparedStatements have a field-proven reliability with no known DBMS-based failures that have resulted in SQL injection. That's why OWASP and others recommend them. We also know that filters have a field-demonstrated brittleness, due to the difficulty of writing reliable filters, and the ease with which applying a filter can be forgotten. It's awfully easy to write $userInput when you should have written sqlStringFilter($userInput). That's a reason why OWASP and others condemn them.
"100% correct" was YOUR reference metric, not mine. I'm just asking you to be consistent with its usage for comparing all horses in the race, not just horses you hate. If you can't back it, don't state it. (It's either crappy or intentionally waffling writing. What is the "100%" sentence really saying? Think about it.)
But the above is true. Filters do have to be 100% correct, and used 100% of the time, in order to avoid SQL injection. PreparedStatements don't have to be 100% correct 100% of the time. If they fail, they fail safely. I cannot find a documented case of a PreparedStatement implementation failing in a DBMS in a manner that causes SQL injection. Usually, when they fail, they either run too slowly because the optimiser has an issue, or they don't run at all.
- That's a whacky statement. Every non-trivial system probably has flaws and I'd bet money that most PS systems have security flaws of some kind in them. And you haven't found a case of filters failing like you claim either. (Although, a handful of flaws may not be a good statistical sampling anyhow.)
- Sure, every non-trivial system has flaws. But remember how PreparedStatements work -- upon execution, they do not invoke the SQL parser on SQL source. They execute the parse tree (or other compiled form) that was generated when the PreparedStatement was created. So, in order to inject SQL, the flawed PreparedStatement execution would have to somehow invoke the SQL parser -- which normally has nothing to do with PreparedStatement execution -- in a manner that creates a new query consisting of query text of the PreparedStatement at the time it was originally created, along with the query text that was passed as a parameter, and then execute it. How would that happen? It seems rather unlikely. (Speaking as a developer of DBMSs, I'd be inclined to replace "unlikely" with "bloody impossible", but that's not as scientific.) On the other hand, how likely is it that a typical programmer types $userInput as part of a dynamic SQL string creation when he should have typed sqlNumberFilter($userInput)? That seems likely to happen fairly frequently.
And there are "easy to write" scenarios that can "ruin" PS-based code also. You need to clarify and better quantify "easy to write" if you want to use it as a metric here. -t
The "easy to write" scenario that causes SQL injection with PreparedStatements is using external input to construct the PreparedStatement definition. That is a schoolboy error, usually done once (at most) by a beginner. The solution, of course, is to never use external data in the creation of a PreparedStatement. It's relatively unlikely that anyone -- once they understand the reason -- would casually forget and mistakenly include external data in PreparedStatement definition. You don't generally wind up carelessly adding things, but carelessly omitting them is easy. Dynamic SQL requires external data (otherwise it's not dynamic) and careful discipline to wrap every external data source in a filter. They are easily inadvertently omitted, even by experts, especially because their absence doesn't stop the code from working.
It's possible to attach thing A when you meant to attach thing B because of similar names or The Monday's, etc. Variables/objects are not smart enough to say, "Wait, this violates the shop conventions. Spanks!" Or maybe part X used to come from a literal, but was changed to come from user input, and somebody plugged it in, but "library users" down the road won't know this. Abstraction often hides the sources of things, which is usually good, that's what abstraction is largely for, but it can also be bad sometimes.
Yes, it's possible to "attach thing A when you meant to attach thing B", but how many professional programmers write secure code that permits such easy blunders?
[1] Wrappers don't necessarily preclude Prepared Statements as an "under the hood" implementation option. But for this discussion, we'll assume Prepared Statements are not used to implement the running of queries through wrappers.
Re: "a bug in PreparedStatements [implementation] does not allow SQL injection." -- Can you back this claim up solidly? -t
Given a query like "SELECT * FROM t1 UNION SELECT * FROM t2 WHERE c1 = 3", popular SQL DBMSs convert the query to a parse tree (or some equivalent) like the following in order to execute it:
COMMAND: UNION
COMMAND: SELECT
COLUMNS: ALL
FROM: t1
COMMAND: SELECT
COLUMNS: ALL
FROM: t2
WHERE:
COMPARE: =
COLUMN: c1
NUMBER: 3
Execution is essentially performed by traversing the parse tree. (Though this is a simplification that does not take into account optimisation and/or generation of a query plan, both of which may involve restructuring the tree or using it as the basis for further compilation.)
If the query has been constructed dynamically using PHP like "SELECT * FROM t1 UNION SELECT * FROM t2 WHERE c1 = " . sqlNum($_POST['userinput']), if sqlNum fails in a manner that lets through user input like "3; delete from customers" -- or if the programmer forgets to wrap user input with sqlNum() -- the constructed query may be "SELECT * FROM t1 UNION SELECT * FROM t2 WHERE c1 = 3; delete from customers", with obvious unpleasant results.
With prepared statements, the parse tree (or some equivalent) is created in one step and executed in another. Thus, creating a prepared statement like "SELECT * FROM t1 UNION SELECT * FROM t2 WHERE c1 = ?" generates a parse tree (or equivalent) like the following:
COMMAND: UNION
COMMAND: SELECT
COLUMNS: ALL
FROM: t1
COMMAND: SELECT
COLUMNS: ALL
FROM: t2
WHERE:
COMPARE: =
COLUMN: c1
NUMBER: ?
When the prepared statement is executed, the statement is
not re-parsed to form a new tree. The already-existing parse tree is executed. Avoiding having to re-parse -- especially to gain performance inside loops, but also for protection against classic SQL injection -- is precisely the reason for having prepared statements. All that happens is the parameter value is inserted into the parameter position in the prepared parse tree. If the parameter value is "3; delete from customers", it will be inserted into the appropriate parameter position in the parse tree to create the following:
COMMAND: UNION
COMMAND: SELECT
COLUMNS: ALL
FROM: t1
COMMAND: SELECT
COLUMNS: ALL
FROM: t2
WHERE:
COMPARE: =
COLUMN: c1
NUMBER: "3; delete from customers"
At worst, the query will harmlessly crash when it tries to compare the contents of numeric column c1 with the string "3; delete from customers". At best, it will compare the numeric values in c1 against the string "3; delete from customers" and find that none of them match.
Now let's imagine the prepared statement mechanism is buggy, and (say) somehow incorrectly replaces some other tree node with the user input "delete from customers", like the following:
"delete from customers"
COMMAND: SELECT
COLUMNS: ALL
FROM: t1
COMMAND: SELECT
COLUMNS: ALL
FROM: t2
WHERE:
COMPARE: =
COLUMN: c1
NUMBER: ?
It's still harmless, because there is
no parsing of the input after it's been turned into a parse tree. The replacement of the first node with the string "delete from customers" will probably cause the query execution to crash, but it won't cause SQL injection. That's because the string "delete from customers" won't be parsed and turned into a new parse tree, which would be necessary for it to be executed as a SQL command. At the point where a prepared statement is executed, all the parsing has already been done. No further parsing will be done, so no string can be turned into a parse tree that can be executed.
That only covers "classic SQL injection". We've been over this already.
I only claimed PreparedStatements definitively protect against classic SQL injection. I don't know anyone, myself included, who claims otherwise. Other forms of SQL injection -- second order SQL injection, for example -- are orthogonal to the issue of dynamic queries (filtered or otherwise) vs PreparedStatements. However, given the significance of classic SQL injection, it can't be ignored.
If the "nodes" in the parse-tree are delimited with say Unicode character #4321 internally, and if the value has Unicode character #4321 in it, then the parse-tree interpreter/processor can be tricked into mistaking some of the value for a node delimiter, and "execute" part of the value. Remember, in RAM your parse-tree is just a sequence(s) of bytes. Indirection may reduce the chance of such happening by making the interpreter/processor jump to different locations to get values, but that doesn't outright eliminate the risk (and slows down performance). Your original statement is still wrong as given. (There may be techniques to outright eliminate ALL possible kinds of injection, but you haven't proved those are used in production RDBMS.)
You could perhaps argue that such requires knowledge of the guts of a given RDBMS, but classic SQL injection often takes advantage of vendor-specific flaws also, and thus generally reflect specific flaws of specific implementations in both cases. -t
I'm not aware of any DBMS (or popular programming language, for that matter) that represents parse trees as strings and executes them by parsing them. (And then what? Construct another "real" parse tree for compilation or execution? Why?) They're data structures, executed by invocation of functions or methods -- for example, by storing the nodes in a container class and iterating over them. Thus, there is no delimiting nor is there anything to be limited by a delimiter. It is conceivable that buffer overflow injection of native code could be used on such (or any!) structures, but that's entirely orthogonal to dynamic queries vs PreparedStatements. If buffer overflow issues are likely at all -- and there are ways of avoiding them entirely -- they're equally likely with both PreparedStatements and dynamic queries.
Data structures and "container classes" are represented with sequences of bytes also. It doesn't change the nature of the problem to call them something different. Whether those data structures/classes/thingamabobs use delimiters or byte counters is an implementation detail that is not revealed for claim verification here. (And byte counting has its own set of problems.) If you wish to claim that "injection can't happen because there are no delimiters whatsoever", then please do. Just be clear about exactly what you are claiming.
The representation of SQL queries in DBMSs is significant in terms of how and where they are vulnerable to SQL injection. Everything in a computer is represented with sequences of bytes. However, not everything in a language <x> is subject to injection of language <x>. PreparedStatements in popular DBMSs are implemented as I've described, modulo inevitable technical (and largely irrelevant) variation that I've already mentioned. Processing the data structures that are involved does not involve detecting delimiters.
If you have any evidence to the contrary, please feel free to present it.
It's not my evidence burden to prove there are no delimiters being used internally. Without evidence for either side, the default is that the existence of delimiters is "unknown", which does not back your claim as-is. You made the claim that "a bug in PreparedStatements [implementation] does not allow [X]", YOU are obligated to provide evidence for that, not me.
Actually, it is your evidence burden, because the notion of representing a parse tree as something other than a tree data structure is, from a language implementation point of view, bizarre. It's an extraordinary claim, requiring extraordinary evidence. It's like claiming it's my "evidence burden" to prove that SQL execution doesn't involve calculating chess moves in the background.
- Here's a representation of a tree structure that uses delimiters: "(+(x(2)(3))(74))". Being a tree-structure and being delimiter-free are independent characteristics.
- That's a string literal that can represent a tree data structure, but it's not a tree data structure. Whilst a PreparedStatement could conceivably (though pointlessly, in the case of SQL implementations) convert SQL into an intermediate language like that represented by your string, it still wouldn't be subject to classic SQL injection because any subsequent parsing step would be parsing the intermediate language, not SQL.
- Your original statement didn't say "classic".
- It's common to use just "SQL injection" to refer to classic SQL injection, but it's more accurate to use "classic SQL injection" to distinguish it from (say) second order SQL injection.
- Further, most in the field don't care about verbal classification labels of attack "types" but rather the actual general risk. Hacked is hacked. Distinguishing between Bach attacks and Lady Gaga attacks doesn't matter here: you've still been smashed in the head by a musician.
- When you write "most in the field", what field are you referring to? It's true that the type of attack is of little importance to the victim, but the point here is that PreparedStatements preclude classic SQL injection but dynamic SQL with filters does not. If one possible attack vector is closed, that is indubitably preferable to leaving it open. Of course, that assumes all other factors are equal, and essentially they are. It's certainly no more difficult to use PreparedStatements than use dynamic SQL, and generally it's easier -- you don't have to create and/or remember to use filters.
Processing a parse tree is straightforward, requiring no scanning of delimiters or the like, and my claim that
PreparedStatements do not allow classic SQL injection is based (in part) on that knowledge. There *might* be string copying involved -- as there is in almost any non-trivial data processing code -- and in some languages and systems string copying is potentially subject to buffer overflows, but that's not SQL injection of any kind.
Classic SQL injection requires parsing strings to convert them into something executable, and I know of no PreparedStatement implementation that does that after the PreparedStatement has been created. Furthermore, I don't know why any implementation would do so. It would be more programming effort for no gain, and much risk.
I'm not sure how you are defining "classic", but your claims on how the internals work are only a personal assessment. That's not nearly strong enough to justify your "professional malpractice" claim. You cannot accuse somebody of "professional malpractice" for NOT taking YOUR word for it.". Further, you don't have access to private-company RDBMS, so you are only guessing about the guts. If you had say 3 studies that showed that at least 99% of all SystemsSoftware (such as RDBMS) don't use ANY delimiters, then you are approaching sufficient evidence. You don't.
I'm defining classic SQL injection in the usual manner. See http://en.wikipedia.org/wiki/SQL_injection
My claims on how the internals work are based on knowledge of how programming languages and database languages are implemented. They are not haphazard; there are conventional ways of implementing languages and DBMSs that are recognised throughout the industry.
- So you expect them to bow down and take your word for it? Again, you need really strong evidence to justify "professional malpractice", not just water-cooler opinions. I will agree that the "industry consensus" is that PS's are the safer approach, but that's not strong enough for "professional malpractice" to stick. Industry consensus is far from scientific and far from 100% accurate.
- I don't expect anyone to take my word for anything. However, I would expect anyone working in dynamic Web site development to recognise that PreparedStatements are universally preferred over dynamic SQL with filters. This is based on a general view that PreparedStatements are impervious to class SQL injection. Where do you think that view comes from? There's certainly a lack of evidence of PreparedStatements (when used properly) being at risk of classic SQL injection -- we can examine the source code of various open source DBMSs to see why -- and commercial vendors (e.g., Microsoft) often cautiously imply better security for PreparedStatements, despite generally (and not surprisingly) distancing themselves from SQL injection issues. There are numerous risks with dynamic SQL and filters -- incidents of classic SQL injection are well-known. But don't take my word for it -- find out for yourself: Do some googling, and/or talk to Web developers and DBMS developers.
- You are generally repeating your prior arguments, and they are still not strong enough to qualify for "professional malpractice". "Universally preferred" is about as high on the EvidenceTotemPole as a snake's belt buckle. -t
- Why do you think PreparedStatements are universally preferred? Why aren't developers rejecting them categorically, with the argument that they're just as insecure as dynamic SQL and therefore even more risky, because we think we can use them without filters?
- Re: "There are numerous risks with dynamic SQL and filters -- incidents of classic SQL injection are well-known." -- There are indeed plenty of cases where NO filters or protections of any kind failed, but like I've pointed out before, we don't have decent stats on what kinds of protections have actually failed. Your wording kind of implies you do have such info, but you don't. It's a misleading statement. The existence of failures alone tells us almost nothing about filters versus PS's because there is a third option: zero filters and zero PS's (AKA "naked SQL"). Why the hell do I have to keep explaining this? Did I not explain it clearly the first two times? Your stubbornness appears to blind you. You keep pointing out that there have been a lot of failures (in general) over and over in such a way that it appears you THINK that fact boosts your case. -t
- I have observed classic SQL injection attacks against Web sites I've been involved with in one way or another, either using custom software or off-the-shelf Web products. The ones using dynamic SQL without filters have been, unsurprisingly, compromised by SQL injection. I've never had one with comprehensive filters get compromised by SQL injection, but I've seen compromises on sites using filters where filters were inadvertently omitted. It seems to be too easy to type "$userInput" when you should have typed "inputCleaned($userInput)". I've never seen a PreparedStatement compromised by classic SQL injection. I've never heard of a PreparedStatement being compromised by classic SQL injection. Of course, that isn't surprising at all -- and this is the important point, not my anecdotal report -- if we look at the source code of MySQL and PostgreSQL to see how PreparedStatements are implemented. Given that, what reason do we have to think that SQL Server, DB2 and other popular closed-source DBMSs are implemented any differently, especially as commercial vendors have a lot more to lose if their PreparedStatement implementations are found to be vulnerable to classic SQL injection?
- Again, you are using anecdotal evidence. That level of evidence is FAR from strong enough to justify your "guilty of professional malpractice".
- My evidence for how PreparedStatements are implemented in MySQL and PostgreSQL is certainly not anecdotal. It's analytic, based on reading the source code. My evidence for PreparedStatements being implemented in an equivalent manner in commercial, closed-source DBMSs is circumstantial evidence (setting aside, for the moment the fact that some vendors explicitly state that PreparedStatements are more secure than dynamic SQL), but it's strong circumstantial evidence. In particular, it's trivial to observe that the same performance pattern found in MySQL and PostgreSQL -- i.e., PreparedStatements execute faster inside a loop than the equivalent dynamic SQL, which is accounted for by the fact that PreparedStatements are parsed only once -- also appears in commercial, closed-source DBMSs. Thus, it is reasonable to conclude that the same internal mechanism -- i.e., that parameter values inserted into appropriate locations in the parse tree or query plan in a PreparedStatement -- is employed in all popular DBMSs and thus they preclude classic SQL injection. Filters with dynamic SQL do not preclude classic SQL injection; they only block it.
- The gist of your argument appears to be that one can forget to use per-value filters but that PS's force one to use API's such that one cannot ever forget to apply the API to values. Is this correct? I've never tested all common brands of PS's to verify there is no way to include values in dynamically generated SQL fed to PS's. If they forbid it, it could be extra code to include "values" that otherwise may be invariant (constant) for a given routine. For example, a given report page may be only for retail outlets and thus wish to hard-wire the condition "site_type = 'retail'" into the SQL for that page. If PS's don't allow hard-wiring of that, then it's extra code to process the value "retail" in the code and is a factor against PS's. (In this case it's trivial, but some apps have more involved invariants, especially old DB's with a lot of legacy baggage such as location codes from mergers.) On the other hand, if they DO allow such, then one can make the same "forget to wrap" mistake as the value wrapper approach (even for non-invariants). Thus, there is no "free lunch". And remember, I'm not promoting a given approach here, only checking to see if there is enough strength to the PS case for "professional malpractice". -t
- PreparedStatements can be created using dynamic SQL, in exactly the same manner as conventional dynamic SQL. However, auditing is simplified compared to dynamic SQL with filters. It's straightforward to identify the points where PreparedStatements are created dynamically and audit them for possible inclusion of external data.
- PageAnchor HumanDiscipline01
- Please elaborate. It appears it requires human discipline, just like wrapper usage: (remember to check X, Y and Z for A, B, and C...). The "machine" is not hard-enforcing anything.
- Using PreparedStatements always requires two steps: The first is the potentially-unsafe creation of the statement, which should never contain external data, filtered or otherwise. The second is the execution of the statement, which can be safely passed external data -- unfiltered -- as parameter arguments. This enforced separation between creation and execution means auditing is only needed at the point where a given PreparedStatement is created. At that point, any dynamic construction should be viewed with utmost suspicion. Fortunately, in typical practice most PreparedStatements are constructed from a static string, which is inherently safe. Dynamic SQL, on the other hand, must always be considered unsafe until proven otherwise, which always requires careful auditing of the construction of the SQL statement.
- But isn't it also often considered "best practice", or at least common practice, to isolate invariants (constants) into a configuration-like area? What if we have dozens of somewhat similar reports such that we create a "configuration module" which has the usage-specific (report-specific) invariants in it. You are instead proposing that one hard-wire them into "main line" string constants (SQL). Like just about anything else in code-land, some parts of queries vary for different reasons such that abstracting away differences on aspect X will differ from the grouping/classification of the abstracting away differences on aspect Y. The difference between user input values and "configuration" values can get murky in such a situation. And it may vary per module such that in Module A a value is configured-in while in Module B the same value is user-input. Generally we want to abstract away the source so that it can be changed (or vary per reuse instance) without requiring rewriting the data-using routines: they "shouldn't care" about the source agent. One loses this abstraction ability if they are required to inline ALL values or phrases. If we allow such source abstracting, then inspecting that all is still an involved manual operation that requires human discipline on the part of application coders to verify cleanliness.
- It's certainly reasonable to isolate constants into a configuration-like area, but how many invariant queries are subject to configuration? It's more appropriate -- and certainly most secure -- to isolate constants into a constant area, whilst preserving the rule that creating PreparedStatements should never involve external data, filtered or otherwise.
- It appears you missed where I pointed about that what's "constant" for one section/grouping may not be for another. Anyhow, as I read it, "preserving the rule" is a "human" rule, and still requires human discipline and thus PS's require one to "be vigilant" just like wrappers. Usage of PS's alone does not outright prevent injection from inadvertent forgetfulness to do the right thing to the right part. -t
- I didn't miss it, I thought I answered it. Constant is constant; I'm not sure what you mean by something being "constant" for one section/grouping and not another. Certainly PreparedStatements do not prevent gross misuse -- such as creating PreparedStatements using external data -- but the separation between creating a PreparedStatement (which is potentially risky) and executing a PreparedStatement (which is not risky) limits the audit surface area. The vigilance required to ensure PreparedStatements are not misused on creation is typically trivial: the SQL string is either dynamically constructed somewhere (potentially risky -- look carefully) or it isn't (and therefore safe). This is nothing compared to the discipline needed to ensure that every dynamic SQL statement is appropriately filtered and every filter is correct.
- See below at "Shared Sub-Criteria and Repeating Snippets". For a non-trivial system, the effort is roughly the same. -t
- Why do you think the effort is the same?
- I have no reason to believe it's different. Neither side has done documented mouse-stroke, keyboard, and eye-movement surveys, but I don't see how it would be less effort for PS's that use shared snippets. One has to inspect the code to make sure that all "constructed" queries use data/variables from "safe" or "cleaned" sources. The inspection path is roughly the same: trace each variable back to its source and make sure it's not changed by some unexpected operation along the way. Plus, your claim of "malpractice" requires clear and obvious evidence, not murky inspection eyeball movement counts. Your argument is growing ever more convoluted, indirect, roundabout, and full of unproven assumptions. Admit failure instead of dig in your heals to defend your willful ego. You failed!
- A chain of static string references is trivial to follow, especially if you use features of the language to (say) modularise PreparedStatement definitions and explicitly not include the mechanisms that allow access to external data. A good developer, recognising that PreparedStatement definitions represent a point of potential security risk, will always ensure that they are as simple, provable, and audit-able as possible.
- This is rather vague. Can you create code snippets showing it's "trivial"? It takes discipline and human memory to remember "not to include mechanisms that allow...". There is no "machine cop" that spanks violators on the spot.
- Sure. A typical set of PreparedStatement definitions will look like $prepStmtWhereClause015 = "WHERE p = ?"; $prepStmtDef013 = "SELECT * FROM piffle " . $prepStmtWhereClause015; and so on. Using PHP as a typical example, it's trivial to observe that all string definitions refer to other string definitions in the same module, with no reference to external strings such as $_POST['externalString'] and the like. Of course, as usual PHP makes things as difficult as possible; better languages (e.g., Java, C#, etc.) can easily modularise the string definitions in a manner that allows the compiler to guarantee no external references if you don't want them.
- I'm not clear what the effort being compared. I cannot read your mind to know what kind of code inspection scenarios you are simulating in your noggin. In both cases it's reference-following. If string A is composed of string B and C, then you follow B and C and recursively see what they are composed of. You keep doing this until either the leaf is a constant or literal or a filter, or it's coming from I/O or another system, in which case you inspect further or fix.
- I've shown code you can inspect immediately above your comment, so there's no need to see what I'm "simulating" in my "noggin". My point is that following static references is trivial; it can easily be automated. The difference between PreparedStatements and dynamic SQL is that dynamic SQL requires references to dynamic external data, otherwise it's not dynamic SQL. PreparedStatement definitions should not have references to dynamic external data -- they should only be static references. The required absence of external data in PreparedStatement definitions makes them easier to audit than dynamic SQL, because dynamic SQL requires references to dynamic external data and may have static references, whilst PreparedStatement definitions may only have static references.
- As far as the motivation of commercial DBSMS vendors, we can speculate on that until the cows come home, but it's still only speculation instead of direct empirical evidence.
- That's a valid statement in general, but does not address the particular statement I made. Again, what reason do we have to think that SQL Server, DB2 and other popular closed-source DBMSs are implemented any differently from open source DBMSs, especially as commercial vendors have a lot more to lose if their PreparedStatement implementations are found to be vulnerable to classic SQL injection?
- Again, our speculation of what commercial vendors think or do is ONLY speculation. Speculation about stuff we cannot or have not directly verified is NOT enough evidence to justify "professional malpractice" (PM). If you think speculation alone can make a good-enough case for PM, then please state that case clearly and thoroughly. (Windows and MS-Office have been notoriously "leaking" in terms of security, and it hasn't killed Microsoft, I would note as a point example.)
- That doesn't mean Microsoft is cavalier about security. More importantly, do you have any evidence that the PreparedStatement implementations in popular closed-source DBMSs are any less secure than their open-source competitors?
- Given that dynamic SQL is demonstrably risky and PreparedStatements are demonstrably safe (or, at worst, demonstrably safer), how can anyone argue that choosing the former over the latter is anything but malpractice?
- Your demonstration of "demonstrably safe" is weak, per discussions.
- Do you think PreparedStatements are not as secure as dynamic SQL with filters?
- My opinion on that is not relevant to this section. (Plus, I'm pretty sure I already answered that.) -t
- I would like to know your opinion, relevant or not. If you already answered it, I can't find it.
You are guilty of professional malpractice if you insist on using filters with dynamic SQL when
PreparedStatements are available, especially as various vendors -- e.g., Microsoft, with SQL Server -- explicitly point out that
PreparedStatements are more secure than dynamic SQL. It is wilfully negligent to advocate a recognised inferior and notoriously risky mechanism over a recognised superior mechanism, regardless how the superior mechanism is implemented.
- Skipping SQL altogether may be safer than using SQL, but that doesn't mean you "should". After all, you can't have SQL injection if you don't use SQL. Driving a double-reinforced Volvo is safer than driving a stock Chevy sedan, but that doesn't mean you "should". Risk is relative, and to be weighed against other factors. Do you wear bullet-proof underwear? If not, why not? Your nuts are at risk of being shot, so why not reduce that risk? Rough estimates of probability are not sufficient for a "professional malpractice" case (unless the weighting is so clearly lopsided as to make error ranges moot).
- Sure, skipping SQL altogether would be safer than using SQL, but that's not a reasonable option. Using PreparedStatements instead of dynamic SQL with filters is a very reasonable option. Indeed, I can't think of any reason to promote dynamic SQL with filters over PreparedStatements. Can you?
- That's not the issue at hand here. It appears to be a straw-man.
- Wasn't use of dynamic SQL with filters, vs PreparedStatements, precisely what started this discussion at the top of TopOnPreparedStatements?
- "Reasonable" is not the evidence level being examined here, but rather "professional malpractice". The evidence has to be iron-clad for malpractice-level claims, not merely "reasonable". Direct empirical evidence is still sorely lacking. Your case is based on ASSUMPTIONS of implementation techniques and ASSUMPTIONS about code auditing practices. They perhaps may be "reasonable" assumptions, but they are NOT iron-clad assumptions.
- Malpractice-level claims require a preponderance of evidence, i.e., more likely than not, which means "an impartial jury, after hearing and considering all the information discovered by respective parties will find a greater than 50% probability that professional negligence did occur." (See http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2628513/) Imagine that you've created a Web site for a client using dynamic SQL and filters instead of PreparedStatements. A coding mistake is made in a filter that causes it to fail. SQL injection occurs and your client suffers significant financial loss; he takes you to court. Aware as you are of the arguments presented here and elsewhere on the Web, what do you think an impartial jury -- assuming your defence attorney and the prosecution's attorney are equally capable -- will find?
- From your link: "Medical malpractice is defined as any act or omission by a physician during treatment of a patient that deviates from accepted norms of practice in the medical community and causes an injury..." -- The "norm" in the industry is not PS's. Or at least you have not established it as the norm. That being said, the US legal system is sometimes whacky and random. Are we talking about legal strategy or "should's"?
- I don't know where you work that PreparedStatements aren't the accepted norm, but they certainly are among the Web application developers I know. Google for "prepared statement sql injection" and you'll see it's a fairly pervasive view. I can't find anything that suggests use of dynamic SQL with filters is to be preferred over PreparedStatements. Your view, therefore, appears to be in the minority.
- If you search for Tool X you will find mostly proponents of Tool X and if you search for Tool Y, you'll find mostly proponents of Tool Y. Google searches are thus not a good indicator. Google searches alone are not rigorous enough for "professional malpractice" evidence.
- Google searches are a potent identifier of accepted norms, one of the best sources we have outside of corroborated expert testimony and peer-reviewed research. Note that "prepared statement sql injection" isn't biased in favour of anything, it merely obtains pages where the words appear together. Why don't we find stories about prepared statements resulting in SQL injection? If there were any, shouldn't such a search retrieve them?
- The jury could be shown "prepared statement bugs" results from Google to see that they are not problem-free, and thus should not atomically "just" be accepted."
- Any non-trivial software has bugs. As was pointed out above (see "Given a query like 'SELECT * FROM t1 UNION SELECT * FROM t2 WHERE c1 = 3' ..."), bugs in PreparedStatement implementations can't result in classic SQL injection. Bugs in the filters used with dynamic SQL can result in classic SQL injection.
- Re: "bugs in PreparedStatement implementations can't result in classic SQL injection." -- But they can result in OTHER types of injection.
- Other forms of injection -- such as buffer overflows -- are no more or less likely with PreparedStatements than with dynamic SQL, so they're not a distinguishing factor.
I don't know what you mean by a "private-company RDBMS".
Microsoft, Oracle, DB2, SyBase?.
You mean commercial closed-source? As I noted above, commercial vendors often cautiously imply better security for PreparedStatements, despite the fact that they obviously like to project the view that SQL injection is and always was solely a client problem. But it's almost universally accepted that PreparedStatements are impervious to classic SQL injection. Why do you think that view exists?
Perhaps API venders want to make a buck by scaring people. It happens. The empirical case for PS's is weak. I'm just the messenger. I'm not saying that "industry practices" should be outright ignored, but to claim there is a very strong case for PS's is irrational. ("Universally accpeted" is probably an embellishment, I would note. You are spinning again.) -t
Do you really believe that PreparedStatements exist because "API venders [sic] want to make a buck by scaring people"? As for what is "universally accpeted [sic]", try googling for "prepared statement sql injection".
I only raised it as a possibility. I'm not choosing one of the possibilities as a favorite here. If your argument is based on vender motivation, then we need to consider all possible motivations or at least previously-encountered motivations and then you are obligated to rule them out to claim there is only one to get enough strength for "professional malpractice". (And "vender" is not technically wrong, just less common.) I already Googled and didn't come to your conclusion.
Re "I only raised it as a possibility", how likely a possibility do you think it is?
When you googled "prepared statement SQL injection", what conclusion did you come to?
That there is a lack of direct empirical studies and because of that the best justification people come up with is "just trust X" and few seem to be fully satisfied with that.
Where did you find that "few seem to be fully satisfied with that"?
Shared Sub-Criteria and Repeating Snippets
Re: I'm not sure what you mean by something being "constant" for one section/grouping and not another.
For example, suppose you have hundreds of on-line reports for a business-to-business web app. It's a legacy system that has a lot of baggage from mergers and re-organizations. A group of about 30 reports is for retail stores only. The to-be-included-in-WHERE-clause criteria for "retail" locations (offices or buildings) is such:
retailCrit = "((loc.org_code between 50 and 64 and loc.org_level < 70) "
+ " or loc.org_type in ('ST','RE','SA'))";
Keep in mind we'll have other criteria clauses for other types of "locations", but we are only looking at the "retail" reports for now. (And convoluted criteria like this is fairly common on legacy systems in my experience.)
So we have these 30 report modules for "retail", with different net SQL, that use the above criteria. Now a design decision is whether to hard-code (repeat) the above criteria 30 times into each of the 30 reports, or treat it like a "constant" for that set of 30. Most would agree it's "better abstraction" to define the criteria phrase in one shared place rather than copy and paste it 30 times. Thus, if we are using prepared statements, we'll still have somethign like this:
prepStat.registerSql("SELECT blah FROM locations loc WHERE x=4 AND " + retailCrit);
One cannot tell from looking at this line alone whether some parts of "retailCrit" come from user input. One has to know the code and/or remember to go check where such variables are defined/assigned.
Given the above code, the string passed to registerSql is static in most language implementations; it will be constructed at compile-time. If the construction of such SQL strings is confined to a module, defined in a class, constrained by type, and/or the strings are defined as immutable (say, using something like C++'s "const" or Java's "final" modifier) it's straightforward to determine whether or not they've been tainted by external data or an attempt has been made to mutate them. Of course, it's possible to subvert such mechanisms and/or obscure the origin of SQL strings behind complexity, but that increases the security risk. That is at least equally the case for dynamic SQL.
That assumes a "static" application language, and still requires manual back-tracing of some form in the code. Whether that makes it slightly better or slightly worse than wrappers is moot in the "professional malpractice" accusation, which needs something more far solid than "a little bit better".
All that "professional malpractice" requires is that "an impartial jury ... find a greater than 50% probability that professional negligence did occur" where professional negligence could be considered behaviour which "deviates from accepted norms of practice" and results in harm. We know dynamic SQL -- even with filters, if they fail or fail to be used -- can result in harm from classic SQL injection. Can you provide evidence that PreparedStatements have ever resulted in classic SQL injection? Imagine the prosecution offers these rhetorical questions as an argument. What do you think the jury will decide?
But a given set wrappers may also have a clean record. The only thing agreed on is that "naked values" SQL is at risk, and that's not the issue. And you have not measured "norms", unless you can trick the jury into thinking Google is a scientific survey device.
Given that wrappers are a barrier to classic SQL injection, there is always a possibility that a future exploitation of flaws in the string handling code will allow a malicious string to fool the wrappers and be passed to the SQL parser, resulting in classic SQL injection. PreparedStatements do not rely on filtering, but on inserting parameter arguments into the parse tree or query plan after SQL parsing has occurred so that classic SQL injection cannot occur. Surely the jury would recognise that the latter is superior to the former?
"Better" is not necessarily enough to justify "professional malpractice". 5-inch thick bullet-proof windows on a politician's car is probably "better" than 4 inches, but 5 > 4 alone does not mean the security-window installer is guilty of "professional malpractice". If the standard or common practice was CLEARLY 5 inches and around for a while, then perhaps "yes".
In a wrongful death suit, choosing 4 inch windows instead of 5 inch, even if the improvement in security is marginal, could well be considered negligent -- especially if all other factors are equal. In this case, they aren't even equal. We're talking about 4 inch windows that provide no security unless special shutters are manually attached (and we have to be extra careful to make sure the shutters are effective and mounted to every window, but we can't be certain they'll always work) vs 5 inch windows that eliminate the chance of projectile penetration, for essentially the same cost.
PreparedStatements have been recommended practice in the J2EE and Web development worlds for over a decade. I can't say whether they're common practice or not without polling a statistically-significant sampling of development shops, but they're certainly standard practice among the developers I know.
I'm not following your 4 inch windows reply. It makes no sense to me. If you mean that Jurors can be convinced via clever manipulative persuasion techniques that anything shy of the maximum known thickness is "negligence", then I agree. But this discussion shouldn't be about juror psychology, although perhaps that's unavoidable. TopsLaw again?
The "4 inch window" is like dynamic SQL. It's insecure unless you have "special shutters ... manually attached", i.e., filters, with which "we have to be extra careful to make sure the shutters are effective and mounted to every window" -- in other words, we have to make sure filters are correct and consistently used. "5 inch windows" are like PreparedStatements. They require no shutters (i.e., filters) and cost no more than the 4 inch windows. In over ten years of use, we have no evidence of them failing due to classic SQL injection, and they're consistently recommended in order to avoid classic SQL injection. Presented with that evidence, how do you think the jurors would react to your decision to use "4 inch windows" with "shutters" (dynamic SQL with filters) instead of "5 inch windows" (PreparedStatements)?
One could argue that J2EE is trying to hook development stacks to their API's to lock in business. That's always a possibility with vendor recommendations. Commercial vendors have a record of bias. I've mentioned that before, but as usual, you forgot such that I have to reinvent the reply. And you have not provided clear evidence that "Web development" officially/clearly recommends such. You are imagining surveys that don't appear to exist. Hey, unicorns also recommend prepared statements. I just talked one in my bathroom. -t
Isn't it more likely that Oracle -- maintainer of both the J2EE definition and the Oracle DBMS -- is fully aware that PreparedStatements are more secure than dynamic SQL with filters, and therefore reasonably advocates use of PreparedStatements?
"More likely" is speculative and weak. It's "impressionistic" evidence, for we cannot smash Larry Ellison's head open to dissect his decision-making neurons. I'm not suggesting such evidence be completely discarded, but it's not any where near enough to raise "professional malpractice" claims over.
Oracle Corporation's intent in promoting PreparedStatements is not the basis for my accusation of professional malpractice. Choosing to use filtered dynamic SQL instead of PreparedStatements is the basis for my accusation of professional malpractice.
That's circular reasoning. You brought up what Oracle is allegedly concerned about as some kind of evidence. Why are you jumping out of the Oracle sub-topic now?
Because it's irrelevant. You brought up Oracle's motivations; I only mentioned that PreparedStatements are recommended practice, and that isn't just coming from Oracle -- see OWASP or any other reference. I mentioned J2EE because it's a familiar example -- aside from all-too-obvious Web development -- of database-driven software that often interacts with the outside world and for which PreparedStatements are generally recommended.
Adding more vendors does not necessarily get one above ArgumentFromAuthority. Adding more authorities does not push the evidence level above ArgumentFromAuthority, it just makes it plural. Further, a survey of a randomly selected set of vendors would be stronger evidence than cherry-picked quotes/vendors, because one cannot tell if bias is affecting which vendors are sampled and which are ignored. This is Evidence 101, I shouldn't have to explain this. -t
I don't see what that has to do with what I wrote. However, everything outside of your personal observations is ArgumentFromAuthority, if you're going to take that tack. Ever been to Siberia? No? Do you believe it exists? If so, you're relying on ArgumentFromAuthority. (Apparently.)
There are different "levels" of evidence. A study that demonstrates "PS's have a 0.001 failure rate while filters have a 0.05 failure rate" with citations is better evidence than "Oracle says so". It might still be wrong, but it's at least more dissect-able than "Oracle says so". It's more "science-able", for lack of a better word at the moment. -t
I've made no claims that something should be done merely because Oracle says so, though if they recommend a particular approach to be used with their DBMS, I'll generally give it greater weight than the equivalent recommendation from outside Oracle.
That's understandable, but it's still far too weak to qualify for "professional malpractice".
No, but choosing to use filtered dynamic SQL instead of PreparedStatements is professional malpractice.
Let me see if I have the essence here.
Prepared statements require a two step process. Step 1 is preparing the statement: taking a string, interpreting it as SQL (with wildcards filling in for data values to be entered later), and storing that as a prepared statement. Step 2 is running the statement, filling in the wildcarded data values on the fly.
Prepared statements promise the user that, no matter what garbage or malice is inserted into data fields at runtime (as opposed to preparation time), it can never, ever be interpreted as SQL code and will simply be really gnarly data.
On the other hand, whether they can deliver on that promise is implementation dependent.
Also, if the SQL is being dynamically generated at statement-preparation time, malicious user data can be baked into the prepared statement at that time.
Closed source RDBMS packages cannot be inspected, so we cannot assume that any closed-source vendor's implementation can deliver on the prepared statement promise based on code inspection.
If the above is agreed upon, I would offer the following:
Everything is implementation-dependent. Implementation is all that matters. Broken prepared statement code and broken wrappers will both open you up to vulnerabilities.
There is no perfectly safe way to run SQL. There is no perfectly safe way to run anything; TheKenThompsonHack proved that. Even if you fully vet the source code of an RDBMS, you can't prove that the compiler itself isn't sneaking prepared-statement-breaking code into your system. No security is perfect.
Even if your prepared statement implementation is perfect, poor programming can still lead to SQL injection attacks. PS is foolproof, not damnfoolproof. If you allow user generated data in your SQL string at statement preparation time, you open the door to SQL injection. The answer is to always hard code your SQL, or to only depend on constants set at compile-time. That is, prepareStatement("SELECT FROM " . table_name . " WHERE TIME < ?") is only trustworthy if you are certain that table_name is set at compile time. However, if it is set at compile time, and your prepared statement implementation is correct, and TheKenThompsonHack isn't attacking your system, then whatever the user enters can be used to fill in the '?' later, and the worst that can happen is that the statement fails to run because the user's data doesn't parse as a date.
These don't mean that prepared statements are worthless. They are the industry standard for preventing SQL injection (see OWASP on that). That is no proof in and of itself, but let's apply a variant of the LasVegasTest. If a major RDBMS' implementation of prepared statements was less than ironclad, it would be more readily exploited. Even if it didn't make the news, those in the security field would respond.
I work in the credit card business. The four major US credit cards require that everybody doing business with them (from processors to mom-and-pop stores that accept credit cards) comply with a standard called PCI-DSS. PCI-DSS in turn requires fixing the top ten complaints in OWASP, starting with injection attacks.
If a major RDBMS had incorrect prepared statement code which allowed injection attacks on well-behaved client code (that is, not allowing injections at prepare time), it would be exploited. Hard. And the credit card industry, which suffers a black eye every three months or so related to credit data theft, would have published an alert (and notified OWASP) along the lines of "Never use XYZ RDBMS. If you have it, switch off it within two years or you will be disconnected." This alone could spell millions of dollars of revenue loss to the RDBMS vendor, and they would either become a laughingstock or fix their implementation right quick, quite possibly both.
This has not happened.
If such vulnerabilities exist, they are nowhere near the weak points of the system. Lack of exploits don't prove that the implementations are correct, but that they are correct enough that crackers aren't bothering to find those vulnerabilities; they're happy to exploit much bigger holes. There are plenty of injection attacks with unprepared SQL or poor use of prepared statements ("let's prepare this with the data we just got back on the HTTP request"), but not with SQL prepared with hardcoded strings and run with variable data. If it was vulnerable enough to matter, it would be exploited. I submit the fact that it hasn't been as evidence that this isn't the case.
--RobMandeville
But like I've said before, we don't know what kind of techniques have resulted in breaches. The available empirical data is weak. I'm not proposing "naked" values such that comparing failures of PS's to failures of naked values is a moot comparison and does not answer the questions raised in these topics. Nobody's presented documented evidence of wrapper-based breaches either. Thus, if lack of known breaches is your metric, then we both "win". (There may have been PS breaches, but they may have been hushed and the vendor simply says, "Please install patch/version N because patch/version N - 1 has a known vulnerability" and the vendor gives no details, claiming doing so would give hackers too many clues.) -t
{Vendors may keep breaches quiet, but do you really think the security and developer communities would keep quiet?}
- They can get sued by the vendor for "violating the license agreement" etc.
- [The party that got breached might be taken for violating the license agreement, but they certainly won't be the only ones investigating. When you get a major breach, you're going to get some feds, such as the FBI and the Secret Service, investigating. You may also have banks and credit card associations investigating. They don't own the license, so they signed no license agreement with the vendor and can't breach it.]
- {Perhaps some can, but that doesn't stop security breaches from being publicised by CERT, OSVDB, etc. By the way, on 13/03/2014 I looked for "prepared statement" in the OSVDB and found 12 entries. There was one classic SQL injection breach -- http://osvdb.org/ref/67/snipsnap.txt -- in which a command-line argument was used when creating a PreparedStatement. There were 11097 entries containing "sql injection". A cursory examination shows that many of them contain text along the lines of "not properly sanitizing user-supplied input". With PreparedStatements, we don't have to be "properly sanitizing user-supplied input."}
- That doesn't change anything already stated. It still has insufficient details to distinguish between actual failures of the 3 options. We all 10000% agree that naked values result in injections. No news there.
- {The failures aren't due to PreparedStatements themselves, or we'd see mention of failure due to PreparedStatements. The specifics of each vulnerability are often linked, so we can examine them. However, it doesn't matter whether the failures are due to unfiltered SQL or filtered SQL -- requiring filters is inherently riskier than PreparedStatements, which do not require filters.}
- So you claim.
- {Yes, I do. Thank you for confirming it.}
{Even in the absence of documented evidence of wrapper-based breaches, filters are inherently more risky than
PreparedStatements: Failure of a filter can result in classic SQL injection; failure to use a filter can result in classic SQL injection. With
PreparedStatements, failure of the
PreparedStatement implementation does not result in classic SQL injection, and "failure to use a filter" doesn't exist. Why, therefore, would you choose to expose yourself and your customers or clients to unnecessary risk?}
That's based on assumptions of vendor implementations, not empirical attack tests. I will agree that based on their claim of implementation, certain kinds of injection are likely reduced, but that's weak evidence that does NOT justify "professional malpractice".
{As has been pointed out before, "empirical attack tests" are not an effective way of gauging security. We know how PreparedStatements are implemented in MySQL, PostgreSQL and other OpenSource DBMSs, and we can infer from the fact that their performance is equivalent in closed source DBMSs -- along with general knowledge of how languages are implemented, plus explicit claims from some vendors of PreparedStatements being more secure -- that they are implemented the same way. We know SQL injection due to failure to sanitise inputs is a significant problem -- see above re the OSVDB and 11097 entries containing "sql injection" -- but "prepared statement" in the same vulnerability database reveals one SQL injection breach, and that was due to creating a PreparedStatement with external input rather than a flaw in a PreparedStatement implementation. We can infer from this that PreparedStatements are not a significant source of vulnerabilities or security breaches. Whilst we could also infer from this that PreparedStatements are not being used, that seems rather unlikely, given that they are the primary recommended means of avoiding classic SQL injection in database-driven applications.}
Re: "We can infer from this that PreparedStatements are not a significant source of vulnerabilities or security breaches" -- That's not news. I never claimed they were. We have three kinds of fruits we're considering: Fruit's A, B, and C. We only, at best, have stats on Fruit-A and Non-Fruit-A. That won't answer the key question.
{If PreparedStatements are not a significant source of vulnerabilities or security breaches, then PreparedStatements are superior to dynamic SQL even with perfect filters, because we have to remember to use the filters. Therefore, why choose dynamic SQL with filters over PreparedStatements?}
For non-trivial designs, one has to do the same with PS's, as we already discussed. (Remember, it's what led the the "constants" discussion.)
{Even with complex queries, it is trivial to structure a set of PreparedStatement definition strings to either exclude all sources of external data, and/or to identify whether or not external data is included. It is not trivial to determine whether or not a set of dynamic query invocations have incorporated filters on every external data source. It is not trivial to prove that every filter precludes every possible dangerous input, and this can vary from DBMS to DBMS.}
I'm sorry, but I don't see how it's allegedly trivial. The code structure & inspections issues are roughly comparable as I picture such code structures in my mind.
{A set of constant strings used to define PreparedStatements will, at most, make reference to each other; there will be no references to external data sources. A set of dynamic strings used to define dynamic SQL queries will possibly make reference to each other -- same as for the PreparedStatement definitions -- but will also make reference to external data sources, along with invocations of filters to sanitise those data sources, along with the definitions of the filters. Thus, the latter is more complex than the former.}
Sorry, I don't see how that's the case. The amount of code needing inspection and the quantity of "reference hops" would not vary much between technique choices, barring some unidentified force tilting code design a certain way. And the filter names are often not much different than their PS counterparts.
{It can easily be shown. PreparedStatement definitions require...}
- String definitions, e.g., $prepStmtDef001 = "SELECT * FROM piffle WHERE x > ?";
- String definitions that reference other strings, e.g., $prepStmtWhereClause015 = "WHERE p = ?"; $prepStmtDef013 = "SELECT * FROM piffle " . $prepStmtWhereClause015;
{Dynamic SQL definitions require...}
- String definitions, e.g., $dynSQLDef001 = "SELECT * FROM piffle WHERE x > " . sqlFilterNumeric($_POST['userInput']);
- String definitions that reference other strings, e.g., $dynSQLWhereClause015 = "WHERE p = " . sqlFilterNumeric($_POST['userInput']); $dynSQLDef013 = "SELECT * FROM piffle " . $prepStmtWhereClause015;
- Invocations of filters to prevent classic SQL injection, e.g., sqlFilterNumeric($_POST['userInput'])
- Definitions of filters to prevent classic SQL injection, e.g., function sqlFilterNumeric($input) { ...etc... }
{The latter, it appears, is more complex than the former.}
{What do you mean by "the filter names are often not much different than their PS counterparts"? What filter names? Counterparts to what?}
I don't see where you "attach" the user input to the PS. It appears you missed a step in your PS version.
{We didn't talk about execution, only the construction of the definition strings. A typical PreparedStatement use looks like the following PHP PDO example:}
$sql = "SELECT * FROM users WHERE USERNAME = ? AND PASSWORD = ?";
$stmt = $dbc->prepare($sql);
$stmt->execute(array($username, $password));
{The equivalent dynamic SQL might be:}
$sql = "SELECT * FROM users WHERE USERNAME = " . sqlStringFilter($username) . " AND PASSWORD = " . sqlStringFilter($password);
$dbc->query($sql);
{Of course, the latter doesn't show the definition of sqlStringFilter() -- which doesn't exist in the former -- but the real problem is that the dynamic SQL must rely on the correctness of sqlStringFilter() -- which is non-trivial to prove -- and an absolutely consistent use of sqlStringFilter() and its cousins with every external input. Neither of these are required with
PreparedStatements.}
- Something has to associate the question marks with input values. That's part of the "concerns" of a developer. Those should be counted also in terms of software maintenance complexity. (Shifting back to their "security" is a change of sub-subject away from "complexity".)
- {Yes, something has to associate the question marks with input values at the point of execution. In the above, it's demonstrated in "$stmt->execute(array($username, $password))". It's certainly part of the "concerns" of a developer, but which do you think developers will find to be easier -- freely passing arguments to an 'execute' function (i.e., essentially the same effort as invoking a function with parameters), or having to construct a string with appropriate delimiters and being very careful to include invocations of filter functions with every variable or expression that might contain external data?}
- You are skipping showing the steps to fill up that array, making the equiv appear to be less code. It's your bias showing through again, causing you to spin your presentation using misleading used-car-salesman-like techniques to shrink the code shown unfairly. Shame on you, it's intellectual dishonesty to make it appear to be less code with in fact it's not. That kind of dishonesty shouldn't be tolerated on this wiki and I hope other WikiZens call you on it.
- {I'm not sure what you think I'm skipping. The array is "filled" by array($username, $password), which creates a keyless array containing the values in $username and $password, which is exactly what execute() requires. See http://docs.php.net/manual/en/language.types.array.php under "Example #4 Indexed arrays without key", and see http://docs.php.net/manual/en/function.array.php }
- You are comparing apples to oranges and hiding the apples in your misleading snippets. The entire code base from the app-developer's perspective will be roughly the same.
- {How am I comparing apples to oranges? How are my snippets misleading? The two snippets above, one using a PreparedStatement and the other using dynamic SQL, are otherwise identical.}
- Are you talking about the first set of snippets or the second? I already explained the issues with the first. In the second, for one, some common PS API's require telling the API what type the parameter is. Second, using positional association is often awkward from a programming standpoint. Third, even as you gave it in the second set, the code is pretty close in size.
- {I was referring to the second set. I assumed that would be obvious by the fact we're discussing array($username, $password), which appears only in the second set. The examples are using PHP PDO (see http://php.net/pdo), for which explicit parameter types are not required. Do you really think a "positional association" for parameters, which is de rigueur for modern programming languages, is worth the risks associated with filtered dynamic SQL? Do you really think code that "is pretty close in size" justifies the risks associated with filtered dynamic SQL?}
- Oh, so you admit now there might be a trade-off? It's a no longer a trade-off-free no-brainer? That's a start. Positional association is fine in some places but awkward in others. There are certain kinds of functions where named parameters are preferred over positional, for example. The positional-versus-reference-versus-embedded choice ItDepends on usage patterns etc. If the associations are stretched too far apart, then positional tends to grow awkward. And PDO may not need type indicators, but other PS API's do. It's not a universal trait that you can use as a selling point.
- {Where did I "admit now there might be a trade-off"? There is no "trade-off". There is no justification for using dynamic SQL with external data when PreparedStatements are available and usable. I'm sure there are rare exceptions to this, but I've not yet seen them cited here, and if they are in the future I would expect them to be clearly labelled as rare. In the mean time, minor code authoring and maintenance issues -- whether due to type specifications, positional parameters, or whatever -- are irrelevant compared to the duty of care we owe our clients. Programmers or project managers who selfishly trade their clients' security for (perceived, rarely real) "easy" coding are professionally negligent. Furthermore, consistently using filters in dynamic SQL string construction is considerably more overhead and programming effort than the type specifications or positional parameters associated with PreparedStatements.}
- Why are you leaving the sub-topic of complexity then and slipping back into the security issue? It appears to me you tried to pull a security red-herring when your complexity claims were dashed. That's honestly what it looks like is going on. Please justify "programming effort" claim.
- {Why am I "leaving the sub-topic of complexity"? Because even though it's self-evident that PreparedStatements are simpler than dynamic SQL with filters (look at the examples!), it doesn't matter. Even if they were an order of magnitude more complex, and for some reason we couldn't simplify them, they would still be worth it for the level of security they provide.}
- You've said that already elsewhere. Let's try not to wonder off the complexity sub-topic. The examples are misleading for reasons already given. For example, here's another take:
// Example 2-PS:
$sql = "SELECT * FROM users WHERE USERNAME = ? AND PASSWORD = ?";
$stmt = $dbc->prepare($sql);
$stmt->registerPar($username, "String"); // some PS API's require associating a type
$stmt->registerPar($password, "String");
$stmt->execute();
// Example 2-Wrap:
$dbc->query("SELECT * FROM users WHERE USERNAME = " . sqlStr($username) . " AND PASSWORD = " . sqlStr($password));
.
- {Of course. "Example 2-PS" is a reasonable illustration of using a low-level DBMS API, and if you're building higher-level facilities yourself -- like you want to author a new PHP PDO -- then it's reasonable to employ them. The majority of developers producing database-driven applications use something like PHP PDO, which provides exactly the interface I demonstrated above. Various DBMS APIs require more or less information such as associated types, or means to bind programming language variables to PreparedStatement parameters, or facilities to define named parameters and so on. But PHP PDO is typical.}
- Giving a parameter type is also typical. For example, Microsoft database API's typically include a CreateParameter? method that takes a parameter type as the second argument. (I'd be curious to know how PDO "knows" the parameter type if not explicitly given. SQL dialects typically want to know that info.) And again, positional parameter matching can be difficult to manage for longer queries. While it's nice to have that option, I personally wouldn't want it as the only option.
- {Yes, parameter types are often given. For cases when positional parameter matching is difficult, some PreparedStatement APIs provide named parameters. There are a variety of facilities available.}
- And you now appear to be changing the argument away from "complexity of code" toward "complexity of checking". One has to be careful in BOTH cases as already shown.
- {It's complexity in general, for which avoiding "complexity of code" and "complexity of checking" both appear to strongly favour PreparedStatements.}
- You are being vague. Your metric(s) for calculating complexity are not transparent. More ArgumentFromAuthority? If you are not measuring "code size", then what the [bleep] are you measuring? Describe what you are counting, why, and then count it and show us all the numbers so that we can recreate your counting method for our own verification.
- {Complexity, in this case, is both code complexity and complexity of checking. I haven't bothered with metrics because the simplicity of the PreparedStatement example should be self-evident, especially if you take into account the definition of sqlStringFilter() and its cousins.}
- We'll it's not "self evident" to me. But we can agree to stop our bickering over that and LetTheReaderDecide if you do not with to present more formal metrics of "complexity". As far as counting the wrapper library as part of the complexity, that's addressed below.
- {Where is "counting the wrapper library as part of the complexity" addressed below? Shouldn't invoking the wrappers and the complexity of dynamically constructing a string -- required with dynamic SQL, not required with PreparedStatements -- be the most significant factors?}
- I'm not following. I'm looking at the total effort of an app coder/maintainer. I don't see that PS's reduce that overall. It's either roughly the same or more, depending on how "effort" is scored.
- {The total effort of an app coder/maintainer is what I'm considering. PreparedStatements reduce that effort because you don't need to maintain and verify the correctness of filters, or the use of filters.}
- My anecdotal experience is that such filters were NOT a significant source of maintenance effort. They rarely had to be "touched". I suspect they were written by big consultants and "organically" spread around to medium and small consultants such that the original development effort is spread over a lot of orgs and thus diluted in terms of per-org "cost". If that effort is your biggest "complexity" claim factor, then you've made a very poor "complexity" case and are just quibbling about marginal scraps now. -t
- {I suspect the greatest ongoing effort is ensuring that filters are consistently and correctly used, after the initial effort to create and (more importantly) prove the filters correct.}
- So any technique that claims to be "correct" or "good" is obligated to prove it's correct or good. Correct?
- {I'm sure there are numerous techniques that claim to be "correct" or "good" but that have no particular impact, and so incur no obligation for proof. The position of curly braces in C and C-like languages, for example, is a source of much watercooler debate -- the end-of-line people think their way is good, and the beginning-of-next-line people think their way is correct -- but it appears to have no significant impact on programming productivity. It probably doesn't warrant proof, and it certainly doesn't obligate it.}
- {On the other hand, techniques that have a direct impact on security almost invariably warrant proof, but proof can take various forms. Whilst formal proof is ideal, de facto proof is often accepted. In the case of PreparedStatements vs filtered dynamic queries, we have over a decade of de facto evidence that PreparedStatements are potently effective against classic SQL injection. It's why, for example, OWASP list use of PreparedStatements at the top of the list of SQL injection "primary defenses" at http://www.owasp.org/index.php/SQL_Injection_Prevention_Cheat_Sheet (see also http://www.owasp.org/index.php/Query_Parameterization_Cheat_Sheet) Their ranking is based on over a decade in which hundreds of thousands of Web sites have successfully used PreparedStatements in the same popular DBMS products, plus the claims of vendors that PreparedStatements are safer than dynamic SQL, plus analysis of code for those DBMSs for which we have access to their source code.}
- I followed one of the links to get the "ESAPI" quote below.
- {The problem with filters and dynamic SQL is that your filters are unlikely to have anywhere near the same level of distribution and use as PreparedStatements in popular DBMSs. Most likely, your filters were developed in-house. Thus, they are relatively unverified -- at least, compared to PreparedStatements in popular DBMSs -- and so obligate a much greater level of verification (or proof) in order to be considered as secure as PreparedStatements. Of course, that assumes your filters used consistently and without fail. Forgetting to use a filter in a dynamic query -- which is easily done -- has no equivalent in PreparedStatements. There isn't something you can trivially omit from a PreparedStatement that will result in classic SQL injection, but you can trivially type $userInput when you should have typed sqlString($userInput) with dynamic SQL.}
- Re: "Forgetting to use a filter in a dynamic query -- which is easily done -- has no equivalent in PreparedStatements" -- This is FALSE. We've been over this already; so why are you dragging out the same dead horse again? Empty repetition will NOT change my mind. It really frustrates me when you guys do that. That behavior resembles brain damage. Prepared statements don't (universally brand-wise) prevent having values in statements. One still has to remember to use the proper API calls for values. In more complex apps with "libraries" of SQL snippets (to avoid repetition of query idioms), it may be easy for value usage to get lost in the shuffle, especially since some of these snippets will typically hard-wire values intentionally (such as category codes). Thus, one has to know/remember which values are intentional and which are accidental. The "machine" cannot do that.
- {I'm not clear why you claim "This is FALSE." It is true. The only way to to accidentally introduce injection into a PreparedStatement is to accidentally use a global variable containing external data when creating a PreparedStatement. What sort of dire programming practices would permit that? If that can happen, your entire codebase is suspect -- SQL injection is probably the least of your concerns. As for inadvertently using the wrong API calls, that's trivial to find using grep or an equivalent utility.}
- What exactly is "suspect"? The technique of libraries or shared snippets is described above under "Shared Sub-Criteria and Repeating Snippets". Note that "shared" is not necessarily "global". It's just factoring 101: some parts are common to other parts and we try to modularize the commonalities rather than repeat patterns/idioms/code. SQL is as much subject to this as app code, especially in legacy systems with lots of historical baggage.
- {In a security-sensitive environment, if you can't immediately identify the provenance of (for example) every variable in your code -- i.e., trivially identify and isolate variables containing external input from internal variables -- you risk of all manner of security compromises, including but not limited to SQL injection.}
- It's not clear to me what specifically you are suggesting or comparing. Factoring out common/shared/multi-used idioms often requires creating references, and this is true regardless of whether PS's or wrappers are used. References are references. It's just an ugly universal truth of living with linear (textual) code that we as mortals are stuck with. -t
- {For example, in easy-to-audit secure code, PreparedStatement definitions are defined as static string constants, and are confined to a region of code (say, a module or a class) that may contain only references to static string constants. This makes it trivial to determine whether references are to static string constants or dynamic variables.}
- As I mentioned before, if we assume humans "properly stick to shop conventions", then both approaches are "easy" and there is not a difference-maker found here.
- {It's easier to verify that PreparedStatement definitions only contain static strings, or static strings that reference other static strings, than it is to verify that dynamic SQL definitions use the correct wrappers on every external input.}
- You keep claiming this, but never CLEARLY demonstrate it. Repetition does NOT sway me. It may have swayed our mom to give you a cookie, but I am not your mom.
- {But it's trivially true. You can even automate the verification process with a simple parser. Following a graph of references and verifying the types of the nodes is easy undergraduate ComputerScience.}
- Calling it "trivial" does not get you off the hook from evidence. Sounds more like dodging. And what keeps one from making a similar gizmo for wrappers?
- {Let's take a look at it. See ValidatingPreparedStatementDefinitions.}
- One can accidentally "leak" values into PS sql strings also if they are not careful. The "machine" is not outright preventing it. Do you want to repeat and lose that argument again? You must be a glutton for punishment.
- {Your memory of the "outright preventing it" threadlet is certainly different from mine. Can you demonstrate how one can accidentally "leak" values into PreparedStatement SQL definition strings?}
- See sub-thread at PageAnchor HumanDiscipline01
And your complexity measurement seems to count the implementation of the filters but not the complexity of the PS libraries. Why is one library counted heavier than another? Just because you buy them and/or get your libraries from a big company and/or pay for them, does not automatically make them "less complex" to manage as part of a shop's tool stack.
{What "PS libraries"? PreparedStatements are part of the connection library (ODBC/JDBC/native) and the DBMS itself. With dynamic SQL, you need exactly the same connection library, so for PreparedStatements you need nothing extra.}
If you don't use part X of library Y, then you typically don't have the problems/tradeoffs associated with part X. Your "complexity math" is goofy.
{What "problems/tradeoffs associated with part X" exist here? How is my "complexity math" goofy?}
Almost any non-trivial API or set of API's will carry with them their own annoyances. And as described above, it's very unclear how you are measuring "complexity". I'm focusing on the efforts of the application writer/maintainer. Whether the API's used are from Oracle or the shop itself isn't going to matter much from the app maintainer's perspective. If you want to talk about "complexity" from somebody else's perspective, then name them.
{I'm using "complexity" in a loose sense, as it's obvious that the PreparedStatement examples are more readable and simpler, especially given we don't need do consider the complexity of the SQL filter functions. If you're relying on the "shop itself", you are at risk of the filter functions being unreliable. Isn't it better not to have to rely on filters at all?}
Yeah, "loose sense" indeed. And your sense of "obvious" has consistently proved deviant. And PS's may use filters under the hood. We don't know for sure until we open them. We only have anecdotal and indirect evidence about that, which again is NOT strong enough to justify "professional malpractice".
{If PreparedStatements used filters "under the hood" with (presumably) query re-writing, we'd expect their execution speed to be the same when executed repeatedly as when executed once, and we'd expect that speed to be equal to (or even slightly slower than, due to the filters) executing the equivalent dynamic SQL statement. In popular DBMSs that is not the case -- executing a PreparedStatement is normally considerably faster than the equivalent dynamic SQL. That strongly implies that executing a PreparedStatement is -- as we can see in PostgreSQL and MySQL -- executing compiled code (such as a query plan or parse tree) and not filtering and re-writing a SQL query.}
If you've done such tests, then please present the evidence. That's what this is all about: evidence to back claims. Further, hashing or check-sums can be used to detect if the same query is being issued.
{Had I known this debate would come up in my future, I'd have recorded figures. I've used SQL Server, Oracle Database, DB2, MySQL, Sybase and PostgreSQL either in production or in-depth testing. Their PreparedStatement performance is always as I've described. If you don't believe me, try it for yourself. If the same query is being issued, using filters and query re-writing it would still have to be parsed and compiled -- there would be no performance gain even if the same query is recognised.}
That's understandable, but if presentable evidence doesn't exist than it doesn't exist. That's just the way it is, and the result is anecdotal evidence. And you haven't explained why it would have to be re-parsed (if hashing is used to detect same-ness).
{If you're using filters and query re-writing, every time a parameter value changes you have to re-write the query. Otherwise, you're just executing the same query more than once. As for anecdotal evidence, the industry has been using PreparedStatements for over a decade to avoid classic SQL injection and expedite multiple executions of parametrised queries. That's what they're for, and that's why they exist. If that's "anecdotal evidence", so be it. I can imagine performance tests were done when PreparedStatements were first introduced, but now their characteristics are treated as given and we have no evidence to suggest otherwise. If there existed a popular DBMS where PreparedStatements were always as slow to execute as corresponding dynamic SQL -- because they're implemented internally using filters and query re-writing -- there would be complaints sprayed all over the Web.}
I thought you meant the exact same query, not value changes. Anyhow, you mention alleged performance benefits and alleged security benefits. Then suggest they are primarily "for" security. It appears to be a contradiction. How do you know the actual single/primary reason of usage if they have two areas of benefits?
{I don't understand -- if you're executing the exact same query repeatedly, presumably it's not parametrised, right? Indeed I mentioned performance benefits (they're not just alleged -- try it yourself) and security benefits (they're not just alleged -- try it yourself), but where did I suggest they are primarily "for" security? PreparedStatements provide both performance improvements for multiple executions of a parametrised query and protection against classic SQL injection, though in Web development, protection against classic SQL injection is often cited as the primary reason PreparedStatements are being used.}
Re: "but where did I suggest they are primarily "for" security?" -- It appears your text has been changed to include both factors on roughly the same level. I won't rule out that I remembered it wrong, but it does not match my memory of it at all now. Anyhow, I don't dispute that PS's are one way to reduce injection problems, but they are not the only way.
{I have changed no text. Review the PageHistory and see for yourself. I don't argue that PreparedStatements are the only way to reduce injection problems either, but they're certainly the most secure and reliable way.}
Most of the common dynamic languages don't have "constants". Does this mean that if one uses shared SQL snippets they should be successfully sued for malpractice for NOT using a static language? A whole lotta IT people would be in jail or transformed into Wallmart greeters for not going full-up SafetyGoldPlating on everything. I'm beginning to suspect "wallet bias" (AreWeBiasedTowardLaborIntensive). -t
In the absence of language-defined constants, a good developer uses other facilities of the language to make security-critical code -- like PreparedStatement definitions -- as secure and audit-able as possible.
That's pretty much the same for ANY technique: human procedures, human conventions, and human discipline. It's not a difference maker here.
While we're at it, can somebody point me to some filter software that claims to provide security? Maybe I'm just not familiar with what's out there, while every major RDBMS vendor can be pointed to as a vendor of PreparedStatement functionality. I feel like we're comparing apples to blue sky.
See ESAPI Quote below.
[Okay, stupid question. What sort of wrapper implementations are we talking about here? When we talk about PreparedStatements, I can point to DB2, Oracle, Sybase, etc. I haven't heard anything about securing SQL with wrappers outside of this wiki. Top, if you're recommending that we use wrappers, can you recommend a vendor or post some source if you have a home-brew one that you recommend? If there are none, then this whole argument is comparing apples to blue sky.]
I'd like to see that too. I asked him a similar question on TopOnPreparedStatements and received code that only did quote escaping. To be fair, he only claimed that it would do quote escaping, but it does make "empirical" comparisons difficult when one of the things being compared appears to be a non-existent ideal.
I am not recommending one use wrappers. The issue at hand is whether doing such is "professional malpractice", as Mr. Handlelack asserted. The ones I've used in the past were probably shop-rolled, but I never dug into the pedigree deeply. I was just told to "use them" by the project manager etc. And in some cases I rolled my own where "simple" values were being passed, such as ID numbers or category & location codes. I filtered them for say digits or digits-plus-letters such that quotes and "funny punctuation" couldn't get through. -t
So we are trying to compare PreparedStatements to some non-existent ideal. Thanks for clarifying that.
It's unfortunate you've never encountered them out there in the field.
I, for one, am quite happy not to work for shops which use them. Still, I find it quite telling that you can't produce an actual implementation of "wrappers done right".
If I get around to it, I'll post examples in ParameterCleaningApi. And like I already pointed out, I did not write most of the ones I used, I simply used (referenced) them and did not study most of them in detail. It's possible many of those shops have since implemented the guts of them with PS's, which is the power of abstraction at work.
How would one use a PreparedStatement in the guts of a wrapper? From what you've been describing, the wrapper wouldn't even have access to the SQL.
Sometimes a shop also wraps the actual SQL invocation. If not, one can wrap those also, although it requires combing the code.
{What does "combing the code" mean?}
I'm pretty sure he meant "combining", not "combing". Still, it leaves us comparing real implementations of PreparedStatements to some nebulous ideal that can have just about any property Top wants at any particular point in time.
No, I meant "combing". And was your repetition of your complaint about lack of API's to analyze really necessary? Go nag your spouse, not me.
Then how do these wrappers make use of prepared statements? Or is this yet another non-existent ideal?
I explained in one of these PS topics already. I'll need to comb the topics.
If the question is about whether using wrappers would be "professional malpractice" depends on the definition of professional malpractice. I've been in the biz for twenty years, and for all that time, large parts of the job were discovering pieces of code that went from "stupid" to "mind-bogglingly stupid". Some of the stupidity was mine, and some was from others. It's not all true stupidity, either; most of it is ignorance, and every time one of us grabs a new tool, we are full of ignorance yet again. In a world like that, my bar for "professional malpractice" has to go beyond "mind-bogglingly stupid" and straight to "malicious", like adding a back door or lying to the client or something like that.
If your RDBMS has PreparedStatement capability (i.e. comes from a major vendor), and you choose to ignore it and to use unprepared SQL with some homebrew wrapper library instead, I would classify that "stupid", possibly adding "mind-bogglingly" if you're supposed to be a DBA or other DB expert. But it's not "professional malpractice". Presumably, Handlelack has different standards. --RobMandeville
I do have a different standard. I believe strongly that deliberately choosing weaker and/or more complex security over stronger and/or simpler security -- when all other conditions are equal -- is professional malpractice. In most cases, our clients have neither experience nor training to appreciate the implications of programming choices, so it is up to us to make the right decisions. To do otherwise is unprofessional and unethical.
{You have a different standard on just about everything. All other conditions may not be equal. For example, it may increase VendorLockIn. -t}
Ah, I get it: Better to risk having your customer credit card database compromised via classic SQL injection, than you should have to employ a database abstraction layer or spend minimal time, cost and effort to maybe have to re-code the prepared statements when migrating from one DBMS to another. (How often do real projects migrate from one DBMS to another?) Pro-tip: Building a database abstraction layer to handle various vendors' SQL syntax variations is easy, and there are numerous good off-the-shelf ones.
{One gives the shop recommendations, but ultimately it's management's or the owner's decision. I'm not God, just a rented developer; often a mere grunt in their eyes. And it's fair for owners/managers to ask for direct empirical evidence. If there is some magic Rule of Evidence that gives security issues immunity from the burden of empirical evidence, you have not identified/established the existence of such a rule. (And database abstraction layers are often NOT trivial. Details and vendor/version variations often gum things up for non-trivial queries. Nulls and date/time encoding/formatting issues alone have turned many grey.) -t}
Verification of security has already been discussed. Do you remember the answer?
- Something like, "there are very few known logged injection problems with PS's". But there are also very few known logged problems with wrappers.
- No, it's that empirical testing for security is essentially useless compared to analysis. Anyway, even if there are no known logged problems with wrappers, there are fundamental problems with wrappers that simply do not exist with PreparedStatements: With wrappers, you have to prove they are correct, and you have to prove that they are always used and used correctly.
- You've neither proved anything nor shown why empirical testing should be dismissed.
- I don't have to prove anything -- it's the developer of the wrappers who has to prove they are correct, and the user of the wrappers has to prove they are always used and used correctly. Filter functions are brittle -- a trivial mistake (either inside the filter, or outside like forgetting to use the filter) can have serious negative impact by permitting classic SQL injection. PreparedStatements are not brittle -- a trivial mistake might result in a query that doesn't run or generates incorrect results, but it doesn't expose the application to classic SQL injection. As for empirical testing for security, it's ineffective. For example, testing for buffer overflows by passing an ever-larger string to a routine and checking for (say) a crash only proves that maybe it's ok with the largest string we've tried -- string length 'n' -- but it still might fail with length n + 1 or it might crash eventually due to overflowing the buffer but didn't do so during the testing period.
- Oracle hasn't publicly "proven" their PS's "are correct". Double standard. Your points are merely summaries of poorly-backed claims you've already made. I won't repeat replies here in the spirit of OnceAndOnlyOnce. Repetition of summary claims does not make them true. I do not change my stance based on your repetition. If anything, repetition from you is usually a sign that your details are weak such that you use mass repetition of summary claims instead to hide or bury your weak details.
- I've seen no published evidence anywhere, in the past ten years that PreparedStatements have been popular, to suggest any implementation of PreparedStatements is subject to classic SQL injection. Given how they're implemented -- which is well understood in the developer community -- that's hardly surprising. What you call repetition, by the way, tends to occur when you appear to misunderstand what we've written.
- And I am not saying that empirical testing is sufficient, but it's still necessary for the level of "professional malpractice". One should not rely on algorithm analysis alone to make sure big strings don't "bust stuff". Good security testing should use BOTH empirical (I/O) AND algorithm analysis.
- Empirical testing is certainly appropriate for verifying security problems once analysis has identified them. Empirical testing is notably poor at identifying security problems in the first place.
- Can you show evidence of that? A good many breaches are discovered when the hacker empirically hacks into the box. It doesn't get any more empirical than that.
- That's not empirical testing; hacking of the sort you describe are usually the result of hackers analysing source code or disassembled object code, or using distributed attacks that exploit multiple machines to run many trial-and-error attempts in parallel. As I pointed out above, empirically testing for buffer overflows by passing an ever-larger string to a routine and checking for (say) a crash only proves that maybe it's ok with the largest string we've tried -- string length 'n' -- but it still might fail with length n + 1 or it might crash eventually due to overflowing the buffer but didn't do so during the testing period.
- I don't want to get into a LaynesLaw loop over "empirical" here. Regardless, actual "big string" tests SHOULD STILL be done even if they are not 100% thorough. I never claimed it was a thorough test, but it doesn't have to be because ideally BOTH I/O tests AND algorithm analysis should be done. The fact that I/O-based tests are not 100% thorough is NOT a reason to dismiss them. They are part of a larger "testing package". "Skip them because they are not perfect" is an illogical stance.
- Hunting in the dark for security issues is a pointless waste of resources. Unless there's a reasonable hypothesis to suggest an I/O based test is likely to prove fruitful, analytical approaches -- even to the point of disassembling object code -- are a far better use of resources and far more likely to succeed.
- False dichotomy. I didn't rule that out.
- You didn't rule what out?
- You make it sound like a dichotomy. I am NOT proposing a dichotomy. Again, both approaches ideally should be done for reasons already given. An actual "big string" test should be run AND the source code should be inspected to make sure it handles big strings, for example. We shouldn't trust reading code alone. If reading code alone was sufficient, then we wouldn't need to test our software before going to production; which almost never flies in the real world. -t
- Searching for security failures by exhaustive search is grossly ineffective and proves nothing. If you've demonstrated that opaque function f() works with all inputs up to 'n', we still don't know if it will fail with 'n+1'. Exhaustive search is sometimes the only available option with closed source, but for that reason it can't be considered secure. At best, it can only be considered somewhat tested. Empirical testing is an excellent way to verify security exploits once they've been discovered analytically (or found in the wild), but terrible for identifying exploits in the first place.
- Who said anything about an "exhaustive search"? Common sense tells normal technicians that large strings should actually be tested against in an actual run with such a tool if you want to claim "well-tested" (in addition to code inspection). Nobody here is claiming that's "exhaustive". I'd call it "empirical sampling" as a working term. I'm not suggesting one should expect to sample every possible input value. Past history suggests AT LEAST these empirical tests:
- Embedded quotes, single and double
- Lopsided quotes, such as 4 opening quotes and 3 closing quotes
- Punctuation and symbols (!@#$%^&*_+=, etc.) - especially if not expected for a given input source
- Blank values
- Negative values (if a number)
- Letters if numbers expected
- Extremely large numbers or input strings, or large dates if a date field (1/1/9999).
- Extremely small numbers, such as 0.0000000000000000000000001
- Scientific notation
- These verify correct behaviour, and correct behaviour should certainly be verified through empirical testing. However, empirical testing is poor at finding insecure behaviour. For example, a filter might pass \0x22 as-is, unquoted, to a DBMS that treats it as a double-quote because 0x22 is the hexadecimal ASCII value for ". Did you remember to test for it?
- Again again again, such tests are not intended to be exhaustive.
- If you can't provide reasonable proof that your filters are secure, then what good are they?
- That applies to both sides. And I don't know what your reply has to do with the statement above it. Empirical testing is necessary but not sufficient to judge "good security" on. Is that not clear? --top
- Then what point are you trying to make? My original point was that "empirical testing for security is essentially useless compared to analysis", to which you appeared to disagree. If you're claiming that empirical testing is not useless because it tests if the code meets its functional requirements, that's fine, but it has nothing to do with whether it's secure or not.
- That's like comparing offense to defense as part of a ball-game strategy. It's a useless comparison. BOTH are usually needed to win big.
- Can you provide evidence of "empirical testing" that identifies security exploits?
- I cannot provide a public example. As far as anecdotal evidence, I've found flaws in others' software using tests similar to the list above. (I sometimes didn't have access to the source code, but suspect they used ziltch filters since all known parameters failed.) But I'd bet money that most readers (all 0 of them) will agree that tests similar to above should be done regardless of any other additional tests and/or examinations done. I'm not proposing a mutually exclusive choice. We are going around in circles here: time to LetTheReaderDecide.
Actually, database abstraction layers are trivial.
You write, "one gives the shop recommendations, but ultimately it's management's or the owner's decision". That's fine, but it appears -- from this page and TopOnPreparedStatements -- that your recommendation would be to prefer dynamic SQL with filters over PreparedStatements. Is that true?
I'm only going to address the issue of the document-able trade-offs between techniques. I'll leave my personal preference out of this. -t
You mean the trade-offs already delineated toward the top of TopOnPreparedStatements?
That's a poorly written intro. Burn it! I see no use in it for the original question.
How so? What would improve it?
You ignored and deleted my suggestions the first time, so why the hell should I think a second try would work?
Put them at the end, rather than increasing ThreadMess by embedding them inline.
Then there would be a long stretch of long-winded misleading info without any kind of notice or disclaimer of the misleading nature.
You still haven't identified what you think is misleading, only that you think it's misleading.
I did, but you deleted that detail if I remember correctly.
I don't recall that, but if I did it's probably because you were re-introducing ThreadMode into something where I was trying to reduce it. Please put your comments at the end, instead of ruining the flow of the text by injecting comments in the middle.
Yeah right, ThreadMode is all my fault and none of it yours. And that intro was born "ruined". I was trying to improve the information value of it. I don't "inject comments" without a reason.
You've repeatedly alleged it's "ruined" or the like, with no indication as to why you think it's "ruined".
You ignored and deleted my fixing suggestions the first time, so why the hell should I think a second try would work?
Because I'm now aware of how much my (alleged) deletion irritated you, and I've given you advice on how to present your criticisms so they won't be deleted, i.e., don't inject them in-line. Stick them at the end and I won't delete them.
If the position alone was the issue then you should have moved it instead of deleted it. Duuuuuh!
Yes, that was a bit of a dick move on my part, assuming I did it. I don't remember deleting any criticisms of the allegedly "ruined" introduction.
I believe that Prepared Statements do not sit too well with TQL. -- ChaunceyGardiner
You mean SmeQl? Sure they will, if you send me a check for $250k :-)
ESAPI Quote
Quote from http://www.jtmelton.com/2009/12/01/the-owasp-top-ten-and-esapi-part-3-injection-flaws/
- "...there are a couple of things you should consider before implementing protection. First, PreparedStatements are far more prevalent in industry. They are a solid solution to the SQL injection problem, and are widely understood and available. They’ve also been very widely tested both for performance and security. Essentially, they are a known entity. However, they do not solve any of the other types of injection. The equivalent to PreparedStatements do not exist for XML or XPath injection or others. You’d have to handle them differently. As for ESAPI [filter system], it will also work well, and has been tested for security and performance, but not nearly as much. Additionally, it does not have the industry reach that PreparedStatements do. However, ESAPI has a solution for the other types of injection as well, and a clear model for implementing new encoding mechanisms for new interpreter types. You can even write your own custom encoders if you choose." [Emphasis added]
Although the author believes PS's to be superior (for SQL issues), they seem to be saying that the ESAPI filters have been adequately tested and "work well". This is a far cry from "professional malpractice". -t
You've not indicated you're using ESAPI -- which would slightly mitigate my concerns -- but in-house SQL filters. Furthermore, OWASP (at http://www.owasp.org/index.php/SQL_Injection_Prevention_Cheat_Sheet) has this to say about ESAPI (for SQL) and similar filters:
- "This third technique [filtering] is to escape user input before putting it in a query. If you are concerned that rewriting your dynamic queries as prepared statements or stored procedures might break your application or adversely affect performance, then this might be the best approach for you. However, this methodology is frail compared to using parameterized queries and we cannot guarantee it will prevent all SQL Injection in all situations. This technique should only be used, with caution, to retrofit legacy code in a cost effective way. Applications built from scratch, or applications requiring low risk tolerance should be built or re-written using parameterized queries. This technique works like this. Each DBMS supports one or more character escaping schemes specific to certain kinds of queries. If you then escape all user supplied input using the proper escaping scheme for the database you are using, the DBMS will not confuse that input with SQL code written by the developer, thus avoiding any possible SQL injection vulnerabilities." [Emphasis added]
If you wish to make a distinction and case between "organized" filters and shop-rolled filters, please do.
- ESAPI is OpenSource, and so has had the benefit of thousands of eyes of examination, which is unlikely the case with shop-rolled filters. The only distinction is that the former is marginally more likely to be secure, but we're talking in infinitesimals here. Neither is good, as strongly stated in the OWASP "SQL Injection Prevention Cheat Sheet" quote above.
- One could argue a case for SecurityThroughObscurity. I'm not claiming either way, only pointing out there are multiple competing principles here and the one that "wins" is purely conjecture at this point. We need yet more conjecture like a hole in the head.
- SecurityThroughObscurity is the poorest form of security.
- I disagree. ItDepends. I'd judge it to be more unpredictable, but that's not necessarily the same as "bad". But let's not debate general SecurityThroughObscurity here. It's off-topic.
That opinion appears be stronger on the difference than the one I quoted, but we simply have two "professional opinions" that vary. That's hardly strong enough evidence to qualify as "professional malpractice" if somebody favors the first opinion over the second. ("My expert witness can beat up YOUR expert witness.")
Your "professional opinion" puts your clients at greater risk than mine, without any benefit to your clients and with no apparent cost to you other than a slightly different coding style. That's strong enough to qualify as professional malpractice.
- That's only a claim. Claims are easy to come by.
- True, but this is a Wiki. Claims are all we've got.
- No it's not. See below.
- See what below?
- "A fails B% of the time per...". And if claims are all with have, then let's agree that we have low-grade evidence, period.
- No, you have "low-grade evidence" -- i.e., none -- to suggest that (contrary to general industry understanding) PreparedStatements are subject to SQL injection, or that filters and dynamic SQL are generally as secure against SQL injection as PreparedStatements. I have provided various evidence to the contrary.
- Bull. You only present ArgumentFromAuthority or metrics that give both sides equal scores.
- How so? I'm quite certain my metrics don't give both sides equal scores. Can you demonstrate otherwise?
- You have not identified more failures with ESAPI than with say Oracle PS's, for example.
- Of course. ESAPI (or at least the SQL filter parts) was intended by its authors to be as secure as possible, in those circumstances where it's infeasible to employ PreparedStatements, but even its authors note that its use of filters "is frail compared to using parameterized queries and we cannot guarantee it will prevent all SQL Injection in all situations." Thus, ESAPI does not demonstrate "equal scores", security-wise, compared to PreparedStatements. This has already been covered. Go to the paragraph above starting with "You've not indicated you're using ESAPI..."
Re: "we cannot guarantee it will prevent all SQL Injection in all situations." -- Can they guarantee that with PS's? (Note that I'm not sure a strong distinction can be made between "SQL injection" and all possible injections. I haven't seen a clear-cut definition boundary so far, but don't wish to get bogged down in a
LaynesLaw mess over such definitions if possible.) -t
Whilst I doubt anyone would be so foolish as to make 100% guarantees about anything that runs on a computer, OWASP certainly isn't shy about ranking PreparedStatements above use of ESAPI with dynamic SQL, and strongly recommending that applications "should be built or re-written using parameterized queries."
That's still ArgumentFromAuthority. There are no objectively verifiable metrics, for example.
What would be an "objectively verifiable metric" for a mechanism like PreparedStatements that structurally preclude SQL injection?
I already gave an example, something like "A fails B% of the time per installment while C fails D% of the time per installment". In a decent study, one can follow the citations back to the source to inspect the details of the study.
There is no evidence in the literature that PreparedStatements have ever failed in a DBMS in a manner that resulted in SQL injection. Thus, their apparent failure rate is 0%.
Same applies to ESAPI. Booyah!
Of course. ESAPI (or at least the SQL filter parts) was intended to be as secure as possible, in those circumstances where it's infeasible to employ PreparedStatements. This has already been covered. Go to the paragraph above starting with "You've not indicated you're using ESAPI..."
You didn't qualify your hatred of filters at the beginning.
I'm not sure what point you're trying to make. I don't "hate" filters. I regard them as representing a greater SQL injection security risk than PreparedStatements, for -- at best -- questionable gains that (if there are any gains at all) will only benefit the developer and not the client. For that reason, I reject them.