Variable Declaration Prevents Typos

Having successfully built systems in both statically typed and dynamically typed languages, I find myself still on the fence in the static vs. dynamic typing debate. At times when I'm working in Java, I find myself mildly annoyed by what seems like excessive and superfluous type-casting. At other times when I'm working in Python, I encounter bugs that would have been eliminated at compile-time in a language like Java.

Of all the bugs that are indigenous to dynamically typed languages, the one I encounter most is the misuse of an uninitialized variable due to a typo. For example:

 var = 1
 while var < 256:
         print "looping..."
         varr = var * 2
 #          ^ typo causes infinite loop

A statically typed language would have caught this typo at compile time, since varr wasn't declared, although this error has nothing to do with typing, per se. In practice I find that relatively few of my bugs in dynamically typed languages are actually caused by the misuse of one type as another.

Have others had similar experiences? Is it possible that the biggest benefit of static type checking is actually the static namespace checking that it requires? Could dynamically typed languages realize much of the protection provided by statically typed languages, simply by requiring variables to be declared upon first use? For example:

 dim var = 1 # Hypothetical variable declaration required using dim
 while var < 256:
         print "looping..."
         varr = var * 2
 #          ^ typo causes namespace violation instead of infinite loop

-- CurtisDuhn

Smalltalk 80 required variables to be declared (but not typed) for this very reason. The spelling corrector would happily add or remove these declarations too. Although this saved many typo bugs I still found the process tedious. Thankfully Java (and strict Perl) lets me declare a variable at the point of first use. This seems just as effective at finding typos. -- WardCunningham

In the PerlLanguage at least if you use 'strict' you'll get something like Global symbol "$varr" requires explicit package name at var.pl line 5., admittedly a bit obscure. If you turn warnings on, you'll get "Name "main::varr" used only once: possible typo at var.pl line 5.", which is clearer. --StevenNewton

CommonLisp compilers will issue warnings about undeclared variables being assumed special. That pretty much takes care of that problem. So, all in all, it doesn't seem to be the lack of static type declarations which is the problem, it's poor programming environments. Is Python that lame? [I've never used it].

Well, Lisp is a more interesting situation because you have both assignments and bindings, and you can't assign to a variable which isn't already bound to some value. For example:

 (let ((one 1)
       two)
      (incf 1)
      (setq two 3))

The (let) creates bindings of 'one' and 'two', which are visible in its body. It binds 'one' to 1 and 'two' to the default of nil. Both the (incf) and the (setq) perform assignments to the existing bindings.

The exception, as mentioned above, is 'special' variables, which are roughly equivalent to globals in other languages. It is highly unlikely that these would be accidentally created when using Lisp non-interactively (ie, as a compiler); when using it interactively, most implementations generate warnings to prevent that. Special variables are none too common to begin with.

So I have to say that it's not the programming environment which grants Lisp its immunity to this problem; it's the language's distinction between binding and assignment.

Well, all I meant is that if you do the parallel mistake in lisp, i.e.

  (let ((var (get-some-var)))
      ...
      (incf varr)  ; TYPO!
      ... )

You would get a warning at compile time about varr being declared special. Since all lispers I know of follow the *foo* convention for special vars, you'd know right away that you have a typo.

But a counter factor is that lack of declaration shortens the code, making it quicker to review. [Quicker to review... evidence?] I find heavy typing and declarations gum up code with formalities that slow down or hinder reading. I've heavily used both styles over the years and I know what makes *me* more productive. Overall, leanness is a good thing. (True, it may also depend on the domain.) This is a classic scripting-versus-formality HolyWar. Related: HowImportantIsLeanCode. --top

I'm not certain you could even argue for a relevant reduction in code-size, here, considering the often minimal space required to declare a variable in most languages (e.g. 'dim' or 'var'). Arguing a quicker review would be much more difficult since (1) a proper review now requires the extra step of searching extra hard for typographical errors in variable names, and (2) 'dim' would make it clearer to the reviewer that it is the first place a variable is available in a scope, precluding the need to check for earlier uses of it - which, for methods much longer than your thumb, is quite valuable. And that's ignoring potential local/global scoping issues with undeclared variables.

How is this an issue for methods? You cannot assign to them such that the "varr" example above does not apply.

It is an issue for long methods, or long scopes in general. People - and the vast majority of your code reviews will be performed by these forgetful beasts - can remember about seven things, plus or minus two. That 'var' happens to be the correct variable is one of those seven things, and they'll forget this in longer scopes because there are a lot more than seven things to remember. So, if they looked and saw a variable named 'varr', they'll need to go back and look to see how that variable is used elsewhere and whether you're accidentally overwriting it (and this is worse if there is potential for crossing into global scope). If, on the other hand, they see 'dim varr', they do not need to do so, and so the review is that much faster. You were arguing that the 'dim' makes things slower because it is three more characters to read. How do you justify that?
I'm still not clear on the problem you are trying to prevent. Perhaps a code sample may help.
I'm left baffled as to how you thought I was attempting to prevent a problem. You claimed that "lack of declaration shortens the code, making it quicker to review". I believed this to be an unreasonable conclusion, and I presented reasons that the presence of declarations can make things quicker to review - a conclusion that contradicts your own, my ultimate point being that "arguing a quicker review would be [...] difficult". I have not presented a problem and an approach to prevent it.
Then why are they doing a code review?
You tell me, Mr. "But a counter factor is that lack of declaration shortens the code, making it quicker to review." - you're the one who made code review a pertinent factor to this discussion.
I don't know what the exact problem or goal you are trying to achieve or prevent. Its just not clear. There appears to be some unstated assumptions in it. If we want to know what is outside the scope of the routine, it may be more logical to declare what is outside instead of what is inside, since the outside scope is usually less. But this is a different issue.
What part of "I have not presented a problem and an approach to prevent it" did you fail to comprehend? Pointing out that your argument isn't reasonable, and why it isn't reasonable, doesn't require I invent a problem to solve.
I give up. I find your writing style difficult. Whether the problem is you or me, I don't care right now. You seem very resistant to trying to describe whatever eye movements, finger movements, or mental steps are taking place that help or hinder maintenance. There's nothing to grab a hold of in order to evaluate or scrutinize. The handle has been deallocated when I go to reference it.

In this particular case, at least, even without declarations you can look for both unused variable names and uninitialized variable names - i.e. variables that receive an assignment and that are never used, and variables that are used prior to receiving assignment - and deliver appropriate warnings to the programmer with a simple static analysis tool. This wouldn't actually cut down all cases, but it would knock out a significant fraction of them (including all one-mistake errors).

This seems to be promoting syntax "suspicious code" checkers, not formal declarations. Perhaps such a tool can identify similar looking names for closer inspection. It could tell you that "var" and "varr" are suspiciously similar. If its legitimate, then you log it to mark it so that the tool doesn't report it again in the future. There's a topic about this somewhere around here IIRC.

Perhaps it is not only about syntax checkers, but human checkers looking over code trying to find out contract information about a function. In many languages there isn't even a contract as to whether the function returns a result - which is ridiculous. One has to sift through the actual function to see if something is returned or not, instead of the function being marked as a function that returns a result! Imagine if one adds this information into the source comments - he defeats the purpose of the lean language and starts adding the source declarations into the comments - bloating the language up with inconsistent comments that each coder reinvents on his own (rather than more of the contract being in the source). See TypeSystemThroughComments
Typical comments are suggested regardless. Example: "Returns the employee name of the employee with the most parking tickets within given department." (Zero or many matches are not addressed here, its just a quick-and-dirty example).

Not all forms of 'suspicious code' can easily be checked. But in the particular case of variable declarations, checking for unused and uninitialized variables is fairly simple. I'm not even talking about the complexity of figuring out whether two variables are "suspiciously similar" - what does that mean anyway? is 'varX' similar to 'varY'?

Agreed, but even without such checkers, a lean coding style is often not the buggaboo that strong typing fans make it. If its not your cup of tea, so be it. Just stop extrapolating to everything and everyone.

BrainFuck is very lean. Assembly can be too. As are regular expressions. Perl/Awk is lean. Bash is lean. But it doesn't help us that much. See a pattern? And yes, languages like Algol and Pascal were not lean enough (they were too baroque and didn't need to be, and IMO so is Java) - that doesn't mean that strong typed languages with static checks cannot be lean.
Being lean is not the only criteria to weigh a language. This should be understood.

If you're planning to argue that strong typing automatically prevents lean coding style, take it up with the fans of ImplicitTyping and TypeInference. Just stop spouting opinionated, indefensible nonsense.

Projection.

MayZeroEight