Variable Declaration Prevents Typos

Having successfully built systems in both statically typed and dynamically typed languages, I find myself still on the fence in the static vs. dynamic typing debate. At times when I'm working in Java, I find myself mildly annoyed by what seems like excessive and superfluous type-casting. At other times when I'm working in Python, I encounter bugs that would have been eliminated at compile-time in a language like Java.

Of all the bugs that are indigenous to dynamically typed languages, the one I encounter most is the misuse of an uninitialized variable due to a typo. For example:

 var = 1
 while var < 256:
         print "looping..."
         varr = var * 2
 #          ^ typo causes infinite loop

A statically typed language would have caught this typo at compile time, since varr wasn't declared, although this error has nothing to do with typing, per se. In practice I find that relatively few of my bugs in dynamically typed languages are actually caused by the misuse of one type as another.

Have others had similar experiences? Is it possible that the biggest benefit of static type checking is actually the static namespace checking that it requires? Could dynamically typed languages realize much of the protection provided by statically typed languages, simply by requiring variables to be declared upon first use? For example:

 dim var = 1 # Hypothetical variable declaration required using dim
 while var < 256:
         print "looping..."
         varr = var * 2
 #          ^ typo causes namespace violation instead of infinite loop

-- CurtisDuhn


Smalltalk 80 required variables to be declared (but not typed) for this very reason. The spelling corrector would happily add or remove these declarations too. Although this saved many typo bugs I still found the process tedious. Thankfully Java (and strict Perl) lets me declare a variable at the point of first use. This seems just as effective at finding typos. -- WardCunningham


In the PerlLanguage at least if you use 'strict' you'll get something like Global symbol "$varr" requires explicit package name at var.pl line 5., admittedly a bit obscure. If you turn warnings on, you'll get "Name "main::varr" used only once: possible typo at var.pl line 5.", which is clearer. --StevenNewton


CommonLisp compilers will issue warnings about undeclared variables being assumed special. That pretty much takes care of that problem. So, all in all, it doesn't seem to be the lack of static type declarations which is the problem, it's poor programming environments. Is Python that lame? [I've never used it].

Well, Lisp is a more interesting situation because you have both assignments and bindings, and you can't assign to a variable which isn't already bound to some value. For example:

 (let ((one 1)
       two)
      (incf 1)
      (setq two 3))

The (let) creates bindings of 'one' and 'two', which are visible in its body. It binds 'one' to 1 and 'two' to the default of nil. Both the (incf) and the (setq) perform assignments to the existing bindings.

The exception, as mentioned above, is 'special' variables, which are roughly equivalent to globals in other languages. It is highly unlikely that these would be accidentally created when using Lisp non-interactively (ie, as a compiler); when using it interactively, most implementations generate warnings to prevent that. Special variables are none too common to begin with.

So I have to say that it's not the programming environment which grants Lisp its immunity to this problem; it's the language's distinction between binding and assignment.

Well, all I meant is that if you do the parallel mistake in lisp, i.e.

  (let ((var (get-some-var)))
      ...
      (incf varr)  ; TYPO!
      ... )

You would get a warning at compile time about varr being declared special. Since all lispers I know of follow the *foo* convention for special vars, you'd know right away that you have a typo.


But a counter factor is that lack of declaration shortens the code, making it quicker to review. [Quicker to review... evidence?] I find heavy typing and declarations gum up code with formalities that slow down or hinder reading. I've heavily used both styles over the years and I know what makes *me* more productive. Overall, leanness is a good thing. (True, it may also depend on the domain.) This is a classic scripting-versus-formality HolyWar. Related: HowImportantIsLeanCode. --top

I'm not certain you could even argue for a relevant reduction in code-size, here, considering the often minimal space required to declare a variable in most languages (e.g. 'dim' or 'var'). Arguing a quicker review would be much more difficult since (1) a proper review now requires the extra step of searching extra hard for typographical errors in variable names, and (2) 'dim' would make it clearer to the reviewer that it is the first place a variable is available in a scope, precluding the need to check for earlier uses of it - which, for methods much longer than your thumb, is quite valuable. And that's ignoring potential local/global scoping issues with undeclared variables.

How is this an issue for methods? You cannot assign to them such that the "varr" example above does not apply.

In this particular case, at least, even without declarations you can look for both unused variable names and uninitialized variable names - i.e. variables that receive an assignment and that are never used, and variables that are used prior to receiving assignment - and deliver appropriate warnings to the programmer with a simple static analysis tool. This wouldn't actually cut down all cases, but it would knock out a significant fraction of them (including all one-mistake errors).

This seems to be promoting syntax "suspicious code" checkers, not formal declarations. Perhaps such a tool can identify similar looking names for closer inspection. It could tell you that "var" and "varr" are suspiciously similar. If its legitimate, then you log it to mark it so that the tool doesn't report it again in the future. There's a topic about this somewhere around here IIRC.

Not all forms of 'suspicious code' can easily be checked. But in the particular case of variable declarations, checking for unused and uninitialized variables is fairly simple. I'm not even talking about the complexity of figuring out whether two variables are "suspiciously similar" - what does that mean anyway? is 'varX' similar to 'varY'?

Agreed, but even without such checkers, a lean coding style is often not the buggaboo that strong typing fans make it. If its not your cup of tea, so be it. Just stop extrapolating to everything and everyone.

If you're planning to argue that strong typing automatically prevents lean coding style, take it up with the fans of ImplicitTyping and TypeInference. Just stop spouting opinionated, indefensible nonsense.

Projection.


MayZeroEight


EditText of this page (last edited May 8, 2008) or FindPage with title or text search