Static Type Safety

Static type safety is the single most accessible form of systematic formalization commonly available in programming languages. I wish more "scripting"-type languages had it. It's the main good thing about Java, but the one thing that no-one ever mentions.

Why isn't this fact acknowledged on this site? Perhaps ...

Everyone assumes that this is a given part of an expert programmer's toolkit, so it isn't worth mentioning. Or ...
Nobody cares about it.

I'm guessing the latter... You meet a lot of SmalltalkLanguage and PythonLanguage people around here! (languages with DynamicTyping, as opposed to StaticTyping) Do you people like finding errors at run time, or what?! :) How do you modify your programs? Or does the experience of using such a system enforce a kind of disciplined approach that helps to eliminate such problems? (Serious question... I was wondering why XP is so keen on religiously applying unit tests to every single class. I've since realized that you're using Smalltalk! Explains a lot.)

ExtremeProgrammingWithTypes is a really depressing read: "If at the end of the twentieth century anyone is building any application that doesn't begin with a weakly typed language - perl, python, smalltalk, prolog or VB - then they need to get their head read."

Is there something wrong with me then? I think in types. I know a whole class of errors has been eliminated by the time my code compiles. Of course, not all of them, but an infinite subset of the infinite bugspace has been neatly and forever stamped on. Is it really better to just sling information into an associative array, with no formal definition of what is what aside from an incomplete set of examples of how to use the information "correctly"?

TypesAreRedundant is even worse in places! Generics fix every damn problem mentioned on that page and they do it with absolute compile-time type checking. The C++ implementation (templates) is perfectly adequate, for example. But that page gets onto a subject I'm vaguely interested in -- TypeInference. But C++ already has that in the case of template functions. For example:

  class A { void foo(); };

  class B { void foo(); };

  template <class T> void bar(T *obj)
{ obj->foo(); }

  A *a = new A;
  B *b = new B;

  bar(a);  // no need to say what type I'm passing
  bar(b);  // - the compiler can work it out easily enough.

(This is an example of StaticPolymorphism; partial template specialization can be used to create the illusion of reflection)

This can't be done for classes, but then, classes provide a central place to state your definition of what operations on a certain kind of object are valid, rather than spread that definition throughout the code in a lot of little incomplete examples.

I think, at the end of the day, some people never learn to love compiler errors. CompilerErrorsAreYourFriends. It's basically a very sophisticated search mechanism, telling you all (or most) of the things you need to change as a consequence of some initial change - extremely helpful in the refactoring process.

In the PlanNineFromBellLabs operating system, apparently "everything is a file server." Meaning that you can construct a file path to anything, and open it and read a stream of opaque bytes. Some part of me is screaming: "That system would be so much better if those files had types!" The interface to an opaque stream of bytes (an array of 8 bit integers) is just one of many possible interfaces to some information. Why not a more descriptive, high-level one? And a shell language that can check your scripts against the environment to find type errors whenever you want it to? Amen! -- DanielKnapp

An encouraging quote from http://www.python.org/doc/FAQ.html, admitting the advantages of C++. "The C++ compiler provides type safety and catches many bugs at compile time instead of run time (a critical consideration for many commercial applications.)" You don't say!

There was some discussion about retrofitting Python with static type safety, but it had an impossible implicit goal - not to "alienate" current users. http://mail.python.org/pipermail/types-sig/1999-December/date.html

Hmmm... http://st-www.cs.uiuc.edu/users/johnson/smalltalk/smalltalk-method. RalphJohnson apparently can't find anything wrong with Smalltalk but tellingly felt the need to "do something with TypedSmalltalk?" (according to a cryptic one-liner here: SmalltalkCompiler.)

-- DanielEarwicker

For one plausible answer to the question: "Why do people tolerate dynamically-typed languages?" see the TheoryOfLanguageEnvironmentSuckage.

Smalltalk has a mechanism to report run-time type violations: #doesNotUnderstand.

Java has a mechanism to report run-time type violations: ClassCastException.

Why does a "type-safe" language include a mechanism to report type violations?

Because it also has mechanisms for bypassing its type system. If you don't want those exceptions, make proper use of the type system.

[At this point, it is useful to remark on the difference between a typesafe down-cast operator (such as DynamicCast in CeePlusPlus, the (Type) cast in JavaLanguage, etc. - with an unsafe cast such as (Type) in C/C++; ReinterpretCast? in C++, etc. The former fails in a controlled and defined way (an exception, returning NULL, etc.) if the types do not match; the latter results in UndefinedBehavior.

Having typesafe downcast does not make a language TypeUnsafe?. Having the unsafe cast does. JavaLanguage has no unsafe cast operators; calling it type-unsafe is a fallacy. It does have the downcast, necessary due to its lack of proper generics. To be fixed in Java 1.5, of course.

See TypingQuadrant]

: So every time you do a class cast you're making improper use of Java's type system? What are you supposed to do with all the Collection classes, then?

A "type-safe" language that includes a cast operator is not type-safe. Strongly typed languages that do not include a cast operator are listed here (please add any that are missing):

MlLanguage, HaskellLanguage, CleanLanguage (maybe?)
LispLanguage (dynamically but strongly typed! ;-) In a dynamically typed language, the compiler embeds the typecasts for you. At any rate, Lisp allows type annotations if you want them.
EiffelLanguage (statically and manifestly typed) Eiffel has a typesafe downcast as well.
AdaLanguage
NiceLanguage (compiles to JavaByteCode)

There are other ways of bypassing a type system instead of explicit casting. Just use a vague data type. e.g. encode all your data structures in strings, and all operations on those values then become string parsing/manipulations. (To make it totally concrete, imagine all data structures being passed around as XML.) Name a language you can't do that in! Depending on the situation, that's either bad practice or an entirely necessary design for flexibility. Type systems are always optional, it takes some thinking to tell the system what you want it to check. Though the advantage of a well-designed system is that it automatically performs type-checking if you only declare your data structures in the native way, rather than making your own dynamic version.

"Type systems are always optional, it takes some thinking to tell the system what you want it to check." - So... what if we thought it was a good idea to use a language that did make the type verification explicitly optional, and trust that we would be able to do some thinking when the time came to tell the system what we wanted it to check ? Would that work then? (And, I think, here you've come up with a perfectly good argument to include on the bottom half of the BizarroStaticTypingDebate.)

Sounds like the DylanLanguage. -- JohnLindsey

The point is that because type systems are always "optional", the alleged benefits of systems based upon them are largely illusory. It's been my experience that the loudest advocates of strong type systems have been the youngest developers on the team with the least long-term support and development experience (based on the number of applications dealt with, not length of time coding). I therefore associate (perhaps unfairly) stridency in this regard with youthful exuberance. -- TomStambaugh.

That's at odds with my experience (which is in itself interesting.) The least experienced developers in my workplace have a terribly difficult time learning how to take advantage of type safety. All the chaotic influences of inexperience seem to direct them to work against it. However, I agree that Computer Science graduates tend to be loud advocates of anything formal, regardless of their own (lack of) experience in usefully applying such tools. (I'm a Physics graduate, so I probably have some slight prejudice in favour of formalism as well.) I think that tools always require some effort to learn how to use them. Type safety is no exception. If you don't want to invest that effort, then you won't get any benefit from it. Having made that effort, I cannot envisage a situation in which I would want to abandon it. It's part of how I design things clearly and simply.

I realize, on rereading this, that I strongly endorse the underlying discipline that results from working in a strongly typed environment. I think that my view is that it is the *disciplined thought* that is necessary, not the type system. Thus, I find that the Java and C++ type systems get in the way of the underlying rigor I bring to the problem space, where as Lisp and Smalltalk environments let me express that rigor in metastructure. I almost never write totally unstructured random code. To the contrary, I generally want my code to reflect more (or at least different) structure from that enforced by a C++/Java-style static type system. -- TomStambaugh

Serious question... I was wondering why XP is so keen on religiously applying unit tests to every single class. I've since realized that you're using Smalltalk! Explains a lot.

I learned to write UnitTests while I was still a Java fanatic. I wrote some tests; they caught some bugs. I wrote more, they caught more. By the time I had enough tests to be confident that I had NearZeroBugs?, I didn't feel like I needed the static typing anymore. I shouted, "Aha!" and went to try out Smalltalk.

Static typing catches an infinite set of bugs, but it didn't reduce my need for testing one iota. And testing reduced my need for static typing down to almost nil.

-- AdamSpitz

This page is very ranty and doesn't really explain what StaticTypeSafety is or why it might be useful, or how it fits in with UnitTests. For something more considered and (hopefully) helpful, check out UnificationOfStaticTypesAndUnitTests. -- DanielEarwicker.

I moved your comments here, at the bottom. It is unpolite to say the least to open a page with a derogatory statement. If you can't make it better, you better don't comment on it. Yes, the case is not made very clear why static type checking is good, but the opposite point isn't either. -- CostinCozianu

I was referring to my own ranting that was the original start of the page, and is still there with my name on it (i.e. I proudly take full blame for my own prejudiced ranting in favour of static typing!). -- DanielEarwicker

"Does not understand" is Smalltalk's way of spelling "Segmentation Fault"

Comments about some benefits of dynamic typing moved to, believe it or not, BenefitsOfDynamicTyping.

A "type-safe" language that includes a cast operator is not type-safe. Strongly typed languages that do not include a cast operator are listed here (please add any that are missing):

I added AdaLanguage to the list following this statement, then I started to think about what you mean by a 'cast' operator. So to clarify do you mean a language is not "type-safe" when:

It has a facility to explicitly convert from one type to any other type (for example from ASCII to Integer)
It has a facility to convert between one type of a certain type to another type of a similar type (for example from Integer (with a range of 1 to 10) to another Integer with a different range)

A final question is how do explicit developer written conversion routines fit into this? Do they invalidate "type-safety"?

As you can probably guess I'm not sure that "type-safety" is compromised by having the facilities described above. I think that "type-safety" is probably something that a particular program can have, but that a language can either support or not support. It is possible to have a "type-safe" program written in assembler, but it would be much easier and much less expensive to write a "type-safe" program in a programming language which supports "type-safety".

Cast operations, as we know them in C, C++, and Java, don't convert the data, only the data type that it expects to find at the referenced address. Casting the Integer value 23 to a String certainly doesn't yield the string "23", while a conversion operation might. [On the other hand, casting the integer value 23 to double in C or C++ yields the value 23.0, so it's not that simple. -- AndrewKoenig] In Java, the following is perfectly valid code:

Customer c = new Customer();
// We'll get a class cast exception here.
String s = (String) c;
// In C++, this would probably cause a segmentation fault
int i = s.length();

Actually it's not valid Java code... String s = (String)c will cause a compile-time exception if Customer does not extend String.

And a cast like that (struct type -> struct type) isn't valid in C++ either. Sure, you could force it with s = *(String*)&c, but BadCodeCanBeWrittenInAnyLanguage.

In Smalltalk, Python, Lisp, and any other dynamically typed language that I've encountered, there's no way to trick the VM into thinking that a Customer object is a String, for example.

One basic difference is that in Java, C++, and C, the data type is a property of the pointer or reference. In dynamically typed languages, the data type is a property of the object itself. A cast changes the type of the reference, but the object itself is unchanged.

-- RobertChurch

Well, you can't trick the Java VM either, and it is essentially uninteresting from a developer's perspective to where type information is attached (to the reference, or to the memory location), as long as it assures safety. In C++, you can force the compiler to generate code that will use a pointer in an incorrect way, but you do that at your own risk and with a serious warning being generated. It is correct to say that C++ can generate unsafe code and this was a design choice (trade-off). Otherwise C++ also has a safe dynamic cast that you can use. Therefore dynamic languages have nothing more than Java and C++ with regards to runtime type safety. -- CostinCozianu

I'm ConfusedAboutStaticTypeSafety.

There are some BenefitsOfDynamicTyping, but if you're using a statically typed language, you should UseTheStaticTyping.

Some of the protection provided by statically typed languages stems from the fact that VariableDeclarationPreventsTypos. This side-effect has nothing to do with type-safety, per se. -- CurtisDuhn

I think that many people's complaints about statically typed languages comes from the fact that "static typing" usually means "you have to know what types things are when you write your program". A few languages, however, treat the notion differently: They don't require the types of objects to be known until the program is compiled.

In many (though admittedly not all) cases, this latter interpretation offers most of the benefits of the former and avoids most of its disadvantages. C++ with templates is one example of such an interpretation; ML is another.

Here, for example, is a little ML program to compute the length of a list (ML experts please note: I am fully aware that there are more succinct ways to write this program, but I think that people who have not encountered ML before will find my version easier to understand):

fun length(x) =
    if null(x) then 0
    else length(tl(x)) + 1

Because this function applies the built-in functions "null" and "tl" to x, and these functions require a list as argument, the compiler can (and indeed is required to) figure out that the argument to "length" must be a list. On the other hand, the compiler can also figure out that nothing in this function depends on the type of the list elements. Accordingly, if you try to call this function with an argument that is not a list, the compiler will refuse to compile your program, and if you call it with a list of any type, the compiler will accept it.

I believe that it is much better for the compiler to reject a program that tries to evaluate, say, length(42) without having to wait until the unit tests run to uncover the error.

-- AndrewKoenig

Then again, I prefer my editor to yell at me when I try something illegal than waiting for the compiler to run to uncover the error, although it would be best if the computer could reach out and smack me before I even get the chance to type it in the first place ("I could see the 'goto' in his eyes") :)

I sometime relied on C++ compile-time type-checking to let me do refactoring -- make a change, such as the number of arguments to a constructor or a method, and fix all the places where the compiler complaints. That didn't work recently; I changed an argument list from (unsigned char, int) to (int, bool) and got NO complaints from the compiler, because C++ is not at all strict about those types. Where's your static typing now, C++? If I had been programming in Smalltalk, running my unit tests (or maybe using one of those refactoring or type-inferencing tools that I've heard about), I would have gotten real error messages.

I'm not sure unit tests would have caught anything either. It would depend upon whether your stub class could detect the difference between an int and a bool. If you didn't use a stub class for testing, the change would have been uncaught. It would have been better to overload the method with the new signature (or even rename the method) and use search/find/grep to change items one at a time. That way you would never have "broken" code during the refactoring.

StaticTyping is also a useful device in creating code that DoesWhatItSays? as the compiler verifies at CompileTime. DynamicTyping helps in BelievingAbstractions; I merrily type what I mean minding syntax and all works automagically -- Ideally looking a bit like natural language.

examine the following:

    typedef std::string on;
    typedef std::string from;
    typedef std::string to;
    smart_pointer<work_c::base> sp_work_c(timer_pool::instance().create<work_c>());
    sp_work_c->set_name     ("10 seconds, exclusions");
    sp_work_c->set_interval (interval(interval::seconds, 10));
    sp_work_c->add_exclusion(exclude(on("weekday"), from("00:01"), to("00:02")));
    sp_work_c->add_exclusion(exclude(on("weekend"), from("00:02"), to("00:03")));

I get an abstraction that looks a bit like natural language (or at least it reads that way to me) and the knowledge it will do as it says. (This argument feels weak, but I believe in it.)

DynamicTyping helps with BelievingAbstractions but you can do the same with a StaticallyTyped language and some SyntacticSugar. With typedefs, I can even create LispLikeForms?.

But the static typing here is partly an illusion. You could write

    sp_work_c->add_exclusion(exclude(from("00:02"), to("00:03"), on("weekend")));

and the compiler wouldn't notice anything wrong. And what if you want to call another function for which it makes sense to write

    sp_work_c->set_counter_range(range(from(123), to(234)));

? I'll make a prediction: in 3 years' time you won't be using this technique any more :-). -- GarethMcCaughan

Here's a version that actually works:

 #include <string>

 struct on_tag {};
 struct from_tag {};
 struct to_tag {};

 template<typename Tag>
 class StringParam? {
 private:
 std::string string;

 public:
 explicit StringParam?(const char *cstring) :
 string(cstring) {}
 explicit StringParam?(const std::string& string) :
 string(string) {}

 const std::string& toString() const {
 return string;
 }
 };

 typedef StringParam?<on_tag> on;
 typedef StringParam?<from_tag> from;
 typedef StringParam?<to_tag> to;

 struct exclude {
 exclude(on onWhat, from fromWhen, to toWhen);
 };

 void foo() {
 exclude firstExclude(on("weekday"), from("00:01"), to("00:02"));
 // OK
 exclude secondExclude("weekday", from("00:01"), to("00:02"));
 // ERROR: on::on(const char *) is explicit
 exclude thirdExclude(from("00:01"), to("00:02"), on("weekday"));
 // ERROR: no matching constructor in call
 // exclude(from, to, on)
 }

-- ArneVogel

Wow, this is stuff that makes wiki great, when you learn something new and insightful. Would you mind to refactor this to KeywordParametersInCpp? ? It ought to stand on its own, as it is an interesting idiom in C++. --CostinCozianu

I would say that one reason why most people on the Wiki prefer DynamicTyping over StaticTyping is most people on the Wiki use object-oriented languages. StaticTypeSafety doesn't help much when you have RuntimePolymorphism. A static type checker can detect many situations when you are using a type incorrectly, but doesn't help very much when you need to implement a type. The latter case is very common when doing OO programming, and a much more common source of mistakes. Static type checkers also don't help check the protocols by which an object may be used: it can check that I am calling the correct methods of an object but not that I am calling them the right number of times and in the right order. Again, that is a much more common cause of errors than calling methods not actually implemented by the object at all.

For OO, static typing doesn't help very much, and is often a hindrance. Perhaps, one day, there will be languages with type checkers that better support polymorphic languages and protocols.

there are a whole lot of times when you need polymorphism but not dynamic polymorphism - static polymorphism is possible today in c++ as already stated. static type check can help a lot here - for instance i develop algorithms for space hardware - we need to optimise the word lengths of all busses in the design, we use class that can statically check that bus sizes on either side of operations are consistant so we dont have to run for hourse before finding this out - JamesKeogh

I strongly suspect that Daniel's second explanation is sadly true. There have been a number of cases in the news over the last few years of potentially life-threatening problems due to confusion about units of measurement. Hwoever, AFAIK very little work has been done on checking units of measurement at compile-time - surely this is going to become more important as computers become more pervasive. Picking up a problem as your spacecraft is coming in for a landing on the Martian surface is a wee bit too late... See SmartData - I'm still waiting for some feedback, by the way. --PaulMorrison

We have classes for all units, and use automatic coercion to add, say, millimetres to inches and store the result in NauticalMiles. We don't get units conversion problems.

Interesting! How many classes do you support, and how did you get around the naming problem? I assume we are talking Java here...? --PaulMorrison

It would be ugly as all get out in Java. In C++, on the other hand, you can just use BoostUnits, which works brilliantly. -- ScottMcMurray

In the JavaLanguage, you can use the UnitsOfMeasure API (unitsofmeasure.org) assisting the UCUM standard. -- WernerKeil

See CompileTimeTypingProblem, StaticTyping, TypeSafe

CategoryLanguageTyping