A language construct is said to be a FirstClass value in that language when there are no restrictions on how it can be created and used: when the construct can be treated as a value without restrictions.
FirstClass features can be stored in variables, passed as arguments to functions, created within functions and returned from functions. In dynamically typed languages, a FirstClass feature can also have its type examined at run-time.
Languages vary as to what is FirstClass. Some, such as the CeeLanguage have only basic types (ints, pointers; in particular, arrays are not truly FirstClass though the array/pointer equivalence lets you fake it in most situations). In ObjectOriented languages such as CeePlusPlus, objects are FirstClass but classes are not, while in SmalltalkLanguage or RubyLanguage, for example, all references including references to class objects are FirstClass. In FunctionalProgramming, functions are FirstClass.
Some authors, such as Raphael A. Finkel in AdvancedProgrammingLanguageDesign, also define the terms SecondClass and ThirdClass? (pg76); different authors have varying definitions of exactly what properties things should have to be considered first, second, or third class, but the terms are typically broadly similar to Finkel's.
| Class of value
Manipulation | First Second Third
===============================+================================
Pass value as a parameter | yes yes no
Return value from a procedure | yes no no
Assign value into a variable | yes no no
I think all attempts to define other classes that the FirstClass are bound to fail, because different things that you can't
do depend on the language paradigm. FirstClass depicts something relatively clear-cut in every paradigm: the full-fledged ability to be a value
. -- PanuKalliokoski
- I would say that that critique applies only to SecondClass, not third, and is bound to be "flawed" rather than "fail", but yes, there are problems. Still, it's mildly useful to have at least an antonym for FirstClass. Also I notice that authors often go into more detail when they use this added terminology, which can help in exploring the various issues.
- I'm not so sure about that "flawed" thing. C does not have inexact integers: so are these third-class in C? What significance does the division between second and third class have in languages that do not have procedures, functions or methods? But I agree that FirstClass should have a counterpart; I'd call it non-FirstClass.
I.e. you can do "anything" (treat it in all ways like any other value) with a
FirstClass construct, nothing value-like with a ThirdClass
? construct, and a little bit with a
SecondClass construct (in Finkel's terminology, only pass as a parameter).
He gives as example that procedures (qua procedures: closures or function pointers) are ThirdClass? values in Ada, SecondClass values in Pascal, and FirstClass values in C and Modula-2.
This also shows shortcomings in this terminology; you can treat function pointers as values in C, but the function pointer isn't a closure, so FirstClass functions in C are sharply less expressive than FirstClass functions in e.g. Lisp.
If one wants more complex classification of the NthClassness? of different programming concepts, one should take into account at least the following: ability to pass, return, store, compare, construct, type-identify and destructure (inspect) the value at RunTime. For example, C functions lack the abilities to construct and destructure. -- PanuKalliokoski
The problem is not so much with the term "
FirstClass" as it is that the term "function" doesn't mean quite the same thing in every language.
The question may arise, what does it mean for something to be ThirdClass?, if you can't do "anything" with it? That means you cannot treat it as a value, but it has other uses. For instance, types in C such as int are ThirdClass?; they cannot be used as values in any context at all, but obviously are necessary for declarations.
ThirdClass? types implies no TypePolymorphism?, which historically has been the AchillesHeel of strongly typed languages. For instance, Pascal originally was crippled in regard to things like string manipulation, because strings are arrays of characters, but Pascal did not allow polymorphism on the array length in user-defined functions.
C++ and Ada implement limited TypePolymorphism? by allowing modules and procedures to accept type parameters (generic modules, generic procedures), creating a general template and then specializing it (at compile time) for a particular type.
CeeLanguage PointerCastPolymorphism is a traditional way to implement TypePolymorphism? under programmer (rather than language) control, which, although error-prone, allowed manipulations in C that were literally impossible in more "strongly-typed" languages such as Pascal.
This has lead to a great deal of confusion and HolyWars right up to the present, where advocates of strict static strong typing tend to view escapes from strong typing as an obvious morally wrong thing that leads to bugs, and that therefore C programmers are a bunch of yahoo cowboys who can't be bothered to self-discipline themselves enough to switch to a language like Pascal to avoid such an obvious source of bugs - while C programmers have been baffled that programmers would advocate giving up something as powerful as TypePolymorphism? (even though it has to be hand-managed via PointerCastPolymorphism) merely because it can be a source of bugs, since isn't it better to be able to write a program that might have bugs, than to not be able to write it at all?
- The PointerCastPolymorphism page mentions the classic example of qsort in C, with the usual sneering tone of voice - perfect example indeed; parameters to qsort are error prone. And generic qsort cannot be written at all in the original standard Pascal.
So therefore I come to this conclusion: BondageAndDisciplineLanguages
? (strict, statically typed) that do not provide
FirstClass types for TypePolymorphism
? are evil, when they've been telling us they were good for us all these years! :-)
- This was meant half tongue-in-cheek, but since this is a serious (and flame-warred) subject, to clarify: purely statically typed languages have lots of advantages, but I believe it's been proven beyond reasonable doubt that they still need a dynamic-type escape valve. There are those who say something similar about retrofitting static types upon nominally dynamically-typed languages such as Lisp, but as far as I know, that solution (workable or not) is still fairly ad hoc by comparison -- the mathematicians spend their time analyzing fully-statically-typed systems, and dynamic gets short shrift.
Meanwhile, dynamically typed languages like Lisp have just been watching all this with bemusement for half a century...
-- DougMerritt
There's the middle path: MlLanguage's offspring with TypeInference and ParametricPolymorphism.
Yes indeed, but commits a related sin. As discussed recently on OcamlTypeSafetyProblem, ML-family languages do in fact need a dynamic type for some things, yet (a) that feature is still lacking, and (b) a need for it is widely (although not universally) vehemently denied by many ML-family fans - see e.g. recent Lambda The Ultimate discussions.
- The discussion in OcamlTypeSafetyProblem was very interesting, thank you. However, I should probably point out that the Ocaml type system provides at least two ways to safely make services such as Marshal.from_string: the first one is polymorphic variants (the routine can return `Float of float, `Int of int and so on), the second one is to throw different exceptions depending on the type. The choice to have a straight and unsafe routine that returns 'a is obviously a choice of the library designers. My point is that the type system of Ocaml provides enough facilities to do this cleanly, just that they've decided that total type-safety (absence of "unsafe" functions) is not a worthwhile goal.
- Hmm. Did you read the referenced paper proposing dynamic?
- Now I skimmed through it. Their proposal is relatively well thought out, however, as they say, a tagged union type is like a "finite" dynamic type, and polymorphic variants, which are infinite tagged unions, actually are by consequence dynamic types. The usage of exceptions, I found out in some Ocaml tutorial discussing how to implement RunTimeTypeInformation in Ocaml. Having an explicit "dynamic" type would probably be cleaner, though. But the unsafety of Marshal.blah is not a big problem IMO. Ocaml always had a hackish approach: this issue is somewhat similar to the reservation of one bit of every value for gc information... (It is somewhat flawed, however, that the routines from Marshal library can't use the type information available to throw an exception on illegal unmarshaling. (Or could they?) For compiled programs, the type information would always be there, unless the value is not used at all, in which case there is no problem anyway.)
The wider point is that it has been common (well, since 1954 or so) for language designers to ban feature X in order to provide feature Y (such as an improvement or guarantee of absence of a certain kind of programmer bug), claiming "Y is important, and anyone who uses X is a bad person, anyway; you don't need that!". Somehow such claims have generally been extremely persuasive - yet ultimately wrong. Yes, Y is important (type safety in this case), but required throwing out both dynamic typing and polymorphism.
This should be a matter of common sense, but so long as major ML dialects lack dynamic, we've still got major problems on our hands.
The situation is much worse in other mainstream languages. Most languages outside of functional languages have been very slow to admit the need for polymorphism. C++ templates are, of course, far better than not being able to do generic programming - except the cure is almost worse than the disease. C# and Java are adding their own variants of C++ templates, promising they will be "better", but this seems dubious. The real problem is that they're desperately avoiding FirstClass types!
Lisp got that part right, but went wrong in insisting that there's no value in static typing; the BondageAndDiscipline languages are correct that type safety is desirable - compile time rather than run time errors - if sacrifices to that god aren't required.
And meanwhile scorn continues to be heaped on poor little C, because it uses an error-prone old-fashioned approach to polymorphism, rather than using the shiny modern but broken template approach.
Poor thing. C has obvious problems, but in a way, was the only language to get it right. We're still waiting for the ultimate replacement for C, and instead what we get are empty promises.
The ML camp is unlikely to deliver on their potential any time soon; they insist on calling dynamically typed languages "untyped". That reflects a very deep and unjustified prejudice that inherently prevents movement in the correct direction.
- Of course, any language that uses different syntax for floating-point and integer arithmetic deserves to be sneered at. :) Now, if OcamlLanguage did that because "integer addition" is a different operation than "floating point addition" (I could add two floats and get the sum, truncated, or I could use the "float add" on two ints and get an int back because for integral arguments, they're the same) I wouldn't chuckle as much. But we all know that this was done as a sneaky way to aid the typechecker. :)
- Since you're using a hypothetical, I'm a bit confused as to what you mean. After all, we know that floats and ints have different hardware representations, and the hardware does in fact have different add operations for the two, so in reality they really are two different operations. So I'm confused; explain your hypothetical?
- I'm referring to OcamlLanguage's use of one set of operators (+, -, *, /) for integer arithmetic, and another (+., -., *., /.) for real-number arithmetic. (Not sure if SmlLanguage also does this. It doesn't; these operators are overloaded as a special case in SML.)
- In my mind, at least, having to encode type information in operators is worse than having to specify type annotations in declarations. HaskellLanguage doesn't need to resort to that sort of stuff. Now, if the only difference between the two were whether or not the result was truncated or not, I wouldn't complain so much - though I'd rather write floor(a+b) in such a case and have the + operator mean what it means in math. But the distinction between the two sets of operators appears to be present in order to give hints to the TypeInference engine. A HighLevelLanguage (from the programmers' perspective) ought not worry too much about how integer and FP math are implemented in the underlying hardware.
- SmlLanguage has unified arithmetic operators, but requires using them in a way that makes the type fully specialized. The problem is that for parametric polymorphism like map, the code is identical for all type instantiations, but for ad-hoc polymorphism like expmod, the resulting code is very different for, say, integers and floats. HaskellLanguage's existential classes are a full-fledged mechanism for ad-hoc polymorphism, and they produce separate code for different type instantiations, like C++ templates. A similar effect with more explicit markup is achieved by ML's modules.
- Although opinions vary, I was with you up until the final sentence. Ideally we don't want to worry about implementation in general, but floating point is a well known case where it must diverge from the abstract of real numbers, and also from integers, so the question still remains as to what you mean by "worry too much", since we do have to worry somewhat.
- What I meant was; your average application programmer ought not be exposed to HW implementation details like the existence of a separate floating point unit - I was responding to the suggestion above that the fact that integers and IeeeSevenFiftyFour floats use different registers/opcodes in the InstructionSetArchitecture?s of microprocessors; that this justifies having different operators for integer and floating-point arithmetic in a high-level language. I agree 100% that for any serious computing involving floating-point math, an understanding of the errors present in the underlying real number implementation (whether FloatingPoint?, FixedPoint?, fraction-based, continued fractions, or some other technique) is important. Proper NumericalAnalysis is, sadly, not obsoleted by good typing systems.
C#'s reflection library does actually provide first-class types. Through the reflection library, types can be constructed, fetched, manipulated, etc. The TypeInfo? object has everything you need to do whatever dynamic dispatch you like. The catch is that to do this you'll have to do a lot of cruel and dangerous casting, fetching bizarre objects from dictionaries, and generallly handling the land of dynamic typing with the kid gloves that a static language is supposed to use in such cases.
The fact is that run-time constructable types pretty much requires a Python-style "member access by dictionaries" approach which is (a) horribly slow, and (b) completely removes compile-time checking. Otherwise, how do you provide the methods? How do they know what's in their namespaces if they are defined before the object is created?
Personally, I still think that the best approach for statically-typed languages is a hyper-powerful metaprogramming mechanism. At some point you have to suck it up and admit that code generators are necessary for compile-time programming and settle into the nitty gritty of making them better than the C preprocessor or the C++ templates.
Just say yes to FirstClass types!
Just because you fly FirstClass doesn't mean you will get VeryGoodSeats.
- Someone had to say this eventually.
- And one would predict that "someone" would be "Pete". :-)