Language Gotchas

Every language has "gotchas". These are oddities that stump you for a while when you first encounter them. This is not meant to be a HolyWar about which language is better, but rather a catalog that helps experienced developers transition to different languages more quickly, by pointing out things which don't behave like one would expect compared to other languages. It is kind of the DeltaIsolation method of learning. Generally, languages have their own pages if you want to complain about the reasons for given features. (Feel free to add language-specific links.)


PhpLanguage


JavaScript


PythonLanguage

 >>> def f(default=[]):
 ...  default.append(10)
 ...  print default
 ...
 >>> f()
 [10]
 >>> f()
 [10, 10]
 >>> f()
 [10, 10, 10]
 foo=`bar`
means get the output of command bar in other languages, in python it is equivalent to
 foo=repr(bar)
 >>> a=10
 >>> id(a.__xor__)
 10263568
 >>> id(a.__xor__)
 10263504
 >>> id(a.__xor__)
 10262672


CeePlusPlus/CeeLanguage

 char *int_to_string (int i)
 {
char buffer[256]; 
sprintf (buffer, "%d", i);
return buffer; /* Evil and nasty */
 }

Often works initially, until buffer is overwritten by some other function's stack frame. This should be an obvious DoNotDoThis? "(ThingsYouShouldNeverDo --MarkLaBarbara)", but I find this sort of stuff in code reviews far more often than I should. A common newbie mistake.

 void *pointer2 = realloc( pointer, some_num );
 if (pointer2 == NULL)
   do_something_about_the_failure();  /* usually try to crash gracefully */
 else
   pointer = pointer2; /* and don't forget to update other pointers that are supposed to point to the same data... */

This code really, REALLY looks like it should compare zero to certain bits from a:

if (a & MASK == 0) ...

but instead it parses as horribly broken nonsense, because == is evaluated before &:

if (a & (MASK == 0)) ...

So you must explicitly add parentheses to get the desired effect:

if ((a & MASK) == 0) ...

(I'm curious whether anyone can explain why these operators parse this way. Is there ever a situation where it's helpful?)

''Dunno about C++, but in C# besides the bitwise usage you can use the & with logical operations as a non-shortcircuiting "and" for those cases where the side-effects are significant. Should work the same in C/C++, although I'd be wary of using it on non-boolean objects. In that case their late position in evaluation order becomes useful.''

Related: CppGotchas (book)


JavaLanguage

 public void fn()
 {
StringBuffer sb = new StringBuffer("abc");
f1(sb);
System.out.println(sb.toString()); // will print abcd
f2(sb);
System.out.println(sb.toString()); // will still print abcd
 }
 private void f1(StringBuffer s)
 {
s.append("d");
 }
 private void f2(StringBuffer s)
 {
s = new StringBuffer("xyz");
 }
Passing in an array if you need to change the object reference. I find it sad that over 2/3 of the fresh graduates that I interviewed said the second println will print "xyz".

This isn't a Java thing... it's an OO thing in general... other languages share this same behavior...

Or it could be a result of JavaPassesByValue...

Again.... Java isn't the only language to do this... CeeSharp does exactly the same thing by default... that's just how OO works... passing a reference object, will always have the effect of allowing changes to the object to show to all callers; that's what it means to be a reference object. This isn't a language gotcha... it's a knowing the difference between a reference object and a value object gotcha. -- AnonymousDonor

PhpHypertextProcessor doesn't work like this. By default, it passes objects - whole objects, not references to objects - by value. This is a frequent source of bugs in PHP, as modifications to an object don't show up in the caller unless it's explicitly passed and stored by reference. This has lead some people to complain that PHP is not object-oriented.

Then PHP is treating its objects as value objects, much as an int or string would be treated.

CeePlusPlus also doesn't work like this. Unless explicitly marked, parameters are passed by value and don't behave polymorphically. Unfortunately, this has not lead people to complain that C++ is not object-oriented. -- JonathanTang

Then C++ is treating its objects as value objects, much as an int or string would be treated. None of that changes the point that this isn't just a JavaLanguage gotcha, it's a value vs reference thing, as I said before, CeeSharp and even VisualBasic share this same behavior. The difference between a value object and a reference object is conceptual; the language is irrelevant to understanding the behavior difference between them. One must of course know how the language passes parameters... but that doesn't make it a Java problem.

Java doesn't differentiate between objects and references-to-objects, and because of this it has a ConceptualMismatch? between variable access and variable assignment. In the above code, some operations on what appears to be a StringBuffer (but which we all know is a reference-to-StringBuffer) act on the object (such as the append function), and some operations (such as assignment) act on the reference. More confusing still are Java's compound-assignment operators (such as String's +=) which act on both the object and the reference.

CeePlusPlus deals with this properly by not confusing reference objects (pointers and smart pointers) and the objects they refer to. Consequently, you need to distinguish between . and ->, and you need the & and * referencing and dereferencing operators in order to avoid this Java inconsistency. Note that what CeePlusPlus unfortunately calls references are actually not references in this sense, and are just aliases.

I'm still failing to understand what anyone finds confusing about this, seems perfectly logical to me. Here's the gist of my understanding, just to put it on the table for analysis. Objects exist somewhere in memory, variables hold pointers to those objects. I'm going to use the term pointer in the abstract sense, not in the CeeLanguage sense. Variables can either be values themselves, or be pointers to objects located in memory. When passing variables, you can pass by ref or by val (language-dependent of course), so naturally if you pass a variable that points to an object by value, then that variable is copied to create a new variable in the function, but it still points to the same object. If you pass a pointer variable by ref, then it isn't copied, it's just handed in to the function as is. Thus if you change the pointer variable to point to a new object and it was passed by ref, you change it for the function's caller too, whereas if you change a by-val pointer variable it has no effect on the caller since you received a copy of the caller's pointer. But all of this only applies to the variable, not to the object it points to. Regardless of passing by ref or by val, all changes to the object will be reflected to anyone holding a pointer variable to it. Thus passing a reference object by val appears to still pass by ref, because a reference object is a reference object, regardless of how a pointer to it was passed. Passing a pointer to a reference object by val, just makes a copy of a pointer, but it still points to the original object.

Wow, that was way more complicated to explain than I thought it'd be, seems so simple when I visualize it in my head. I think what makes it easy for me is that I don't see variables as objects, merely pointers to objects. I intuitively understand that passing by val is just copying the pointer and not the object, because I know I'm not passing objects around, only pointers to them. To me, the important thing is value vs reference semantics. If I have a customer object, that's a reference object, there should only ever be one of any particular customer, though many other objects may have pointers to it, any changes to it reflect to all pointers to it. But if I have money object, that's a value object, it doesn't have identity, so I don't pass around pointers to it, I pass around copies of it. Every variable points to a different object, thus no aliasing issues. You can fake this by making the object immutable. I feel passing all values by val as the default, is the correct thing to do, I think JavaLanguage does it right, in fact, even though in CeeSharp I can pass variables by ref if I so choose, I've never found a reason to.

(Digression can be unindented, since it's really a digression on a digression that's now on WhyIsTheFirstArgSpecial :)

[Discussion on why it makes sense to pass the first, 'this' parameter by reference moved to WhyIsTheFirstArgSpecial]

Keep in mind that this topic is about gotcha's, and not really about whether something is "logical". It is about being tripped up regardless of whether that tripping is from bad language design or one's own ignorance.

No, this topic is about LanguageGotchas, tripping on ones own ignorance is not a LanguageGotchas. LanguageGotchas are where the language doesn't do what you'd logically, based on experience, think it would do. Like scope in JavaScript, has C's syntax, but not C's scoping rules, or VB.Net's changing the way scope is handled from VB 6, which only scoped at the method level, not the block level. Programmer ignorance however, can't be blamed on the language.

I don't see why Java arguments aren't all const anyways. You shouldn't be editing the argument in valuetypes, and you shouldn't be editing the reference itself in objects. The compiler should tell you so - it would eliminate this whole class of gotchas, as they'd be compile-time errors. If you need to conditionally substitute an argument, then make it a new variable. I expect this kind of "enough rope to hang yourself" mentality in C++, but not in Java.


VbClassic

This is just the ordinary meaning of CallByValue and CallByReference . It's not exactly a gotcha. It would be a gotcha if VB *didn't* act this way


CommonLisp

See http://wiki.alu.org/Lisp_Gotchas


SmalltalkLanguage

There are 3 other widespread gotchas in ST and there's one esoteric gotcha involving execution order with side effects, because compilers are single-pass but the language spec is triple pass. So in the example:

the jump: executes before the next. I should check out whether it actually works this way though.


PerlLanguage

Too many to mention? Maybe; take a look at the official list at http://perldoc.perl.org/perltrap.html

The first traps mentioned are not using the 'warnings' and 'strict' pragmas. These prevent a whole host of miserable problems in exchange for a little extra discipline.


Oracle SQL


MySql

(Not really a "language". Perhaps topic should be renamed to something more general, like "Tool and Language Gotchas" or the like.)


SQL (General)


http://stackoverflow.com/questions/1995113/strangest-language-feature


CategoryPitfall, CategoryLanguageDesign (learn from experience)


EditText of this page (last edited September 5, 2014) or FindPage with title or text search