Python Problems

Describe ways in which the PythonLanguage doesn't fit your idea of an IdealScriptingLanguage? or IdealProgrammingLanguage.


1. Problems which make Python substantially unsuitable for particular tasks, or which cause widespread lack-of-adoption

{WetWare is important too. Languages are for humans first and computers second.}


Speed

Python is much slower than some other languages. Typically it is about 100 times slower than C in low-level number-crunching benchmarks (e.g. tight inner loops processing arrays of data.)

Some people think Python is too slow: I've never found it a problem. JPython fixes this to some extent by taking advantage of the VM's JIT, and CPython makes it fairly easy to rewrite performance-critical things in C. -- MartinPool

Speed is not an issue. Python is a language waiting for its niche. To focus on speed would be a premature optimization. --MarkJanssen

If you're hitting a hard drive, or a database, or the network, or the file system, these operations are generally thousands or millions of times slower than anything you'll be doing on the CPU. Most software in the world is IO bound, and changing programmnig language cannot make one iota of difference to their execution speed. Some people care nonetheless, because they need very high scalability, but for most programs, it simply doesn't matter.

Additionally, the choice of algorithm matters far more than the speed of a particular language. Choice of language generally provides a linear factor of speed-up or slowdown, such as the factor of 100 mentioned above. Choosing an algorithm, in contrast, can result in switching from O(n) to O(log n) or O(n^2), etc. For large n, these latter changes turn out to be much more significant. In circumstances where the correct algorithm is obvious (eg. very simple low-level algorithms typical of number-crunching benchmarks), then the constant factor due to language choice dominates. However, in complex, real-world problems, the choice of algorithm becomes much more significant, and the choice of language much less so. Languages like Python that value succinctness and expressivity can even aid in choosing an appropriate algorithm, in which case they can start to substantially claw back their traditional performance handicap. I've been working in Python for over 3 years now, and in practice have not found performance to be any more of a problem than it used to be in C# or C++. --JonathanHartley

So far, JPython's speed has been ok for us. My main additional requirement for the IdealScriptingLanguage? would be very fast access to large numbers of objects with transparent persistence, for example for analysis of large amounts of read-only data. -- RichardDrake

IronPython is much faster than Jython or the reference implementation.

IronPython doesn't do everything the reference implementation does. (That said, it may still be faster once it does).

Oh, and people looking for better performance might want to take a look at PsycoPython; it's a normal python module that JITs to machine language without any structural changes necessary in the code (there's a two line addition to make in the main script), in a fashion that (if I understood correctly) simply blows java's socks off.

Vastly exaggerated. I've seen a speedup of really slow code by a factor of 50 once. At all other times, it was a factor of 2-4, so Java would typically still be a lot faster than psyco-ized Python.


Lack of compile-time type checking

Because Python is dynamically but strongly-typed language, it can do no compile-time checks for type consistency. Many programmers who are used to static languages find this distinctly unnerving, imagining the proliferation of hard-to-detect errors which could otherwise have been caught easily at compile time. In practice, for small and medium-sized projects at least, this does not seem to be a problem. There is a substantial debate about whether it would become a problem at large scales. Experienced dynamic language programmers believe that the opposite is the case. But at present there are very few data-points from which to empirically demonstrate this one way or the other.

Again, not an issue. Every programming language has an implicit DataModel. Most every language is sub-optimal in this regard. Until a UnifiedObjectModel is available, it doesn't necessarily make sense to make types so rigidly strict to conform to some de facto one dictated simply by computer history. See ObjectOrientedRefactored. --MarkJanssen

[I think you're misunderstanding the purpose of static TypeChecking. It has nothing to do with something "dictated simply by computer history", and everything to do with eliminating a possible category of programming errors at compile-time that could otherwise occur at run-time. Also, few programming languages have a DataModel, implicit or otherwise. A DataModel is a collection of data structures and rules for using them. Other than whatever primitive or built-in types a language provides, languages with a DataModel are typically database languages like TutorialDee and SQL, and list-based languages like Lisp.]

I understand static type checking (I came from C and Pascal). What I'm saying is that the requirement to check for types is simply a point of rigidness that is a by-product of computer history, more than design. The history of computers was for calculation, but in the modern era, it is for moving data. This puts constraints in the wrong place. To explain beyond that will require a more lengthy dialog. --MarkJanssen

[What do you mean by "this puts constraints in the wrong place"?]


Lack of tools like intellisense

Because it is dynamically typed, it is difficult for an IDE or editor to provide tools like intellisense and autocomplete. Programmers used to a static language IDE will definitely miss these features at first.

In theory, it is not possible for an editor to know what attributes will be present on a dynamic language object until runtime, because the type of any variable could change, or any particular instance could have members or methods added or removed. In practice, this isn't quite as bad as it seems, because:

a) An educated guess based on static analysis of the code will yield the correct attributes for each variable in 99% of cases. Modern versions of Visual Studio demonstrate this technique with IronPython, as do grass-roots projects such as the 'pysmell' plugin for Vim, Emacs and others. b) As a last-ditch fallback, having an autocomplete which simply offers all previously detected tokens (such as Vim's omnicomplete) tends to be sufficient.

This seems to be a specific example of the interesting observation that more powerful languages make it harder to write useful tools for them, but on the other hand they make up for that by making powerful tools less necessary.

This is no longer true. Even the comes-with-Python IDLE (Integrated Development and Learning Environment) has autocomplete. Advanced tools like ipython, PyCharm, EclipseIde and Komodo all do a great job at autocomplete and 'intellisense'.

I don't understand how autocomplete can ever be 100% accurate in a dynamic language, to the same extent as in static languages, where people routinely use autocomplete as documentation while exploring an API. The tools mentioned above use all sorts of interesting techniques, like type inference using static analysis, to infer that object A is the same type as object B, and then when they see usage of A.c, they also infer that B has an attribute called 'c' too. But in dynamic languages the available members of any object can change at runtime, so how can any tool going to be able to discover this at edit time with 100% reliability?

Answer: They aren't, but much to everyon'e surprise it turns out that it doesn't matter. 95% of the way there is good enough to churn out code very happily. If you have dynamic aspects to your design, they are an exception rather than the rule, and you are aware that the autocomplete's not going to spot it.

More information on Python IDEs is at PythonIde.


The significant whitespace issue

Many programmers have an aversion to Python because they dislike the way indenting is used to denote control blocks, rather than delimiters such as braces or 'begin/end'.

However, it can be argued that this approach has significant advantages. See the PythonWhiteSpaceDiscussion.

Regardless of whether this is a real problem or simply a perception of one, it does unfortunately seem to be a significant hinderance to Python adoption.

-- Good, I wouldn't want to use any Python library created by people who can't even handle how to use whitespace.

- GvR keeps telling me that the more I use Python the more I'll come to appreciate semantic whitespace, but I still don't like it.


2. Problems with localised impact


As a person who hacked the interpreter for a bit, here's my personal list of gripes:

-- MartinZarate


Commenting issues

Python has no way to easily comment out multiple lines of code, other than by using the hack of designating such code an extended string with the triple single quote operator, of course, the hack fails when the lines being "commented out" have extended strings in them. -- AndyPierce

Often it is sufficiently non-ugly to indent the sequence of lines in question and insert a single "if 0:" above it. -- PeterHansen

I seem to have standardized on triple-quoted strings with single-quote (apostrophe) for regular usage, which means I have the triple-quoted strings with actual quotation marks (") available for commenting out huge blocks of code, even the entire file. Seems to work quite well. -- PeterHansen

You can also use the standard line comment # in front of every line. My text editor does this with a macro. The only caveat is to make sure you aren't cutting a triple quote in the middle. Syntax colorizers usually do a good job checking for this. -- SeanOleary

Plus, Python comes with its own IDE, "IDLE", which handles Python's indentation...


Private members only enforced using name-mangling

By convention, class attributes and methods and methods are public, unless they starting with a single underscore, in which case they are intended to be private, but this is not enforced in any way. Those starting with double-underscore are hidden from easy access by name-mangling, but a determined developer can still work around this to access the 'private' members.

Using no underscores (which is common) or single underscores doesn't take advantage of valuable checking that encapsulation is not being violated. (is mangleList meant for internal use or external?). You could always use a comment, but such comments always fall out of sync with the code unless the compiler can be enlisted to enforce them.

The philosophy of Python on this issue is that it is usually better if the language does not enforce privacy. If you want to be a good user of a library and not abuse any of its internal members, then you can easily do that - simply don't use its underscore members. If, on the other hand, there is a bug or limitation of an existing library which you'd like to work around, having access to its privates in a pinch can be very handy. This of course means your code may not work with future versions of the library - the responsibility for that lies upon you. The idea is that the language should not prevent you from doing things, it should empower you to make the right choice under all possible circumstances. Not enforcing 'private' allows you to make an educated decision about whether to violate it or not. In practice, I have not seen this cause substantial, systemic or common problems, but I have seen it save the day on a few occasions. --JonathanHartley

Think about it in SmallTalk way. Most OOP Pythoners don't use double underbar often. Instead of it, they use single underbar. The interpreter wouldn't check it but when you are typing it you know that it's private. (some JAVA programmers prepend "my" in front of private members, like myAge, myHeight, etc. anyway.) It's very obvious that those are private members from their name. And if you really want to make it private or die, make setter/getter that will protect from outside, which is quite easy in Python. (I don't see how setter/getters help. How can they prevent external access to private members any more than double-underscores can?) (Presumably a getter without a setter could be used to allow read-only access to internal data)

This are also problems because of name collisions. If you want to derive from some existing class, you need to know what its such semi-privates are so that you won't accidentally create your own semi-private member with the same name in the subclass and screw up everything. The name mangling when using __ prevents such accidental collisions. Since test driven development is highly encouraged in RAD languages like Python, this kind of bugs will get reported quickly, but they can still be sneaky to realize.

While single underscores prefixing a member are "enforced" by the importer as more or less a "de facto" convention that can be overridden (I don't think the last sentence is meaningful or true. Can we delete it?) Double underscores are enforced by undergoing a mysterious mangling process, prepending the the current class (determined lexically) to the name at each reference. This sort of sleight-of-hand in a language with a high degree of introspection tends to violate the PrincipleOfLeastAstonishment.


SelfDotSyndrome

"self" (the Python equivalent of C++ and Java's "this") is never implicit in Python: You always have to name it:

  self.func1(self.var1, self.var2, self.var3)
While this greatly simplifies the life of the compiler, and may make it easier for a naive reader to read the code, it adds a lot of syntactic noise to lines such as the one above.

By the way, the example given is a bit unusual - normally you don't need to pass arguments with 'self' in front of them to methods, because the method already has access to these instance variables. Although there are circumstances where passing them like this might be required, it is fairly unusual. Generally you'd simply call the method without these parameters:

    self.func1()
Oh yeah? Well what about this?
  self.myData[self.x][self.y]
That's true - this does happen more frequently, and it is a little verbose. If "self." is too much, you could go with "s.". Of course, that has nothing to do with the aesthetic value, only the pragmatic one. I do happen to like "s." being explicit, because even if you seldom get local variables and instance variables messed up.

Yes - I'm pretty sure explicit self was done deliberately for the benefit of the reader, not the compiler, to always make it obvious where variables and functions are coming from.

It is very nice to directly see if a call is to a global function or a method. (Yessirree, I do use global functions, and I don't think they are evil. A class is a kind of global function, anyway - Python does not even have special syntax for it.)

"me." is another short possibility. Even "_." works, and it looks just like the "_" prefix I like to put on instance variables when doing Java, but "_." is probably not the most idiomatic choice, so I only use it for "private" code. -- fb

More often than not I find "self" very helpful, whether you say I'm a naive reader or not. Not only when reading: when I'm writing, too, I can be sure if I'm talking about object members or local variables. Some java programmers use a coding convention that will differentiate the two anyway, such as aNumber, theNumber, myNumber, etc.

I mostly agree, but I don't like the assymetry between defining a method "def f(self, arg1): ..." and calling it "self.f(arg1)" Normal functions keep the call the same as the definition. -- jJ

It's a good idea to use "self" rather than "s" or "_" or "this" or "me" - it makes it easier for others to read your code if you comply with the convention, and although Python doesn't care what name you use, using something other than 'self' will confuse tools like PyChecker, which expect that convention.

There is also one good reason for explicit self - it gives you the option, in situations where you know what you are doing, to call a class method on an object which isn't an instance of that class:

  class A(object): pass
    def f(self):
      pass
    def g(self, other):
      A.f(other)
Here we are calling method 'f', passing in 'other' as the value to use for 'self'. 'other' might not be an instance of A, in which case it would not be possible to simply call 'other.f()'. This will make static typing devotees uncomfortable, but this sort of trickery - used in moderation - is what makes dynamic languages so darn handy, and it wouldn't be possible without explicit self.

Does any other language let you bind "self" to a method reference?

Yes, most functional languages (FunctionalProgramming). If you can decide how to represent objects.

 ref = self.method
 ref() # calls self.method()
This is cool because in UI callbacks you can piggy back your 'self' in and use it in the called-back method.

SuneidoLanguage lets you do this in two ways:

 ref = .Method('method') // 'self' is optional
or with a block:
 ref = { |arg| .method(arg) } // assuming one argument
-- AndrewMcKinlay

So does RubyLanguage:

 ref = method('method') # 'self' is optional
 ref.call(arg)# calls self.method(arg)
Ruby also supports a block form roughly identical to the SuneidoLanguage example above. -- AvdiGrimm

The requirement of explicit 'self' references also destroys the concept of objects as static closure, and draws an artificial and unnecessary distinction between functions and methods, giving them different arity based merely on their scope. This distinction will always exist in some manner, however, since while python does capture bindings from enclosing scopes, it remains unable to modify any of those bindings (i.e. closures are effectively read-only.) This behavior is distinct from perl and ruby, which have fully enabled closures, though real-world usage of both these languages tend to exclusively use the built-in object system for objects, and not function-based static closures.

I believe the Python 3 keyword 'nonlocal' allows one to rebind names from enclosing scopes. Is that what you're after?


I wish lambdas supported statements as well as expressions

Python is so close to being able to implement Smalltalk/Ruby style collections. It's got NestedScopes now, but they're crippled by being unable to rebind variables in enclosing scopes. It's got lambdas, but they're limited to a single expression (and nobody wants to keep typing the keyword "lambda" all the time anyway :).

I believe the Python 3 keyword 'nonlocal' allows one to rebind names from enclosing scopes. Is that what you're after?

You can rebind; you just can't don't it without warning.

 x[5]=y.attr1=new_val 
works. -- jJ

If Python's NestedScopes are "crippled" because of one specific rebinding restriction, then what are functional languages such as ML, Haskell or Erlang, where you never can rebind anything whatsoever...? In Python, when you want to wrap up some mutable state and some behavior, you typically write and instantiate a class - a pretty typical approach to OO, after all. Some important sets of cases are best handled by simpler means, such as closures and generators. When you want to pass some behavior (possibly wrapping state) as an argument, you pass a callable - a function, closure, bound-method, callable-instance, whatever. Pretty simple, really.

And what's so crucial about keeping your functions _anonymous_ , anyway? That's all that lambda buys you (and it _is_ next to nothing, and Python would be better without it - pity that backwards compatibility means it will stay). Use def, give your functions a durned _name_, and off you go.

An example from JavaScript can illustrate why it's good to have anonymous functions. I'll attempt to show how giving functions a durned name can be a PITA sometimes. Suppose you have a (common in JS) function that accepts a callback. You need your callback to do something really trivial. In JS, you can write this:

 var enclosed_variable;
 someObject.callMeBack(param, function(result) { enclosed_variable = result; });

In Python, you have to give your function a durned name. Since the function is trivial, its name will be something stupid:

 enclosed_variable=None;
 def whatever(arg):
      enclosed_variable = arg

some_object.call_me_back(param, whatever)

Later, the definition of whatever will inevitably get separated from the place where it's used:

 enclosed_variable=None;
 def whatever(arg):
      enclosed_variable = arg

# About half a screenful of code right here.

some_object.call_me_back(param, whatever)

It would be much better in that case if whatever could simply be written inline.

Your last step is unreasonable - separating the definition from the use wouldn't pass code review on any project I've worked on. So what you're really comparing is your first shown Python against the Javascript. The Javascript does use one less line, (12 characters less) so that's nice. Of course, if the javascript anonymous function was even *slightly* longer, it would get broken across lines for readability:

  var enclosed_variable;
  someObject.callMeBack(param, function(result) {
      enclosed_variable = result;
      somethingElse();
  });

Which of course is barely any different from:

  enclosed_variable=None;
  def whatever(arg):
      enclosed_variable = arg
      somethingElse()
  some_object.call_me_back(param, whatever)

Except that in giving it a durned name, we open up the possibiity for 'whatever' to be used elsewhere, called from tests, etc.

lambda can be used as an expression, rather than a full statement. Personally, I just define it on the line above, but I can understand the desire to define it _right_ at use. -- jJ

Python scope sucks, but if you write a lambda that refers to locals you can build a closure by passing them into a named function. Still a hack, though, but the above paragraph was correct in spirit but not in the exact details. lambdas create closures with a single expression, but the closure itself can be big.

Why is Python going to all the trouble of introducing ListComprehensions and iterators and generators, when it could achieve all that functionality much more elegantly by simply removing the restrictions on its almost-blocks?

What do you mean, _all that functionality_? Take the typical Haskell example of list comprehensions, Pythonized:

 def quick(xs):
if not xs: return []
return quick([x for x in xs if x<xs[0]])+[xs[0]]+quick([x for x in xs[1:] if x>=xs[0]])
How do you do write a "three-line quicksort" `by simply removing restrictions on blocks`?!

Here's how I'd do it in Ruby:

  class Array
def quicksort
return [] if self.empty?
return select { |x| x < self[0] }.quicksort + [self[0]] + self[1..-1].select { |x| x >= self[0] }.quicksort
end
  end
A few things to note:

I'm not sure which version is cleaner. I'm sure the gentle reader can make up his or her own mind. But what I was trying to say is that Ruby gets by just fine without ListComprehensions, simply because its anonymous functions (BlocksInRuby) are very, very slightly nicer to use than Python's anonymous functions (BlocksInPython).

-- AdamSpitz

I think what python is missing is first-class suites - a block that can share locals() with whichever environment it gets called in. I doubt it will happen soon, though, because the good use cases may not be enough to justify the extra complication. -- jJ

you could use Enumerable#partition() (we all love blocks don't we?)

 def qsort(list)
return [] if list.empty?
x, *xs = *list
less, more = xs.partition{|y| y < x}
qsort(less) + [x] + qsort(more)
 end
cleaner,faster, longer. Golf is for PerlGolf ain't it?

You're right, that's better than mine. And shorter, too. I like it. -- as

Of course it's not quite as simple as that, because of backwards compatibility and all that. I just wish that they'd done it the Smalltalk/Ruby way in the first place. Thanks for listening to me vent. And if some Python person out there can console me by telling me why it's not so bad after all, I'd really appreciate it. :)

No. It is so bad. Without scope you are not even a "structural" language.

''What does "without scope" MEAN? Nothing, I think. What's "unstructural" (unstructured?) about:

 def makeAdder(augend):
def add(addend): return augend+addend
return add
for example?''

Can we get a Smalltalk/Ruby person to offer up a specific algorithm that's more cumbersome to do in Python? -- SteveHowell

I doubt it. I'm sure that the new iterator/generator thing is easily powerful enough to do whatever I want it to do. I'm just irritated that they introduced such a complex mechanism to solve a problem that was already almost solved. As it is, we're left with our crippled lambdas, and we're stuck having to explain to newbies what a generator is and why it looks like a function definition but behaves completely differently.

''Probably things involving callcc and continuations? How do you implement amb() in python? (one of the ruby impls is here: http://www.snowplow.org/martin/ambenv.txt) I believe StacklessPython has them?''

What's "complex" about iterators? That you call next to get the next item, or that next raises StopIteration when it has no more items? Puh-LEEze. Simple generators are one handy way to create and return iterators, of course, but they're far from being the only one. I'd still like to see SOME example of what would be solved allegedly "better" with unnamed blocks than with Python's mechanisms.

I still like Python very much, of course. But this page asked me whether Python fit my idea of the ideal language or not, and this is one way in which I think it screwed up.

If you take a look at http://www.python.org/peps/ you'll see that an annoying number of the proposals seem hell-bent on adding unnecessary syntax (i.e ternary operators or the various spawn of generators). The really annoying part is that if lambdas could somehow be extended to support statements and behave like blocks in their interaction with the local scope then we could throw away large amounts of the existing syntactic sugar. ListComprehensions, generators, generator comprehensions, dictionary comprehensions could all be subsumed.

err ... a lamda with statements is just a regular function definition. You can define a new function inside another function. There is some overhead to a function call, but you pay that anyhow with a lambda. If something is long enough to need a statement, then keeping it anonymous doesn't buy you anything except confusingly different syntax. -- jJ

Both list and dictionary comprehensions will probably be subsumed by generator expressions/comprehensions (PEP 289) in 2.4, although the syntactic sugar for ListComprehensions will remain for backwards compatibility.


Everything's a Boolean Personally I think it's misguided to allow code like this:

    if some_list:
        ...
Why not replace that with
    if len(some_list) > 0:
        ...
The difference in writeability is really neglible compared to the difference in readability. Why have a boolean type at all if you're not going to make people use it? Silly. Most objects have no real implicit correspondence to either True or False... why even give them a default one?

''The Python philosophy is that former example is the more readable of the two. The latter is cluttered and easy to get wrong. (e.g. use '<' instead of '>') whereas the former is concise and hard to mistype or misread. Don't think of the predicate as a boolean expression - after all it isn't one! Instead, think of the 'if' as accepting any type, and each type has its own rules for determining whether a particular value should be considered true or not.


3. Minor or unsubstantiated problems, fanciful suggestions, personal preferences


Function composition is missing.

I assume that you mean partial composition. Function composition is used everywhere:

 x = range
 y = 1
 z = 4
 print x(y, z)
Nor is partial composition exactly missing - it's just not in the standard library, nor is it used widely. This may change - see <http://www.python.org/peps/pep-0309.html>

:The above example doesn't exemplify function composition. Function composition would be something like this:

 c = compose(len,range)
 print c(1,4) # Equivalent to print len(range(1,4))

Output: 3

The above compose could be defined as a function in Python. Some people would like composition to be a syntactic operator, like it is in Haskell and mathematics.

This is not the same thing as partial function application (currying), where you could call range(1) and get another function that will complete the call to range when you pass it one argument, which will be treated as the second argument.


Confusing generators

Who knows what this does?:

 rangedimension = lambda x, bound : range(max(x-1, 1), 1+min(x+1, bound))
 [x for x in rangedimension(4, 9)]
Yeah - it's hardly pretty. Given a number x, it'll return [x-1, x, x+1], limited to range(1, bound). It would be cleaner if self-documented using a named function and some named intermediate variables:
 def rangedimension(x, bound):
lower = max(x-1, 1)
upper = 1+min(x+1, bound)
return range(lower, upper)
 print rangedimension(4,9)
Basically, if you're using a lambda for anything other than simple callbacks or for ObfuscatedPython fun, don't.

Is it bad to write little functions like 'rangedimension = lambda x,bound : range(max(x-1,1),1+min(x+1,bound))' using lambda expressions?

It's bad because it's unreadably congested. The function version is cleaner, more readable, and better documented [CodeIsDocumentation?]. If you were provided with the function version, the question about what it meant would probably not have arisen.


Dynamic Typing

Python's type system is strong but dynamic. This means that values have types, names don't. In practical terms this means that:

  x = 4
  x = 'a'
... does not generate an error. However, if you try to do this:

 >>> 'x' + 4
 Traceback (most recent call last):
File "<stdin>", line 1, in ?
 TypeError: cannot add type "int" to string'''
... it will return an error, like it should because you don't want to be adding ints to strings anyway.

(but you can do

 >>> x * 4
 'aaaa'
very memory consuming if x is supposed to be an integer but it is really a very long string)

BruceEckel posted an interesting article about typing at http://mindview.net/WebLog/log-0052.

In response to Python being "untyped" there is an underlying reason this is so. Python names are not the same as variables in a language like C. Python names do not actually contain values, they merely refer to objects that contain values (like a C pointer). Therefore the following is legal in Python (and troubling to C programmers):

 >>> a = 10
 >>> somefunc(a)
 >>> a = "Some String"
 >>> somefunc(a)
This makes perfect sense if you think of "a" as a pointer to an object. The second assignment merely points "a" toward another object. In this case, the two objects are different types.

Objects themselves do have types though. You can determine the type of object a name refers to using the "type()" function.

-- Caseman

This issue simply seems to be part of the ongoing HolyWar between dynamic (or weak) typing and strong typing. It is not specific to Python. Those who don't get along with dynamic languages are not going to be happy with Python or any other dynamic language.

Dynamic Typing in itself is not a problem. It has strengths as well as weaknesses. It would make more sense to split out the specific weaknesses of dynamic typing (eg. runtime speed, lack of compile time checks for type consistency), and list those as problems, to the extent that each one is manifest in Python. Because of this, I'm going to create stubs for those problems on this page, and move this dynamic typing entry, which is interesting & valuable, into 'minor and unsubstantiated' -- JonathanHartley

For an alternative dynamic typing model, consider a "tag-free" model. See ColdFusionLanguageTypeSystem.


Surprising Namespace rules

People coming to Python from Algol-derived (C or Java-like) languages tend to be surprised by the namespace rules. The following code:

 x = 2
 def foo():
     x = 1
 foo()
 print x
... prints '2', not, as some people would expect, 1. If you assign a new value to a name, it's assumed to be local unless you declared it as global. As you see in the example below, Python assumes that a name is a local variable if it's assigned to. It's natural that you can hide a global name with a local, and it's also a good thing to be able to get at global data, so there needs to be some mechanism to keep these things apart even though declarations aren't normally used.

(There is a mechanism - declare it global. Each variable gets bound in the scope the assignment appears in, be it modules, classes, or functions. That is regular and predictable. Override this behaviour when you need to by declaring variables as global.


Memory Management The reference implementation uses reference counting, the worst-performing memory management scheme there is (compared to explicit malloc, conservative or precise garbage collection). google on python-list(comp.lang.python) and python-dev for "bohm", "mark", "sweep", "reference"... RC is considered a reasonable tradeoff for the portability, and the speed hit may not be as significant as you think owing to python's object usage.

RC is considered a reasonable tradeoff for the portability, I don't buy this. There's no reason you can't implement a garbage collector in a portable manner.

Python is slow - with or without garbage collection. If the speed hit of GC concerns you, you're using the wrong language to begin with. What you're proposing is to put airfins on a Ford Model T.

'''If the problem is 'performance', that is covered elsewhere on this page, so for the moment I'm moving this 'memory management' issue into 'minor or unsubstantiated'. --JonathanHartley


Tuples

Python's oddly-named fixed-length/constant-reference arrays are a frequent source of confusion for new Python coders. The problem is that, to a new user, they look like arrays, and so are used as such... but within most Python code, they're used to model heterogeneous data (where lists model homogeneous data)... which, in retrospect, makes the actual meaning of tuples obvious: they're C#-style structs. They're constant as a means to enforce value-like semantics. Which raises the obvious question: why are they indexed with a number instead of access with named members? Therein comes the problem - PythonLanguage is dynamically typed and uses hashtables for member access. MyTuple?(1) is an array lookup, whereas MyObject?.Bar is a hashtable lookup. So, tuples are really an efficiency thing - a workaround for the fact that people need fast performance when they pass throw-away value objects around. This becomes more obvious when you look at the Python interpreter - arguments that are set using the (arg1,arg2,arg3) syntax (as opposed to optional argkey=arg1, argkey2=arg2, etc. format) are constructed using a tuple. Every Python function takes two parameters - a Tuple and a Dictionary. You don't notice, because within that python function the tuple's values are mapped into local variables... which, not coincidentally, are the only Python named objects that aren't accessed through its slow hashtables. Unfortunately, this mechanism cannot be exposed to other functionality because Python doesn't do any type inferencing. So, if we want fast access, we have to use indexed lookup instead of named lookup - hence the need for the damned tuples.

This is just plain wrong. Never, ever are you forced to use tuples. The built-in dictionaries are very fast, actually. There are also lists, which are much more frequently used than tuples and also are internally implemented as arrays. Lists are mutable, whereas tuples aren't. Tuples are more memory-efficient than lists. They also are hashable, i.e. you can use them as a key in a dictionary, which can be very handy indeed. Also, it's MyTuple?[1], not MyTuple?(1) - I suspect the writer of the above has spent about 15 min on Python.

I'm hazy on specifically what the problem is here. Many of the sentences are interesting and/or true, but none of them describe a problem, as far as I can see (apart from the reference to 'slow hashtables', which unfortunately is one of the few false statements, so doesn't count.) Can we clarify what this problem is, or break it down? For the time being, I have demoted this problem to 'minor or unsubstantiated' --JonathanHartley


Underscores for privates members are ugly

Python's mechanism for designating members as being private - prepending the name with __ - results in ugly code (I don't like the underscores):

  self.__func1(self.__var1, self.__var2, self.__var3) # ugly

The example is a bit contrived. Methods already have access to member variables, so in most circumstances this actually looks like:

  self.__func1()


double-underscore magic methods

I don't like that Python uses __xx__ to make magic happen. I don't like having to implement __getitem__ to override the [] syntax. It makes the language look patchy. There is too much magic in it. I think it is a sign that Python, the language itself, is not designed to be as extensible as Ruby or Lisp. Every new functionality that needs uncommon syntax will require another magic __xxx__. While in Ruby or Lisp I'll use Block or Macro. -- AnonymousDonor

Don't worry, everyone hates the __underscores__ at first. They don't mean "magic", they mean "method without a name" -- operators, constructors, string-representation, etc. It's a useful convention. -- MatthewBennett

It is true though, that Python is not as extensible as Lisp, but then again, what is? :-) --JonathanHartley


lack of a do-while loop

Some people wish that Python had a do-while loop, ie. that checks the condition at the end of the loop, rather than at the beginning:

 # not valid Python
 while:
     someStuff()
     until condition
But instead, they have to use something like:
 while True:
     doSomething()
     if condition: break
or
 doSomething()
 while condition:
     doSomething()
The final alternative is criticised for violating OnceAndOnlyOnce, but if your project was failing due to lack of this feature, then you might find the workaround above that to be just about workable.


I wish colons and indenting could be used to create lambdas

Myself, I'd just like to see the colon become a general syntax item. Want a lambda?

 mylambda = (foo, bar):
foo.baz(bar)
return bar.debaz()
Maybe even for lists, tuples, etc:
 aList = (:
"hello"
"world"
(Hey. I don't understand what the colon is doing here. Can anyone clarify this example?)

On second thought, in order to make those useful at all, one needs to have the ability to continue the first line after the block, kinda like:

 (foo, bar):
foo.baz(bar)
return bar.debaz()
(bing, bong) 
or
 (foo, bar):
foo.baz(bar)
return bar.debaz()
(bing, bong) 
being equivalent to the lispy:
 ((lambda (foo bar)
(baz foo bar)
(debaz bar)
 ) bing bong)
Of course, you would use spaces instead of tabs, so that the indentation doesn't get so huge. The lisp style of function invocation might even be good in general for this (keeping the ':'):
 lambda (foo, bar): 
baz foo, bar
return (debaz bar)
leading to
 (lambda (foo, bar): 
baz foo, bar
return (debaz bar)
 bing, bong)
for inline application. I realize that at this point I'm basically a different language which happens to have a scent of call-with-cc-berry-py, but hey, that's what I do :p

-- WilliamUnderwood


A general-purpose proxy class should be simpler

I won't say I don't like Python, but implementing object proxies with new-style classes is a BITCH. http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/252151

It's not mega-important, but I hope they fix this as of Python 3.x.

It's not really that bad, is it? If you just need to proxy regular attributes and methods, then your proxy class is simply:

 class Proxy(object):
     def __init__(self, obj):
         self._obj = obj
     def __getattr__(self, attrib):
         return getattr(self._obj, attrib)
If you also need behind-the-scenes magic methods, then it increases to 17 lines plus comments. These lines are entirely reusable across projects - you only need to ever type it out once. How big would it be in C#?


4. Entries that seem, on examination, not to be problems at all.


No pre- or post- increment and decrement operators

Although Python expressions are like C or Java, it has no increment or decrement operators (++/--). This is by design: the reasoning is that it's not that bad to write it out longhand, and that adding them would make the namespace rules confusing. [Note that given the bugs caused by confusion in C's [pre|post]-[inc|dec]rement operators the lack of these is probably a Good Thing]


First-time dictionary entry increment workaround

When using a dictionary to keep a running tally:

 results = {}
 for each in somedata:
if each not in results:
results[each] = 1
else:
results[each] += 1
 print results
It is irritating to have to perform the 'in' check and initialise each entry in the dictionary.

Solution (as of Python 2.5) is:

 results = collections.defaultdict(int)
 for each in somedata:
     results[each] += 1
 print results
Initialiase the defaultdict with the type of entries it is expected to hold (eg. the 'int' above), this type will be used to initialialise new entries when they are first referenced.

-- jJ summarised by JonathanHartley

defaultdict is best, but if you can't use it for some reason, another option is to use the dictionary's get method:

 results = {}
 for each in somedata:
    results[each] = results.get(each, 0) + 1
 print results


Built-in 'range' should use generators

 range(1,3) # returns a list. It should return a generator.
As of Python 3, 'range' now does use generators. Specifically, 'range' returns an object which is a lot like an iterator, except that it can be reused. To get the same behaviour in older versions of Python, use 'xrange' instead.


It would be nice to have a filter mechanism for generators.

Generators can easily filter their own output with arbitrary conditional clauses - see http://www.python.org/peps/pep-0289.html

Alternatively, to filter an exiting or third-party generator, use existing filters such as takewhile, ifilter, etc., from the itertools module. Or you can easily write your own. e.g. takewhile is a generator implemented by filtering the count generator, only returning values less than end:

 # This function duplicates xrange() - for demo purposes only.
 from itertools import takewhile, count
 def rangei(start, end):
pred = lambda x: x < end
return takewhile(pred,count(start))


I was confused about the order of nested for..in clauses in ListComprehensions

 l = []
 for y in x:
     for z in gen(y):
         if pred(z):
             l.append( f(z) )
is equivalent to:

 l = [f(z) for y in x for z in gen(y) if pred(z)]
whereas it seems more consistent to me to reverse the precedence of the for..in clauses since you are putting the inner loop body at the front:

 l = [f(z) if pred(z) for z in gen(y) for y in x]
This led to a lot of head-scratching the first time I tried to use complex comprehensions. -- IanOsgood

This isn't really a problem with the language. At worst, it is an arbitrarily chosen convention. Just rememeber which way round they go. There are only two possible answers, and a nice mnemonic to help remember the way it works is that the code, stripped of newlines and indents (eg read out loud), reads *exactly* the same in the list comprehension it does in the explicit loops. Try it with the example above.


I like py2exe

There's also a stand-alone package creator written in python which basically wraps up the application and any detectable dependencies into a single self-contained folder; exe containing the main script, zip file containing the python dependencies (the compiled pyd's), and any dll's required. This includes the python dll, and as such, no python installation is necessary or used. With compression (recompressing the zip file with 7z, packing the dll's and exe with upx) the whole package can get quite small, into the 1-5MB range. Taking into account that a java app needs to package the jvm (15MB), and still doesn't gain the ability to be independent from other installations if necessary, that's not all that bad. In my case that includes wxpython, win32all, and a few other odds and ends, all packaged up automatically with no changes necessary to my source (with the exception of 'import psyco; psyco.profile()'.

Take this with a grain of salt, as there are always bugs and corner cases, etc, but I'm happy when I do some simple thing the straightforward and obvious way, and everything works as expected. I'm ecstatic when I do something complicated and tricky, and everything JustWorks?.

-- WilliamUnderwood


Order of Declaration

Maybe they fixed this since, but Python used to be picky about the order of function declarations. In textual writing, you put general first and details below. Python has this backward. Is it done for efficiency? I usually want human readability over efficiency.

There is no such thing as 'declaring' functions in Python (you only define them.) Python has never had any ordering requirements for functions. There's nothing to stop you defining functions in any order, even if they call each other:

  # works fine
  def a():
    b()

def b(): pass

You cannot call a function before you have defined it:

  # doesn't work
  myfunc()
  def myfunc():
      pass

But there's no need to ever do that. Just define all your functions first, then call them, either at the bottom, or (better) from within a 'main' function. There is no ordering requirement.


Too Tool Dependent

When working on Python code in multiple environments, it's far too easy to introduce errors (both of syntax and logic) in your code if they aren't set up exactly the same. The first time I had to spend extracting spaces from files because I dared to work on them in some other editor was enough to turn me off the language, no matter how pretty and elegantly designed it is.

I don't understand this. Python itself isn't tool dependent at all. Use any editor you like. Don't mix tabs and spaces. If you use tabs, then configure all your tools to display tabs the same size. That's about it. Doesn't seem hard nor unusual.


Break Need for Switch/Case Statements

Modernized switch/case lists don't need a "break" statement. Why the hell did Python keep it? See IsBreakStatementArchaic.

It didn't. There is no switch/case statement in Python.


See also PythonRubyAttrComparison, PythonRubyInitializer, PythonWhiteSpaceDiscussion, InterfacesInPython, PythonThreeThousand.


CategoryPython


EditText of this page (last edited January 19, 2014) or FindPage with title or text search