Productivity Rant

There's a pretty definite link between the number of statements in a program and the time required to create the program, regardless of language. This is a rant, not a reasoned argument, so I'm not going to bother to dig up references. But check out Capers Jones. And if I'm totally wrong and off-base, well, hopefully the rest of this rant will be entertaining, if not particularly insightful.

So, here's my rant: Given this (so-called) fact, that finished program size correlates directly to effort, and that more expressive languages like Smalltalk are more productive than less expressive languages like C, why are we still using low-efficiency languages?

I have an answer. A crackpot answer, actually: It's because we worship at the altar of Turing. If a language isn't TuringComplete, it's not a "real" language. All the Real Computer Scientists look down their noses at it, as do the Real Programmers, and the rest of the poor schmucks never even hear about it.

Well, what's so great about Turing-completeness? Okay, yeah, sure, if we aren't Turing-complete then we can't solve every problem that has a computable answer.

So what.

I gotta tell you, in the work I'm doing, there's not a whole lot of computation. There's database interface stuff, user interface stuff, a bit of glue, some obscure accounting rules, but not much in the way of fancy computation. Hell, before I came along, my client did just fine with a dumb front-end to the database. Last I checked, relational databases weren't Turing-complete.

What's the most popular programming language currently in existence? I'll give you hint: it's not VisualBasic, and it sure ain't PerlLanguage. It's MicrosoftExcel. Yeah, the spreadsheet. Real people use Excel every day to solve their computing problems. They type in some numbers and formulas and get results back. They create one-off programs in a matter of minutes. They don't bother testing, revision controlling, designing, or (most of the time) debugging because their programs are too simple. Some of those programs get saved and reused, some get tossed.

(Oh, and most of the time they don't hire us expensive programmers, either. Hmm. Maybe the root of our resistance is simpler and older than Turing-equivalence.)

So how is this possible? The language is focused and simple. (Simplicity) The customer can express his requirements directly in the language without going through a team of programmers. (Communication) The results are immediately visible after he expresses them. (Feedback) The customer knows that the language works, works well, and is the correct tool to use for most computing needs. (Courage)

Spreadsheets aren't just a fluke, either. HTML 1.0 and SQL are other good, popular examples of programming languages that allow users to get results quickly and easily.

If we wish to improve the current sad state of programming affairs, we should focus our efforts in two directions:

Simplify our programming languages, make them domain specific, and get our expensive asses out of the way of our customers.
Make the simple languages' power scalable. Make it possible for the spreadsheet calculation to be put on the web or integrated into the accounting system, without changing the language it's written in or its accessibility. (After all, the original writer of the calculation will want to change it.)

Turing-completeness is a RedHerring with minimal value in the real world. Let's focus on the real goal, which is delivering more software that meets requirements more accurately with fewer defects. To achieve that goal we need smaller, more powerful programs that can be written by the customer. Languages are the key.

Gerald Wienburg made a similar point in 'Quality Software Management Vol 1 - Systems Thinking'. He described his own version of the 5 CMM levels, and added a level 0 for people who did not even know they were programming, i.e. users of spreadsheets etc. He made the point that these people are programming quite successfully without any process whatsoever, and imposing process would probably get in their way (as far as I recall - it has been a while since I read it) -- DaveKirby

I think the allure of TuringCompleteness is that one language can (theoretically) do it all. If you know C, or Java, or C++, or even PHP, you can use it for all your problems and never have to learn another language.

Personally I think this is a silly idea. Even if you can program an OS in Java or a compiler in PHP or a database-backed webapp in C, that doesn't mean you should. In most cases, the investment needed to learn a new language is dwarfed by the the domain knowledge required by the problem, so you might as well have the language capture some of the domain knowledge and save on the big learning curve. -- JonathanTang

''If we wish to improve the current sad state of programming affairs, we should focus our efforts in two directions:

Simplify our programming languages, make them domain specific, and get our expensive asses out of the way of our customers.
Make the simple languages' power scalable. Make it possible for the spreadsheet calculation to be put on the web or integrated into the accounting system, without changing the language it's written in or its accessibility. (After all, the original writer of the calculation will want to change it.)''

Hmmm. The problem I see in this is that the unschooled programmers (= customers) do often is the worst of the solutions, and allowing it into a production system is tantamount to product suicide. Since they don't test it, don't consider big O issues, don't normalize tables, don't check for buffer overflow, etc, etc, etc, they are going to destroy a lot of data, chew up a lot of CPU and waste a lot of time. You don't let just anyone work on your car, why would you let just anyone program your business-critical systems? -- PeteHardie

Those problems don't always occur. Our jobs will even stay secure; they'll call in the programmers if those problems happen. Many people change their own oil, but if they strip a bolt or something while changing the oil, they take it to the mechanic.

Let's remember that "no programming necessary" is an AlarmBellPhrase.

When scalability is a concern, sometimes you can use more powerful languages to look inside the less-powerful format and pull it out. HTML is a good example; it's basically just plain text and can be parsed easily. That's how search engines work.

Something like MicrosoftExcel is a little more difficult. If you just need to move data from an Excel table to a DB, you can export it to tab-delimited text and then write a script to insert each line as a new row. But if that Excel spreadsheet contains complex functions, what does the interface to that look like? I'm sure someone out there has figured out how to use VBA to dig that stuff out, but it's harder.

Then again, sometimes you never need to scale up. Sometimes those Excel functions written by that guy in accounts will never need to be used by anybody else. But sometimes those functions need to be put on the intranet, and it's hard to guess these things in advance.

Most people know how to write, and make good use of the ability in their daily lives; this ranges from shopping lists to complex work schedules.

To produce a written work which is useful to a large number of people, on the other hand, requires much more effort, significant practice beforehand, and formalized skill.

To suggest that people might, in practice and on a regular basis, write their own software is akin to suggesting that they might write their own treatises on philosophy or epic novels. It just doesn't seem to work that way.

No you are comparing creating a trivial excel spreadsheet to a monumental task. Yes, people are not going to write operating systems on a regular basis, but they will (and do) regularly use their computer to solve small problems, using a general purpose program like Excel. Rather compare writing this small once-off to writing a letter to your mother.' -- GavinVanLelyveld

I'm not so sure there is a correlation between number of lines of code and amount of effort. A trivial example is removing redundant code. There is a positive amount of effort and a negative number of added lines. Another example is that it is often trivially easy to write hundreds of lines of straight forward code and very difficult to write tens of lines for a difficult procedure. I would venture to guess that two programmers each proficient in a different language could solve the same problem in about the same amount of time, regardless of the actual number of lines of code.

Bottom line, I am not sure if there are a lot of gains to be made from "simplifying" languages. Rather we need improved methods of program design. -- WayneMack

There is a statistical correlation between SLOC in the finished program and the amount of effort required to build the program. This statistics were gathered from large projects involving multiple programmers and certainly do break down when you look at individuals making day-to-day changes. But the overall correlation is there. -- JimLittle

Is it really true that we're worshipping at the altar of Turing? It seems to me that if Excel really is the most-used programming language, we're halfway to casting out the idols. End-users are doing their thing with Excel and Access and whatever else comes to hand, and we professional programmers are (or should be) writing just the programs that require our skills, that can't be done adequately with those end-user tools.

Excel is Turing Complete. See http://esoteric.sange.fi/brainfuck/impl/interp/EXCELBF.XLS for one example.

Sometimes I'm given a simple MicrosoftAccess program and asked to add loads of functionality that, well, really can't be done well with MicrosoftAccess. I usually end up rewriting the application using a more sophisticated language and toolset, which is fine because the user was just following WriteOneToThrowAway?.

''Me too!. But when I get the Access Database the additional User requirements bend the design of the application so far away from the original "prototype" that the Users' expected time to "convert" to a "real application" differs greatly from the actual time to "create" it "from scratch".

Isn't that a good thing? The end-user got a handy prototyping tool, I got my UserStories in a fairly direct form, and the (heh) "bare" minimum of expensive ass-time is involved.

Now that I think of it, isn't slapdash programming with Excel a radical implementation of YouArentGonnaNeedIt? But when they do need "it", they come to us. Which is as it should be. --MarkSchumann

PaulGraham suggests the opposite approach where productivity is improved by using more powerful/expressive/sophisticated languages in this article http://paulgraham.com/avg.html.

What is said is true, but I draw a different moral from it.

It has long been known that if you can get your product integrated into a customer's spreadsheets, you have them hooked. (Would-be purveyors of web services take note!) It takes something extraordinary to get them to replace the spreadsheets they depend on since nobody - including the authors - quite know how they work. In short spreadsheets really are programming - with all the associated traps. They are widely used because they hit a sweet spot in the tradeoff between productivity and needed knowledge. But the approach doesn't scale.

-- BenTilly

Perhaps the answer is that productivity covers far more than the initial time it takes an experienced programmer to type in code and get it working.

For someone experienced in C and not experienced in SmallTalk, that person will be more productive writing in C.

When program size and execution time are important, it may be better to choose a language like C and take more time to initially write the code, rather than trying to performance tune SmallTalk.

Finally, it is really unclear whether lines of code is a meaningful measure when a program is in maintenance (where most programs spend 80%+ of their lives). Lines of code per hour may not be a valid measurement criterion at all.

Excel and Access seem to be WYSIWYG Programming environments. You do something and you see the result immediately as such. Maybe some of the secret is there... -- GavinVanLelyveld

Insightful comment. Reminds me of the work being done in SubtextLanguage on example-centric programming. -- RyanHoegg

There's a pretty definite link between the number of statements in a program and the time required to create the program, regardless of language.

I would challenge this statement with the following thought experiment. If you have 10 programmers write the same program, does it seem likely that the shortest program will have taken the least amount of effort to create? --AnonymousDonor

It might be fun to collect some data from universities. (Students and schoolwork aren't really representative of real programmers and real projects, but at least in schools you do get lots and lots of people all writing the same program. There's probably useful data to be had.) I don't have any statistics to give you; only anecdotes. But it certainly seemed to me that the people whose assignments ended up the shortest were also the people who spent the least time on the assignment. (I would chalk that up to differing skill levels, but I also remember times when people of approximately equal skill were writing the same program in different languages, and the people using the better languages tended to end up with shorter programs and be done faster.) -- AdamSpitz

My experience is that shorter programs often take longer to write but are more reliable, better tested, and often pack in more functionality. (And yes, I'm still talking students.) My professors have horror stories about students who wrote a 10,000 line monopoly game as a final project. It didn't really need to be more than 1000 lines, but the students involved hadn't really discovered loops or functions. -- JonathanTang

See: WorseIsBetter, WhyCorporationsLikeStaticTyping

CategoryRant