Faster Than Cee

Primary discussion is in 'AsFastAsCee'. But, if 'AsFastAsCee' is a powerful benchmark for performance, then 'FasterThanCee' is the true holy grail: any computer language that can boast 'FasterThanCee' for a wide spectra of problems has much greater potential to penetrate the wide array of programming markets that are (despite many very reasonable arguments contrariwise) still obsessed with PrematureOptimization and 3% coefficient differences.

"FasterThanCee", referring to the C programming language (CeeLanguage), is primarily a claim of optimization (as opposed to, say, productivity). Although the C language isn't well designed for high-level or program-wide optimizations, it is already low-level and many thousands of man-hours have been placed into the modern C compilers, such that it is quite difficult to beat a C program for speed... especially hand-optimized C.


FasterThanCee is inevitable

C and similar languages (Fortran, for example) today have a performance advantage for many programs due to runtime models that closely match the hardware architecture. However, it is inevitable that CeeLanguage will lose this advantage. And it's already happening:

These changes in the architecture eat steadily away at the grasp C and brethren (C++, Fortran, etc.) have on performance. Unfortunately, language benchmarks will often fail to demonstrate performance scalability issues, or scalability issues of any kind (reliability, for example). It's ArgumentByLabToy - intra-language benchmarks are kept small because they're economical to write, easier to compare for rough equivalence, and reduce variability due to design choice.

The places where C offers a "small-scale program" advantage (uniform memory model, explicit GC, etc.) will forever continue to offer enhanced perception of C/C++'s performance fitness. Higher-performing languages that lose some small-scale performance advantages will have a challenge on their hands for proving viable performance. They'll need to sell themselves on other properties then prove 'adequate' performance, and forever suffer the reputation of being a bit slow. That C/C++ may have failed to achieve adequate performance for the same problem, it is likely that nobody will never know.

Makes me wonder how many languages already qualify as FasterThanCee on the large scale.


Despite the wall presented by the powerful C optimizers and the wizards who come in and further hand-optimize code to show how 'fast' C really is, there are some languages that have proven to be FasterThanCee, sometimes much faster, in limited domains - especially in domains that benefit a great deal from significant code-specialization (VHDL, simulators) and immutable or interned values and garbage collection (e.g. backtracking searches, planning in weak AI, inference systems). Hand-written C code must be written generally to the problem domain (a set of models), whereas code-specialization techniques essentially write the code for the specific model being examined at the time (and will rewrite it as often as necessary for new problems). And, for immutable and/or interned values with garbage collection, as are often useful for large, backtracking searches, C is simply a major hassle to use... even wizards of the language will often avoid building the massive structures necessary to make them work efficiently. As a consequence, C programs will often lack optimizations made easy by other languages.

There are also some languages that are, generally, AsFastAsCee, or approximately so, plus or minus based on the problem. This is most quantifiably evidenced by the ComputerLanguageBenchmarksGame. Examples include OcamlLanguage (a mostly functional language with type-inference), DelphiLanguage (Borland's Object Pascal), and MercuryLanguage (a functional & logic language based on PrologLanguage but purer and better designed for optimization).


In my opinion, nothing is FasterThanCee yet. My benchmark: chess programs. None of the top programs are implemented in anything but CeeLanguage, CeePlusPlus, or AssemblyLanguage. Chess programs are in a class which must run AsFastAsPossible?, so they eschew dynamic memory allocation (excluding all FunctionalProgrammingLanguages, sadly, and most modern programming convenience libraries) and require direct compilation (excluding interpreted languages like Java and Python) and the best optimization (gcc or intel compilers). Chess programs tend to implement domain-specific versions of data structures provided by dynamic languages (strings, bit vectors, hash tables, lists, multithreading primitives) in order to squeeze out the last iota of performance, and only low-level languages allow them to do so.

I don't think Cee is a fast language because a language isn't fast or slow per se. Its implementation (compiler) makes it fast or slow. That being said, what about a chess game or Go game that uses a relational database with statistical voting (this move won the game how many times, versus that other move)? I wonder if chess games are reinventing databases by using hash tables, lists, etc. In order to build the best Go or Chess game, I believe the computer just has to have a large database of all the possible moves and vote which moves won the game the most, and vote which moves won the game the least. The top chess players' moves in competitions can be recorded and stored in a database. It takes up more hard drive space to have thousands (millions?) of move combinations pre stored though. I believe some Go games use relational database backends.

There are multiple databases in use by competitive game programs, each customized to the domain:

Of these, only the openings have any use for ad-hoc relational searches, and then only when used by humans to do their own opening preparation (i.e. what has a particular GM been playing lately?). Chessbase has rolled their own, for no particular reason.

Your note about Go is interesting. It turns out that Monte Carlo Tree Search has made great strides lately. The hand-coded yet brittle Go evaluation functions have been superseded by integrating the results of thousands of randomly played games. The more randomly played games, the better the results, so speed and parallelization is again of the essence, and the best Go programs are all written in C.

They are not all written in C; there is a program called Moyo Go Studio which was written in DelphiLanguage, but it wasn't a Go game it was just a Go analysis tool which used databases as the backend. I don't know much about it since I haven't used it, but I read about it. A lot of time has been spent making Cee compilers produce tight, fast code - many compiled languages could be just as fast as cee if people spent the time on those compilers (i.e. modula or oberon: why would cee be faster than modula or oberon?). You could make a slow Cee language interpreter (or you could make it fast if you spent more time).

Also, are the Go programs and Chess programs really written in plain C? Or is it C with preprocessor macros, or is it actually C++? Why wouldn't C++ be used instead of plain C? Plain C is so primitive. PlainCeeProgrammersAreLuddites; even if better technology is available, some guy will still use plain C just to boost his ego. Why, for example, would someone use plain C in a game, when games often have lots of strings in them.. wouldn't you use a mixture of C++ (with managed strings as classes) in addition to using some more plain C for other stuff?

[The amount of game states might be larger than the number of atoms in the visible universe. But feel free to go for it. I understand that many begin-games and end-games are handled this way. I do agree we could take better advantage of data resources, at the very least to mine for strategies and heuristics.]

You could just store the most common moves in the database, like the top chess player moves in certain game positions, and for other game positions, you could resort to calculation on the fly method (brute force). I think maybe this is how some Go systems work like Moyo Go studio, but I have to admit ignorance here since I don't know enough about it. Apparently what Moyo Go did was record a database of many top Go players common moves in certain game positions. Microsoft apparently reverse engineered Moyo and used the technology in a product of their own.


See also: CeeLanguage, SufficientlySmartCompiler, DoMicroprocessorsLoveCee, AsFastAsCee

AprilTwelve


EditText of this page (last edited April 10, 2012) or FindPage with title or text search