I am skeptical of need for most developers to perfect their parallel programming skills to take avantage of multi-core processors. There seems to be hype pushing in this direction. I would like to explore actual scenarios where it is necessary. Does anyone want to suggest any?
I will agree there are some domains/situations where its needed. So let's list the exceptions up front:
I agree, but adding app language gimmicks is not necessarily the way to achieve it. For batch reports, for example, even if you have 100 cores, its easier to manage say 4 processes or streams at a time than 100. Its best if the parallelism is used to speed those 4 streams instead by using the 100 cores to speed up queries in terms of indexes, sorts, sequential searches, etc., which is something managed mostly by the DB engine, not the app developer.
Further, CPU is often not the bottleneck in such cases, but rather the managing of shared resources, such as shared tables. Parallelism can't help as much for that. OnceAndOnlyOnce for a given fact flies in the face of parallelism. If the facts are mostly read-only, then copying/caching can help to some extent, but at a point there is still a bottleneck of multiple caching engines trying to make a copy of that one thing at the same time. And, RAM usage goes up if one makes multiple copies in order to feed all the cores, thus RAM may become the bottleneck before CPU does.
Huh?
Moved rest of discussion to AcidCompromisedForPerformance.
I, speaking as both a language designer and person who works often with distributed and parallel computations and communications, am largely of the opinion that it is the languages that should improve to this task. Using mutexes with all their pitfalls for parallel programming (deadlock, livelock, priority inversion, and the issue that the sub-programs aren't safely combinable: i.e. you can't call procedure A from a synchronized section B without knowing all the mutexes A accesses, lest you risk deadlock or livelock). Unmanaged memory also has its pitfalls for parallel programming (mutex-destructor issues, figuring out when memory being used by many threads can be freed, etc.). Avoiding these pitfalls is what takes a lot of 'skill'... in addition to foresight and careful planning.
Rather than requiring "developers to perfect their parallel programming skills", I believe efforts can and should be made to lower the bar for the developers such that perfection isn't required to avoid problems and getting parallel programming right doesn't require any 'special' degree of skill. My favored approach would be to embrace SoftwareTransactionalMemory and perhaps to support an atomic keyword for blocks of state-manipulator actions (which works so long as no action within the block demands input from something other than an STM-managed mutable-state source). This has been done in several languages (including ConcurrentHaskell? and ClojureLanguage) but these features obviously haven't hit the mainstream.
I'd also suggest embracing a more ActorModel or CommunicatingSequentialProcesses based concurrency model, either of which would reduce program error a great deal by limiting most operations to message passing (immutable messages, though possibly passed via shared memory) and reduce shared state to where it is absolutely necessary (shared state = shared process). There is still motivation even here for cross-process and sub-process transactions (as per motivating examples in TransactionalActorModel).
There is a topic around wiki somewhere that debates how GUI status/progress messages are handled. I'll link to it when I find it.