From SynchronizationStrategies:
Where CodeLocking provides locks for algorithms, ParTition provides locks for instances of data structures. Programs that have many instances of independent data structures (an example of what many would consider an EmbarrassinglyParallel program) can achieve arbitrarily large SpeedUp with relatively small DevMaintCost.
However, many algorithms require some rework to cause their data structures to be more independent of each other. Attempts to ParTition such algorithms can result in exorbidant DevMaintCost for unimpressive SpeedUp.
A special case of ParTition that avoids CriticalSection altogether is DataOwnership.
The following papers describe partitioned algorithms:
``Efficient Kernel Memory Allocation on Shared-Memory Multiprocessors'', Paul E. McKenney and Jack Slingwine, Winter'93 USENIX.
``Efficient Demultiplexing of Incoming TCP Packets'', Paul E. McKenney and Ken Dove, Spring 1992 Computing Systems and SIGCOMM'92.
``Efficient Buffer Allocation on Shared-Memory Multiprocessors'', Paul E. McKenney and Gary Graunke, IEEE HPCS'92.