Kill Mutable State

[This brain dump is my NewYears? 2005 rant. Enjoy! -- LucasAckerman]

Machine abstractions are inappropriate for people to work with directly. MutableState and linear storage are the two big culprits here.

What if we eliminate mutable state with destructive update as an optimization altogether? Files, processes, other dangerous abstractions, etc., all become obsolete.

Storage is big, and cheap. Magnetic storage is by far the fastest growing modern computer element.

Identity trumps value if storage resources are scarce and must be allocated very carefully. This is no longer true in general, so value should now trump identity.

This suggests a value-centric system organization is possible. Here is my outline.

Values are immutable, and globally interned by the system. All resource allocation is OS level, mostly of the 'allocate permanently' variety. The system maintains and persists the value store, and runs programs by computing new state values per their logic. Program logic and state are just values. The entire value of computational history is retained.

Users browse the value store, and extend it with new value computations on occasion, as well as de-allocate values they no longer care about. The system can analyze the value-graph and access intensity (heat?), to suggest which values may be discarded by the user. The value store is like one big write-only TupleSpace graph.

No process protection is necessary since values are immutable. Programs access the value store instead of raw memory. Program logic can only operate on values it has access to and can only add to them (sandbox), like in a CapabilitySecurityModel. If you make a mistake, or your code crashes, you can backup and retry, or avoid the issue, or find the bugs. It becomes trivial to organize new computations, because you can alter logic and state safely, since it's just adding new values to the system.

Value equivalence becomes trivial, even for huge values, since value 'identity' is known by the system. A value is guaranteed to be equal to itself, so it's just a reference comparison. Hooray for interned values!

Names and other MetaData (dates, value 'sizes', originators, other SideEffects, etc.) just become more values in the store. Thus we want partial-value PatternMatching. E.g.: what is the name of value x? What is the size of the value with the name "foo", etc. MetaData is no longer special, since serialized files go away. Everything is just values, period. It would be like having GoogleSearch for your OperatingSystem.

"Why is there a growing interest in search on the desktop? Many attribute this interest to the 'googlization of the desktop'," says EricFreeman (http://www.artima.com/weblogs/viewpost.jsp?thread=84331).

Normal people don't have to care about data structures or algorithms, just composing the values they're interested in. All user work and program state are implicitly versioned [OrthogonalVersioning], so the system is mistake-proof. You can always just rollback to an earlier point in history (state, logic, value, etc). Everything becomes inherently branchable and composable. Input history is retained too. It becomes trivial to record and replay programs, since everything is already retained!

All resource management becomes system-wise, and MemoryFragmentation? goes away. The system would continually massage the persisted value store while otherwise idle. Programs can extend the system's internal value 'formats' by providing compressors, or just operating on compressed values themselves. Everything in general can become efficient in the incremental, LazyEvaluation, on-demand sense. Duplicate values are never stored.

Types become kind of irrelevant, a by-product or side concern, descriptive rather than prescriptive. Is value X in type Y? Or type can just be another part of the value. Type values would be easy to tag things with.

Temporary stuff and weak-cache can be used for fast mutation-heavy processing (graphics, games, etc). Call it "unsafe" mode. Important values are retained (initial state, configs, few last good used states). Anything the user does is important, as are expensive processing results.

Coders can move beyond the text-centric-program limitations. Text yields logic, but logic values can then be used with DirectManipulation! Program logic is inherently backtrackable. Crash, backtrack, fix, wohoo! Also, crash-states and transient bugs are never lost, so every bug can be found.

Programs should be short, and largely compositional. Compilation as a necessary optimization can be discarded, since you can glue logic and state together at will. Or logic can be compiled as appropriate for speed reasons, by the system, on-demand. Compilation is treated as a compression-style optimization though, as you can always retrieve the original logic (excepting sourceless distribution). Programs no longer need to manage their own environments, except for keeping the working set as small as possible so temporary values can be garbage collected.

Linking, dependencies, packaging, etc, all become trivial. The value store is like a tactical nuke on all this old inefficient crap. Dynamic vs static issues basically go away.

The edit-compile-link-debug etc. dies for good. Instead, your code/state editor can build the values as you edit them. This should then feedback to the editor in realtime, to show you errors as you type, etc. Then running a program just is executing with those new values. Code building can become lazily decoupled, incremental per-statement. Separate compilation is de facto standard, all the time, for free.

[Please note that this is all system-level. It has nothing to do with languages, OOP vs FP, etc, aside from issuing a strong judgement on the current state of computing affairs.]

It does certainly have a lot to do with FunctionalProgramming. There have been various proposals and research projects on PurelyFunctionalOperatingSystems with similar goals, for example <http://citeseer.ist.psu.edu/669829.html> and <http://citeseer.ist.psu.edu/context/89769/0>. -- DavidSarahHopwood

DistributedComputing becomes easy, since you just have a distributed value store and some distribution logic. Online systems would persist everything naturally [OrthogonalPersistence]. No more connection/connectionless issues, just distributed value store replication. Everything could be p2p by default. You would mark values as temporary, permanent, or public. It would be very interesting to browse "what new values were created in the public store today?" to find games, art, news, etc.

Traditional OperatingSystems become obsolete. The only relevant questions are what values and resources your platform can manage efficiently. Platform native logic and resources will of course be the fastest, but everything else is viable.

Media piracy becomes trivial and irrelevant. Either you have a legitimate local copy, or you don't.

"Content providers" might not see it that way. Some of them want to be able to control what you do with your local copy (see DigitalRightsManagement). This is silly and shouldn't be allowed, but they want to do it anyway.

The distributed value store can track access rights to manage distribution. Anything copyrighted being offered publicly to the distributed store is a big red flag. Online distribution both ubiquitous and safe.

Formats, protocols, and so forth, all become global public goods instead of proprietary private goods.

The world of computing evolves into a transparent, safe, general, flexible state, and everyone is happy. The end.

-- LucasAckerman

This sounds like "LetsBlowUpTheUniverse!" :-> Take a look there for similar thoughts.

It's a nice idea, and certainly one that's come up several times before, starting with JohnBackus's TuringAward lecture. But there're some problems that have so far kept it from becoming a reality.

State is an inherent part of a lot of problems. Consider a really simple GUI problem - you have a buddylist where selecting one screen name means that you have one set of operations available ("Get info", "Send IM", "Send file", "Direct connect", for example), while selecting multiple screen names enables another set ("Invite to chat", "Remove from buddy list"). These actions may appear in multiple places - a popup menu, toolbar buttons, keyboard shortcuts, menu bar, preferences, etc. When the user changes the selection state, you want all of those places to change their appearance. You don't want to have to manually send messages or do checks within all those spots, because that's a massive violation of OnceAndOnlyOnce (and is unfortunately what you have to do with many OO frameworks; I've entered setEnabled() hell a few times, and am very glad that things like Java Actions and MozillaXul Commands let you automatically propagate that state.

This is an inherent part of the state problem. Mutable state is bad because of aliasing; you don't know whether a change will impact some distant part of the program. Without aliasing, mutable and immutable state are semantically equivalent: (set! var value) is just like ((lambda (var) (rest-of-body-including-var)) value). But aliasing is precisely the effect you're trying to achieve with state: you want long-distant changes, because certain attributes cut across the program to impact many program areas.

Once you have aliasing, you get many of the same problems as state, whether you call it mutable state or not. Imagine a TupleSpace database. Each time you insert a tuple, that tuple might be read by hundreds of processes. You have no way of knowing whether this will cause unintended consequences until you do it. It's like assignment with PatternMatching, which is a cool feature (ErlangLanguage can do it, as can CommonLisp), but doesn't really make state management any better.

Then there're also concurrency concerns. TupleSpaces were made for concurrency, so they take care of some of the issues, but they're also non-deterministic. If there are multiple matches on a tuple pattern - which seems all too easy if you make a global, system-wide store, you have no way of knowing which tuple will be consumed. I think that's one main reason why TupleSpaces haven't caught on for major programming.

-- JonathanTang

Aliasing is not the 'effect you're trying to achieve' by modeling state. It's a side-effect (pun not intended). You're absolutely right that the disadvantage of aliasing is that 'you don't know whether a change will impact some distant part of the program'. One of the main motivations of purely functional approaches to modeling state is that they do allow you to know what changes can impact which parts of the program.

Consider your buddylist GUI example. The dependency of the operations available on the screen name can certainly be specified in a more declarative way than by explicitly changing every relevant GUI element in response to a change in the screen name. They can be a (pure) function of the screen name that is automatically recalculated when its inputs change. Perhaps the emphasis should be on DeclarativeProgramming and minimization of non-determinism rather than on purely functional programming in the strictest sense, but there are strong arguments (e.g. see WhyFunctionalProgrammingMatters) for starting with a purely functional model and relaxing it, rather than starting from an imperative model and trying to reduce the use of explicit state only as a matter of style. -- DavidSarahHopwood

I don't see how this is workable. Say a dual processor can generate 5 billion 64 bit integer results a second. without *any* mutable state I would have to allocate 40 gigabytes per second to store the results. Losing a reference to an object by either returning from a function or by tail call optimization or via thunk replacement are all mutations, and it is only by behind the scenes mutation that a purely functional language's runtime doesn't grow space without bound.

re: "Users browse the value store, and extend it with new value computations on occasion, as well as de-allocate values they no longer care about." I'm not sure how you safely deallocate without mutation. Either an object is reachable or it is not. To make an object unreachable, its reference or something that refers to it has to be mutated to something else, or it has to go out of scope which itself is a mutation by the runtime (e.g. decrementing a stack pointer).

"No process protection is necessary since values are immutable." If users can deallocate objects then how can process protection be unnecessary? If you allow runtimes to do behind the scenes mutations to prevent unbounded space allocation, then you need protection. At least your allocator will need locks, else how will you decide which process gets the next chunk of physical memory? How would you write a memory allocator without mutation (of the free space pointer)? -- JamesMcCartney

JamesMcCartney is unintentionally trolling here by completely misunderstanding or ignoring the nature of the proposed system. For homework, tell us why.

By analogy with GotoConsideredHarmful: I don't see how this is workable. without *any* GOTO I would have to follow every single control pathway unconditionally. Changing the flow of control with an if/then/while/try/catch are all GOTOs, and it is only by behind the scenes GOTO that a structured language can conditionally execure code. ... etc.

Yes, JamesMcCartney probably misunderstood the proposal. After all, it's about abstracting mutable state away, not eliminating it. Some of his points are still valid, though. But what bugs me is this: if a system doesn't have any mutable state in principle, how does the user fit into it? How can he interact with the system, if it resists any change? Or put differently: from the perspective of the system, WHAT IS the user?

I can at least list some things which he cannot be:

A function, as he doesn't necessarily react to the same input in the same way all the time.
A data structure, as he has got mutable state, which isn't possible for data structures in such a system.

In an object oriented environment, you can treat the user as just another object, because everything the user does is sending and receiving messages to and from other objects. But in a system without mutable state, this isn't possible. So, what exactly do you propose to solve this dilemma? -- MatthiasBenkard?

Indeed, Matthias. It's easy to for FunctionalWeenies to forget that computers must do two things: communicate and calculate. A FunctionalLanguage (in the 'pure' sense) is defined to only describe the latter. Any signals delivered from a system to a user inherently affect state, and time rolls forward. Attempting to KillMutableState entirely is ridiculous, as it simply puts the entire onus of state on the users.

A FunctionalWeenie's standard action at this point would be to stand on a soap-box and proudly claim that Monads will solve the problems. After all, any description of a computation is a value (as all descriptions are) and monads describe computations. The problem: the moment you stop merely performing calculations over these Monad descriptions, and start attaching Monads describing computations to things like 'main' and CPUs and such, you suddenly have an imperative language in the form of standard monads with relatively standard state descriptors all compiled to a computational environment. The soap-box dissolves. The FunctionalWeenie was really an ImperativeWeenie? in disguise.

It's funny to watch... for a little while.

In any case, the moment communications become a procedural part of the system, you have mutable state. If nothing else, you have state in the form of signals propagating over time, and capturing state can be performed by having two processes repeatedly bounce signals off one another. It's wiser to just give in and create a mutable state primitive - a 'cell' - to capture and ground such signals... a special service that cooperates with the hardware and handles the above conceptual bouncy-signal-thing in a far more efficient manner, such as physical encoding.

However, a more moderate (and far more realistic) form of KillMutableState is to make MutableState into something acquired explicitly from the Operating System and treated as a service. This has the advantages of allowing a great deal more control regarding the use of MutableState, limiting it to where it is needed and allowing such things as realistic SoftwareTransactionalMemory, VersionControl, OrthogonalSecurity, OrthogonalDistribution?, OrthogonalReplication?, etc. MutableState simply won't be used except where it is fundamentally needed. Don't KillMutableState; just end its ReignOfTerror?.

I just stumbled upon this page after playing with the idea of a lazy imperative language in one of those idle conversations. Call it "Whatever", because whenever a statement is reached in the control flow, the runtime just goes "yeah, whatever", and drops it on a things-to-do pile. The entry in the pile contains references to the then-current values of each variable cited in the statement, while assignments result in the current value of the variable being assigned to becoming "whatever this statement makes it".

The runtime goes racing through each basic block, pulling up only at branch points and volatile stuff like I/O; only then does it start to go through its to-do pile, and even then only enough in order to determine which branch to take or handle input/output ("so I have to output the value of Foo, whatever that is...").

From userland the variables are still mutable since different entries in the to-do pile define its value at different times, but any older values would persist at least as long as there is still something that refers to it.

Okay, it's a pretty vague idea; I see it's basically a rehash of conservative TransactionSemantics. Really - "lazy imperative" programming? Whatever.

CategoryDistributed CategoryFunctionalProgramming CategoryRant