Eventual Side Effects

Eventually, programs will produce side-effects. This is my biggest gripe with the pure functional weenies...

On this page we discuss ideas about large systems, no side-effect hype, whether or not functional programming focusing on the wrong area or not, imperative and database alternatives to the side-effect problem, locking, connections, etc.

Pure FunctionalWeenie folks make a big deal about protecting programmers from side-effects - but have they thought about the way large systems in practice work, such as Google File System; which apparently use procedural/imperative languages to write data (SawzallLanguage)- rather than functional? Have the functional weenies thought about data storage and the real side-effect protection we need - i.e. a system that manages the side-effects! Not one that bans us from causing side-effects...

Yes they have. HaskellLanguage has a very clean solution for this: The IoMonad (see OnMonads). The idea is to limit the EventualSideEffects to a well defined part of the language where they can be implemented efficiently and without interfering (or rather interacting in a clean way) with the functional part.

I was recently listening to an audio feed from the inventor of Haskell about side-effects. (http://www.dotnetrocks.com/default.aspx?showNum=310) It really struck me as useful and got me interested in researching Monads more. Especially, since I am a MopMind? and the audio discusses modularity. The audio feed was about how imperative languages cause side-effects and we cannot easily predict what procedures and functions are causing what side-effects, etc. I understand Monads and I JustGetIt?. It's simple. However, my critical mind started querying whether or not his side-effect protection in the language is really focusing on the right protection in the first place. Can we protect a program, at a macroscopic level - so we do not have to care about the individual functions? Can we use a program as our modularity level? Yes and yes. Must we focus at the algorithm (individual function) level, rather than the program or storage level? Not necessarily - we can still use functional ideas at a higher level and not microscopically think of every minor thing as a function.

Do the side-effects (the way he discusses them as a problem) matter as much as the inventor of Haskell thinks they do? Is he protecting programmers the right way - by forcing them to code in functional manners. Why can't we protect the side-effects at the DB level and at the storage level - instead of in the programs within each and every algorithm? In a large system with many nodes of computers, simply have all the nodes write data to a protected storage system - forget forcing programmers to write code in non side-effect ways. It's the storage system that is the problem. Is it not? At a node level, who cares if the node has memory side-effects and maybe temporary swap/filesystem side-effects. Until that node writes to the main storage/db/file system, the side-effects aren't a worry (this, is all in theory - please feel free to shoot down my theory - but do some reading on Sawzall and Google file system, first).

Consider about 305 computers that are placed together in clusters in a custom written file system (like Google File System). Consider that these 305 computers (nodes) all write to the file system and process data. Better yet, consider that they write to a database, since file systems suck for storing data.

Could not the side-effects be controlled at the database level or file level - so that no matter what programs are writing simultaneous crap a million times a second - the data is still controlled appropriately? Think about 305 computers all trying to ram data into the same database. But if this database supports the connections at it's little gateway, and if it has table level or row level locking, etc. - then what pray tell do we need to worry about side-effects for at the algorithm level - like the functional weenies so convincingly try to make us believe? Why not worry about the side-effects at the storage level - the storage is our side-effect! We need to control the storage and how it manages the node connections - more than we do the individual nodes which can have as many local side-effects as they want in their own node space, so to speak.

Consider 14256 computers all filtering data and processing text - eventually writing their results to a file system or database. Why should the programmer be writing in odd functional languages. Why not just make all these 14256 nodes be private on their own (screwing with their own memory, causing side-effects on their own little node space, doing whatever they please). Then, when these nodes all have to submit their results of processing, or submit their whatever it is they are doing to some global gigantic database or file system - fine - then this is the time we worry about side-effects of the operation - we worry when we are writing results to our storage. We don't worry within each fiddly program and algorithm on the local node itself, like the functional people seem to think (but feel free to shoot down my argument with evidence that worrying at an algorithm level is still needed).

Because this very nearly reinvents Monads. Its such a good idea that you just independently discovered it. The problem with side effects isn't that they occur, but that the usually occur in global space, which you are ruing out by saying the side effects should happen at a certain time on a transactional, externally managed basis -- which is what Monads are all about.

Which node is going to write data first? Which node will write data last? None of this is a worry at the functional algorithm level! It is all in the architecture of the nodes (at a node level). The data storage and connection handling (simultaneous nodes writing to the same file or database).

In other words, I think the focus on no side-effects in functional programming is focused in the wrong area. I think the nodes and the servers they connect to need to be managed at that level. The side-effects the computers cause need to be managed. Not the individual functional algorithms! If we try to manage the side-effects in each algorithm, we are wasting time - it is the bigger picture - each computer is a cell and can internally have any side-effects it wants - it's when it connects to the bigger system that it has to worry about the side-effect, and these side-effects are essential in the system doing anything useful. And the side-effects, should be something that the storage system can control (at least partially, as in database connections, locking, handling several nodes, etc).

Pure FunctionalWeenie folks make a big deal about protecting programmers from side-effects - but have they thought about the way large systems in practice work, such as Google File System; which apparently use procedural/imperative languages to write data (SawzallLanguage)- rather than functional? Have the functional weenies thought about data storage and the real side-effect protection we need - i.e. a system that manages the side-effects! Not one that bans us from causing side-effects...

I'm afraid you completely misunderstand the intent of pure functional languages.

The whole idea is precisely to manage the side-effects, just as you request. We all know that there are EventualSideEffects. This is such a non-issue that it totally didn't need to be raised. However, for any computational process, you have a set of inputs, and a set of desired outputs. The challenge is, "How do I implement a system that takes these inputs, and produces those outputs, without affecting anything else in the system?"

Just as static typing is an automation of checks and balances between the kinds of inputs, it also can serve as a check against accidental side-effects.

As you probably know, the earlier in time you catch an error, the easier and cheaper it is to address it. Sure, you can catch unintentional side-effects by declaring quite explicitly who can touch what fields of what data structure instances. But, for the love of all that is holy, why? The syntactic noise would send even Java programmers reeling off to the madhouse, jibbering something about friend classes, friend members, and so forth. It's simply impractical.

The next level at which we can catch side-effecting issues is the level at which we combine things. In a normal imperative language, there are a myriad different ways of combining things, since not everything is clearly expressed as a function. On the other hand, a functional language is so-expressed, right on down to core primitives of the language, and thus we can now impose a modicum of sanity of what plugs into what. This plugging action, the action of calling a function, uses a uniform syntax, which makes it trivial for the compiler to detect, to typecheck, and if necessary, complain to the programmer.

To do any better than this would necessitate managing side-effects at the editor level. This is the approach advocated by ChuckMoore and in ForthLanguage/ColorForth. History shows that this approach works, but it clearly isn't practical, precisely because humans are fallible. Thus, an IDE would be necessary to perform the appropriate checks on behalf of the coder. But, again, this can only happen with a suitable language with suitably defined semantics. I encourage you to try to invent something yourself. I'd be interested in comparing your results against, e.g., HaskellLanguage's solutions.

I was recently listening to an audio feed from the inventor of Haskell about side-effects. It really struck me as useful and got me interested in researching Monads more. Of course, now I understand Monads and I JustGetIt??. It's simple. However, my critical mind started querying whether or not his side-effect protection in the language is really focusing on the right protection in the first place. At the algorithm level, rather than the storage level? Why?

For the simple reason that catching errors early is cheaper than catching errors later. There are certain invariants between all programs generally considered to be "working." These invariants are upheld and enforced in a pure functional language. Imperative languages allow these invariants to either be broken, or, to not be detectable. If it is fundamentally impossible to express a broken program (and provably so), then entire classes of obscure bugs are eliminated from the resulting binary. The only bugs that may remain tend to be logical bugs, such as (and most often) misinterpretation of the customer's requirements.

What you're proposing is called an access control list, and it's already been done, and yes, has proven useful. However, it's relatively slow, it's extremely heavyweight compared to a simple load or store instruction on the microprocessor, and realistically speaking, has zero bearing on what you're complaining about. Pure functional programs are primarily concerned with stuff that's happening on the local machine, and particularly in the program's local address space.

To give you an idea of how secure these systems are, it is possible to take a pure functional program, compile it, and run it along-side other similarly compiled programs in a multitasking, single address space operating environment, with no fear of cross-contamination of side-effects. In other words, for the program to be able to run, it must be proven (via the compiler) that it won't harm other running programs. With a language like CeeLanguage or PascalLanguage, you just cannot make that assumption across the board. Even OberonLanguage allows you to "IMPORT SYSTEM;" for example, necessitating the user to exercise caution when introducing new code which imports from SYSTEM. With a pure functional language environment, even that concern goes away.

No. The side-effects that pure functional programming is concerned with is stuff like stray pointers, uninitialized variables, or worse, variables altered by module A from behind module B's back. You're looking at the problem at the macroscopic scale -- we already have systems in place for this scale. Look more microscopically. Why, for example, does nearly every language on the planet allow this to happen:

  typedef struct SoftString? SoftString?;
  struct SoftString? {
    int length;
    int bufferSize;
    char *buffer;
  };

  SoftString? * allocateSoftString(void) {
    return malloc(sizeof(SoftString?));
  }

The result of allocateSoftString is "a soft-string." But is it really? The length is undefined. The buffer is likely pointing to something in Scotland, and the bufferSize probably exceeds the process' maximum memory capacity. That clearly isn't a soft-string to me.

So, you abstract the creation of such an object behind a module interface.

  // make allocateSoftString static, private, or whatever.

  SoftString? *newSoftStringFrom_(char *aCString) {
    SoftString? *myString = allocateSoftString();
    if(myString) {
      myString.length = myString.bufferSize = strlen(aCString);
      myString.buffer = aCString;
    }
    return myString;
  }

Here, we have a function which cannot ever return an uninitialized myString. But, alas, it can return a NULL in the event something goes wrong. But NULL is not a SoftString? -- it's NULL! That, technically, is a type violation. Fortunately, it's usually easily dealt with, but now you have to remember to do so each and every time you invoke newSoftStringFrom_(). The software's complexity grows phenomenally.

With a pure functional language, none of this code has to be written, because the compiler just knows how to instantiate an object.

data SoftString? = SoftString? { buffer :: BufferAddress?, bufSize :: Int, stringSize :: Int }

That's all you need. Now, to create a new SoftString?, you just instantiate it via the constructor:

newSoftStringFrom_ aCString = SoftString? aCString l l where l = strlen aCString

No nulls here -- what you see is what you get. Period. If an out of memory condition happens, it throws an exception for you, so you don't have to. And, as with the fully encapsulated module interface, you never, ever, ever get an uninitialized (or worse, partially-initialized) object back.

Think about 305 computers all trying to ram data into the same database. But if this database supports the connections at its little gateway, and if it has table level or row level locking, etc. - then what pray tell do we need to worry about side-effects for at the algorithm level - like the functional weenies so convincingly try to make us believe?

Something has to write the data to those 305 computers. The reliability of those multitudes of somethings is what function programming is concerned with. DB-level security is a totally orthogonal issue.

In other words, I think the focus on no side-effects in functional programming is focused in the wrong area.

Actually, your core understanding of why it exists in the first place is fundamentally flawed, leading to otherwise intelligently argued, but nonetheless incorrect, conclusions.

Think of each node, as a function then. Each node runs one program.. and at the end of the program, it returns a result. No matter what side-effects this node has on its local memory and swap space during its computations.... the end result of the program will be the same - just a stream of dumb text that is spit out over the network into the distributed file system or distributed database.

If the problem is simple enough, such as simply merging repetitive data from several sources (which is what google does, parsing web pages and merging it all into one distributed file system place), then I think it can much more easily be be dissected across several nodes. We can more easily make use of parallelism, if the problem is dissect-able. This really seems the key to easy parallelism versus hard parallelism. Easy parallelism involves dissecting the problem. Hard parallelism involves something more complex than just throwing a bunch of programs in each node to do slave work (rather, individual computations must be carefully dedicated to different nodes if the problem is more complex).

A serious problem in parallelism can obviously be network bandwidth. Does the bandwidth absorb more time than the CPU computations? If so, we're in trouble - no sense in parallelizing unless we figure the timing out carefully - but of course the solution is just to have the computer do more computations before sending it off over the network, and use less bandwidth (gzip or 7zip compression, etc)).

One visual thought I had: each node in a grid of computers, could be a function. Each node returns a result. Each node (computer) can use whatever local side-effects it needs, but at the end of the program.. it just returns a result! That's why, I don't think microscopic functional programming is necessarily a key in massively parallel solutions - rather it's just functional ideas that are useful, even at a macroscopic level. Consider a human cell - which has all sorts of side-effects inside its cell. But the cell is just a macroscopic view. A cell is just a node or module, or a program, or a server. Can the program itself be a function (macroscopic), instead of each individual component in the program source itself being function? Disputed above is that my thinking is too macroscopic - but I'm not so sure - as I think I can put together a server farm and make use of my macroscopic idea, even if it seems too macroscopic to you folks (which is appreciated, btw, I don't expect my idea to be accepted without criticism, and actually prefer criticism).

If I can build a dumb node based/program based server farm at a macroscopic level - instead of microscopicly analyzing each little function inside the program - it means I can care less about the details and work at a higher level (getting work done quicker). Remember that an important point is that if a solution works, then it works - even if not perfectly at the microscopic level. I still think macroscopic ideas have validity and potential real world use (consider also TopDown and BottomUp).

All we could choose to care about (ignoring in order to focus), is the end result of the program - what the program returns: a stream of text it spits out. This is disputed above, though - as people on this page seem to be arguing that we already know and have macroscopic solutions. But do we really? We typically use one desktop computer.. not 14 or 23. There are many times I've had to parse 34,000 websites and I hate waiting a few minutes or hours for the parsing to complete - and if I had a server farm designed at a macroscopic level so that 34 computers each parsed 1000 websites to total to 34,000, this would help me significantly. Google has taken advantage of this idea, I think, by using 1000's of commodity computers... without using a functional language (just using functional ideas, note the difference between using an idea at a macroscopic level).

As someone who has actually worked for Google, I think you might be surprised at the respect they have for functional languages. Most of their prototyping is done with them. The only reason they don't use them for production work, quite simply, is because you can't just hire a Haskell coder off the street. -- SamuelFalvo?
I have respect for functional ideas more than languages themselves, I think. The respect for the functional idea may be more important than the respect for functionalism at the microscopic level in the language. i.e. eventually, we have to get work done - which is imperative. Anyway, why did they invent SawzallLanguage in a procedural/imperative manner and not just use Haskell or invent a functional SawzallLanguage? Sawzall is also a language where one cannot hire a programmer for it on the street.. Think about that for a minute.. no one knows SawzallLanguage. The answer, of course (these are rhetorical questions), is that Sawzall was designed for getting work done. The programmer, should not care about microscopic details of each procedure or function he creates - the programmer needs to command a bunch of computers to get a lot of work done now. And that is what imperative sawzall is all about - a language which makes it easy to command the computers to do a bunch of imperative work for us, right now at a macroscopic level, without caring about microscopic functional details in a program - it's about what result several nodes (servers) produce, methinks.

Imagine - on the cheap - with today's technology - what we can do.. with the simple macroscopic idea that each node or program just returns some dumb resultset. I think this is indeed what SawzallLanguage/MapReduce? has taken into consideration with their cheap server farms built on typical commodity hardware (but since I haven't studied it enough, and since I haven't worked at google, I must do more research on it).

I may suggest ideas at a macroscopic level - but must we also consider what we can do today with our cheap commodity hardware? Macroscopic ideas per each node or program? Still useful - if you consider each node just a big cell of the entire system.

A benefit of thinking at the macroscopic level.. is that you do not have to worry about nitpicking at the microscopic level. Think for a minute about what side-effects we should really care about - minor ones, or major ones? Think about human cells - think of each human cell as this horrible Cee program or Pascal program that has all sorts of internal side-effects inside the cell (such as mitochondria burning sugar). But the outside view of the cell - each cell is just a dumb single cell. Put together, hundreds of dumb nodes can become very powerful.

Interesting food for thought and an analogy: Does each cell in our body get dedicated CPU time from our brain? And is each cell just a little program that is producing internal side-effects - but at an outer level, the cell doesn't produce so many side-effects - it just ultimately exists as a cell that can have inputs through osmosis, blood, water (as a carrier of nutrients, i.e the stream of text), and other means...

I have this BigDumbIdea?. I presume it's already been discussed and analysed to death, or else that it's not useful, because it's so dumb it seems trivially obvious, and the core of it seems to exist in paradigms like DataFlow? and FunctionalReactiveProgramming... and yet, I don't generally see it talked about in the extremely simple form I think of it.

Here it is: What if all our programs are PurelyFunctional, but they are pure functions over *streams of events* (or messages) rather than literal data?

By which I mean: okay, 'we all know' that programs must produce side-effects if they are to communicate with the real physical world. But what if we change our viewpoint and consider those side-effects themselves to be the 'output' of a PureFunction??

In other words, say a program wants to 'open a file and write a line to it'. That's a mutation of system state, so it's obviously not pure-functional in itself (unless we write it as a Unix-type filter; but say we don't). However, we can describe this as a pure function from:

('start program') --> ('open file frob' 'write line bamf' 'close file frob')

So what the program is is not either a pure function of raw 'input' to 'output' (text, for instance) OR a procedure with side-effects; but it's a pure function of one *event message* sent from the operating system, to a sequence of such messages.

To me, it seems as though you are just doing this for the sake of satisfying some FunctionalWeenie. In fact you will have EventualSideEffects and it serves no purpose other than making your program much more complex with more layers of useless indirection. Abstraction and indrection are good when useful, but you have to prove why this is even useful. I mean, the way you word it, it's almost as if you are doing this just to satisfy someone.. without an actual reason, other than to say "this is kind of more functional and therefore better". It's like following a fad just because so and so says the fad is better - but without any scientific reason. Why couldn't you just write an OpenFile?() procedure which is actually just an OpenFile? message you are calling. Names of procedures can be considered your message. Create a another OpenFile2 and OpenFile3 alternate procedure, and pick the best one, testing them out individually as separate messages. The procedure is separate block of code that you can change and modify without affecting other blocks of code, which is the same as you modifying your message response code. Your message code requires more complex code which means more errors, because it's not as simple as just calling OpenFile?(), you now have the potential message parser bugs and code to deal with (for no justified reason).

Your message eventually is just calling a procedure anyway (the side effect). When you parse the string and see what the message is asking to do, it's just calling a procedure in a sillier way. I mean really you are just making the code more complex for the sake of it, and you aren't solving any problem, other than satisfying your favorite pet theory that somehow is better, just because it is. But you have to justify why it is better and how it causes less errors. Prove that it does make it better, and why. Prove that it is simpler and more correct than having a OpenFile?() procedure which can be modified separately anyway. This is actually exactly what ticks me off about some functional programmers - they think that somehow they are doing things better just because they are conforming to some idea or paradigm, but they never seem to fully explain scientifically with logical reasoning, why their way is better. It really reminds me of those who overuse OOP (or table oriented programming) in situations just because they can, not because they've actually justified that they need it. Or how about the people who only wear a certain brand white t-shirts just because the TV advertisement says so (Can't Get No Satisfaction, The Rolling Stones).

It seems a trivial change, but it seems to me that this is very useful because, like with monads and lazy evaluation, we're not *immediately* executing those mutate-the-system commands. What we're doing is creating a data structure describing those commands, then queuing it up for execution by the system. But the thing is, in between queuing and execution we can insert other functions which transform those pending events. It's a level of indirection / reification up from 'side-effects', and that allows the program to introspect on its own actions. It also seems to me to be one step beyond mere laziness, because we're not just creating opaque closures or objects which only the core language kernel can do anything with; we're creating fully open, transparent, first-class data structures which any user-level program can manipulate. A little like Joy quotations, but messages, not functions.

This seems to me to be important, but I can't find much in the literature describing this technique. FRP comes close, but adds more complexity (a notion of real-valued 'time' which seems a bit overkill to me - I'm not saying time isn't useful but that I would have thought it could be implemented as a second-order function given a discrete 'clock' signal). Is this the same idea as Monads? It seems like it might be, but the literature on Monads seems to be bound up in opaque strict type formalisms while 'list of events' is just a simple list, able to be ripped apart and recreated later down the line; type doesn't come into it at all.

This idea excites me as a way of closing the PureFunctional? / SideEffect loop but it seems so obvious that there must be huge problems with it, somewhere. I just can't see where.

Anyone know what this technique is called and how I can learn more about it? -- NateCull?

Look up ComplexEventProcessing (CEP) and Event Stream Processing (ESP, http://en.wikipedia.org/wiki/Event_Stream_Processing). Your idea is essentially to "confine" SideEffects to be the intelligent production of new events in response to observed events, sometimes involving observations across more than one event, or more than one event channel. The idea is to focus on the "plumbing" rather than on the final consumption or initial production of events, which distinguishes it from EventDrivenArchitecture/EventDrivenProgramming. (Related: AlanKayOnMessaging; focus on the 'ma' or interstitial architecture.)