Programming Without Ram Disk Dichotomy

To ponder how programming would change if the distinction between RAM and disk did not exist, or was made transparent by internal caching techniques.

For reference: ErosOs does just this.

Does OO Want BubbleMemory?

I have noticed what appears to be a pattern in OO proponents' view of "persistence". They don't really want a database, as it violates pure encapsulation. They just want instantiated objects to stay instantiated forever, until explicitly deleted, collected, or archived. They really want BubbleMemory: the technology that once promised to make RAM state stay even if the power is turned off. Or, at least something like it.

In other paradigms, in which data and behavior are often viewed as more or less separate things, the volatility of RAM is less of a problem.

Tools like a PrevalenceLayer mostly seem to be an attempt to mimic BubbleMemory. Operating system features such as OrthogonalPersistence/TransparentPersistence would be useful to achieve the same results.

Without a MMU that can provide memory protection on an object basis, BubbleMemory scared me.

For the sake of argument, assume they "fix" that, using double-commit write-backs or something. (AcId)

Don't all programmers want BubbleMemory? I've wanted it since my BASIC days. I don't think this is a peculiar trait of "OO fans".

I think it is a bigger issue with OO fans. The RAM-Disk dichotomy seems more of an issue in OO. Perhaps an OO language/engine can be built to hide the difference more.

What does OO have to do with it? The distinction between RAM and disk is important, no matter what language or technique I use. I haven't wanted BubbleMemory more since I learned OO languages.

For example, when you do an Update command to a RDBMS, you don't care whether it is using RAM caching or disk to store it. You are assured that its "state" will be there the next time you look (assuming no power failure before next internal cache write). But traditionally a SetX OOP operation does not provide this guarantee. If it did, things like a PrevalenceLayer or OODBMS would probably be in less demand. (Note that there is much more to a DataBase that just persistence though.)

RDBMS is not a programming language. If you're writing an RDBMS, you do care whether you are using RAM or disk. It doesn't matter if you write the RDBMS in an OO language or not. You still care about what is in memory and what is not.

I agree that "base tools" will need to care, as base tools are closer to the equipment in order to get performance.

What is a "base tool"? Do you mean general purpose programming languages?

No, I mean development and software solution tools like language interpreters, databases, components, GUI engines, report writers, etc.

So why is this page about "OO fans"?

Because some people seem incapable of making a statement without disparaging some group that exists only in their imaginations. I think it may be a defence mechanism for those situations when they feel their argument isn't strong enough to stand on it's own.

{I honestly did not mean it to be disparaging. Alternative title suggestions are welcomed. If I *had* meant to be disparaging, I would have called this OoWeeniesReallyWant? BubbleMemory.}

Methinks thou dost protest too much. The original article had nothing disparaging in it. All it says is that the ideal persistence solution for OO resembles BubbleMemory. If anything, drawing an analogy like this might give someone a great idea for how to implement it. For example, I immediately thought of a memory-mapped file as a simple (although naive) construct that simulates this: let your language's memory manager party on a memory-mapped file instead of an in-core arena, combine with some safety and global persistence, and you pretty much have the same effect as BubbleMemory.

But the implication is that "OO fans" want BubbleMemory more than programmers using other languages. I don't see any argument here that explains that position. Is there something about OO programming languages that makes BubbleMemory more attractive than other programming languages? If so, what is it?

I think it's the idea that to an extent, in many OO environments, effects occur implicitly. Examples are garbage collection, selection of conditions via polymorphism rather than conditionals, etc. In the case of BubbleMemory, you didn't have to think about explicitly persisting the contents; it just worked. Similarly, well-written serialization of objects just works, in the same way BubbleMemory did. OO fans want implicit behavior like BubbleMemory provided; OO phobes want to see every line of procedural code where something happens. There... I've made sweeping generalizations about both camps now.

I still don't understand. When I wrote procedural code, I still wanted state to persist. Loading and storing state is no more or less difficult in OO. The desire to persist state is no more or less strong in OO.

But in OO you generally don't want to save attribute-by-attribute, but instead either tell the entire object to save itself, or automatically remember itself without a save/load step. In the non-OO world, you sort of act as if the objects are the database entities or records and talk to them, but do not implement them in application code. In the OO world, you more or less write and manage your own little database in the application (for good or bad).

No I don't. Non-OO code had structs and records. Most non-OO code doesn't use a database. There's nothing about object oriented languages that makes people more prone to writing little databases in their applications.

Well, most of the non-OO code I see does use a database. Look at it another way. If BubbleMemory worked well, I think non-OO designs would still use databases and database calls because of the other perceived benefits of databases beyond persistence. However, I think OO designs would change by not having formal persistence steps: The object exists until you tell it not to exist (or perhaps declare it temporary). If you disagree, I am curious about how and if you would manage persistence.

Most of the projects I've worked on didn't use a database. Of the projects that did use a database, most of them used object oriented languages and techniques. I don't understand - why would procedural code use a database when OO wouldn't? The benefits of databases apply just as much to OO code as procedural code.

Well, I think they are somewhat orthogonal, but that is another existing topic.

It's precisely because they're orthogonal that the benefits of databases apply equally to OO and procedural code.

Procedural designs can take advantage of BubbleMemory, just like OO designs. I always wanted structures/records that would exist until I told them not to exist.

Isn't that what databases are for? If I wanted that trait, I would use a database (including NimbleDatabases).

Databases are a compromise. We use them because RAM isn't infinitely large and doesn't maintain state forever.

How are they a compromise? I would still use them if the distinction between RAM and disk disappeared.

That was one of the first things I wanted when I learned to program. I want infinite RAM that lasts forever. Everything else is a compromise. The money is in learning when to apply which compromise.

If RAM were infinite and non-volatile, it would grow into a mess over time and techniques would have to be put into place to keep it clean. I believe those techniques would closely resemble DBMS technology and features.

How about this question: How many OO proponents would still use an OODBMS or a PrevalenceLayer if BubbleMemory worked well?

I don't think any would use prevalence layers. That's a compromise we could forget if all RAM was static. OODBMSs (and other DBMSs) would still be used because RAM isn't infinite and we need good ways to search it.

Suppose the underlying OS or hardware blurred the distinction between disk and RAM (or BubbleMemory was as cheap as disk). Would you still use an OODBMS? And, could you elaborate on "good ways to search it", please?

If BubbleMemory was as cheap as disk I doubt anyone would use an OODBMS. I don't use them today. By "good ways to search it" I mean good ways to search RAM, such as relational queries.

Relational queries? In my opinion, OO and relational do not get along well because TablesAndObjectsAreTooDifferent.

So you think if data were stored in RAM instead of on disk people who currently use OO programming languages and relational queries would stop using relational queries? I don't. The choice to use relational tables and queries doesn't depend on which programming language I use or where the data is stored. It depends on the need for fast ad hoc queries.

So you are not of the opinion that OODBMS or querying of a dynamic OO environment can provide ad-hoc queries?

I didn't say that. If a piece of software needs fast ad hoc queries of large amounts of data, relational tables are often the best solution. The programming language used and location of the data don't change that fact. Some OODBMSs provide fast ad hoc queries because they are built on relational tables or maintain relation table indexes.

Then why not skip the middle man and talk more directly to the RDBMS? It seems that you like some features of a relational view of info for ad hoc querying, but for others you like an OO view. The open question is if this would change if the RAM/disk dichotomy were blurred. Some say that RDBMS tables are not app-specific enough. However, others counter that "nimble tables" (local or app-specific) are possible and practical if your tools support them. Is it that tables traditionally lack methods? If methods were added to tables, would this change things? And further, if ad-queries were not an issue, would you then prefer an OODBMS? Or would "perm RAM" be sufficient?

That's misunderstanding the issue. Binding a new value to instance variable x of some object is likely only one step in a larger (transactional) response by a system to a stimulus. I want to be assured that the in-memory changes I've made in that response will be durable when I commit the transaction, not when I re-bind variables. In that regard it doesn't matter whether I'm binding an instance variable of an object, or binding a global variable in a FORTRAN common block. -- RandyStafford

Could you please clarify this? I am not following. Thanks.

Sure, I'll try. The author in italics opined that "the RAM-Disk dichotomy seems more of an issue in OO", intimating that OO aficianados want durability semantics (as traditionally defined) to be associated with typical SetterMethod?s (on DomainObjects, I assume). I am disagreeing on grounds that transactions have larger scope than individual DomainObject SetterMethod? invocations - in fact, transactions have larger scope than individual DomainObjects (in my experience, they tend to involve graphs or sets of DomainObjects). We want ACID semantics at transaction scope, not finer. I don't see how the programming paradigm used to encode the logic of the transaction magnifies or reduces the differences between volatile and non-volatile storage. Did that make it worse, or better? :) -- RS

You aren't assured that the state will be there the next time you look. Something, like the next statement or another thread/process/whatever can modify it before you look. Can we agree that the RDBMS user is assured that the state will persist until modified?

All the OOP languages I've used keep that promise (unless you ignore their threading and/or caching demands, like Java's need for the "volatile" keyword.) When I call object.setX() I'm assured that the state will persist until modified. But since memory doesn't last forever, neither does that object. In Java, when a thread dies it throws away all the references it held. When there are no more threads, either through peaceful exit or unforeseen termination, there are no more objects. The next time the program runs it has to configure the state of every object it uses. That's the compromise we have to make. If RAM lasted forever we would only have to do that once. The same is true (minus the setX() call) for structures or records in procedural languages. We are assured that after struct.member is set that the state will persist until modified. We just have to set the data in the structs every time the program runs. This is true of all fields, no matter if they are primitive data types, complex structures, unions, pointers or objects. This assurance is language neutral.

I don't see a "RAM-Disk dichotomy". I see an axis of memory access speed (the access axis!), with RAM near the fast end and disk a little slower. The slow end point is off line storage. If the fastest of all possible worlds, everything would be in registers all the time. Disk is faster than a manually loaded tape. RAM is faster than disk. The axis is populated between those points with various virtual memory and disk caching schemes. RAM and disk aren't contradictory, which is part of the definition of "dichotomy" I use. They are just points popular hardware occupy on an axis, and not even the end points.

I would argue that different things are not simply different points along some continuum. Things like processor caches and virtual memory operate virtually identically to RAM; it is rare that a programmer has to write explicit code to move data between them. Similarly, the differences between disk caches and physical disk I/O are mostly invisible. However, moving data between volatile and non-volatile storages is a significant operation in most OS's/architectures - one is not simply faster than the other.

Do you think this is inevitable, or a historical habit? For performance reasons, do we need to have systems that make an explicit distinction between volatile and non-volatile?

Note: We should rename this page. I had to repeatedly look through the backlinks of my homepage to find it because the only significant part of the name is "dichotomy" and I don't seem to be able to remember that. But alas I have no better name proposal.

Discussion factored out to TransparencyAndUniformity.

EditHint: I wish to protest the move. TransparencyAndUniformity is not a sufficiently specific name in my opinion. It begs the question: "transparency and uniformity of what?". Some appear to not like the name of this topic, but at least it is descriptive as-is, in spite of any memory difficulties it may induce in some people. Let's think about this.

I remember working on a personal computer made by DEC in 1982-3. It was called, if I recall, the Professional 350. It had a BASIC interpreter, with persistent arrays - these were written to floppy when the programme ended. -- NissimHadar

Sorry. I didn't intend it as a renaming. I had two objectives: a) renaming and b) extracting the TransparencyAndUniformity part. I just moved the discussion that centered about that - indeed more abstract - sub topic. -- .gz

How about "TransparencyAndUniformityOfState?" instead of TransparencyAndUniformity? I find the latter description too broad. Other suggestions?

CategoryPersistence