Persistent Language

In the 1980's some research was done towards persistent languages. It was a good idea and I am still not sure why it didn't pick up.

Recently PersistenceEngines e.g. ThePrevayler for Java constitute a similar approach.

A great divide in computing that remains to this day is between programming languages and databases. Persistent languages eliminate the divide by putting the database inside the language.

Database systems tried to eliminate it by putting languages inside the database e.g. StoredProcedures.

Personally I believe in this second approach and I plan to realise it in my Mneson system.

--MariusAmadoAlves

What exactly is a "persistent language"? How does it differ from CollectionOrientedProgramming and ProgrammingWithoutRamDiskDichotomy?--AnonymousCoward

And PrevalenceLayer? No difference in essence I think. Persistent languages were a more pure approach. They tried to extend minimally the programming languages to achieve the essence. Prevalence layers are more complicated.--MariusAmadoAlves


RE: What exactly is a "persistent language"?

"Persistent language" implies TransparentPersistence for any state within an execution of the language (objects and data are recoverable), but also implies more than that, because prior to disruption any program is more than a static collection of objects and data. For example, active threads or continuations must also be restored and scheduled so that processing can pick up where it left off.

"Persistent language" could reasonably apply to any programming language where:

"Disruption" would typically include loss of power or loss of communications. For a distributed programming language, one might extend that to temporary loss of nodes (CPU cores and data segments). Disruption might be characterized as predicted vs. unpredicted; ideally, a persistent language could handle arbitrary unpredicted disruption, yet also take advantage of 'predicted' disruptions (such as voluntary shutdown of a laptop, or a notice that power is low) when said predictions are available in order to improve performance.

Realistically, a persistent language can only handle certain unpredicted causes for disruption, and with some limits to integrity that may require a trade-off between performance and robustness. Thus, "persistent language" is by no means a binary feature of the language.

Also, persistent does not imply 'disruption tolerant'. That is, there is no implied GracefulDegradation. The program is free to stop entirely while disrupted, then continue again when the cause of disruption is alleviated.


Bonus features: not implied by PersistentLanguages, but enhancing them


RE: How does it differ from CollectionOrientedProgramming and ProgrammingWithoutRamDiskDichotomy?

PersistentLanguage in no way implies support for CollectionOrientedProgramming, and CollectionOrientedProgramming in no way implies PersistentLanguage. The concepts are orthogonal. They complement one another very well for database applications, though.

PersistentLanguage also does not imply ProgrammingWithoutRamDiskDichotomy. That is, the "persistence" state might still be distinct from the "active" state (or RAM state) of the objects in question. Programmers who add extensions or plugins to the runtime itself might be quite aware of a RAM-disk dichotomy, needing to 'regenerate' certain RAM-only resources for communications, open files, and so on. The modular use of plugins and AbstractFactory would tend to imply a dichotomy between what-is-stored and how-it's-interpreted, which naturally becomes a RAM-disk dichotomy.

But PersistentLanguage does imply TransparentPersistence; that is, a programmer within the language needn't take special efforts to bring objects into 'active' state or to push them back to inactive states on disk. That can all occur behind the scenes by the PersistenceEngine of the language runtime.


But when you have that ability, then you have to start worrying about issues that database handlers have to worry about, such as where the one-and-true "official" copy of a given fact is, what happens when you "share" and/or change it simultaneously, what if other apps/systems want access to that same info, etc. The more valuable a given fact is, the more it needs to be accessed by different concerns. Volatile (traditional) variables allows one to sweep most of those concerns under the rug. You don't get power without corresponding responsibility (unless you like chaos). I invoke GreencoddsTenthRuleOfProgramming. -t

RE: "that" ability - which one? Do you refer to TransparentPersistence, or the possibility that the language-runtime distinguishes between RAM and disk?

RE: you have to start worrying... - Of whom does the word "you" speak? Do you refer to the programmers implementing/extending the language, or to the programmers using it? I would imagine the concerns you named are implementation issues for the language.

RE: You don't get power without corresponding responsibility - what a ridiculous notion, TopMind. It's easy to obtain or grant power without responsibility. Give a gun to a child. Use a persistent language that someone else implements and maintains. The effort to produce the device that contains power is generally much higher than the effort to transfer that device to someone. If you mean to say that introducing persistence, concurrency, etc. will raise the bar for implementation, or introduce language-design concerns that a 'weaker' language might not possess, that's a reasonable argument. But that burden doesn't fall nearly so heavily on the programmer using the language.

RE: "share" and/or change [a given fact] simultaneously - You assume both unrestricted concurrency and shared mutable state. Persistent languages don't need to include either of those features, much less both of them at the same time. Restrict either feature in various ways and you can readily avoid the race-conditions and synchronization-challenges of determining what the 'official' value for state might be at any given moment. Even without shared mutable state, "persistent language" is still applicable, as continuations and activities have implicit state that must be recovered after disruption.

If your language does need both unrestricted concurrency and shared mutable state, then there are techniques for taming it. For example, transactions, which are useful for persistence (as they allow partial volatility in the implementation), may readily be extended to support isolation.

RE: what if other apps/systems want access to that same info - If what you have is a "persistent program" written in a persistent language, these other apps/systems would certainly need to talk to the persistent program through whichever open communications channels it offers access. And that is exactly how it should be.

As to your "GreencoddsTenthRuleOfProgramming" - it is not unreasonable to suggest that a "persistent language" is slightly more database-like than a volatile one, and that a "concurrent, persistent language" is even more so, and that a "concurrent, persistent language with shared mutable state" is just a few steps away from being a (not necessarily relational) database. I'm not sure what you mean by 'invoking' it, but to suggest GreencoddsTenthRuleOfProgramming applies (in some capacity) to the circumstances of this PersistentLanguage page is reasonable.


RE: "Persistent languages eliminate the divide by putting the database inside the language."

This isn't true. A PersistenceLayer is not a database, and not all databases are persistent (http://en.wikipedia.org/wiki/In-memory_database). PersistentLanguages are not necessarily suitable, without some API, for storing or querying data. A persistent ObjectCapabilityLanguage would almost certainly forbid ad-hoc queries unless they occurred across a special capability provided by the runtime (i.e. for debugging).

That said, it would certainly be easier to implement a persistent database (with its own data-manipulation language and such) within a persistent language. This is especially true if the language also has good support for distribution, transactions, and disruption tolerance - features that would achieve some of the more complicated database aspects of load-balancing, mirroring, and concurrency.

The performance achievable within the database would be limited by the language runtime if achieved this way, but the theoretical performance limit is better because one can eliminate much serialization and translation overhead. The better theoretical limit for performance only extends to use of the database within the persistent language, of course, but that may be sufficient for most applications, especially given the ability to push more of the program to the data.


RE: transactions, which are useful for persistence (as they allow partial volatility in the implementation), may readily be extended to support isolation

Again, adding transactions is a case of GreencoddsTenthRuleOfProgramming.

Indeed, it is. That wasn't a useful statement to make, though, since it repeats a point I already made.

I'd rather let languages focus on being good languages and databases focus on being good databases rather than a language try to be a half-ass database, risking scaling and sharing problems.

Now you contradict yourself. GreencoddsTenthRuleOfProgramming essentially suggests that being a good language requires being (with respect to some concerns) a good database. That said, it doesn't mean the language IS a database. Sharing concerns like robustness/reliability/concurrency/performance/scalability/etc. doesn't make a system a database. It's the data-manipulation language that makes a database. It'd be perfectly reasonable to write a database in a persistent language, leveraging the free support for persistence/transactions/garbage-collection/etc. and giving you time to focus on a data-manipulation language, dataflow to track how views change over time, support for querying streaming data in real-time, and other neat stuff.

I'm not sure how you conclude that about Greencodds.

I conclude it by use of logic, TopMind. But perhaps your axioms are different.

Do you believe good languages should force programmers to reinvent things the hard way and have features that are hacked on and require lots of work to use? No? Well, then you agree with the above axioms.

GreencoddsTenthRuleOfProgramming states: Every sufficiently complex application/language/tool will either have to use a database or reinvent one the hard way.

Note 1: It can further be noted from your comments in this page that 'Greencodds applies' that by "database", you really mean database-like-features such as persistence or transactions, as opposed to any strict or formal concept like DatabaseIsRepresenterOfFacts. Since you coined 'Greencodds', this can be taken as canon. (In my eyes, persistence, concurrency, and transactions are completely orthogonal to databases, but are related to 'management systems' - the 'MS' in 'DBMS'. But since you coined GreencoddsTenthRuleOfProgramming, I'll allow your opinion rule how the Greencodds is interpreted.)

(Conclusion 1) Therefore, a good language will include a database-like-features (from Greencodds, Note 1, and Axiom A).

(Conclusion 2) Therefore, a good language will have tightly integrated, symmetric, and first-class database-like-features (from Conclusion 1 and Axiom B).

Thus, if you agree with Axiom A and B and Greencodds, you are logically inconsistent if you further say that a good language shouldn't concern itself with being a good database. After all, your GreencoddsTenthRuleOfProgramming says (under the above logic) that being a good language and a good database are not separable concerns.

[... risking scaling and sharing problems.] (Or perhaps have a partial implementation of a database and/or query system using standards so that we can use mostly the same code when we do need to migrate to a full-on DB.)

One doesn't need ad-hoc queries and data-manipulation to implement a programming language. When debugging or doing LiveProgramming, I can see weaving a database operations into code to help 'view' or 'tweak' its activities, but other than that it seems to be a violation of PrincipleOfLeastPower, an enormous back-orifice for security violations, and a rather inefficient AbstractionInversion.

Perhaps I should have said lite-duty database-like API's for things that are in-between arrays/lists and dedicated databases. At times it's nice to have transient work-tables whose scope is tied to the programming language's units, yet use the same query language(s) as dedicated DBs so that we can upgrade when needed with little or no rewrite.

Sure. Good support for relations and queries - as found in LogicProgramming - are high on my list of desirable features. And support for FirstClass relations or databases, such that they can be composed logically and maintained separately (as per 'rulebooks' in the InformLanguage) is also nice.

As far as using "the same query languages as dedicated DB's", I think that's terribly short-sighted and stupid. The query language should be well integrated - types and all - with the rest of the language.

But all that has nothing to do with PersistentLanguages, or even with persistence. The features of 'persistence' and 'tables/relations/db-structure-of-choice' are totally orthogonal.

Any useful data tends to be shared by multiple apps and outlives specific app languages. If the relationship is very tight and will stay very tight with a given app and/or app language, then tight integration makes sense. The problem is that the "natural" partitioning lines of "app" and "data" tend to be different or drift apart over time. It's somewhat analogous to political boundaries versus postal boundaries in the geographical sense. Perhaps they started out one and the same, or similar, but have drifted apart over time. Based on past observation, app languages tend to fall in and out of favor. But the data is often necessary for the organization, and must remain. And different tools, such as report writers, also are used with existing data. Things like CSV, ODBC, and SQL were designed to facilitate relatively easy sharing and standardization of data access and extraction among systems and tools. (Perhaps merge/link into SharingDataIsImportant. I forgot it existed until after.) -t

Your complaints are not specific to PersistentLanguage, and thus seem inconsistent. Why must using a language with persistence features for storage hurt sharing of stored data more than does Oracle implementing its own private language and API for accessing a flat-file or HDD directly? You complain about the lifetimes for languages, but why aren't you complaining equally about the possibility of the language in which Oracle DBMS was written falling out of favor? You shouldn't be pointing fingers at PersistentLanguage for problems that are independent of the persistence feature.

Generally when one needs "persistence", they will also likely eventually need "Typical Services Provided by Database Management Systems" as found in DatabaseDefinition. Needing *only* persistence is rather unlikely for anything beyond say app-specific stuff that one might find in an appFoo.INI file.

A PersistentLanguage doesn't need to be *only* persistent (what would that mean, anyway?). And those other features common to enterprise DBMS systems (good support for concurrency, support for automatic distribution and load-balancing, support for automated failover redundancy and self-healing, and a built-in security model) aren't 'DBMS' features. They are features any sufficiently complex system will desire, even if it isn't a DBMS, and thus could be considered KeyLanguageFeatures for a GeneralPurposeProgrammingLanguage. If I were going to implement any shared service - be it a DBMS, a WebServer, a WikiWiki, a FileSystem, an IntegratedDevelopmentEnvironment, an OperatingSystem, etc. - I certainly would prefer to implement it in a language with all these useful features.

That said, even persistence by itself (minus those other CrossCuttingConcerns) is powerful. Persistence is the entire basis of hibernate modes for computers (especially laptops), after all. It would be perfectly reasonable to want even higher quality persistence than is typically achieved by hibernate - i.e. automatically re-establish network connections (e.g. audio and video streaming, restoring downloads and uploads, ComplexEventProcessing, MultiCaster, PipesAndFilters) along with GUIs (dialogs, windows, etc.). Even better: good support for persistence allows one to never 'close' any applications... that is, all applications are open all the time, but they're simply not in main memory all the time. (Virtual memory only is half a solution, acking network layer persistence.) This allows considerable more scalability at the OS layer; persistence with publish/subscribe can support millions of active applications and services instead of a few hundred max, and also support much faster startup times.

Configuration files are not used for just persistence (they also serve as a sort of PolicyInjection/DependencyInjection and support AbstractFactory at the OperatingSystem layer) but could be replaced given better support for sharing persistent resources among different services.

And using Oracle-specific features such as PL/SQL is indeed a product-lockin risk that one has to be careful about. But at least the data itself is still sharable through semi-standard SQL.

You totally missed my point. Oracle DBMS is a service that uses non-standard storage (flat-files and block storage were mentioned explicitly) yet provides a standard interface/API. Services in any PersistentLanguage can do exactly the same. Your complaining that a PersistentLanguage doesn't directly support sharing in a standard way is very much analogous to complaining that Oracle doesn't directly support sharing of the flat-files in which it stores data. That's a silly and invalid complaint, and completely separate from the issue that Oracle's SQL API doesn't strictly adhere to every standard.

Further, Oracle scales better than most app languages, and thus one can be fairly confident that they won't run into an unfixable scalability wall. They just have to open their wallet wider than before. -t

Call me pessimistic, but rather than seeing this as a strength of Oracle, I see it as a weakness of most app languages.

I'd rather see an app language that is designed to work well with existing RDBMS rather than replace them. It could also provide it's own local SQL engine for local processing (or perhaps borrow say SqLite's engine). That way if you need to migrate or scale up that portion to a server-based RDB, one mostly just changes the data source configuration and minor SQL tweaks rather than re-code everything. (SQL indeed has some annoyances, but being a de-facto standard still overpowers those warts.) Related: EmbraceSql. -t

[It's called Java, C#, Python, Ruby, etc. The unnatural divide between application and database is unlikely to endure for long, however, once more sophisticated languages make it possible to define applications as abstractions, with the physical location of database definitions, presentation functionality, business logic, etc., being a matter for automated optimisation rather than programmer-defined directives. Programmers should no more have to worry about these low-level trivialities in the future than concern themselves with the location of file nodes on the average disk today. When that is achieved, SQL will, at best, be reduced to the role of a protocol for communicating with legacy DBMSes.]

Well, paint me skeptical. The base idiom and syntax differences between those languages and SQL are too wide and complex to hide behind dot-path API's. The tight marriage between app language and query language of ExBase taught me of possibilities that are hard to forget. But, I imagine the future will be more like MS-Access-Done-Right rather than be language/code-intensive. -t

[I've no idea what you mean by "[t]he base idioms and syntax differences between those languages and SQL is too wide and complex to hide behind dot-path API's". I don't even know what languages "those languages" refers to.]

[Whether language is presented graphically or textually is of negligible consideration compared to language capability. Graphical language vs text-based language is like worrying about the colour of the paint on the wheels, when the significant consideration is whether to use a car or an airplane.]

[As for the ExBase, see TutorialDee for an example of application language and database integration done properly.]

Given a choice, I'd make a SMEQL-friendly (TopsQueryLanguage) app language if we are going to bypass SQL as the root query language. But for now we'll have to live with SQL.

[There are at least two TutorialDee implementations, a working one in the RelProject and a work-in-progress one from Ingres.]

I meant as far as established or accepted standards. TutorialDee is still a lab toy.


Does KayLanguage belong to this category, since it has KDB built in?

KDB provides something of a "persistent language" (ApiIsLanguage), but does require explicit storage (KDB is an in-memory, volatile database by default, though it makes persistence easy). KDB gets bonus points for supporting redundancy and distribution.

KayLanguage itself... I don't believe it qualifies. The language is designed for interactive programming, and its implementation is unlikely to 'continue' an action started by a user/programmer across, say, a laptop reset (i.e. picking up where it left off). That said, I've never used the language.


See also: PersistenceEngine, ProgrammingWithoutRamDiskDichotomy, InfiniteAmountOfTransactionalMemory, GreencoddsTenthRuleOfProgramming


CategoryPersistence


EditText of this page (last edited July 8, 2010) or FindPage with title or text search