Some RelationalWeenies still deprecate all things ObjectOriented. This typically appears to derive from some fundamental misunderstandings about ObjectOriented approaches:
- Myth 1: ObjectOriented approaches are strictly about creating DomainObjects, which implicitly or explicitly define ObjectOriented (i.e., neo-hierarchical) databases.
- ObjectOriented approaches may be used in a variety of ways, but are most effective when used in the computational domain (i.e., to create computational abstractions) or to define cross-domain types such as DATE, MONEY, COMPLEX_NUMBER, GEOGRAPHICAL_COORDINATE, TEMPERATURE, etc. They are rarely effective when used to create strict hierarchies of DomainObjects, except when developing simulations.
- Myth 2: ObjectOriented-ness is defined primarily by domain-independent ObjectIdentity.
- This myth encourages the view that ObjectOriented approaches are irreconcilable with the RelationalModel, typically based on the mistaken notion that "pointers are required when you use objects." Obviously, ObjectIdentity, in and of itself, is not that significant. Nor are "pointers", as such, required -- ObjectOriented systems can be creating using RelationalModel-friendly value-and-variable semantics.
- Myth 3: ObjectOriented-ness is defined primarily by mutable ObjectState.
- Myth 4: Inheritance and polymorphism are too inflexible.
- Inheritance and polymorphism are typically (and properly) used to represent invariants in the computational domain, not "real world" hierarchies, except in highly-questionable introductory textbook examples based on geometric shape or animal taxonomies.
- Myth 5: "ObjectOriented" is an unknown to be feared.
- This isn't aided by a plethora of dire textbook examples (non-ObjectOriented programmers are rarely exposed to "real" ObjectOriented programming, which tends to deal more with computational objects than DomainObjects), and the somewhat amorphous nature of terminology relating to ObjectOriented concepts. E.g., the definitions of object, class, instance, reference, variable, value, etc., are often blurred. However, "ObjectOriented" can be understood in terms of specific analytical approaches and programming language features, which -- taken as a whole -- are not irreconcilable with the RelationalModel. Specific features may be, but this does not deprecate ObjectOriented programming as a whole.
The
RelationalModel and
ObjectOriented programming can be highly complementary. The
RelationalModel provides an effective way to manage collections of instances -- i.e., it replaces the usual container classes & instances with a clean, powerful, provable, optimisable, composable model -- whilst
ObjectOriented techniques can effectively define attribute types (especially those instantiated as immutable
ValueObjects) and manage imperative (e.g., application) code.
Perhaps those who categorically reject OO would have less of an objection if they better understood the differences between using ObjectOriented approaches to:
- (a) Define database schemas, i.e., implement ObjectOriented databases for general-purpose data management via OO database tools or ObjectRelationalMapping layers,
- (b) Structure and modularise imperative code, and
- (c) Define attribute types.
These days, the typical
RelationalWeenie who accepts OO:
- Deprecates (a), for all the reasons (problems with optimisation, ad-hoc queries, etc.) that ObjectOriented databases have largely not succeeded for general-purpose data management,
- Embraces (b), for all the reasons that ObjectOriented programming has succeeded in the marketplace, and
- Accepts (c) or recommend models deemed particularly suited to the RelationalModel, such as DateAndDarwensTypeSystem.
--
DaveVoorhis
One problem is that user-defined types are more difficult to use and share by multiple languages. Existing RDBMS tend to offer a limited set of types, but at least most apps can use them without too much cajoling and complex adapters. Types don't cross language boundaries very easy, and until somebody "solves" this, composing complex types with simpler built-in types has generally been smoother. In fact, I think RDBMS should not support a direct Boolean type, just use 1 and 0 integers, because it's fairly difficult to share with different languages. For example, some languages do not have a "null" Boolean. (Related CrossToolTypeAndObjectSharing). I'd welcome experiments with complex user-defined types in RDBMS, but am skeptical they will succeed at this point. -top
I remember when similar scepticism was expressed about computers, "high level" (i.e., not assembly) languages, structured programming, PCs in business, GUIs, object orientation, C++, the Web, Java, DotNet, and so on...
From a sharing stand-point? Problems with X and problems with X sharing may be apples and oranges. Regardless, ideas should be adopted because they prove themselves in sufficient pilot projects of a relevant/comparable niche, not merely because "new equals good".
Indeed. However, a number of us who have implemented complex applications which require complex types and complex code -- and have struggled to shoe-horn these into existing tools including DBMSes, current application languages, and the mechanisms that painfully integrate these -- can easily envision better ways of doing things. Implementing them is merely (!) a matter of time and effort.
I agree that existing RDBMS are limited, but the solution is not necessarily judicious use of "types". There may be many different kinds of solutions (including extending relational). Without specifics, I cannot make recommendations here.
I don't recall asking you to make recommendations. It's notable that DateAndDarwen explicitly do not propose extending the RelationalModel, but they do go into considerable detail on what form a RelationalModel-friendly type system should take. Codd was clear that attributes must have types (i.e., attribute values belong to domains), but not explicit about what form the type system should take.
As argued in DoesRelationalRequireTypes, I suggest that the type system ("domain math") be considered a logically separate system (as far as efficiency allows). Thus, the "types" would not be something the relational engine has to concern itself with as long as the domain math has the necessary hooks (interface requirements). But an unfinished issue is when "cell values" have expandable structures in them, not mere scalars or fixed-quantity sub-components/sub-slots, like phone numbers. Such structures would gum up what relational is about, especially if one has to use domain-specific accessors to manipulate such structures. This is the "encapsulation problem" again. -t
I'm curious: What do you think "relational is about", such that it would "gum up" with complex user-defined types? Also, I'm not clear what you intend to gain by separating the type system from the "relational engine", other than introducing the apparent, uh, "encapsulation problem" you've noted above. However, sharing complex types is highly attainable using ideas suggested by mechanisms already in existence, such as JSON, YAML, and the techniques employed by CORBA, WebServices, et al.
- Relational is mostly about operations on sets of tuples that return sets of tuples. The scalars, or simpler types ("fixed types") are the elements of the tuples. Collections are managed via relational, not custom functions with ADT wrappers (encapsulation). Encapsulation neuters relational. It is very difficult to have in a given collections both encapsulation and "access" to the relational system. It forces a hard decision that is difficult to just back out of when requirements for the collection change. In our common example, a "stack" is no longer a "stack" if the nodes within are available to relational operators. In relational the relationship between operators and operands tends to be viewed as complex (potentially many-to-many), not tightly bound (one-to-one or hierarchical). This reflects the set theory philosophy in relational and nested little state-machines in ADT/objects.
- I see almost nothing in the quoted paragraph that reflects the RelationalModel as defined by Codd and maintained by DateAndDarwen, or indeed any other writer on the subject. Perhaps it would be more accurate to replace "[R|r]elational" in the text above with "table-oriented programming", so that outright inaccuracies like "[t]he scalars ... are the elements of the tuples" -- as if tuple attributes must be scalar (untrue) -- can be given an appropriate context. Most of the rest of the paragraph is meaningless ramble, such as "[i]n relational the relationship between operators and operands tends to be viewed as complex." Huh?
- I was describing a "should", not a definition. Relational does not, or at least should not, dictate what is "in the cells" beyond a minimum set of requirements needed to interface with relational. I was not clear on the transition in my paragraph, and I apologize.
- Hmmm... I must have misunderstood your use of English, because I interpreted "is" in your "Relational is mostly about ..." to mean what the RelationalModel is, as opposed to what it should (a word notably missing from the above) be. As a "should", then, I again suggest that you should incorporate your "shoulds" into Top's TableOrientedProgramming model, rather than tilting at the RelationalModel windmill. We can then effectively compare and contrast the two models, without pointless diversions into what the RelationalModel is or is not, or what it should or should not be.
- You asked two different questions and I munged the answers together when I probably should have separated them. You basically want to use the type system at the structure level, whereas I'd prefer to use relational to manage the structure-level. It gives better reuse and more flexibility because the boundary between one "type" of structure and another is thin-walled and easily morphed. I seek flexibility in my designs/tools while you seek compile-time-checking. Adaptability with minimal effort is more important than having the machine prevent as much "errors" as possible for the projects I work on. -t
- Nice diversion. I again suggest that you refer to -- and expand -- your TableOrientedProgramming model rather than attempt to "use relational to manage the structure-level." Note that DateAndDarwen have already defined an effective TypeSystem (i.e., DateAndDarwensTypeSystem) that appears not to require any reference to tables or RelVars in order to provide user-defined types. However, whether or not these are a sufficient (and effective) way to implement the richness of types encountered and/or desired in the real world is a matter for further analysis and research, as is analysis and research on the equivalent facilities in your TableOrientedProgramming model.
- That's ArgumentFromAuthority, plain and simple, bub. Plus, I see no evidence that they addressed basic issues about the relationship and convertability/sharability between structure-like types and relations.
- How could it be an ArgumentFromAuthority, when it's not an "argument" at all? I'm not saying your TableOrientedProgramming approach is wrong because DateAndDarwen say so. I'm simply pointing out the benefits of distinguishing your TableOrientedProgramming approach from the RelationalModel, such that they can be compared and constrasted in terms of their respective abilities to model complex real-world and computational situations -- including types (or not) -- without the distraction of pointless quibbles over what you feel the RelationalModel should or shouldn't be. Why not refine your TableOrientedProgramming model, implement it, and show us its benefits?
- Are you suggesting that "relational" is defined by DateAndDarwen? (I smell another Nygaard-vs-AlanKay kind of LaynesLaw problem here.)
- No, but I am suggesting that your use of the term "relational" -- which I assume is intended to refer to the RelationalModel -- differs enough from the generally-accepted definition of it (to which DateAndDarwen and others adhere, in keeping with the principles of DrCodd's original paper) that it simply seems reasonable to treat your approach as a distinct model.
[The idea that 'domain math' can be separated from relational is well accepted. In an RDBMS, of course, efficiency is an issue (especially for cluster-based queries - likeness, relative distance, etc.) so an RDBMS may need a lot more 'knowledge' of the domains it is expected to index. The idea that variable-sized structures or domain accessors "gum up" relational seems to be Top's belief alone, and seems to be based on little more than his distaste for types.]
Indeed, Codd's original paper shows that in terms of the abstract RelationalModel, domains (types) exist and need to support (minimally) tests for value equality, but are otherwise irrelevant. Practical DBMS implementations, however, are a different case. I note that Top's apparent distaste for types is a peculiar one; his writings suggest that despite his protestations, he actually likes types (or he'd be an assembly language or FORTH programmer) -- he just likes using them the hard way.
I agree that implementation practicalities complicate clean logical separation. But over time more powerful hardware and practice tuning open-source component interfaces often allows us to move incrementally closer to the ideal. As far as the benefits of heavy type usage, until I see it helping for my domain, I'll remain a type-lite fan. (It may indeed help in other domains. I won't argue that. Best tool for the job.) -top
Top: Relational is mostly about operations on sets of tuples that return sets of tuples. The scalars, or simpler types ("fixed types") are the elements of the tuples. Collections are managed via relational, not custom functions with ADT wrappers (encapsulation). Encapsulation neuters relational. It is very difficult to have in a given collections both encapsulation and "access" to the relational system. It forces a hard decision that is difficult to just back out of when requirements for the collection change. In our common example, a "stack" is no longer a "stack" if the nodes within are available to relational operators. In relational the relationship between operators and operands tends to be viewed as complex (potentially many-to-many), not tightly bound (one-to-one or hierarchical). This reflects the set theory philosophy in relational and nested little state-machines in ADT/objects.
- RE: The scalars, or simpler types ("fixed types") are the elements of the tuples. - This is TopMind's belief and has no basis in RelationalModel.
- I meant "should be" -t
- RE: I meant "should be" - This is TopMind's belief and has no basis in reality.
- Because you say so?
- Because you've not based it in reality. A belief without real hard evidence and logic has no basis in reality. You've made it up, insisted it true with much HandWaving, and repeated it in page after page after page. It still has no basis in reality.
- Where has macro-rigor EVER been applied in software engineering outside of machine performance? [A quick example is CMMI.] It's back to my techniques can beat up your techniques kind of arguments. Let's not go there again, there's enough bulk on that debate already.
- Your belief that it's okay for you to make baseless statements simply because you feel some other people do so is noted. Are you aware of your own hypocrisy in having issued a "should be" claim without having hard reason and numbers to back it up? If you aren't willing to "go there", then retract the claim.
- Nearly all of this wiki would be deleted if hard evidence was a requirement. Software engineering is a soft science, for good or bad. And stop over-indenting, please. CMMI is not scientific, by the way. If that's the best example you have, I don't feel threatened by your brilliance at all. -t
- Software engineering is an engineering discipline, not a science, but is still subject to rigor. CMMI is not a science, but is still an example of macro-rigor applied in software engineering outside of machine performance (see http://www.sei.cmu.edu/cmmi/results.html for those 'hard numbers' you're always seeking).
- Those I looked at are brochures, not independent studies. You asked for "hard evidence" from me, but cannot deliver it yourself. Now you appear to be back-peddling, saying "engineering is not really science" (paraphrased). It appears to me that you are being hypocritical here. If I am speaking as an engineer, then why should you demand "hard scientific evidence" from me when you seem to believe that it either does not exist or does not apply to engineers? (I personally believe that software engineering is largely about psychology, which is not a "hard" science at this point.) -t
- Pointing out strong evidence of your hypocrisy is not the same as making a demand for hard evidence for the hypocritical claim; that inference was all you. My own standard has never been evidence, and has always been validly reasoned justification - something you have only rarely succeeded at providing. Also, as a note, SEI is a source of independent study regarding results (that is, independent of the companies integrating CMMI). But it does fold those results up into bite-sized brochures for people like you.
- I was merely clarifying a statement there, not making a claim there. This whole topic is about exploring that claim, but you seem to demand a one-paragraph proof at the spot of clarification. It is a spot for clarification of a specific statement, not the evidence spot. Understand? You should think before insulting. -t
- Yes, I expect a one-paragraph explanation or justification of any statement that is likely to be controversial... on the spot. Further, explanations and justifications are effective tools for clarification (both clarify 'reasons' for a statement, and by so doing clarify the statement), so your little "that's a clarification spot" argument really ain't holding much water.
- Does this apply to your magic Type-A-Tron also?
- If I ever promote a magic type-a-tron, it will certainly apply there, too. But, thus far, I've managed to avoid appeals to use of HandWaving and magic in my support of types. I prefer to focus on refining what has already been done, such as in XML, Haskell, OCaml, MozartProgrammingSystem, and Alice.
- OCaml? Languages that almost nobody uses in the wild? People did HandWaving alright, and waved "bye bye" to these lab-toy languages. Finally that silly term has meaning.
- Sigh. Stop it with the overt fallacy, would you? There is no reason to believe type systems have anything to do with OCaml's failure in the wild; you know as well as I that market success of platform systems (languages, operating systems, CPUs, transport, databases, protocols, guns & ammunition, etc.) is influenced by a great many factors other than technical quality, not the least of which is the incumbent advantage granted due to a VirtuousCircle in which success begets and maintains success. Your SMEQL is doomed to the same fate, I'd put money on it: even if you completed SMEQL and proved it could do anything SQL could do better, it will fail. The problem isn't people waving "bye bye"... it's that most will never even know the language exists, and the rest won't care because it will appear neither maintainable (since nobody else knows it) nor portable (because nobody else supports it).
- This topic is about popularity to some extent ("embrace").
- Even if that statement weren't a stretch, it isn't relevant.
- RE: Collections are managed via relational... - Relations are managed via relational. TopMind's thought that other collections 'should' be represented in relational has, again, no basis in the RelationalModel. It isn't even true in practice, given the prevalence of variable length 'strings' (collections of characters) as a relational type. There also isn't a problem in RelationalModel with relations containing relations (so long as they are properly relating the relations). TopMind should perhaps reconsider his position of arrogant superiority in claiming what relational is and is not.
- If you have non-relational collections, which is certainly possible, such are in a different universe from relational, with a narrow portal to connect them. Strings are a wonderful example of this: you cannot perform relational operations on strings and vise versa without conversion and copying. The are two different "collection universes". It is difficult to blend them if you wanted to. -t
- I submit a challenge: is TopMind prepared to offer a sound and cogent argument that stacks, trees, and other structures are any less "wonderful examples" of this "different universe" phenomenon than are "strings"? If not, I propose to dismiss as indefensible TopMind's attempted distinction between which collections should be managed via relational and which are in 'different universes'.
- It is similar to the issue of NonOrthogonalLanguageFeatures. Don't have different things that are only kind of different unless there is a clear need. Merge them! Screw IS-A; give me the FeatureBuffetModel. "Tree" should be a view. If we make it a strict ADT, then it cannot directly participate in relational reindeer games with the rest of Santa's data. -top
- Sorry, but your shouting and bold-face assertions and "reindeer games" claims and and "Screw IS-A" argument is anything but a sound or cogent argument that strings aren't some indefensible arbitrary line in the sand. Logically, your argument is full of holes - it seems you appeal to emotion rather than logic or reason. Can you not demonstrate, with reason, that "tree values" are "only kind of different" from relations whereas "strings" are not? Can you not show, logically, that there is a "clear need" for strings whereas there is no "clear need" for other variadic structures? Can you not demonstrate that somehow, fundamentally, 'tree' structured values need to participate in 'relational reindeer games' while 'strings' do not? If you can't reply properly to the challenge, then your assertion that strings are special is inconsistent, illogical, and unreasonable.
- See MagicEverythingMachine. See RedHerring That's not a real reply. Yes, that's exactly what "See RedHerring" means in this context. It means "See MagicEverythingMachine is not a real reply."
- RE: custom functions with ADT wrappers (encapsulation) [...] neuter relational. - Two points: (a) ADT wrappers don't imply encapsulation. (b) Given access to an equality operator that is reflexive, transitive, and commutative, encapsulation in no way 'neuters' relational.
- RE: stack example - Relational only requires that 'stack' as a domain type has an equality operator. In any case, since we're dealing with values or value-objects only, there is very little difference between a stack and a list.
- RE: It is very difficult to have in a given collections both encapsulation and "access" to the relational system - Not as difficult as you imply, but it does require an operator not found in SQL if you wish to, for example, 'join' on all elements of a stack 'domain type'. A generalized 'aggregation' operation does the job very well.
- If you can see the contents of a stack without popping them, then it's not a true stack.
- And yet you can still have a true stack if you 'see' the contents of a stack simply by popping all the elements.
- Your "stack" has now merely become a view of sorts. You are moving my way here without realizing it. And SQL is not the pinnacle of relational. -t
- "merely a view" is utter bullshit. There is nothing "mere" about views. Views, and the differences between representation and what is represented, are very significant in terms of both computation, communication, and framing of problems, and you are terribly foolish to dismiss anything as "merely" a view. Related: JustIsaDangerousWord, TheMapIsNotTheTerritory.
- Perhaps we need to start at square one here. What exactly is a stack in your own words?
- A stack is a simple set of operations intersecting a simple set of invariants: pop(push(X,S)) = S, peek(push(X,S)) = X, isEmpty(newStack()) = true, isEmpty(push(X,S)) = false.
- Is this an exhaustive list of operations? In other words, are there any new operations that would render S a non-stack? (other than deletion or outright replacement.) -t
- Yes, this is an exhaustive list. One can (barely) get away with assigning failure conditions (e.g. exceptions) to pop(newStack()) and peek(newStack()) since those operations are undefined but not all systems support DependentTyping. Otherwise any "new operation" that reduces a stack to a value would need to be defined in terms of the existing reductions (pop, peek, isEmpty).
- PageAnchor: milstack
- So if my manager asks for a sorted report of the elements in a million-element stack, unless I implement it by performing a million pop's and push's, it's no longer a stack? -t
- Nothing about the definition of stack specifies implementation, and you should probably be careful to avoid interpreting manager-speak using programmer jargon.
- If a manager asks for X, you better deliver X, or have a pretty explanation beyond, "It would violate ADT purity to give you such a report, so I won't do it." I've worked with zealots like that, and they didn't last long. Probably went into teaching or something. My point is that the real-world may ask for an "X-ray" of the so-called stack, such that we need to do DB-ish things with the nodes in the stack. A strict stack wouldn't allow such, at least not in a reasonably efficient manner. -t
- I shall roast some marshmallows as I watch that StrawMan burn.
- My point is that the concept of an ISOLATED "stack" is too limited. The real world wants catdogs. You still have not addressed how you'd deal with this situation.
- I have addressed it, more than once here, more than once in RelationalTreesAndGraphsDiscussion, and you are clearly not paying attention. When you need something other than a stack, you apply a function, and, behold! you have something other than a stack. You may produce a relation. If you can define 'catdog' in terms of values, you may produce those too. FoldFunction and other HigherOrderFunctions make this easy. The need to apply a function to produce a view is no different for DomainValues than it is for relations. Indeed, any relational operator may be described as a function closed over relations, and relations may be DomainValues.
PageAnchor: stack_example
RE: apply a function, and, behold! you have something other than a stack
How about an illustration of some kind for the million-node report. (Something tells me you will plan to do so only after my join example.)
The stack report is a contrived example, of course (fundamentally, a stack isn't a DomainValue unless you're relating stacks to things as opposed to relating elements-of-stacks to things). But, supposing your goal is an 'X-ray' of a million-node stack into a million-element relation, I'll offer an example using TotalFunctionalProgramming to guarantee termination. But you could drop that syntactically guaranteed termination and translate the following easily enough into OO using 'pop' 'peek' 'isEmpty' and a FunctorObject instead of an HOF:
define StackFold =
{fn Fold: push(X,S) => {fn: HOF => {fn: Seed => {Fold S HOF {HOF X Seed}}}}
| empty => {fn: HOF => {fn: Seed => Seed}}
} // implemented to take advantage of TailCallOptimization
define SequentialUnion = {fn: Elt => {fn: (Counter Rel) => (s(Counter) {union &[(Counter Elt)] Rel})}}
define Report = {StackFold MillionNodeStack SequentialUnion (zero &[])}
The above is a left-fold, but switching to a right-fold would allow one to leverage
LazyEvaluation rather than
TailCallOptimization.
I'd need clear goals to run further comparisons. The elements can be shared (useful if they are large strings or whatever). Other problems, like asking for all stacks that contain a particular element, would need the same sort of domain-specific indexing support necessary to ask for all strings containing specific words. That particular issue can be solved by generalizing the indexing mechanisms to support HOFs on what aspects of DomainValues need to be indexed, and can be written in the same functional language as the above.
Not sure I follow your notation here. But it looks like you are inventing an FP query language. And searching for a particular element would not be an uncommon request. One cannot know up front all possible tasks/operations asked of a non-trivial collection. And the user of the data has to learn your different little query language here. And these are just a basic WHERE and ORDER-BY operation at this point. Why do we need 2 different query languages to do the same thing?
SELECT * FROM stackNodes WHERE color='red' ORDER BY sales_region
That is by no means equivalent. If I had stacks as
DomainValues, that means I probably have hundreds, if not millions, of
different stacks, and the operation you describe would return nodes from every single one of them. Now, perhaps that is your goal; it could be done with functional+relational easily enough if it is, but we really need to compare systems that are functionally equivalent (i.e. produce the same information, even if it is represented differently) before we can reasonably compare non-functional properties.
RE: not know up front all possible tasks/operations - that's fine. There is no issue with that. Functional and OO are both incredibly flexible when it comes to specifying and composing operations (though I'll amend that: traditional OO really needs a companion language for constructing object configurations ideally with support for DependencyInjection, whereas functional composes readily without that extra effort).
RE: user of the data has to learn your different little query language here - the user needs to learn the representation of the stacks and how to perform domain math operations over them in any case; in your case they need to learn how to join on equal stacks, data entry, and have a tough time performing operations that involve more than one node at a time (e.g. return just the stacks with three red-nodes followed by three blue-nodes).
RE: these are just a basic WHERE and ORDER-BY operation at this point - How so? Please clarify/justify this claim.
RE: Why do we need 2 different query languages to do the same thing? - I'll return the question to you: Why, in Top's approach, do we need 2 different languages for testing equality between and otherwise handling DomainValues (based on whether they belong in a "cell" or not)? As far as "needing 2 query languages", the idea is to support both RelationalModel of data while allowing queries to restrict, aggregate, and perform other operations based on domain math.
- Let's see its vocabulary and some usage samples.
- Sure, just as soon as it has a fully defined vocabulary. For now, all you get to look at are motivations. There are more some motivations and example use-cases in CrossToolTypeAndObjectSharing, too.
In any case, I suspect you're imagining the use of stacks as containers for 'data', but in the sense of
DomainValues, stacks (according to the
RelationalModel, as opposed to
TableOrientedProgramming) shouldn't be treated any differently from short strings or integers. They don't contain data - that is, individual nodes say nothing about the world. Instead, it is the relationship between stacks and other
DomainValues that is data. One needs the ability to operate over selected stacks the same way you'd operate over integers. Your approach fails to do so with even a modicum of convenience.
Being able to view a stack as a "cell" element is not really the problem here. The real problem is *only* being able to view it as a cell element.
I agree, that would be a problem if one's hands were tied and one had no mechanism to view stacks in other ways. Fortunately, it isn't a real problem, because it isn't a problem at all.
To get a flexible system we also want to use our existing query operators and DB infrastructure on them (elements of the collection) without having to code these features by hand for each and every new "collection type" or copy them in and out of various "containers" to use those container features. You can have your cell view as long as you don't hide "the structure" from the collection-oriented side of the DB. Any given record/node should be able to belong to a million dedicated structure "types" if need be. Think set theory: any given node can be a member of lots of different sets. -t
[So what if an item can belong to lots of different sets? The approach you are arguing against doesn't prevent the items in the stack from belonging to different stacks. In fact, in exactly the way an item can belong to different sets, it can belong to different stacks. And nothing about the approach prevents an item from belonging to a stack, set, or any other collection. However, your approach does prevent a cell view. Each and every time the "cell" as a whole needs to be accessed, we have to explicitly manage that structure.]
RE: "To get a flexible system we also want to use [...]" - Top, you have either mixed up the goal with the strategy, or you have been willfully negligent in your failure to recognize how other strategies on the table for achieving flexible systems accomplish the same goal. Either way, it's irritating. GOAL: We want an efficient, convenient, and ad-hoc flexible system that supports DomainValues from a variety of domains (possibly to better allow CrossToolTypeAndObjectSharing). MANY STRATEGIES:
- One approach involves representing cell types as relations (it's relations all the way down! except for the empty relation, of course) or even full databases (records of relations!) in a manner similar to ZF SetTheory. (That approach can be highly optimized without difficulty, and could easily become intuitive; I gave it some serious thought a while back.)
- Another approach involves representing cell types from CategoryTheory, offering sum-types and codata, offering Fold and Unfold functions over everything, perhaps mixing in a few primitive types and operators that are easy to understand (like atoms or words, maybe relations, too, in order to have full SymmetryOfLanguage)
- Top's proposed approach is to (for 'complex' DomainValues at some arbitrarily decided measure of complexity) force cells to carry references into external tables and use relational operators in order to operate with them, in addition to somehow hacking in arbitrary domain math, equality tests, etc. (presumably via scripting language plugins or something).
- Please define "external". -t
- 'external' tables = tables outside the cell, as opposed to having tables inside the cell
- In cyber-land, location is an abstract concept. [No more abstract than location is for real-land.] I'm only saying that it's more efficient and less complexity to use the same collection management system for all collections unless there's a clear reason to reinvent the wheel for an exception. We don't want node-by-node navigation. That's the 1960's. If we need to lock out somebody or something from using certain relational features on a collection set, then it should be a security issue, not an issue of "structure type X didn't bother to implement Y even though the database does Y, so you are SOL on Y unless you write a node-by-node loop using Fold". That is what I find dumb and limiting. -t
- Your unjustified claims about efficiency and complexity and exaggerations about being 'locked out of relational' are noted for the umpteenth time.
- Please illustrate in sufficient detail how a "pure" stack is not locked out (without storing pointers to table cells). -t
- See 'stack_example'. Now please illustrate in sufficient detail how use of relations offers "more efficient and less complexity" for a stack equality comparison.
- I don't understand your weird notation, let alone its optimization techniques to use existing indexes, existing data (without copying), relational operations, etc. Why should the DB user have to learn both relational notation and your weird "other" notation? Why 2? And, "fold" is similar enough to existing relational idioms that we might as well use them. You seem too fond of your FP conventions to consolidate such. (Perhaps the same can be said about me and relational, but at least I am using mainstream conventions as a basis for expansion.) We might as well use the relational-influenced version of Fold instead of invent something that is 70% the same: -t
SELECT myFoldLikeOp(...) WHERE foo='bar' AND blah=7
Strategies need to be evaluated. Not all of them actually achieve their goals. For example, I don't believe Top's approach offers efficiency or convenience, and I don't believe one can justify a claim that it offers more flexibility than the other strategies. I don't even view it as easier to implement:
InventorsParadox seems to favor the broader strokes offered by ZF
SetTheory or
CategoryTheory. Thus, (between this and other observations from SMEQL/
TableOrientedProgramming) I evaluate it as: looks more promising than SQL, looks less promising on almost every measure than other options on the table.
RE: "You can have your cell view as long as you don't hide the structure" - this is another statement of strategy, not goal. The goal you aim to achieve here is composition (which supports ad-hoc flexibility). Achieving this goal doesn't require that "structured" values be directly exposed to relational operators (though various approaches, including yours and the one derived from ZF SetTheory, allow that); it only requires the ability to view a value as a relation and perform relational operations over DomainValue's. All three of the above strategies allow one to compose relational operators with cell DomainValues; the OO and CategoryTheory approaches simply make the composition indirect through an extra primitive. (related: PrimitivesAndMeansOfComposition.)
RE: "Any given record/node should be able to belong to a million dedicated structure "types" if need be. Think set theory..." - this is a statement of goal, and does not imply your strategy. As noted above, there is no issue with the same node or value existing in many different structures (stacks, lists, trees, etc.). If the goal is even greater efficiency, such that representations are shared, that goal is favored by all approaches except yours, Top, and actually counts as a point against your approach. Sharing of structured values is performed quite easily, with O(1) time on construction by interning of large and otherwise structured values through an intermediate hashtable (though I'll amend: some special efforts are needed for even-more-optimal interning of large strings and lists, usually involving a behind-the-scenes representation strategy called ropes). Relevantly, for ZF and CategoryTheory/FP approaches, this can happen entirely behind the scenes, applying to all types and values at once, requiring no special efforts, nor any explicit GarbageCollection, to achieve. However, Top's approach requires special application-side efforts to share nodes/values/etc. across structures anywhere near so optimally. E.g. if treeZ has treeX as its LHS and treeY as its RHS, one must design the data entry helper to search for treeX and treeY to see if it is already in the database. And then one must be careful to keep treeX and treeY around for as long as treeZ is around. Cascading deletes can help here, but only if explicitly noted, and they'd require some fairly complex specifications; another approach is application-side nightly GarbageCollection (which is okay, I suppose, since I didn't list continuous service among the goals, though I probably would for my own work).
- Please clarify. -t
- Hmmm... "TopMind's approach unsuitable for automatic sharing. Other approaches don't suffer this problem. Therefore Top's approach worse by Top's own measure of values/elements belonging to million dedicated structure types." That do it for you?''
- Because you say so, it must be true.
- The explanation is in the above paragraph. But I have learned that, to you, "clarify" means "dumb it down for me, please".
- That your ego-protecting euphemism for "articulating properly"?
- No. Indeed, I was tempted to put "Oook ook" at the end, as I felt like I was writing Tarzan speak for an illiterate child. Nice to finally have an example of what you consider to be "articulating properly".
- Let me rephrase it: your writing sucks.
- I guess that puts us on even ground then, because your logic sucks and you never say anything worth understanding.
- If the logic was objectively wrong, then convert it into formal or semi-formal logic "outline form" along with givens, like a classroom proof. (Warning: past attempts at something similar usually turn into interpretation fights or LaynesLaw. People tend to mistake internal "notions" for truth that others don't necessarily accept.) -t
- That is yet another illogical request. If your logic is wrong, all I should need to do is point out the fallacy or the assumption you made that you haven't defended. Which. I. do.
- I haven't seen you point out any pivotal objective logic flaws on my part. Usually it's your mis-interpretation of something I've said, such as mixing up statements I've made under different topics (and under different goals). YOU seem to be the problem here, not me. (I agree that English is meant for a physical world and we are dealing with non-physical concepts often, but you don't have to be rude about miscommunication. I'm only rude as retaliation, not as a first resort.) -t
- When you start arguing from logical principles, then I'll be able to point out individual flaws. At the moment, the objective flaw in your logic starts right when you start speaking: you don't even bother with logic. You make or imply vague accusations, claim strengths of your own approach without justification, ignore the counters (or fail to comprehend them), ignore requests to justify your own claims, repeat your accusations and claims - you exhibit the behavior of a man who convinces others by trickery and deceit rather than logic or reason. If you feel otherwise, how about you provide an honest start: please logically defend your implied claim that "destructured values are easily able to belong to a million dedicated structure 'types' whereas structured values cannot easily do so". If you make another claim I consider indefensible, I'll point it out to you too. Perhaps, eventually, you can work your way back to logical principles.
- Software design is a soft science. That is not my fault. Nobody has ever proved their pet paradigm/tool objectively better outside of hardware performance or some very narrow metric. I sense a double-standard because you personally don't like me or because I don't use ivory-tower-speak. Again, show me a sample of a good proof and I will emulate it's techniques and style. -t
- I asked for a logical defense, not a proof that your pet paradigm is objectively better on some vague 'macro-rigor' scale. A logical defense only requires (a) a justification for your claim, (b) that all elements of that justification can be logically defended as well, and (c) that the justification makes a meaningful distinction regarding the point of contention (e.g. selecting one option/feature/decision over others). As an example regarding a design decision made last year for one of my projects:
- I choose to use TotalFunctionalProgramming because it allows greater analysis and is guaranteed to terminate. These points are defended by RicesTheorem and the definition of TotalFunctionalProgramming respectively.
- I desire termination because it allows for arbitrary PartialEvaluation optimizations, and also because I'll be running untrusted code and I'd finish, and presumably because it would simplify debugging (if only because I don't need to consider whether or not an infinite loop is the problem).
- I'll be running untrusted code on my machine because running code on my own machine allows a reduction in network latency, and because the network is faulty and I won't always be connected but I'll sometimes need service from the software anyway.
- And so on. If my defense of a decision or claim is logical, then I should be able to trace and justify reasons for it all the way back to observations over the real world, accepted truths, and system requirements. Now, back to you, if you still think you've been reasonable: please logically defend your implied claim that "destructured values are easily able to belong to a million dedicated structure 'types' whereas structured values cannot easily do so"
- [Guys! Guys! Where's the love? I'm not feeling the love here. Group hug?]
- With a powerful enough relative-view engine, we can have group hugs and war at the same time ;-P
- Sort of like the infamous group hug led by Marcus Junius Brutus?
It's a bad argument form to raise a goal in defense of a strategy unless one can both justify the goal, and justify that the strategy is
the best option to achieve said goal, at least from among the listed alternatives; raising a goal in defense of a strategy indicates without said justification indicates that the speaker isn't considering other strategies that have been presented to him, which indicates he is not listening, which everyone else reasonably finds rude and irritating. It is also bad discussion form to raise a strategy as though it were a goal, especially if the speaker should be aware of the alternative strategies from prior discussion, because it indicates the speaker really is not listening, which everyone else reasonably finds rude and irritating. Can you please, Top, try to be more aware of what you are presenting as goals vs. strategies? Keep it straight. Consider how each goal holds up in other strategies that have been presented. It would lead to more civil discussion. Seriously.
You are HandWaving, claiming that Z-foo or whatnot solves everything magically without demonstrating it and without trade-offs. My suggestion builds on an existing and common tool, RDBMS. Why toss out something that's tested in the market-place for some obscure untested academic math? I agree that such deserves experimentation, but for the shorter term the safer path is to build on what's out there unless you can show some magic way to share collection-orientation and have hard-walled "types" at the same time. I suspect they are conflicting goals, that EverythingIsRelative will hit face-first against encapsulation, but you are welcome to demonstrate otherwise. -t
- [Please clarify. (Ah, for a "sarcasm" tag...) ZF refers to Zermelo-Fraenkel set theory, which for all intents and purposes you may read as "set theory", something which you (Top) have often advocated implementations thereof, or at least the use of sets. As for something that's "tested in the market-place", note that the marketplace -- thankfully -- evolves, progresses and moves on. You may or may not be watching the future of DBMSes & languages evolve in front of you here, but at the very least you shouldn't be so arrogant or ignorant as to behave as if you have nothing to learn.]
- Set theory needs some kind of representation and notation to computerize and/or compare to existing tools/notations/languages. If you wish to document and describe such a contraption, including how data sharing and indexing could work with it effectively, be my guest. Just make sure you use Ascii and don't get symbol-happy.
- [Would SetlLanguage do?]
- Do for what? What exactly are you claiming?
- I would imagine that it is "do for documenting and describing a contraption for computerized SetTheory as you suggested in your previous paragraph". Perhaps you should review when you've forgotten the context.
[
User defined types are obscure academic math? I'm fairly certain that more people are comfortable with UDTs than with RDBMSs. Especially since I see more poorly used RDBMSs than UDTs in spite of seeing many more UDTs than RDBMSs. As for tossing out tested products, it should be noted that the parts that the RDBMS currently do well are being kept in, it's only where the market-place has discovered that RDBMSs are lacking that are being tossed. It should be noted that you have also noticed that lack as you have proposed another solution to fix it. It should also be noted that your approach (special operators for particularly popular structures) has been tried repeatedly and so far hasn't shown any significant success on that front.]
- Where did I say that UDF's were "obscure"?
- [You didn't. But I never claimed you said that, so why bring it up?]
- [Now, if you meant "When did I say that UDTs were "obscure"?", then that would be when you said, "Why toss out something that's tested in the market-place for some obscure untested academic math?", in what I responded to. Since what you are arguing against is allowing the user to implement arbitrary UDTs in a an RDBMS, I assumed that that was what you were referring to with the "obscure untested academic math" phrase. It's not like there are any other candidates to which that might apply under discussion.]
[
As for that "magic way" you define equality operators on your "hard-walled" types. You define a fold operator on collection types. That's it. Supply those to the RDBMS and it can now use those types in the same manner it uses integers, strings, etc. An example has even been supplied on this page, without the goals conflicting, so your suspicions are suspect.]
- That doesn't explain implementation, optimization, and node sharing.
- [You have a sample implementation of a fold function for a particular collection. The others would have to be supplied by the ones who define the UDT. What more do you want?.]
- [What are you optimizing for? Obviously, I can't optimize something if I don't know what resource usage I'm trying to reduce the usage of.]
- Typical collection-oriented behaviors, like those found in our relational tool-kit.
- [Node sharing is performed in exactly the same way it is currently done in an RDBMS. I hope I don't have to explain that to you.]
- So then it should also have access to all our favorite relational operations if and when we want them, and participate in joins, indexes, etc. like any other relational record/node without extra copying or excess pointer indirection. Good.
[''As for the costs, you have to define, at most, two operators per type; the RDBMS has to be implemented to allow arbitrary UDTs; and you have to have some way of communicating the types between applications and the RDBMS, Your approach requires that the operators are implemented every single time they are used.]
Where did I put that limit?
[It requires that diverse applications agree on how the exposed structure is to be used. It requires database writes whenever one wishes to search for a particular collection.'']
Security in relational between operators and operands would be via something like an AccessControlList or update constraints, not IS-A modeling. A "push" or "pop" stored procedure could work on any or no tables depending on it's configuration and security access lists. Put another way, relational tends use a subtractive approach to associate operators and operands, while ADT's/classes use an additive approach. It is comparable to a Cartesian join being the default join, and filters (WHERE) are supplied to provide any limits we want to place on it. -t
- This is based on the (unvalidated) assumption that stack domain values "should be" represented in full tables. And 'security' is not part of the RelationalModel (though SecurityModels for securing data make fine experimental extensions to the RelationalModel).
- Security may not be part of the model, but HOW security is done from a sets-and-predicates philosophy will be different than IS-A-based security/protection/encapsulation.
- There is no such thing as an "is-a-based" SecurityModel. ObjectCapabilityModel certainly doesn't depend on 'is-a'. But, yes, security for knowledge available in centralized databases will be different... probably based on some combination of "need-to-know" and "cleared-to-know".
- As far as what a stack "should be", if it's easier to do more things with it by toggling a few feature switches than changing paradigms altogether, then our design is more change-friendly. (I agree there are cases where being change-friendly is not a key goal, but it *usually* is in my domain.)
- Your preferred design isn't more change-friendly. It is only more SQL friendly.
- It certainly gives one more options and abilities than hand-adding them one-by-one.
- FalseDichotomy. Support for both user-defined functions that return relations (common in logic programming) and generalized relational aggregation (the relational equivalent of FoldFunction) are together sufficient to add the desired flexibility 'all-at-once'... and do so WITHOUT the nasty CodeSmell of your favored approach (representing trees and such in relations) that comes with introducing arbitrary need for surrogate keys and application-driven garbage collection.
- Even if it was easy to add the "typical services" listed under DatabaseDefinition (I doubt it), they still would not be "compatible" with the rest of the database, such as using the same indexes and sharing the same data when doing joins, etc. It would still be a separate collections universe. -t
- You repeat these claims from RelationalTreesAndGraphsDiscussion, even though you have failed to defend them over there. Quick summary of counters: (a) If I need relational operators, I simply apply a FoldFunction to convert a DomainValue into a relation (or record of relations). (b) Sharing structure is easy; see CopyOnWrite. (c) Top's implicit claim that his approach offers an indexing advantage is completely unsubstantiated. I'll also add that "separate collections universe" is not a "problem" to be solved and falls naturally out of DomainValues being orthogonal to the RelationalModel.
- Do you mean copying address pointers from your custom structure into a relational structure? That is butt-ugly. I hope that's not what you are really proposing.
- Any pointer operations would be well buried under the hood and not exposed to users. Same with indexing and index maintenance of other sorts.
- It could be a headache for the developer. But I will await an illustration, per above.
- I ask that you properly justify your implied accusation that it is highly probable this is 'more' a headache for the developer than your approach.
And the JSON/YAML comments generally look like issues already raised in
CrossToolTypeAndObjectSharing. -t
- Those were examples, not issues. And, as examples, they finely contradict your claim that complex types 'gum up' relational wrgt. sharing and storing structured data.
- Please flesh out the examples more.
- To what end? Look up YamlAintMarkupLanguage and JavascriptObjectNotation? on your own if you need examples of how they are used to share structured data.
EditHint: Merge relevant sections of OopBizDomainGap and ComputationalAbstractionTechniques.
See HierarchicalRelational
FebruaryZeroNine