WikiStub
A revamping of ObjectOriented. This page is for the discussion of concrete details of a UnifiedDataModel (which is otherwise a synonym) and ObjectOrientedRefactored.
The basic idea, since we starting from (off-the-shelf) hardware and not some abstraction (see ComputerScienceVersionTwo), is to build Objects from the ground up using simpler objects to make a DataEcosystem. A unified object model will make threading and async computing much more straightforward. Computing will be able to be delegated to individual objects and message-passing will be uniform.
Since we have multiple uses for many different generic classes, PrototypeBasedProgramming is the way because we can rename our generic prototype objects more descriptively for our DataEcosystem with very little overhead. (See a ClassesPrototypesComparison).
ReFactor all your smallest objects (like strings of chars) into the highest object for which you have the highest "category" for as a compound object. In other words, ClotheYourData. But this must be balanced with: DontContainTheUnknown.
With a DataEcosystem, built in type/class inspection will become paramount. Python's help() built-in provides a good example, it lists the classes documentation string, all of its data members and methods.
Now what kind of objects shall we build to form the basic modular components upon which most everything will be created? What are the fundamental forces?
- An unordered container is the first type/class, which can accept any other smaller data type. It cannot include itself.
- Since it's unordered you can make it a Bag (or Counter) which counts the duplicate items.
- An interesting note: A C++ class counts as the first kind of container as it's designed to be an abstraction of an OBJECT (and, by default, hides its members), while a struct could be considered its ordered cousin, anchored to its concreteness in memory. This is a significant conceptual difference.
- The next obvious thing would be an ordered container like we're all used to with arrays and LinkedLists, etc, but stop for a moment:
- Apart from a NxN matrix, an ordering implies an unstated, often unconscious relationship in the data. This is not the pattern in a unified data model!. (Even an NxM matrix implies an unstated relationship to the data!).
- Break down your "lists within lists within lists" and make an object that describes your sub-objects and eventually you'll find either a single, atomic element or an unordered set. Back to point 1!
- Remember AllDataRelatesToOtherData and strive for minimizing all undocumented constraints on data (like an ordering).
- But in the mean-time, we'll need an ordered container: a "list" will do nicely.
- A simple, linear label type. Our DataEcosystem needs human-readable names just as the internet needed domain names, so they can be located. A "string" label associates to an arbitrary object. A dict is a set of these associations. Strings cannot be broken up and must be contiguous in memory. If you need to read a long string from file, you're either doing it wrong or need to write an interface of smaller sub-objects, like "Line", or "PersonRecord". Methods useful for this label type:
- "change name". In-place modification of a name when space allows. When it doesn't, the name change request will have to be put "in the stack" and objects reordered as can.
- A standard numeric type for computing on the objects we're creating. Integers (perhaps BigInts) should be all that's necessary.
- Optional: a tuple type for making a serial stream for MessagePassing to Objects. Useful for passing data to threads.
- These types will fade or be relegated to sub-domains: complex, float (relegated to physics), file (after the LanguageIsAnOs becomes real).....more to come.
And look there, the same types suggested in
UnifiedDataModel!
Okay, now what will be some common object types to create from this from the get-go?
- A TextFile, for example, is a list of TextLine's.
- TextLine could be considered a list of TextWords if you're doing language processing, or someone's nicely cloned TextLine as CommaSeperatedRecord and you can clone this to EmployeeRecord (for example) if you're doing TextProcessing, otherwise leave it as is, in respect for the rule DontContainTheUnknown.
Now:
>>> f=TextFile?("test.txt")
>>> help(TextFile?)
Need to define a standard versioning scheme so it's clear what DataObjects
? implement what functionality and behaviors.
WikiStub:
Okay, I know I must be reinventing the wheel here somewhat. Can someone tell me what similar attempts to all this: CPAN? CORBA? are two things that are similar.
CPAN is an archive of software, so probably not directly relevant. Your ideas appear to be closer to CORBA, a standard (with various implementations) for creating distributed software by allowing objects written in various programming languages to be exchanged between nodes. However, software agents may be even closer to what you have in mind. See http://en.wikipedia.org/wiki/Software_agent
Thank you. I think, however, that with a DataEcosystem, the programmers become the agents, working in the interpreter environment and querying/manipulating objects directly making more and more interesting objects in a MashUp form.
Sounds like an interactive CORBA, which could be quite interesting.
PythonThreeThousand