Collections As Grouping

Stating what may to some Obvious and to others Not so obvious - (with Examples) - Collecting and Collections are central to ModernTechnology -- DonaldNoyes.ThinkingOutLoud.20101020


Examples: InformationOrganization

PhysicalManufacturing ElectricalWiring FilePackaging The thing each of these have in common is the action of "Grouping", "Staging", "Compartmentalization" and "Direction".

They have a flow of from the most numerous (rightmost) to the least (leftmost)

It is convenient, and lends to efficiency, that such organization allows the construction of complex structures, whether they be physical or abstract.

It is possible to construct artifacts randomly and without organization, leaving it only to chance that anything useful evolves, or trusting to efforts to rescue what has evolved, even though no predetermined centralized organization of effort or structuring and planning is involved. But the chances for success are greatly reduced, approaching zero as complexity increases.


Caveat: One has to be careful about a hierarchical view and an actual hierarchy. For example, in the physical modeling levels (car.assembly.part.etc), the accounting or inventory department may not care much about physical construction hierarchies, yet still track info about a given part. In such a case, the hierarchy should perhaps be a *view* into a "master" collection and not a hard-wired characteristic of the collection itself. Related: LimitsOfHierarchies. --top

I see this naming convention as a classification scheme, not as a hierarchy. Thus a wire.cable can refer to the same thing as a cable.wire, i.e wire #7 in Cable 27 is the same thing as cable 27's wire 7. I also see by the wiring diagram that #7 wire in #27 cable is connected at the machine end at BSPX445 JunctionBox? to terminal block tb7, terminal #4a and at the Control panel end to BodyShop? Marshalling panel CPMP044 (Area 44) to tb12, terminal 37. The limits are physical, since cable 27 is a 24 conductor cable, and the wire numbers are sequential from #1 through #24. It may be that not all wires are used, but those that exist are still classifed according to scheme. While wire numbers may be duplicated, their classification and thus their naming is tied to the cable number, which in a given installation is unduplicated. In a large installation the name of the cable usually follows a scheme that is tied to its physical beginnings and endings. The referenced cable might be called CPMP044-027-BSPX0445. This scheme is particularly useful when the plant design utilizes an automated cable routing process which limits cable tray loading while optimizing distances. -- DonaldNoyes

Nonetheless you gave these as examples of CollectionsAsGrouping. I think multiple views (CollectionsAsViews? or ViewsForGrouping???) are a very common aspect. The finance and production department being good examples. car.motor.combustion.cable.wire may denote the same thing as inventory.cables.ieee-4711 or assets.inferiors.cm_inc.131072. Further complicating this by filtering onto only some aspects of these leaf elements in each case (possibly complete disjunct ones).

The question is: How is identity kept between these domains? Is the identity even relevant/necessary in all cases? Who is responsible for maintaining the 'actual' hierarchy? How are the multiple hierarchies kept in sync?

The relational answer is to not model hierarchies but only relations - which may be combined in any way. And to use views to let different departments have their own - well - views. The strange thing is that views seem to be seldom used. Are they too expensive? Unwieldy? Too low level (in the sense of not usable by end-users)?

I'd really like to put different views on my data. There are some niche products which do this. Email programs allow to filter by keys. Search folders on the Mac do this. But I#d really like to have this as a basic functionality e.g. as part of the filesystem.

-- Gunnar Zarncke


There is a physical reality in the example of the classification scheme of atControlRoom/panel/terminal/wire/cable/tray/cable/wire/terminal/panel/atEquipment which does not prevent, in fact which invites, views:

That views are used in this example case is true:


How is identity kept between these domains? Is the identity even relevant/necessary in all cases?

In most cases, identity isn't relevant. ObjectIdentity is fundamentally an illusion since the concept of 'object' is itself a view of data. Modeling ObjectIdentity where it isn't intrinsic (e.g. names in a communication structure) is a silly idea in a system where your aim is to avoid a fixed 'view' of things.

That said, it is reasonable to model 'unknown' properties by, essentially, naming them with unresolved variables. This appears in the database as surrogate keys especially if you use the unknown variables as a key (which implicitly indicates that every variable has a unique set of unknown properties... e.g. you don't know the exact location of any two vehicles, but you do know that no two vehicles occupy the same position at the same time, and so you can assert the unknown variables identifying each vehicle are unique). This allows you to store 'facts' in your database that may otherwise have seemed inconsistent, such as two distinct vehicles being marked by the same VIN (which may occur by manufacturing accident, theft, etc.).

That sort of 'identity' is really just a name for a collection of properties - a 'bundle' as described in BundleSubstanceMismatch. Use of such names fits readily into the classification schemes described by Donald... since, after all, such names 'classify' anything meeting a particular set of properties (albeit a set of properties that is rather specific to what what we 'view' as a unique individual).

That sort of 'identity' is more or less what I mean with vagueness: You do not necessarily know what these unknown variables are. I agree that one is in principle able to introduce them as need arises. But maybe it would be better to a) infer them from use or b) postpone them as long as possible or create them on demand. -- .gz

It goes against my principle of "ToDescribeIsPerhapsToValue?". You cannot value what you do not describe. Inference and Postponement are not solutions, they are avoidances. The groupings I described above can be valued (receive identification and quantification). It follows, in lists of members of a collection, which is what I mean by Grouping, that the name you call something, and the description of what that name means, allows grouping to occur. In this way you can move forward to ValuingTheObject.


See also: RealWorldHierarchies


CategoryOrganization MarchZeroNine


EditText of this page (last edited October 20, 2011) or FindPage with title or text search