Separation And Grouping Are Archaic Concepts

Rethinking what "group" and "separation" really mean or could mean in cyberland. (Note that some object to the usage of "archaic", which itself is also debated below or in linked topics.)

The issue keeps coming up of "grouping related stuff", "separate concerns", etc. As explained in LaynesLaw (PageAnchor: "transcend") , the concepts of location, grouping, separation, etc., are not necessarily applicable to cyber-land. The only things concrete are perhaps references (links, pointers, etc.). But even these may be virtual as one can do CalculatedRelations. We are not bound by 3D associations in cyber-land; and we should take advantage of this. The physical view of certain OOP proponents (not all) and over-reliance on textual or file-tree programming have limited the industry's thinking.

I've never heard of this 'physical view' of certain OOP proponents. Care to point me to the right pages?
I have a book somewhere in boxes that I'll quote when found. -t (Still haven't found that sucker. Sep2014)

Perhaps it sometimes makes a difference as far as performance is concerned, but from an information organization and management standpoint, we should start tossing physical-world positional thinking out the door in my opinion, because cyberland does not have to be bound to locational issues any more. As time goes on, the machine's speed will be less of any issue such that we can take advantage of the theoretically infinite number of association dimensions we can get out of cyber-land. Even if infinite are not possible, we can still do better than the current physical-based approaches.

For example, consider SQL statements. There are long HolyWars about whether SQL statements should be embedded where they are used (unless repeated often such that a shared function is in order), or put in a "separate" place. If the SQL statements were "marked" somehow, and we could "query" our code, then we could view them with the task code, or view them removed (with perhaps a hyperlink inserted). How, when, and where we see the SQL is purely configurable (assuming we provide enough information). We just have to supply enough information for the system to be able to identify the SQL statements in order to control presentation in a dynamic or custom way. What is really wanted is isolatable concerns. Separation is just one of many possible viewpoints of the same info.

"File" technology may be cheap, familiar, and well-studied; but it seems to be limiting us from moving into the higher realm of SeparateMeaningFromPresentation.

-- top

How would the notation for this work? So something like "SELECT * FROM FOO, BAR WHERE FOO.KEY=BAR.KEY" would become. . .?

Are you asking how SQL statements would be identified by the editor/code-browser? One simple approach is by the API or functions used to call them, such as an executeQuery() function. However, if they are dynamically constructed using strings, it may be a bit more difficult to isolate some of it. But this is true almost any time one language is embedded in another. I tend to use a fixed set of operations for building most dynamic SQL clauses, as described in HelpersInsteadOfWrappers. The use of wrappers relies on convention also, I would note.

I have been having that exact problem in a language I've been writing for a SCADA system. Basically we need to precache the points (similar to tables & fields) - before the script executes, but that is near impossible to do with dynamically constructed strings. i.e. "Pre-caching dynamically constructed strings is an unsolvable problem". I suppose that it is a paradox.

Maybe you're just approaching it the wrong way? Why not run it once without caching and write the script such that in its first invocation, it caches, and on subsequent invocations, it uses the cache if warranted. It's not unreasonable for a program using caching to run slowly on its first run, this process can even be folded into the install/configuration process.
- I'm assuming this is feasible, because otherwise the problem you're stating wouldn't make sense. You can't cache something until you know what you want to cache. :)
  - True - except the dynamically generated strings may be new queries based on the previous data being returned. i.e. I can't construct the strings until I have certain data, but I can't have all the data until I construct the strings - it's a temporal sequencing paradox! It may be partially reroutable, but some things are just impossible. Certain types of dynamic constructions are doomed to be slow, impossible, or not truly dynamic.
    - Well caching inherently focuses on the past. A cache is a cheap way to replay what happened recently, like videotaping your last vacation (so you can relive the experience without paying for outrageous hotel prices). It sounds like what you want, though, is not caching, but rather prediction. Maybe you can reformulate the problem to, "Given what I know, can I make a reasonable prediction of what I am going to do next?" While not foolproof, a good understanding of the problem and the resultant prediction algorithm may give you most of the benefits of a cache, but without your temporal problem.
      - Yes, a few rules can help with that problem. The following might help: A) Dynamically constructed point queries cannot include a portion of another dynamically constructed point query. and B) Each program must be able to run in "null mode" - meaning it can make requests, but is not allowed to take any action. Of course these are just hacks to fix the unsolvable temporal problem.

Sorry - didn't mean to make this page take a tangent - but I think the two problems are DeeplyIntertwingled. Static grouping makes one lose flexibility, but improves speed. Dynamic grouping gives maximum flexibility, but there will be a speed hit. Much like everything else, the "real solution" is probably some kind of PragmaticHybrid

Some of the discussion above seems to suggest that "meta-tizing" code has too much overhead and that MooresLaw will never catch up because we always want to do more when we get more machine power. I am not quite sure I agree with this, but want to clarify that this is the point being raised. -- top

Modularity, which seems to be what you mean by separation and grouping, is not an archaic concept. Programs that aren't modular, simply suck. They are rigid, inflexible, prone to bugs during maintenance due to code for one thing changing and breaking code for another thing. Whether you view software as layered, or subsystems, it is important to modularize the code somehow. Sql does not belong with UI code, UI code does not belong with application code, Sql does not belong with application code, etc. Separating and modularizing these separate responsibilities "does" make the code better. Finding all your Sql statements, should be as simple as looking in the data layer.

I have to disagree. One has to hop around and wade through endless interface layers when such is done. But there are other topics about such (see links below). My real point is that whether they are "separate" is purely a viewpoint. If we can mark DB stuff and mark GUI stuff. Then the system can display it however you want: separated, together, blue, orange, etc. I am not disputing the meaning of your "modular" suggestion, just it's ability to be transformed view-wise. Your approach generally uses context and location in files to identify what is what. I am just suggesting that references be used in place of this. If you want to see it all together, such as all SQL code or all GUI code together, you can. I am not removing that ability. I am adding abilities to group by more dimensions than *just* SQL-ness or whatnot, not removing them. The existing approaches hard-wire a single factor's grouping at the expense of others. To get around that, either we become multi-dimensional (4+) beings, or use database-like technology. The second appears to be the better bet so far.

The reason for this is that the grouping/searching factor we want is often not predictable up front and changes depending on purpose. Finding all code that deals with say taxes may be more difficult if we group merely by what is GUI and what is SQL because both the GUI and the SQL may touch taxes in some way. There is no single right factor.

I am just recommending that we normalize code. That is the bottom line here. We don't use physical location itself in databases to indicate what is what, so why should we do it for code? Context is often convenient for humans, but not necessary for the machine and it makes transformation algorithms more difficult to utilize. (We can still use context when displaying it.)

There's a reason source code is stored in plain text and not in databases, and it isn't likely to change anytime soon, programmers in general "prefer" plaintext. Source code, in almost every language, represents an abstract syntax tree, and Sql sucks at representing and allowing free form editing of those trees.

If you notice, I am not necessarily against "text" because it is one of many possible ways to represent things. The problem is when text is the ONLY representation. It is the same information as text code, just in a more machine-chompable form. Text has limited transformability. As far as what developers "like", I feel they have not carefully considered and practiced with alternatives to know otherwise (FileSystemAlternatives, FileTreesToManageCodeDiscussion). Powerful code browsers hint at the possibilities, but are still too tied to linear text thinking and/or navigational structures.

I agree that SQL is the less-than-ideal relational language, and is why I proposed my own (TopsQueryLanguage). But, that is a separate issue I believe.

Note that I think the programming styles would tend to move away from trees (AbstractSyntaxTree) in programs if more view and edit options were provided. Not that they would disappear, but be reduced. I look at code and see a lot of stuff that is tree-ish out of habit, not because it is really best that way. I agree that things like Boolean expressions are still often best as AST's. But what I propose does not preclude trees either way.

In general I predict that the current file-and-text program organization will be replaced or subsumed by something better in the future. The "modules" will become smaller and "cataloged" into a code database with ample meta-tagging. But the file-oriented approach is quite entrenched and will take a while to dissipate. -- top

View A - Unmarked raw implementation

 module X{
   aaaa
   bbbb
   cccc
   dddd
 }

View B - Identify "aspects" by block names

 module X{
   biz{aaaa}  // business logic
   db{bbbb}   // database access
   biz{cccc}  // more biz logic
   out{dddd}  // output (GUI)
 }

View C - Physically separate "aspect" implementation

 module X{
   aaaa
   do_b()
   cccc
   do_d()
 }
 db{
   define do_b{bbbb}
   define do_something_for_another_module{...}
   ....
 }
 out{
   define do_d{dddd}
   ....
 }

View D - As data

 module | aspect | implementation
 -------|--------|---------------
 X      | biz    | aaaa
 X      | db     | bbbb
 X      | biz    | cccc
 X      | out    | dddd
 ...etc...

With information from view D, we can generate all the other views. There is no one right view, so it seems that the approach that most easily allows one to get the view they want based on an as-needed basis is the most logical.

The only thing view D does not provide is reference identifiers (function or method calls) and possibly parameters needed for view C. It could probably auto-generate them, but it may be hard to provide human-friendly names in text. I suppose we could provide an optional "notes" or "title" column for View D for that purpose. Parameters may be a bit trickier. Perhaps the code units could optionally inherit variable scope from the caller so that parameters are not needed or optional. This issue may relate to LongFunctions.

Another tricky issue is overlaps. Some aspects will overlap in practice, and thus discrete blocks may be an overly simplistic model. (Note that one may be able to force full separation with enough parameter management, but there is a point where it becomes very combersome.)

One approach is to have multiple potential aspect labels:

  aspect_block foo having aspects: A, B, C {
    ....
  }

Here "foo" has 3 aspects: A, B, and C. A table version of this would be a many-to-many table.

Generally there are 3 orthogonal aspects to any code unit in typical apps:

Entity - Domain person, places, and thing
Task
Computing-Space: queries, forms, reports, etc.

Thus, if we wanted to make a database or meta-base of each code segment, we could use a table or structure similar to:

 table: codeChunks
 -----------------
 snippetID  // auto-num
 entity    // (name or ID)
 task     // (name or ID)
 compSpace // (name or ID)
 sourceText

This would allow us to bring "together" related issues/snippets for editing, debugging, or inspection.

Another approach is to use a many-to-many table, similar to a "roles" table as seen in EmployeeTypes, for more complex classification of chunks. Here is the many-to-many version:

codeChunks ---------- snippetID sourceText

categ_code_assoc // many-to-many -------------- snippetRef categRef // primary key = (snippetRef, categRef)

categories ---------- categID categTitle categType // entity, task, comp, other (see above)

We may also want to have arbitrary categories of categories. However, let's save that for the "advanced course".

-- top

Sample Aspects/Classifications for Custom Business Apps

Database (database-related)
GUI
Security
I/O (such as files, pipes, sockets, etc.)
Error-handling

Other possibilities can be found under the FuseBox framework.

Moved discussion to SeparationAndGroupingAreArchaicConceptsDiscussion

Top... I really DO NOT appreciate you moving discussion about how SeparationAndGroupingAreArchaicConcepts is an invalid idea off to a discussion page.

I found it growing TooBigToEdit effectively.

Ah, in that case you should have done as I suggested and moved the stuff irrelevant to Separation and Grouping being or not being archaic concepts to a dedicated page. Since the vast majority of your 'views' and 'tables' stuff is irrelevant to defending, supporting, or depending upon the issues you raised at the top of the page, that would have cleared up a great deal.

You are wrong.

I disagree. You're the fool who, as you indicate below, tried to "focus on implementation" when implementation very much should have been another topic from the start.

Name-calling. How immature.

Name-calling. How 'immature'.

And your mother wears army boots.

[You two: Get a room.]

Rooms are archaic ;-P

Rather than focus on implementation first, which seems to be degenerating into a tool/structure/paradigm HolyWar, how about we list desired CodeTrackerFeatures, and then focus on ways to implement it using our pet techniques.

While I agree that ideally from the point of view of remixing data and code it would seem sensible to have everything be OneBigMessyGraph?, which the user then picks particular subsets or views... there's one problem I've seen in systems which use this approach (such as SmalltalkLanguage, and to a lesser extend LispLanguage).

That problem is this: how do we easily separate out and group chunks of 'stuff', be it code or data, which makes a 'working system' for transfer to and installation on a new system?

The Smalltalk object browser is fine until you want to split off your finished application from its debugging harness and make it a standalone app. Same with Lisp. In practice what seems to happen is either whole 'system images' are distributed - which are a nightmare to port across platforms - or we end up with some standard way of grouping code/data into transferrable 'source files' or 'modules'.

But while Smalltalk and Lisp and other 'image-based' VM languages show this very quickly, this problem is actually more widespread (and more self-similar) than it looks, recurring in multiple situations under multiple guises. The nightmare of distributed data, code and configuration is endemic in the modern IT environment.

An Oracle server hosts the SQL database with embedded stored procedures, another the Java business-logic application with a bunch of JARs and a host of XML config files, another the web server with a PHP module hosting the web interface (a bunch of HTML, PHP, XSLT, Javascript), another the SVN server with the source files, another the Unix and Windows home directories for the Eclipse environment including the config files that enable it to talk correctly to SVN and publish to the Web server... and let's not forget the DNS environment, and the Cisco switches with their config files...

If you squint and think in OO terms, what we've actually accomplished is turned whole systems and servers into a system of interacting objects... a 'program' in other words... but each part is not only written in whole different languages, it's split across a LAN or even distributed across the Net. And yet, every single part must work correctly with each other - it's not just a 'program' but really a 'system'. To save and restore the state of all this becomes a major headache and lots of manual programmer / system-admistrator intervention - even the programmers might not have, for example, the login codes to back up and restore the production business database.

The upshot of this Babel is that we've actually *de*-automated many of the adminstration tasks which computers used to be able to do - or at least that they're capable of doing - because on the one hand, we have forced grouping of things which needn't be together - and on the other hand, we don't have any way of clearly encapsulating related sets of state that relate to a whole system.

It seems to me like we ought to be generalising the idea of 'software engineering' to apply to 'systems administration', since it's really the same problem just with a different mask on. And we should be trying to solve the 'dump and transfer code/configuration/data state' problem in the general case for all computer systems as overlapping subsets of the entire Internet - not just for tiny special cases like 'write an application program'.

I suspect the answer will not involve OO as we currently understand it but something more like relational algebra (but not SQL as we know it either), because a bunch of opaque software objects is as much of a management nightmare for the user or programmer as a bunch of opaque hardware black boxes with no common 'load/store state' interface are for the system administrator.

-- NateCull?

If you want to make it easier to move chunks to new or different systems, then it may require a sacrifice. High modularity often increases "interface code" such that one spends more code and effort adapting and mapping modules. I'm not saying this is inherently good or bad, only that one has balance the trade-off and decide where they want to set the modularity/integration dial to select the least of two evils for a given situation.

I would argue that "indexing" can improve re-use in some cases over hard-partitioning by making it easier to isolate the sections and characteristics you want to extract. If what you wish to extract doesn't fit the boundaries of the existing hard-partitioned system, then you have to manually re-draw the lines. -t

DecemberZeroEight

CategoryInfoPackaging, CategorySourceManagement