Overhauling Style Sheets

In BradyBunchGridDiscussion, a possible approach to replace style-sheets was hinted at. Here's an adjusted snippet:

 <attribset name="atr">
   <attrib name="autorefresh" value="yes"/>
   <attrib name="width"  value=200/>
   <attrib name="height" value=200/>
   <attrib name="float"  value="left"/>
   <attrib name="autoscroll" value="yes"/>
   <attrib name="border" value=1>
 </attribset>
 ...
 <div>  <!-- row 1 -->
   <framediv attribs="atr" interval=2.5 src="panel.php?x=1&y=1"/>
 </div>
 <div>  <!-- row 2 -->
   <framediv attribs="atr" interval=4.0 src="panel.php?x=1&y=2"/>
 </div>

GuiMarkupProposal "Templating" offers another syntax possibility that is more compact:

 <template name="tp01" autorefresh="yes" width=200 height=200 float="left" autoscroll="yes" border=1>
 ...
 <div>  <!-- row 1 -->
   <framediv refTemplate="tp01" interval=2.5 src="panel.php?x=1&y=1"/>
 </div>
 <div>  <!-- row 2 -->
   <framediv refTemplate="tp01" interval=4.0 src="panel.php?x=1&y=2"/>
 </div>

Not sure I like the name "refTemplate", but this is just an illustration. And we could have a hierarchy such as "refTemplate='tp01,tp02,tp03'".

Basically, the idea that style-sheets are a separate "kind" of thing different from markup is tossed. The "inheritance" mechanism used for the markup (a GUI markup language in that case) could just as well be used for style-oriented elements. Thus, we don't have to reinvent the wheel twice for markup inheritance and style inheritance. (The borderline between what is semantic and what is "style" is blurry anyhow.)

An improvement can be made by allowing multiple inheritance, similar to what the existing HTML "class" attribute can already do:

   <sampleTag attribs="atr,atr2,atr3"  etc="etc"/>

The first attribute set ("atr" in this case) receives priority if there is an overlap in attributes. (The HTML Class attribute uses spaces to separate class names. Commas are more readable in my opinion. But the specific syntax is not important at this point until somebody goes to implement it.

[What makes this better than CSS? It seems to be it'll just encourage mixing of markup and presentation, the avoidance of which is a major reason why stylesheets exist. -DavidMcLean?]

The distinction is blurry in practice. This doesn't force mixing anyhow. And even if we do "separate" them, there's no reason to have them each in a different language.

The distinction is only "blurry in practice" if you have failed to distinguish content (what is to be presented) from presentation style (how it is to be presented). Whilst it is certainly possible for CSS to have been implemented using XML, it would have been overkill, because CSS is strictly about setting values of properties.

One can force a distinction, but often it's an arbitrary line in the sand. The customer says "make the page look like this", and they don't necessarily categorize what is what; they just want thing X in spot Y because they do and they call the shots. It's like yard landscaping: you can make recommendations and suggestions, but the house owner ultimately decides where stuff goes and how it looks. I cannot and am not paid to read their mind and so cannot classify their thinking into some idealistic categories. PrematureAbstraction. (Perhaps complicated by a central "style cop" versus a department user with a sub-site.) And XML is perfectly capable of setting values on properties. If we want to classify by tag or attribute name for some unstated reason, we can have parsers that do that. I'm not sure what practical problem you are solving by treating them as very distinct things and using different languages for each.

It is not "an arbitrary line in the sand", unless you do not understand the distinction. HTML is a markup language. CSS is a property-setting language. Both are minimal syntax sufficient to implement their respective semantics.

You haven't said anything concrete or distinguishing.

Actually, I have. Your apparent lack of recognition of this is telling. Do you understand the distinction between content and presentation? Do you understand the difference between a markup language and a property-setting language?

Let's look at something more concrete. Let's say we have a web page with one and only one little "panel" that the "customer" has asked to float to the right in the middle of the page with a picture and a little table of statistics. Whether floating to the right is semantics or a presentation style can be argued either way. It's the method chosen by the customer to present their idea/content. Presentation is communication (or at least can be). Whether the presentation contributes to communication of the idea or is "mere decoration" is difficult to pin down, especially for marketing-oriented stuff. And it's probably a philosophical question beyond the worth of having a developer/designer get overly concerned about. If the customer wants X red, make the damned thing red. It's a wasteful thing to obsess on the classification of the "red-ativity".

[Floating to the right is presentation. It helps that HTML doesn't actually have a way to specify floating-to-the-right, while CSS does, in narrowing that down. Making things red is also presentation. The correct thing to do, in fact, is ask why the thing is red and to encapsulate the en-reddening CSS into a semantically-descriptive class name; for instance, if the thing is red to draw attention to a fatal error, it should have a class "error", not a class "red" or inline CSS setting its colours. -DavidMcLean?]
In my experience such attempts at generalization often fail. (PrematureAbstraction.) If we know up front we have a standard "error color", sure, make it a formal class. However, a good many things don't fall into neat categories or drift over time or the customer doesn't know what they want. If it's a one-off thing, it's best to embed the sucker near the originating object to avoid hoppity-hop maintenance steps. If a pattern grows over time, then one can refactor. Some YagNi principle applies here also. In other words, wait until it really is a one-to-many relationship (or highly likely to be) before preparing for a one-to-many relationship. Don't force categorization or indirection before it's ready. -t
[It's not actually generalisation. It's keeping markup semantic, instead of physical. The markup should say "this box is an error", never "this box is red"; it's a presentational issue that "errors are red". We actually don't need to know up-front that errors are red; we just need to know that there are errors, because if we only specify "errors are red" in the stylesheet once, we'll only need to replace it with orange once. -DavidMcLean?]
I agreed with that already. "If we know up front we have a standard...".
[My point was that we don't need to know up-front that we have a standard "error colour". We just need to know that there are errors. -DavidMcLean?]
Same difference. That's a need we "know up front". That's not the kind of situation I'm concerned with.
[If you're tasked with making something red, ask why it's red, then use a class name that describes the why instead of the what. It's not complex; all it requires is that your customer actually have a point behind making decisions. To quote you, 'it doesn't make a very good example/demo if you just say, "The customer told me to make it that way; I don't know why it has pink counter-rotating flowers that squirt lemonade."'. -DavidMcLean?]
For once you pay attention to what I write. I'm in shock! Sometimes the customer either doesn't really know or is not in a mood to tell. You ask, but they are not ready to answer, and you move on because one often has bigger fish to fry in the app/site. But if they told me such, then I would normally avoid using their specific request as an example here; but in this case the issue is about dealing with the existence of non-explanations during design and coding. Vagueness does happen often. Maybe later a pattern will emerge, but it takes time and patience.
[Yes, it's certainly true that the customer won't always elaborate on particular points or mightn't be able to. That doesn't make mixed markup and presentation a good thing, though: It's still bad practice, just currently unavoidable bad practice. -DavidMcLean?]
"It's a bad practice because I say so" is not good evidence. Per below, my recommendation is to separate if and only if there is a one-to-many relationship or a one-to-many relationship is likely in the near future.
There's a trivial "proof" that separation of content and style is vital: accessibility. Separating content and style makes it possible to deliver user-selectable styles to support various eyesight disorders and/or dyslexia without any need to re-code the content delivery mechanisms. In some cases, the end user can even supply their own stylesheets. See http://www.csszengarden.com/ for an example of what is possible by separating content and style.
Well, okay, you may have a decent argument there. But we are trading likely work for the developer for a need that may be relatively rare, especially for intranet apps with a relatively small group of known users. The dyslexic and blind are not likely to be working in accounting (although I think my Bank hires them based on the screw-ups on my statements). Know your "audience".
See if your HR department agrees with you. They might even cite corporate, state, provincial or federal regulations that require accessibility whether you've got staff that need it or not.
That's probably moot because the current disability software products generally don't substitute style-sheets anyhow to extract needed info and could in theory perform the very same transformation operations on local styles. The computer cares even less about location than a code maintainer (as long as info is not lost in the process). But I'm not an ADA lawyer.
Sorry, I'm not following you. What do you mean by "current disability software products generally don't substitute style-sheets anyhow to extract needed info and could in theory perform the very same transformation operations on local styles"?

Second, it doesn't make much sense to use one language for describe it as a DIV or a unit and another for location info. Why the hell toggle between languages? I see no logic or rational. Doing it to "remind developer about separation" is weak, per above.

The partitioning should be based on estimated likely ChangePatterns, not some goofy arbitrary philosophical distinction. Misguided idealism f8cked up the current practice and the CSS designers should be slapped with a large wet curly brace.

[It's not arbitrary nor philosophical. Content and presentation are (quite obviously, really) different things. On the Web, content lives in HTML, and presentation lives in CSS. Both languages, as noted above, are the minimum syntax necessary to specify their respective semantics. Because presentation information does not require arbitrarily-nested markup, as HTML content does, CSS does not need to support those features; all it must do is associate properties with selectors, and it uses a simple yet effective syntax well-geared toward specifying the required selectors and properties. XML doesn't afford the same capability, because it's a way to encode trees, and CSS doesn't constitute a tree. -DavidMcLean?]

CSS should allow for tree-ness (and can indirectly using class lists and other tricks). Defaults/inheritance can apply to styles also and we don't want to repeat or micromanage each element styling if possible. For example, we may want to make a special "error style" for some elements to highlight errors, but not all. If there is an error, we may want to put a red box around DIV's and buttons but not SPAN's (or vise verse). Thus, we want the SPAN style to default to a wider category if not explicitly overridden by error-centric CSS. Ideally we could address a general area or DIV as being under the "error" mode and elements designated to recognize error-ness style themselves differently than the default style, if given an error-style.

[None of that stuff you describe is tree-ness, but it pretty much is how CSS already works. What's it missing? -DavidMcLean?]

If the distinction between content and presentation is truly clear, I invite you to write the algorithm/steps in a clear way. Otherwise, I remember skeptical and so should readers.

[Content includes the actual text that will be displayed, coupled with the semantic metadata that HTML affords providing structure, such as "this is a paragraph", "this bit's an article", "this particular string of text happens to be a citation". Presentation includes exactly how all those structural elements are arranged and appear on the page, including colours, sizes, margins, locations, whether text is displayed in bold or italics or perhaps underlined, font sizes and families, what kind of bullets to show on an unordered list, and so on. -DavidMcLean?]
Often there are exceptions to the rules or custom thingy-mobobs that have yet to be classified or fall into a clear pattern. For example, a certain style of ad that is still being worked out such it's premature to group them under one umbrella. Domain or shop conventions often arise organically.
[Always separate content markup from presentation. Don't allow exceptions to the rule, for the same reason you should never start coding a function without a failing unit test (CodeUnitTestFirst): If you start implementing stuff the wrong way, there's a good chance it'll stay that way due to inertia. -DavidMcLean?]
No! Again, separation can make one hop around a lot during maintenance and reading. Factor because there's a one-to-many relationship, but not "stretched" one-to-one relationships. Separation is nearly useless for one-to-one relationships. Cluttering up style-sheets with one-off styles makes for bloat and confusion.
No! Separation eliminates the need to consider irrelevancies like colours, font, backgrounds and layout when developing presentation of content. Separation eliminates the need to consider the irrelevancy of content when developing styles. This makes for less "clutter" when coding content-generation, or working on style and presentation.
It's not a concern that goes away under either scenario, it's just a matter of how such things are managed. I've seen style sheets bloated up with one-off styles such that they were un-managable. Maybe if there was better coordination, it could be cleaner to read, but that takes high discipline. If the one-off styles are kept close to their object, then we generally don't have that particular problem.
Bad code can happen anywhere.
YagNi dictates keep it local until it's shared. (Although I'm for a softer form of YagNi.)
By that same argument, YagNi would appear to dictate not using functions or procedures until the same code needs to be invoked in more than one place. Obviously, that isn't the case -- it would result in bad code.
I happen to believe that about functions to some extent. Split them if they are growing excessively large or you see a natural seam, but don't force splitting just because some authoritative-sounding bloke says "it's good".
Invocation of good function names helps make code readable. n.getStringLength() is far more readable and self-explanatory than an in-line loop with an incremented count! The alternative is to intersperse your code with comments, but then you have to maintain a (potentially wasteful) discipline of keeping comments up-to-date with code changes.
I'm pretty sure I've seen this debated before on this wiki. I'll try to find the topic.....leisurely.

Further, it's illogical to have one default/inheritance technique for the markup and another different one for the styles. It should be uniform if possible so that one doesn't have to learn and remember two different ways to manage defaults/inheritance.

[CSS has inheritance. Specifically, it cascades; that's what the C stands for. HTML, by contrast, does not have inheritance. What would that even imply? -DavidMcLean?]

A more powerful GUI markup language will best be served with some form of inheritance and/or templating. HTML is crippled in that regard, and is thus sometimes enhanced with SSI's, ColdFusion, and other techniques.

[Ah, okay, that does make sense. It's less "inheritance" you're finding HTML lacks and more the ability to define custom "subroutines", I think. Generally that's addressed through application of server-side development, though, as you've noted: Most every view layer has support for partials that effectively extend the markup language with app-specific constructs. -DavidMcLean?]

No, subroutines don't handle things like attribute inheritance well. We don't want entire tag versus none.

Attribute inheritance is handled through CSS, which is precisely what it's for.

But we would want that ability for markup also. For example, rather than type a repetitious "onClick" handler for each list item, we can have it inherit that attribute, yet override it for specific items.

[Or we can not use an onClick attribute in the first place, since as I've mentioned earlier that's also mixing markup with other concerns. Using jQuery we can apply the same handler to a set of items through its selector features (borrowed from CSS), as well as register more specific handlers for particular elements. -DavidMcLean?]
So we have one language for "presentation" (CSS), another for content classification (HTML), and another for event handling (jQuery). Three fricken languages. Plus the app language (such as Php or DotNet): FOUR. If this is the pinnacle of software development, shoot me now.
[jQuery isn't a language; it's a JavaScript library. We obviously need an actual programming language (JavaScript) for client-side scripting, as neither HTML nor CSS are capable of specifying behaviour (nor should they be). The HtmlStack has always had "three fricken languages". This is not new. -DavidMcLean?]
The HtmlStack sucks eggs. (I'd argue jQuery borderlines on being a "language", but that classification doesn't change the problem.)
[What is the problem, aside from your not liking the HtmlStack? -DavidMcLean?]
Too many languages is one problem. Having to load bloated libraries for common typical GUI activities is another. HTML's lack of some factoring/referencing-based-reuse abilities is another. The jittery/clunky nature of DOM-centric GUI's is another.
[Why is three languages "too many", and how is that a problem? The lack of factoring available as part of HTML is a definite inconvenience, but as noted below that's been pretty much entirely solved with server-side scripting. And what recent Web GUIs have you been using that are "jittery"? -DavidMcLean?]
All else being equal, it's best to work in fewer language for a given project, wouldn't you agree? Solving the factoring problem with server-side scripting means that one solves the problem a different way for each app language. Giving HTML (or the equiv) this ability allows more standardization because we are not catering to different app languages. And if HTML had it, then it would be best if it used the same factoring approach(es) that CSS did so we don't don't have to remember multiple factoring techniques in different languages. Thus, fixing one problem exposes another problem that can be fixed by merging CSS and the markup (into a master markup) and giving them a common set of factoring techniques. It's factoring the factoring. -t
["All else being equal", it's certainly better to work in fewer languages. But all else is not equal. HTML and CSS are domain-specific languages for describing markup and presentation, respectively. Disposing of either would necessitate adding complexity to one of the remaining two languages in the trifecta, as well as mixing that language's original purpose with some seriously unrelated concern. HTML can't readily use the same factoring approaches of CSS, simply because it doesn't do the same thing as CSS. There's nothing HTML specifies that it would make sense to "inherit" the way one does in CSS. Adding new tags to HTML, as AngularJS enables, is valuable, but as noted that's closer to subroutines than inheritance. -DavidMcLean?]
- The power in the distinction is that each becomes a LittleLanguage. While working on styling a person must grok the absolute minimum required to see what's happening, one of the factors that has enabled legions of visual designers to pick up web design. Take this to the extreme and we are left with the simple beauty of HamlLanguage?, SassLanguage?, and CoffeeScript. -AndrewCouch?
There's nothing special or magical about CSS syntax, and the block "syntax battles" are mostly personal preference issues.
CSS describes a hierarchy of property values, and nothing more. More complex syntax -- such as markup, as there's no text to "mark up" -- would be overkill and pointless complexity. CSS is a minimal syntax sufficient for its intended purpose. So is HTML. So is Javascript.
Please elaborate on your claims. It so far sounds like PersonalChoiceElevatedToMoralImperative.
No, my claims are self-evident. Please provide a counter-argument.
No, they are not. You think they are because you mistake your (misguided) head models for reality.

You mean an onClick that always invokes the same handler?

It doesn't have to be the same for every list item. But let's take the scenario that a large sub-set of the list will invoke the same handler.

In that case, some mechanism for declaring and using re-usable HTML blocks would be reasonable. However, server-side includes and server-side scripting in general obviate the need for this.

But the idea is to make common GUI behavior declarative. And SSI's don't handle attribute-level templating very well.

Yes, then server-side scripting is appropriate.

Then we have a different UI languages for each diff app language. That's like having a different DB query language for every different app language. I don't see that as a good thing.

Actually, it's like having SQL for querying and C# for application programming.

SQL is a far more powerful database language than HTML is a GUI language. What we need is The Sql of GUI's. (Ideally the SmeQl of GUI's, but I'll accept SQL for now.)

[HTML isn't a GUI language. It's a markup language. The HtmlStack, however, could be viewed as a GUI language that happens to have intrinsic SeparationOfConcerns. -DavidMcLean?]

Yes, that's why HTML is limited. Having different languages for SOC is silly, partly because the boundaries of concerns are fuzzy, and partly because concerns are interweaving as to not be linearly partition-able without sacrificing another concern. If each language did it's job far better than the other, you may have a case for having different languages, but such has not been shown.

Having different languages for SeparationOfConcerns minimises the extraneous syntax needed for each, and there are no fuzzy boundaries of concern in the Web client stack. The only thing "fuzzy" is your understanding of it.

So you claim.

Anyway, this debate is moot. You're not going to change Web standards by yourself.

I'm hoping to remind somebody of influence about the illogical convoluted TowerOfBabble? the HtmlStack has become. At least they should ask for "community comment" before building a new contraption to make sure they address design decision questions.

[You're not the only one who isn't particularly fond of the HtmlStack, but most solve it by substituting a different HtmlStack. In the RubyOnRails world, HTML/CSS/JavaScript is often replaced with Haml/Sass/CoffeeScript, for example; Jade/Stylus/CoffeeScript is popular in NodeJs land. No one but you finds the separation of content markup, presentation, and behaviour into separate languages a big enough problem to replace all three languages with one (most don't find it a problem at all). -DavidMcLean?]

More Semantics versus Presentation Fuzz

An example of a fuzzy boundary could be "kinds of ads" a company shows on its web pages. The "kind" may be defined by how they look, or what kind of content is present, or who "invented" the look or style or layout, or a combination. The "meaning" of their classifications may be very nebulous. The "types" may be just names given in an attempt to label a nebulus concept or set of concepts because a fuzzy name may be slightly better than no name, but is otherwise hard to pin down and may not be pin-down-able. Forced classification can be PrematureAbstraction. Such domain abstractions often change in an organic and unpredictable way.

We have pre-defined historical conventions in text such as "paragraph", "block quote", etc., but domain-specific things don't necessarily have a reliable history. And the semantics versus presentation-ness is hokey even in the established ones. A numbered list could be considered a presentation choice versus a bulleted list, for example. Even the concept of a "table" can be fuzzy. The choice to present a physical grid versus the following is also a presentation choice:

Emp ID: 35 Name: Fred P. Jones IQ: 68 Hire Date: 12/12/2000 --------- Emp ID: 36 Name: Lisa J. Mac IQ: 212 Hire Date: 01/14/2006 ---------- Emp ID: 37 Name: Flippy J. Offy IQ: 89 Hire Date: 09/02/1998

dBASE used to allow toggling between a traditional grid table and the "verticle row" style above, by the way.

[It's also possible to toggle between a traditional grid table and a vertical-row table using CSS, because you're absolutely correct on this front: It's merely a presentational issue. -DavidMcLean?]
I'd be curious to know how. It's not something I've tried yet.

Perhaps HTML should have one tag, the TAG tag, which specifies the nesting relationship only. Everything else is "presentation". Whether you want to read a list as a list or a single-column table or a comma-separated item paragraph is purely a display choice: do you want your pea soup in a bowl or in a cup or on your enemy's lap? It facilitates universal customization. You don't have to settle for BLOCKQUOTE, but you can have different kinds of blockq-uotes, such as one color/style for quotations from the good guy and another color/style for the bad guy: GOODYQUOTE, BADDYQUOTE. Why have one syntax for standardized text elements (P, BLOCKQUOTE, etc.) and something different for non-standardized domain-specific or stupid-wishy-washy-customer-specific ones? Reserve label names, fine, but don't force a different syntax between built-in and non-built-in textual abstractions.

[Your arguments seem to presume that HTML implies presentation that it doesn't. You're correct that you can present a list as a bulleted-list or as a single-column table or as a comma-separated item paragraph, and you're also correct that those are all display choices. You're incorrect in assuming that HTML doesn't already work that way: The <ul> tag doesn't mean "a bulleted list". It means "a collection of items for which order is not of prime importance"; by default, this is displayed in bulleted-list form, but there's absolutely no reason you can't make it show up as a table or a paragraph with comma separation. As for your example of "different kinds of blockquotes", we could allow for unlimited totally arbitrary tags in HTML. We don't, because if we did so HTML would actually not have any semantics. The built-in set of tags all have a particular meaning; as noted, <ul> is "a collection of items for which order is not of prime importance", for example. If tag names were entirely arbitrary, HTML would have no meaning independent of a stylesheet, and since stylesheets only impose presentational meaning that eliminates all structural semantics from the HTML, which has obvious issues in, for instance, accessibility. -DavidMcLean?]
"Of prime importance" is nebulous in terms of whether the "importance" is semantics or presentation. I assert that we should not try to force such a classification up front. I'd perhaps consider such a "default presentation" if we want a flexible UI. Thus, the distinction between ordered list and unordered list is perhaps a "style" choice and the tag should really be LIST, not OL and UL. A style attribute can then be something like defaultListStyle:<numbered,lettered,roman,none>.
- [It's definitely of semantic importance. If the order matters semantically, use an <ol>. -DavidMcLean?]
- Importance is relative, and what if we don't have the knowledge of whether it is of "semantic importance" or not? We cannot always probe the author so deeply, being that there are probably many more important questions to get to first. Ideally we should be able to imply the semantic importance is "unknown" (or nil) if we don't know. We only know the draft used numbers, and so we make that the default display. I agree it's probably a nitpicky issue in this case, but in general the developer/designer shouldn't be forced to over-specify intent (oh oh, there's that word again). If we do it right, we'd make the semantic intent be an optional attribute/setting, not a forced and up-front thing. It's like having a mandatory gender (male/female) selector field on a planet full of androgynous hermaphrodites. (You know, the planet where David Bowie, Annie Lennox, and Michael Jackson came from.)
- [We can determine whether the order of a given list is important by examining its elements and assessing whether their order is significant, using the heuristic ability humans possess. Of course, computers can't do that automatically; that's one of the reasons we as humans include information like "the order of this list is important" in our markup. -DavidMcLean?]
- Please clarify, I'm not clear on how examining the list produces certainty. A guess is still a guess. It some cases we can guess with high certainty, in some cases low certainty. It's usually the developer's/preparer's job to deliver a product, and not primarily to classify the domain objects. A "default style" is plenty sufficient to encode the best guess without having to make any iron-clad classification statements. If we have iron-clad info, yes it would be a nice BONUS to be able to encode such info, perhaps in an optional attribute, but it shouldn't be the primary or forced mechanism. You are essentially forcing the up-front choice of bag versus set when most textual tools and writing conventions don't make that distinction; and the question would probably confuse the author and get you a odd glance.
- [How am I requiring a choice between bags and sets? <ul>, being unordered, might be representative of either a bag or set, but <ol> is very obviously an ordered list and not either a bag or set. In the abstract, it's not really possible to quantify how one goes about examining a list to determine whether it's ordered or unordered (or a key-value situation, which calls for a <dl>). In specific cases, one might notice timestamping, or an implication of sequencing between instructions, or some such factor. If we're receiving content from clients directly, we generally assume its initial markup is semantically accurate; this may be flawed in situations where clients don't know their semantics, of course, which is why we sometimes must use our human heuristics to attempt to infer the intended semantics. However, devs don't just accept raw content from clients and slap it onto a page alone. The site's layout must be considered and constructed (partially with HTML, and partially with CSS); while doing this, we apply the various tags structurally as is suitable: Anything that works as an ordered list of elements should be represented as an <ol>, for example. It's common for instance to represent the posts of a blog semantically as <li>s within an <ol>. -DavidMcLean?]
- Sorry, I meant set versus list. And yes, we "must use our human heuristics to attempt to infer the intended semantics", but that's just a verbose way to say "guess" such that my "guess" statements still stand. And there are times one may want to re-sort blog posts, say by topic.
- [Yes, and the very fact that we can re-sort blog posts meaningfully is indicative of the fact that order is meaningful for them. That's a good example of heuristic, actually. Thanks. -DavidMcLean?]
- Sorry, I'm not following. Who is "them", end user or content author? The most common info we reliably have from the author is default preference, and not much else. We cannot and should not assume we can infer deeper or more than that about what the end user should be allowed to do with the list. As a developer/designer, I'm not going to hard-code guesses that are mere guesses. If the end-user has a browser that can re-sort lists or turn on or off numbering (versus bullets or whatnot), I'm not going to attempt to restrict such activity unless there is a stated or obvious business need. You keep skirting the issue that usually we don't know the author(s)' "semantic" intent beyond a guess.
- ["them" is neither end-user nor content-author. "them" is "the blog posts". Blog posts may be sorted meaningfully in various ways (date, topic, tags, number of comments); the very fact that there are meaningful orders possible for blog posts means that order is meaningful for blog posts, and therefore they should be represented in semantic HTML as an <ol> of <li>s. Whether the end-user's browser is capable of re-sorting lists or customising the displayed numbering is irrelevant (most are in fact capable of the latter using user-styles anyway), and the decision to use an <ol> of <li>s does not in any way inhibit doing either. -DavidMcLean?]
- You seem to be agreeing that OL versus UL is simply a "default display preference", but I am not sure. I still believe the distinction between semantics and display is either vague, or the differences require having subtle information that is often not practical to obtain such that in practice our design should not make the distinction required info. Sorting has philosophical elements of BOTH semantics and presentation such that my "default display preference" approach is the better K.I.S.S. The same given information can often be displayed in many different ways: as trees, as lists, as tables, as 3D floating graph blobs (as seen in a Java demo), etc. We cannot practically probe the author to ask and encode whether all possible display ways make sense semantically, and our own guess is merely a guess. HAS-A is more flexible than IS-A when there is domain ambiguity, and a default presentation indicator is closer to HAS-A. (If we play ELIZA with the author to probe their mental map of the domain, they'll slap us silly or call the Loony Bin to come pick us up.) -t
- The distinction between semantics and display is not vague. Semantics is what the data means, display is how the data looks. Semantics says "this is an error message", display says "this text is red."
- That particular example is generally clear-cut, however, I've given an example of something that is not, or at least it's not realistic to go out and obtain semantics. I'll restate the scenario in more specific terms. As a developer or web publisher, you are given a MS-Word document of the content to put on a web-site. List X within the document is a bulleted list of "goals". It doesn't state that the topmost item is more important than the bottom-most; it's just a list. You are told not to "bother the author" except for "critical reasons". We can safely say or assume that the author wants the default display of List X to not be numbered, based on the Word document. However, we CANNOT safety say that "order doesn't matter" or that "order is not important" in List X. We can only make a solid (default) presentation decision/classification, NOT a semantic one. Our choice to use UL is not (solidly) based on semantics. And I gave another example about ads.
- <ol> and <ul> are -- like <font>, <b>, <i> and similar tags -- reflective of a pre-CSS legacy. The semantics of a list may be, for example, "these are our customers", the display is "single column, numbered bullets". Again, semantics and display are conceptually distinct. Whilst <ol> and <ul> are also distinct -- by virtue of legacy, as noted -- through CSS they can be made identical. If we had it all to do over again, there would probably be one <list> tag with a plethora of CSS options.
- Put another way, sometimes we have semantic information and sometimes we don't, and sometimes we have presentation information and sometimes we don't. However, we are still called upon to "get the job done", requiring us to make a best guess. I'm not complaining about the existence of guesses, only suggesting we don't "hard classify" things based on such guesses if possible. Record that's it's a guess, but don't try to paint it as something it's not. In the list case it's probably no big deal, but in other situations, PrematureAbstraction can create big maintenance/change headaches.
- PrematureAbstraction may -- as a general approach -- be a form of YagNi, but combining semantics and presentation is premature clutter. Unless your job is jamming out one-off throwaway pages as quickly as possible, no loss ever accrues from separating them at the start.
- Sorry, I have to disagree. Separating on 1-to-1 relationships can and does lead to extra maintenance steps and unnecessary clutter. (Even 2-to-1 can be questionable because similarities may be coincidental or volatile.) I agree that software and content structure design decisions involve weighing tricky trade-offs. I generally run a kind of simulation of the future in my head, considering the different possible paths and their likelihood using past experience, and use an informal version of the kind of calculations shown in DecisionMathAndYagni to make such decisions on indirection/fan-out. (For bigger decisions, a formalization of such analysis may be in order.) Design-by-rules is not the optimum approach. The "rules" should merely be reminders of factors to consider, not substitution for future-simulation and probability tree analysis. --top
- There's a 1-to-1 relationship between your car's gear shift lever and its transmission. Even if you shift gears rarely, would you rather do so with the shift lever or by sticking your hand in the transmission? Obviously, this is a contrived example but the analogy is apt. Separating presentation from semantics makes it easier (and therefore safer) to change either one, because both are simpler apart than if they were combined. If your argument is that they might not change, then you're probably in the category of "jamming out one-off throwaway pages as quickly as possible", as I mentioned above, whether you like it or not. Valuable production code has a high probability of change, and therefore benefits from making changes safer and easier by enforcing SeparationOfConcerns right from the start.
- It's not always "safer and easier" because it may require opening up 2 different documents or editing-spaces instead of one (more effort), and also it may not be easy to know what ELSE is using a given style settings such that the impact of change is harder to know and test, similar to FragileBaseClassProblem. There are downsides to both approaches. You need to make a probability-based argument to convince me that separation-up-front is always the best decision. Can we at least agree that different changes/scenarios favor each one over the other?
  - [If it's so hard to open two documents in your editor that you'd prefer to put all your styling information directly in your HTML pages, get a different editor. -DavidMcLean?]
- Conversely, a style change that should apply to multiple locations now becomes an awkward and failure-prone search-and-replace, potentially in more documents or editing-spaces than just two. My transmission example, above, makes the probability-based argument that it is always better to separate semantics from style unless -- as I pointed out above -- you are "jamming out one-off throwaway pages as quickly as possible".
- Both approaches can be "done wrong" such that a more useful question is which one is more likely to be done wrong in practice. And toss the car analogy, it's too far removed from this because for one it's about the end-user, not maintainers.
- Obviously, SeparationOfConcerns is preferable to a BigBallOfMud, or even a small ball of mud. In practice, things are far more likely to go wrong when things are complex, and combining concerns always results in more complexity than separating them.
- You are only repeating prior claims, not demonstrating them to be absolute objective truths. I could break down my model into key-strokes, mouse-movements, programmer time, etc. multiplied by probability estimates, but in the past you didn't seem to put much stock into such analysis, which are essentially economic arguments. However, you offer no number-based alternative.
- Such economic arguments are only reasonable if you have actual figures, not purely speculative ones, and can demonstrate that any theories derived therefrom actual work.
  - I disagree. They can help tease out where our difference lay. If you think X is 20% likely to happen by I think it's 70% likely to happen, then we explore X more, or at least document it was a pivotal point. Even if we never agree, we've at least created a model framework to compare such issues. Just because we cannot finish scientific analysis of software issues doesn't mean we shouldn't start. You have to walk before you can run. Science is as much about good questions as it is good answers.
  - Random numbers are meaningless.
  - Anecdotal estimates based on experience are not "random". Sure, they are not very high on the EvidenceTotemPole, but DontComplainWithoutAlternatives.
  - Case studies and anecdotes are valuable. Arbitrary numbers, presumably culled from anecdotes or case studies but sans context, are useless.
  - Further, a reader who may agree the model is generally useful but disagree with a specific estimate value can plug their own shop's estimate into the slot to get a more fitting result.
  - Arbitrary models, untested with evidence, are useless.
  - The alternative is "it's good because I say so and I declare myself smart." I do things for a reason, and I try to articulate my reasons as best I can. Often it is based on probabilistic observations ("people tend to do X when they encounter Y"). I would prefer similar counter models and observations from those with apposing viewpoints over "I just say so". A consolation prize is better than no prize. But if you don't want such info, you don't have to read or propose draft models. Don't complain, just skip over it. To have the viewpoint that one should either have an OfficialCertifiedDoubleBlindPeerReviewedPublishedStudy OR nothing is extreme, in my opinion. Even if I disagree with others' models and observation anecdotes, I still learn something by probing their thought process and knowledge (if they are clear enough).
- Perhaps the most self-evident evidence in favour of SeparationOfConcerns is the answer to either or both of these questions: Why do we have functions, procedures, modules, protocols, operating systems and other mechanisms to implement SeparationOfConcerns? What would programming be like if they didn't exist?
- Factoring, such as 1-to-N factoring. There's an existing topic on functions and 1-to-1 relationships somewhere on this wiki.
- Eh?
- Removing repetition is the primary purpose. Splitting up steps into sub-steps for easier reading or staff division is sometimes a reason, but I would not state it as an absolute rule, but rather an ItDepends rule of thumb.
- Removing repetition is the primary purpose of protocols and operating systems? Would you consider it good practice to code a new DBMS as an undifferentiated soup of language parsing, query optimisation, query plan generation, data structure manipulation and file access without separating or layering them in order to manage complexity?
- We also want to remove repetition of learning other DB-like products and repetition of rewrites upon vendor swaps or changes in implementation due to improved technology. Similar for OS's. Standards are an attempt at reducing such repetition, and are thus a form of factoring.
- That may be so, but it doesn't answer the question: Would you consider it good practice to code a new DBMS as an undifferentiated soup of language parsing, query optimisation, query plan generation, data structure manipulation and file access without separating or layering them in order to manage complexity?
- Maybe the first DB's did jumble these altogether. Over time the above components became their own topic such that algorithms were grouped and found under such labels and specialists specialized in aspects along these lines. Our "parts" formed their own working sub-language and became semi-off-the-shelf tools/algorithms for which we use to build new variations. If you make a new database, you grab the B-tree literature and/or C libraries because that "part" is mostly pre-made and road tested. Humans are lazy (or "economical"); we don't want to do more work if we can borrow others' work. It's kind of a "parts" culture that becomes a kind of standard. If you want to make a new gizmo, it's easier to walk into a hardware store to get pre-made components rather than hand-shape all your root parts from scratch from rocks and soil. Alien intelligences may partition their "standardized IT concepts" and databases quite differently. We only have one planet to observe, and its IT tools were largely formed and shaped by "western universities" and companies (US and Europe) between roughly the 1890's to 1970's, for good or bad.
  - Note that I suspect our differentiation between file systems and databases is a historical accident, and things would have turned out better (single tool instead of 2) if the concepts merged early.
  - Probably true, but irrelevant here.
- That may be so, but it doesn't answer the question: Would you consider it good practice to code a new DBMS as an undifferentiated soup of language parsing, query optimisation, query plan generation, data structure manipulation and file access without separating or layering them in order to manage complexity?
- Because I'm a lazy bastard who doesn't want to spend the time to reinvent the wheel (or different wheels). And if I make up or find too many new or different concepts, it's harder to find another maintainer. For example, if a comment says, "This is the hook to log parsing errors", the maintainer generally knows what's going on. However, if it said, "This is the blipniv to log traxurt errors", the maintainer is going to take longer to grok the code and either give up or charge more for his/her time.
- That may be so, but it doesn't answer the question: Would you consider it good practice to code a new DBMS as an undifferentiated soup of language parsing, query optimisation, query plan generation, data structure manipulation and file access without separating or layering them in order to manage complexity?
- At this time, I'd say the primary consideration is economical: it's cheaper to not reinvent the wheel for the reasons given.
- That may be so, but it doesn't answer the question: Would you consider it good practice to code a new DBMS as an undifferentiated soup of language parsing, query optimisation, query plan generation, data structure manipulation and file access without separating or layering them in order to manage complexity? Why are you avoiding the question?
- No, because we already have conventions and kits and samples and labels (for comments) for the necessary parts so we can save time/money by not reinventing the parts, including maintenance costs per those who know about the prior conventions. If it was a brand new kind of "thing" that had no precedent, then starting out as an undifferentiated blob of code and gradually factoring out commonality as it's encountered may be the way to go. But this doesn't go against my advice of not separating out 1-to-1 relationships without a clear reason. "The maintainers are used to having X separated from Y and get confused if it's not" may be a decent reason, unless maybe you feel it's time to break them of bad habits. But that's still a cost/benefit analysis where 2 paths are compared, a SimulationOfTheFuture.
- Whether we have pre-existing parts or write them ourselves matters not. Let me ask a different way: Do you consider it bad practice to separate or layer functionality in order to manage complexity?
- Whether they are pre-existing does matter. Reinventing parts from scratch is more resource-intensive than using off-the-shelf parts for both production costs and maintenance familiarity reasons.
- That may be so, but it's irrelevant here. This is not a discussion about whether to build or buy, but about whether SeparationOfConcerns is a good or bad practice. From a SeparationOfConcerns point of view, it's immaterial whether you build or buy.
  - It's not just about purchasing, although that does come into play because if we cater to the way parts are sold or packaged (OSS), then we can later swap those parts out as needed. WhenInRome. If we use vocabulary and partitioning based on what maintainers are likely to be familiar with, then maintenance will generally be smoother. The maintainer will know what "parsing" is, but not zabniffing. Thus, we don't reinvent IT conventions unless we can identify a clear reason why they are not a decent fit.
  - Whilst that may be a worthy topic somewhere, it's irrelevant here.
  - It appears I don't understand what you are asking for then. And I want to make it very clear that it's not JUST about "buying". It's one factor among many.
- For the second sentence, you'd need to better define "functionality". We know the value of separating out "parsing" from experience, for example, and it's a concept we can label (comments, variable names, etc.) to jog our future maintenance memory. It's a known IT meme, a "known language of parts". But for new things, or conflicting memes, I'd say ItDepends whether to split 1-to-1's or not.
- I have already identified functionality in the DBMS example above: "language parsing, query optimisation, query plan generation, data structure manipulation and file access". This isn't a question of "whether to split 1-to-1's", whatever that is, but one of overall development strategy. Once again: Do you consider it bad practice to separate or layer functionality in order to manage complexity? That applies to new things as much as old, build or buy, and I don't know what you mean by "conflicting memes".
- In THIS case, probably "yes" (not having more details), for reasons already given. But that doesn't change my general ItDepends stance. A different app may have different considerations and/or the traditional IT conventions are not a good fit. As a general rule of thumb, I'd say it's best to partition on existing IT conventions, at least on the larger scale. But for one-off styles where the staff is not divided on style-versus-app, hard partitioning makes for more maintenance steps overall because one has to jump back and forth and match up to the MirrorModel. MirrorModels add roughly 30% overhead compared to in-lining. Call it the indirection tax. Not only are the corresponding parts in separate files, but the matching has to be managed using match-names etc. One would not have the matching duty under in-lining, or at least it's less complicated because one does not have to worry about a global matching name-space. And it doesn't bloat up the "list" of global styles. The database scenario you gave doesn't create a mirror model so far as we know.
- If you consider it bad practice to separate or layer functionality in order to manage complexity, then I can only assume you either write very small programs, or ones with with very narrow functionality -- e.g., "apply all our business rules to every invoice, and create a list of those that don't comply." Separating and layering functionality -- in other words, enforcing SeparationOfConcerns -- is not just a good habit, it's fundamental to successfully implementing, testing and maintaining anything larger than scripts or classroom examples. It makes UnitTests possible, it makes re-use possible, it makes mentally grasping complexity possible. It makes it easier to write code, it makes it easier to read code, it makes it easier to modify code. The notion that partitioning or layering results in a "MirrorModel", or that it presents an "indirection tax", has no basis in reality. You appear to be conflating something like ObjectRelationalMapping -- which sometimes does result in a MirrorModel -- with SeparationOfConcerns.
- Re: "apply all our business rules to every invoice, and create a list of those that don't comply." -- Please clarify.
  - [For each invoice, apply this set of business-rule predicates. If any of the business rules fail (return false), put that invoice into a list; show the list to the user when done. Would be used to find invoices that violate business rules and hence to deal with them. -DavidMcLean?]
- For one, I did not necessarily consider it a "bad practice" in a general sense. I said ItDepends. I consider such on a case-by-case basis. If it's likely to result in confusion if I don't split, then I indeed will split. Avoiding a MirrorModel, if the pattern starts forming, is one reason to consider not splitting, but must still be considered in with respect to all other known trade-offs. Second, you made a boatload of general claims there without any backing or detailed explanations.
- Finally, SeparationOfConcerns is impossible to a full extent in practice because things do interweave in practice, as described in SeparationAndGroupingAreArchaicConcepts. Concern "marking" is perhaps better than physical separation since we cannot separate all factors at the same time into different files (without making a big multi-indirection mess, the mother of all GoldPlating). Since they cannot all be split out, we must make decisions as to which to split out and which not to. If you are arguing "ALWAYS split concerns" (even if 1-to-1), then you are backing a rather extreme approach which almost no practitioner uses. Otherwise, you have your own ItDepends rules and have joined the ItDepends club. I'd like such rules clarified.
- In general, I'll consider separation on a case-by-case basis (per scenario) and cannot propose any sure-shot rules at this time, only rough guidelines like those already given.
Accessibility does tend to enforce classification standards for content, but this makes it a standards-versus-flexibility issue. For legal and societal reasons we indeed may stick with and encourage a pre-defined taxonomy, at least as a key aspect. But it's still a good exercise to explore more generic approaches to tease out aspects of the philosophy of design in order look at the issues and choices for domain-specific abstractions, such as "ads" above.

Thank You for this opportunity to rant my ass off. You may not agree, but I hope it gets the design philosophy gears rolling in your head. --top

Perhaps a compromise could be a local default style that is overridden by a "master style" set when available. This gives us a "self-contained thing", but also allows adapting to local customs (a shared sheet) if and when available. For example, the default font for all headings could be Arial when there is no master style sheet or if it's not ready yet.

[Already possible:]

  <style type="text/css">
    /* local styles go here */
  </style>
  <link type="text/css" src="style/master.css" />

[They're called cascading style sheets for a reason, y'know. -DavidMcLean?]

That's backwards from the priority that's needed, I believe.

[Nope, that's right. The problem statement calls for a "local default style overridden by a 'master style' set when available", and later CSS always overrides earlier (well, subject to specificity and !important and such), so the local styles belong before the master stylesheet. -DavidMcLean?]

Some browsers don't work that way if I remember correctly. I'll have to run some tests on some older versions and get back to you.

Broken old browsers seem rather irrelevant. If broken old browsers don't properly support standard style sheets, they're certainly not going to retroactively support an "overhauled" style sheet.

I'm talking about app designer schedules, not desktop updater schedules.

??? What does that mean, exactly?

CategoryWebDesign

AprilThirteen