Normalization Repetition And Flexibility Discussion

A continuation of NormalizationRepetitionAndFlexibility because that is getting TooBigToEdit.

Excellent - that page is a big, sprawling ThreadMess that's too hard to find anything useful in, so let's start another one!
- [We should start a little mini thread war right here..]

They are sufficient when dealing with already formalized computational models, such as programming languages and CPUs. We know exactly how these things interact, down to the hardware level and up to the high-level programming languages - i.e. we can assert that the mathematical models of these are as close to perfection as any model can achieve, and those cases where it isn't constitute manufacturing flaws or compiler bugs.

Comparing hardware design to software design has always been problematic because they are different beasts. For one, design maintenance cost is usually a low-ranking factor in hardware, largely because the largest cost is manufacturing, not design. It is sort of comparable to software design in the late 50's: machine efficiency trumped code design by far. Model maintenance has since overshadowed model-execution-engine maintenance. This is not the case in manufacturing yet. It is still closely bound to the hardware. --top

The fact remains that software and programming languages are entirely defined and subject to complete and 100% correct mathematical analysis (internally, at least). [removed content-free rudeness]

Knowing this, we can, in fact, use math and logic for 80%+ of all observations. The only ones it can't touch are psychological and statistical observations over human users and politics (e.g. noting that humans lack perfect foresight, humans rarely both know exactly what they want and how to communicate it without wanting to later change their mind, and that humans make typographical mistakes and logic errors, and that every human ever observed does so - we've never, in fifty years, found a programmer that gets everything perfect on the first try, and noting that businesses prefer 'plug-in' programmers, and noting that most humans don't like learning many different languages creating StumblingBlocksForDomainSpecificLanguages). These observations are necessary to help shape policy and language-decisions, and they can't be diminished... but they also rarely need to be quantified (qualification is generally sufficient to make appropriate decisions - even if we DID quantify these things, there is simply NO place we could ever plug in the numbers).

If there is no dispute about the qualitative decisions, then of course there is no "problem" with qualitative approaches. But that does not help with disputes or conflicting choices.

Unless you can demonstrate or prove that quantitative analysis would do us one better (since you cannot just assume it) we are left only with the conclusion that disputes or conflicting choices are inevitable. What matters is that qualitative analysis gives us enough information to make a better informed decision regarding properties of the language, not that everyone agree that those are better properties. A language-designer, even with perfect information on rates of typographical errors and the relative number of logic resulting from creation of on-the-fly variables in dynamic scope, will be unlikely to make a better design decision than a language-designer who simply knows that such errors happen enough for people to complain about them.

Your own 'ideal' regarding the use of empirical observation in ComputerScience fields is one I consider entirely naive. You need to use more math and logic, and limit observational science to the places you can demonstrate it will help.

What is the alternative? Self-elected "experts" who dictate "standards"? (I doubt they would agree anyhow.) That is the dark-ages. You may like this state of affairs, but some find it primitive. We still have DisciplineEnvy.

No, the alternative is proper use of math and logic and qualitative analysis.

How will this work if you don't know what you are optimizing or don't know how what you optimize means anything in the real world (more productivity, less errors, etc.)? A "clever model" is not necessarily a useful model. Epicycles were clever and had a certain purity to them, I would note (at least the early models).
MuAnswer. You assume we don't know what we're optimizing. You also assume the goal is always optimization. You assume far too much. As far as asking your style of questions: have you stopped beating your wife yet?
If you wish, you can look at the very long list of objective features that can be qualified in AbsolutismHasGreaterBurdenOfProof. Optimization is one application (and we DO know what we're optimizing when optimization is the goal; where we wish it, we know every detail down to the signals on the wires, and we know every abstraction up to the high-level code and the classes of problems or scenarios we choose to optimize). Making decisions on objectively measurable features (e.g. 'resistance to typographical error', or 'guarantee that no undefined operation will occur') then (and especially) engineering solutions to achieve various features simultaneously is also a primary use of math, logic, and qualitative analysis.
Numerical metrics are almost always necessary to claim something is based on scientific analysis ("objectively better"), especially if the claims are on contentious issues. (If nobody challenges them, then "soft science" usually gets a cheap pass.). I doubt you found the exceptions, and are instead mistaking "clever" design for real-world results. --top
I believe you have a very deep misunderstanding of the scientific method.
The feeling is mutual.

This partially means creation of tools and languages that allow better, cheaper, easier automated analysis of programs. While you and people like you squabble over such dark-ages nonsense as whether a change better 'optimizes psychology' or 'brings your programming closer to God', the hardcore logicians will create languages that provably make being 'correct' or 'safe' or 'secure' by objective, qualitative measures the 'easier' thing to do. ComputationTheory? analysis of EssentialComplexity can help tell us what it is possible to do easily and how close we're getting.

That just may be excessive idealism of your personality rather than a worthy or achievable goal.
OR it may be an achievable goal that has been moving forward continuously while you squabble about 'optimizing psychology', whatever that means.
We need better metrics in order to filter out such human bias.
AND Math and Logic and Qualitative Analysis are a proper location to find such metrics.
Humans are highly biased creatures with their own little pet ways of seeing the world. Or, find ways to make the tools fit our biases like tools that fit the shape of our own hand. You think the solution to better software is some clever math that "proves" the world while I believe it to be fitting the model to the modeler's head to reduce the cost of interfacing. You see the bottleneck as lack of a more self-proving or machine-provable model while I see the bottleneck as the translation effort between the model and the modeler's thinking process.
BUT you have no way to ever show your solution fits the model to the modeler's head. Nor do you have a way of demonstrating that the model still fits the problem after you've twisted it to fit the modeler's head. You know what? Your proposal may be excessive idealism of your personality rather than a worthy or achievable goal. Here's a problem for you to confront: Take a child of eleven years of age. Make the model fit her head. Then make sure that model not only still reflects the domain but remains in good enough condition for programming purposes.
Also, my goal is more about productivity than "safety" because if one is productive they have more time for review and testing.
Ah, 'productivity'. What does that mean in a world where you can get sued for writing bad software? How do you measure 'productivity'? How do you demonstrate improvement therein? Are you talking some sort of non-scientific 'bullsh8t'?
Like your claims are scientifically measured? Not. --top
Eh? A claim of type-safety, hard real-time, security against certain scenarios, resistance to certain typographical or transmission errors, etc. can be mathematically proven and observationally verified, which (in combination) is far better than 'scientifically measured'. So don't point your hypocritical finger my way; take care of your own arguments first.
You have not used science. And, why did Paul Graham and co kick competitor's butts using a scriptish style?
- Scriptish? More like Lisp-ish. And, why, pray tell, did Yahoo stores get rewritten in an industry language called C++? No one could understand his Lisp code.. possibly. See also FunctionalInImperative for why functional programming can be cool - but hard to read and grok at times.
- One theory is that Lisp support and developers are difficult to find consistently and that it didn't match the infrastructure Yahoo already had. The other is that the more fastidious languages are better once an app has matured and stabilized. But my point is that Graham's team beat competitors who used different languages. The way you talk, not using a fastidious language should have slowed them way down. Most of the fastest growing web companies used Perl, NOT C++, Java, Eiffel, Ada, etc.
- [Web companies are using Perl less these days. Anyway, fast growing non web companies like Nasa and Boeing use Ada. Other fast growing companies like Microsoft and Apple used Pascal and Cee in their beginning stages. Java/C# is also used in a lot of web companies - new ones very much so. Paul Graham could have written his shopping cart in Python, Ruby, or Cee and he still would have been famous and still would have succeeded. What Paul does is use his fame as an excuse to promote Lisp as if Lisp saved the day. Any good programmer can save the day in any language... Qomp, Ruby, Python, Java, Cee, whatever. I do think that Cee would slow a person down on the web, and I do also think Ada would.]
- [As for the word anal - yes here it is again. Top is once again being an insulting hypocrite, as usual. No worry. I am not offended. If you wish to call some languages anal, then let's equally call Top's languages weak, sloppy, and unprotected aids infested anti sterile diseases]
- I removed that word. -t
Is there a point to your comments?
I don't know what 'productivity' means, but I can classify code touched after a requirements change, and I can focus on giving business rules explicit and centralized representation for easy change, and I can target security and privacy issues and get a provably correct model so programmers don't have to reinvent it themselves each time, and I can ensure that the language provides safety so you can feel comfortable running other people's code on your machine without it breaking stuff, and I can provide SoftwareTransactionalMemory so programmers don't need to reinvent it or deal with unweildy and risky lock-based solutions, etc. These might offer productivity boosts or might not; that would depend on the domain. But I can also put some effort into ensuring they don't hurt even where they don't help (e.g. SoftTyping with warning support over StaticTyping).
Plus, understanding the domain is often a bigger bottleneck than mere software "errors", at least in my neck of the woods. --top
The domain and its EssentialComplexity doesn't go away even for me, but I don't pretend to have a miracle-solution that will cure acne, regrow your hair, and make the world a simpler place. If you're claiming your solution makes the domain itself a better fit for the programmer's head, I'm afraid you'll have a great deal to prove, and that the only viable approach will be that of a fast-talking charlatan. Meanwhile, I'll stick with focusing my language and framework-design efforts on solving problems that are actually solvable.
You again are asking me to objectively prove subjectivity.
If you make any claims about objective benefits of focusing on psychology, such is my right, and such is your burden of proof. If I recall, did you not make such a claim about 'productivity'? And just now about 'learning cycle'?
I have no studies to cite, only anecdotes. But being that there has not been evidence presented either way, I'll take my anecdotes over yours. And, are you claiming that GUI's don't improve the learning curve? Remember the "skeptical before challenge" rule? I know you don't like it, but I hold to it. It is a rational rule. The default is not that "psychology does not matter". --top
So, if I understand top correctly, he has not a leg to stand on, not even a measurable notion of what 'fits the mind better' means, no evidence to support his claim, doesn't even specify the 'domain' over which GUI supposedly improves learning curve (I'll agree that "the GUI helps you learn the GUI", but not that "the GUI helps you learn X" for arbitrary X), etc. Further, top even violates his own statements (not that this is unusual) of "equal or no different until proven otherwise" - if you're going to claim that psychology matters, you've got to prove it. And your 'rules' are completely unacceptable, but I am skeptical of your claims.
Do you really want to argue that users should go back to CUI's? Be my guest...
I said nothing about going back to CUIs. Is it too much to expect you to know the difference between skepticism and disagreement?
By the way, GUI's are a case where fitting the user's head (familiarity to real-world objects) produces a quicker learning cycle on average. (I don't have any stats on that, but I don't think you'll disagree.
Provisionally, I'll agree if you're defining "fitting the user's head" as "almost already knows the subject" as seems to be the case here. OTOH, I think your argument is circular - the only way you could ever show that a GUI better "fits the user's head" would be to demonstrate that it "produces a quicker learning cycle on average".
Please clarify "almost already knows", and also why it would matter. If it helps at the late stage than it still helps. --top
"almost already knows" = "familiarity", and it matters because you specified it in your original claim.
Also, GUI's don't necessarily improve productivity once learned over good older techniques.) Character-based UI's were more efficient and logical in many ways, such as macro building. But user interface has trumped logic. Similar issues apply to table design. We have to interact with our designs and models to maintain them just like a user does with a spreadsheet or Turbo Tax values.
How did you logically conclude that UI trumps logic?
See above. --top (I am no longer using quotes but sig to distinguish my content because quotes are too hard to follow and check in edit mode.)

And, perhaps by SapirWhorfHypothesis, we'll eventually pull some generation of young students out of the mental quagmire that is the dark ages of computer programming. We don't know what 'optimizing psychology' means, so we can't touch it... but if it means making people think in manners more suited to solve the problems they're faced with, THAT we can do because we CAN prove that certain approaches to problems have better properties (efficiency, correctness, opportunities for error, ability to detect error or prove correctness once the solution is written, etc.) than others. Indeed, that's what mathematics is all about.

ACM has even shied away from software design issues because they are very difficult to quantify, generating a full mailbag for them. As software becomes a bigger part of our lives, finding better ways to measure "better" becomes more and more important. Otherwise, charlatans and inadvertent MentalMasturbation will rule the coop.

Ah, you mean the people that shout "I am great!" and offer advice or opinions that lack any reasonable backing and preach faith-based beliefs regarding the future of computing. People like yourself, perhaps. Indeed, having them rule the coop would be terrible. I imagine the metaphorical 'dark ages' would go on and on and on...

If you're willing to let your DisciplineEnvy motivate you, perhaps you should stop thumbing your nose at academics and actually learn the discipline in which you claim expertise. Read papers on research and applied theory. Learn enough to have a decent comprehension of 'System F' and 'Y combinators' and other arbitrary bits of the common vernacular without having to look it up. Actually create and fully implement a statically typed language - even a simply typed one.

Sure, there will always be a bit of aesthetics, artistry, personalization, and changing requirements in software and HCI just as there is for people building houses then adding paint, porches, and swimming pools. But that doesn't make the plumbing, electricity, structural integrity, cost to build and maintain, security, resistance to damage from quakes or insects or water, heating ventilation and cooling, support for information access or cable television, etc. any less of a full and true engineering discipline. There is enough in software engineering to constitute as much an engineering discipline as any other engineer or architect. Aesthetics are important to being a successful architect, but so is actually getting the building up and keeping it there.

You've said you focus on business reports. That certainly puts you much closer to the 'artistry' side than me (who handles data and pubsub middleware, communications infrastructure, safety testing, etc.). I really don't feel much DisciplineEnvy.

I've never used psychological arguments as a metric, just to help illustrate a reason you can't casually dismiss a valid metric that you had been fastidiously ignoring.

What specific objective metric did I ignore?

Specifically, you ignore two entire classes of corrections required for post-column removal of the wide-table solution that simply don't exist in the narrow-table solution: those resulting from sub-case 1 and sub-case 4. I.e. you blinded yourself to problems so that you wouldn't have to think about them.

I'm sorry, but I don't understand what you are talking about. And you won't name the unit of thing(s) you are measuring. If you don't want to clarify it, then it will stay as is.
I've already named what was being measured and described the measurement. I believe you don't understand it because you don't want to. You know... the proverbial hands-over-ears yelling "LA LA LA LA LA! I'M LOOKING FOR ANY EXCUSE TO IGNORE YOU! AH! FOUND ONE! YOU DIDN'T NAME YOUR UNIT! AND YOU DIDN'T CLARIFY IT FOR ME THE FIFTH TIME I ASKED THE SAME QUESTION EVEN THOUGH I IGNORED YOUR ANSWER THE FIRST FOUR TIMES! WOW! THAT IS SO CLEVER OF ME! I'M SO GREEAAAT! LA LA LA LA LA!"
My experience is that wide is easier to work with.
Please let me know when you have sufficient experience with narrow table solutions for that to be a meaningful statement.
Further, clogging up table-name-space harms my MentalIndexability, requiring a longer pause before I add or change tables to make sure its the right table and know what it is doing.
That, of course, has nothing at all to do with a column removal scenario. In any case, I suppose you never mentally index columns, and you must forget entirely about many<->many relationships. Perhaps your mind should be changed.
Further, it's not clear whether your "--" comments apply to stuff before or after them, and you have nothing in between that tells what alternative applies to what.
"AH! ANOTHER GOOD EXCUSE TO REMAIN IGNORANT!"
So you would rather leave your documentation face down in the gutter. You are afraid of clarity, metrics, and the scientific process. "We don't need to optimize no stinkin' factors". I'm done with you. I thought we were moving to something concrete, but that was illusion. Enjoy your impractical anal-retentive designs that give you MentalMasturbation. --top

I suggest giving these metrics working names in this topic to avoid pronouns etc. Give the scenarios specific names and the metrics specific names.

Suggestion noted.

My metric is asymptotic cost of change per scenario, measured in both potential and necessary (potential >= necessary). In practice, this is closer to volumes of code that need changing - not absolute number, but relative portions.

Please clarify. I find the above obtuse. How is it closer to "volumes of code"? How are you determining "necessary"? Are you sure it's necessary? I can't verify that without knowing what you are looking at in your mind's trekkian main screen.

A 'necessary' cost is one you always pay to achieve a specific purpose (in this case to maintain a working application base). It is determined by the definition of necessary and logical analysis, especially of the trivial and obvious cases. E.g. if you remove a column from a table, it is necessary that you track down and remove explicit references to that column from application code and application queries if those queries and code are to continue correct operation. A 'potential' cost is one that might not exist based on one's ability to discipline the code; e.g. if you avoid use of 'select *', there is no risk of paying the 'potential' cost of fixing fragile code that, while never referencing a particular column, breaks when that column is removed; similarly, if you can guarantee that all application code that touches the database is 100% robust and immune to breakage, you don't need to pay that potential cost.

The cost of having to edit code in one solution that would not need edited in the other is a great deal higher by this metric than is the cost of having to delete one additional "d," along with at least one explicit use of said 'd' in application (indeed, the 'd,' is cheap and has no effect on asymptotic cost at all). You are, perhaps, used to thinking in 'finger-movements' so you attempt to translate what I'm saying into it - I suppose if you only have that hammer, everything looks to you like a nail.

That is measurable and separate from psychological issues. I am not claiming it should be the only metric; in fact I claim that psychological issues are the primary issues to be optimized. But we can't objectively measure these.

What does it even mean to "optimize" a "psychological issue"? How do you measure or determine whether you've accomplished it? Can you even demonstrate that it's possible, or are you possibly claiming that you want the impossible?

However, if somebody claims that it can be *objectively* shown that thin tables makes programmers more productive etc., they are obligated to show the objective metrics.

Technically, they're only obligated to offer proof (or what they consider sufficient evidence to convince a reasonable person of sufficient education whose mind is open to change; the ability to convince fools, children, and stubborn fundamentalists is not required). That is what BurdenOfProof means. Objective metrics are just one possible means of achieving such proof. You might believe they're the 'best' metric, but I disagree; I happen to believe that objective metrics THEN require you to prove that you were measuring the right thing in the first place (e.g. what does 'productivity' mean? how do you measure it? how do you distinguish experience from methodology?).

I happen to find logical case analysis to be a stronger and more accurate proof mechanism in most cases. I don't attempt to prove 'productivity', but there is a very wide slew of objective properties that can be analyzed in this manner (including code change analysis after a change, resistance to programmer typographical fault (% chance of locating it at compile time), ability for an application to continue operation in a disrupted network, etc.

I am not obligated to show objective metrics because I believe psychological factors to be the most important.

If you ever claim that 'optimizing psychological factors' provides ANY objective benefit (including productivity, average increased programmer satisfaction, improved readability (unlikely), etc.), then you ARE obligated to show objective metrics. Similarly, if you ever make a claim that says doing any particular thing helps optimize psychological factors, that is ALSO an objective claim that requires objective evidence.

At the moment, your claim is about equivalent to: "I claim programming should be optimized to get us closer to God." I.e. at the moment, you're not obligated to provide objective evidence. BUT, at the moment, your claim is an infantile fancy and wish that has no real meaning whatsoever.

And, psychology is by definition "subjective". Make sense?

Psychology isn't, as a study (even of individuals), wholly subjective. But I can see why you'd think so. Keep in mind that even psychology has its soft (clinical, talk therapy, etc.) and hard (memory analysis, reaction and response, behaviorism) divisions.

You cannot obligate somebody to objectively prove subjectivity. That's like dividing by zero in obligation-land.

You are obligated in reasonable debate to be willing to meet your burden of proof for ANY claim you make. You are obligated, in reasonable debate, to NOT make claims you cannot or are unwilling to prove. So, if as you say, your claims are purely subjective, then you should not be making them - it's like a fundamentalist shouting his faith-driven beliefs on a corner without a scrap of evidence to back up a claim.

In short, if you make objective claims, you need to show objective metrics. The actual choice of metrics is initially yours. But if I can counter them with objective metrics of my own, such as number of statements that need changing, I will. This does not mean I am endorsing a given metric as being important, only pointing out the there are objective metrics that support my point of view. If you want to demonstrate that the weights of my objective metrics should be less than the weights of your objective metrics, be my guest. I would happily welcome such.

Actually, you have burden to demonstrate that your 'counter' metric is a 'valid' counter. As is, your choice of 'counter' metrics thus far has been to drop some classes of problems from one side of the equation to make it balance in the other direction.

The cost of changing a volume of code internally is, in post-deployment situations, often less expensive than achieving the capability to change it (which may require contacting all sorts of people who have written queries for the database).

Agreed. This is why I encourage CodeChangeImpactAnalysis via scenarios if we want objective metrics.

(... Which happens to be exactly what I had started doing before being utterly sidetracked by you.)

In pre-deployment (or non-deployment situations), the cost of changing each volume of code and performing all unit-tests and passing everything is pretty much the sum of the costs (albeit not minor). I do assume that queries are written to go along with application or other code that utilize them. As a general consequence of that, fixing the queries themselves is a tiny fractional cost of fixing the application code, firing up the appropriate development environment, and testing the change via any unit tests or analysis.

Please clarify. I don't know why this would be the case. Code is code. I see no reason to rank SQL code changes lower than the app code changes.

Perhaps you misunderstood; I am not ranking SQL changes lower OR higher than application code changes. However, in all my experience, fixing and testing changed application code - and even firing up the development environment to run the unit tests and the rest - generally requires far more time, tweaking, and testing than merely deleting a column from a query. I.e. code is code, but there is (in my experience) usually a lot more app code to change for each piece of SQL code, and said app code is (again, in my experience) more difficult to fix. The lowest ratio I've ever seen is ~50:50 where the SQL code and app code were about equal in size and change impact difficulty (e.g. just deleting "d," from the SQL, and just deleting "print(d)" from the application), and it typically only gets worse from there. Your experiences may be different. Have you often encountered situations where changing queries are more than fifty-percent of changing the related application code?

Not that it is pivotal; that changes to queries constitute only a 'fractional' cost of the total code change would remain true nonetheless, and thus fixing the query code will never have an asymptotic cost effect unless there are cases where you need to fix queries without touching the application code (which could happen with views, I suppose).

When you keep tooting your horn about a tiny savings in a tiny fractional cost of the total change, I keep rolling my eyes and yelling at you; it's penny-wise and pound foolish. The relative potential cost of having to change application code that breaks that never even touched 'd' is far, far greater than any such savings.

You have not clearly shown any objective "biggies" yet. A one-eyed man is a king in the land of the blind. You are not clear about what exactly you are counting.

I have clearly shown two objective "biggies" for removal and addition of columns (sub-case 1 and 4), but perhaps only the other people with at least one eye who actually face the evidence and analyze can see it. I can't seem to help you with your inability or unwillingness to take the effort to comprehend. Even after many attempts to explain, you keep coming back with: "I still don't get it."

A similar scenario: just because you don't understand a proof of, say, the Halting problem, doesn't mean it wasn't proven. Technically, it is your job to comprehend any attempt at proof well enough to say why it is invalid or unsound, or to ask specific questions for clarification. Repetitions of "You have not shown X" actually put burden of proof on you to prove said claim (that I "have not shown X").

So, please provide your evidence that the issues I have raised are not valid.

Your proofs are cryptic. A drunk perler could have done a better job. I am not obligated to spend 3 days to decipher your mess. I'll even show you how to document it better once I figure out what you were talking about.

Pffft. You just want information to magically be formatted just for your brain to absorb regardless of essential complexity or your weakness at formal reasoning and math. You doom yourself to ignorance with almost every decision you make. Not my problem. I'm no mental slouch, but I'm also no grand genius; one thing I did learn is that doing my homework thoroughly is a very effective way to learn - it often takes me weeks to grasp a concept, showers in the morning, dreams at night, music off in the car, lunch break after lunch break, pencil and paper in hand sketching out scenarios and testing an idea to eventually make it 'click'. You give up before mere 'days' are up if not hours, and you probably quit after similar lack of effort when confronted with the various concepts that would have helped you comprehend what I was saying as I first said it. I may as well be explaining geometry proofs to a person who barely groks the difference between areas and volumes. I suppose I no longer need to wonder why you can't keep up in conversations with people that actually think, learn, and understand for weeks or months before they talk - you aren't stupid; you're just choicefully, due to your arrogance, uneducated. No wonder you've begun to seek magical solutions to make the world and domains you work in simpler, starting with 'EverythingIsRelative'.

If it takes that long to explain it, then very few others will buy into narrow tables also. You are doing a very poor job of selling to "them" also. And there are too many zealots promoting their pet paradigm/technique/tool in an aggressive way. Why should your pet get such thorough review over the others? (Your debate style and accusations against me are a lot like those of the more aggressive OOP fans.) Further, I've used both kinds of designs in practice and found narrow tables wanting. This is why I require relevant numeric metrics first. --top
Now now, what about your little pet, Table Oriented Programming, Mr. Top? And why does it take you so long to explain why your methods are so useful, without you providing any precise third party references that support your methods, other than your own geocities website (which is not third party).
I don't claim it objectively better anymore. It fits the way I think. TOP was fairly popular among other ExBase programmers I encountered over the years. I've seen them use similar techniques to mine. And, you didn't really address the issue of poor salesmanship/documentation on your part. --top
{You are not talking to the same entity here. And I think issues of 'salesmanship' should wait until AFTER you have proven the technique. The mark of a true snake-oil salesman is someone who sells first and proves never. Even when you're saying your technique fits 'your' mind better, can you prove it?}

Grow wiser, TopMind. Become an EternalStudent instead of a HostileStudent. If you have some extra time, consider taking a few college courses that will challenge you, both to learn something and to knock that ego of yours to a size you can better manage. Grab a book on type theory or category theory and read it. Actually do the exercises at the end of each chapter instead of lazily and arrogantly pretending you could if only you wanted to - and don't fear being wrong so much that you can't stand to test your answers and face truth.

I'm not your damned student. You should learn why science is important so that you don't mistake clever ideas for actual results. The problem is you, not me. --top

And don't hesitate. At the moment you have nothing at all to contribute - not unless you can figure out some objective and formal approach to 'optimizing psychology'. For now, I'm going to stop acknowledging you exist until such a time as you open your mind, kick your ego into submission, and start using your real name. I waste too much of my time explaining stuff to you when you either aren't ready to understand it or simply don't want to try.

"I'm not your damned student." --top

{Yes, but you should be.}

Top as my student? I'd feed the man hemlock at first opportunity. No, no, it would NOT be conducive to his health were he to become my student. I only asserted that he should become a student - someone who continues studying and learning.
Violence solves all problems, eh?
Violence solves some problems.
Murdering people who don't agree with you?
Alleviates headaches and optimizes psychology.
I see.

"You should learn why science is important so that you don't mistake clever ideas for actual results." --top

{And you, Top, should learn why the other components of science -- logic, mathematics, theory, and models -- are important so you don't mistake personal opinion for theoretical foundations.}

They are only ingredients, NOT the muffins that come out of the oven. You have a problem understanding this.

"The problem is you, not me." --top

{No, it's you.}

[No, it's me. --bottom]

Just to bring some closure to this, if change counts suggested in CodeChangeImpactAnalysis were done for both table styles, do you feel that thin tables would score noticeably or significantly better? Myself, I am skeptical. I believe it would be roughly even. --top

[ This doesn't make sense that they are roughly even. If the normalized solution is roughly even then it has already won - because normalized tables can be queried more modularly, maintained easier (and a whole bunch of other advantages, history repeats itself (flat files versus databases)), and hence it isn't even if you say it is roughly even. Maybe a bit hard for you to grok. It's like saying if I have two choices of cars, and they both cost the same money - they are even. Except one of them, has better organizational compartments on the front dash and in the trunk. Which one to pick. They are even, surely.]

To be perfectly clear: for post-deployment refactoring, my opinion is that narrow table solutions will score much better in some situations, some of which have been described. Further, they'll score no worse in all other situations. The combination of these is strictly in favor of the narrow-table solution for CodeChangeImpactAnalysis. Additionally, narrow-table solutions offer greater flexibility with regards to meta-data and a wide array of other features that cannot be effectively achieved in the wide-table solutions. These additional features affect CodeChangeImpactAnalysis for all cases where these features suddenly become desirable, and they further affect the basic value of the solution (more options, more flexibility, at no significant cost). The main costs the narrow-table solution has is a practical requirement for a few additional optimizations, and a practical requirement for more advanced TableBrowser utilities.

Most of these are unsubstantiated "brochury" claims in my opinion. Maybe one of these days I'll go about documenting change metrics in more detail than our first effort to see if there really is a numeric advantage. At this point, I couldn't find any after reviewing your example a third time. You appear to be shifting toward psychological assumptions but not realizing it. If I am simply too dumb to understand your writing, as you flamefully suggest, then it will remain that way and I'll have to reinvent the wheel to see for myself. (And maybe teach you documentation techniques in the process.) Further, asterisking provided at least a few areas of objective numerical advantages, and thin-tabling makes asterisking difficult. --top

You requested an opinion in closure, you offered yours, and you received mine. I have already indicated why your point on 'asterisking' was a non-advantage at best and a strict disadvantage at worst in the scenarios we covered, and this is not the place to re-issue your arguments.

Let's review:

 // Snippet A - wildcarding
 qry = select * from ...; // 1
 print(qry.d); // 2
 ...

 // Snippet B - explicit column
 qry = select ...d... from ...; // 3
 print(qry.d);  // 4
 ...

If column "d" and all references to it are removed, then snippet "B" needs to change statements 3 and 4, while "A" only needs to change statement 2. Thus, wildcarding objectively reduces the number of statements that need changing for this scenario. A similar situation plays out for adding. (If this is not the place to re-issue my arguments, then I'm open to suggestions about where the place is.)

Snippet A does not offer any algorithmic cost advantage over Snippet B,

"algorithmic cost advantage" is not the metric being applied here. (Not sure what it means anyhow.)

Please educate yourself on algorithmic and asymptotic costs. It is a subject with which anyone discussing methodology in computer science should be intimately familiar.
If you mean like BigOh and machine performance, that is a different group of metrics.
Yes, BigOh is an example. BigOh has no intrinsic association with machine performance. Algorithmic and asymptotic costs are a much more appropriate group of metrics when dealing with high-level analysis of different methodologies.

and you've not covered all wildcarding scenarios;

No, because it is focusing on a specific scenario on purpose. If we did this full-out, we'd have lots of scenarios.

The specific snippet you offer is, as a consequence of your decision to avoid case-analysis, simply not representative of the column-removal scenario. You cannot derive valid conclusions from it alone.
What do you mean by "case analysis"? Your approach mixed in too many issues at once, making the discussion messy. I am not claiming it necessarily related to your discussion. I was simply showing that numeric-metric scenarios that favor wide tables do exist.
You cannot show that the metric actually favored wide-tables unless you show that your example was representative. To show that an example is representative, use case analysis (http://en.wikipedia.org/wiki/Case_analysis). Your example, in this case, is not representative, and therefore you have failed to show that the numeric-metric favors wide-tables.
I disagree. But, how about we revisit frequency (likehood) later.
How about people consider your case invalid until you prove it is representative?
And if you wish to show an example of a 'numeric' metric that favors wide-tables, simply use "average number of facts packed per row". Do not forget that you must ALSO prove that your numeric-metrics are valuable and meaningful. Numeric metrics are not useful simply by virtue of being numeric.
"Facts" is more of a psychological issue. We should avoid that if possible. As far as whats useful, we can assign weights to that later also rather then mix it up with this.
By what objective measure do you qualify "Facts" as a psychological issue? I happen to consider 'conjunctive normal form' and representation to be very formally definable. And "we can assign weights later" seems like an invitation to create a billion completely arbitrary metrics because there is no need to justify the metric "now". Is that what you intend?
See bottom of CodeChangeImpactAnalysis. And, how does facts-per-table or whatever you propose relate to "goodness"? It's pretty obvious that the fewer statements that have to be changed per scenario is a good thing.
Are you ready to prove that 'fewer statements changed for one specific scenario' = 'fewer statements changed per scenario overall'?
No, because it may be impossible to document all possible scenarios. Both sides are permitted to submit scenarios. This is typical in the practice of science because nobody can know if they've discovered all possible hypothesis. Thus, the known hypotheses proponents compete with each other.
Obviously you don't comprehend case analysis - I've never seen a situation where it is not possible to document all scenarios by documenting 'X' then documenting 'NOT X' for some predicate X. And those two cases can often profitably be further analyzed. So it is, at least, USUALLY possible to document all possible scenarios. If you believe this "may be" a special exception, I'd like to see your evidence, and if you're just tossing out "may be"s like some sort of intellectual FUD, then you deserve a kick to the teeth. And, yes, you are free to submit a random scenario, but don't forget: garbage in, garbage out. I'll stick with using case analysis to ensure I don't have garbage. You're free to continue creating specific scenarios that favor your world view to ensure you DO have garbage.
It is possible to know all possible code combinations? No way. That's like saying its possible to test the text of all possible poems. Perhaps one can assume that every poem will fall into a given class of scenarios, but doing that thoroughly is probably not possible for a human. -t
It is possible to know all possible code combinations in the same way it is possible to know all integers. It isn't tractable to enumerate them, but there is rarely a reason to enumerate them - almost any analysis one might wish to perform can be performed after identifying a few representative examples that provably match (usually via construction) on pivotal abstraction requirements, or may be performed (if the case has clear properties) upon the abstraction itself.

you've once again ignored the issue:

// Snippet A.2 - wildcarding qry = select * from ...; // 5 ... application code without 'd' ... // 6

Given that fragile application code can and does exist, statement 6 can break even though 'd' is not used, thus objectively increasing the potential cost-of-change to the entire set of queries using 'select *' rather than just those queries with application code that uses column 'd'.

Perhaps, but that's a different scenario. And, not a common one in my experience.

Same scenario (column removal), different snippet.
The only time I remember seeing that happen is when somebody uses column position integers to index columns. It's quite rare IME. But that's a "crash" metric, not a number-of-statements-changed metric. I should have said, "different metric". I apologize.
Are you saying that you don't believe you need to change statements to fix crashes? And can you assert with confidence that your 'experience' is representative? I don't believe it is.
I am focusing one on metric at a time here in an attempt to avoid ThreadMess. I'm not measuring crashness (yet). And, no, my experience may not be representative, but without hard surveys, that's all we have either way. You mentioned C/C++. Perhaps C shops do it more because associative arrays are not built into the language and those used to thinking in terms of machine performance, which C programmers often are, may also avoid associative arrays for performance reasons. C/C++ is generally not that common for custom biz apps anyhow.
So you believe relational databases and the issue of wide vs. narrow tables is only relevant in the context of custom biz apps?
- That's the nature of the vast majority of my experience. I won't speak for other domains. They may have different habits and needs that changes stuff.
- I'm under the impression that your statements of 'in my experience' should be amended with a humility modifier, such as 'in my extremely limited experience', lest you continue to sell false impressions.
- Congratulations on working in IT for 400 years. --top
- Heh. Ten years of doing the same thing repeatedly each year = one year of experience repeated ten times. IIRC, you do custom business reports over and over again. When you start using relational and tables for a wide and far more representative variety of purposes, perhaps your 'experience' will be more meaningful. Today, it is extremely limited.
- I didn't say I only did reporting throughout my career. But why should it matter? Don't thin tables make EVERYTHING better? --top
- It is dishonest to imply you possess a variety of experiences when you do not.
- Now where the fluck did I exaggerate my experience? You drop accusations like a pigeon with diarrhea. -t
- A better question is: where haven't you? Whenever you appeal to your 'experience' as a reason something doesn't apply when the discussion isn't explicitly limited to topics within your experience, you exaggerate your experience. Where haven't you done this?
- I've described my niche many times over the years on this wiki. If you missed it, that is not "dishonesty" on my part. You complain when I repeat myself and you complain when I don't. I'd hate to be married to you. You invent problems, exaggerate problems, or both. -t
- And I have never said that narrow tables make everything better. They make some things better, and, given a particular optimization, a DML with very terse joins, and a good TableBrowser, they make every other thing no worse. I don't believe there are any measurable or qualifiable tradeoffs, while there are some very objective advantages (especially concerning meta-data and security, and to a lesser degree in the elimination of many code-change costs). If there are any 'psychological' tradeoffs, it's that old codgers like you would need to learn a new tool... perhaps due to having used ToolsThatTeachPoorHabits such as PrematureOptimization and ignoring code-change cases because they're inconvenient to one's worldview. You don't want any risk of change because it hurts your 'MentalIndexability', which was shaped by the tools and methodologies you already know (a bit like saying that a certain strategy at chess is better because it's the only one you can remember).
- We all have biases based on lots of things. Related: MindOverhaulEconomics. And, I think it is rational for people to stay with what they are comfortable until someone objectively proves a net advantage. You guys have been UNABLE to produce any significant scenarios where wider tables cause significant problems. Your justifications are very round-about and indirect. That's usually the sign of a charlatan or a zealot. You exaggerate personal accusations, which suggests you also exaggerate technical benefits. -t
- We have been able to prove net advantages... whole pages of 'evidence' to which you plug your ears and shout 'LALALALALALA! that's just evidence!'. There is a reason that 6NF is needed for bi-temporal databases. And sub-cases 1 and 4 from NormalizationRepetitionAndFlexibility ARE significant scenarios where wider-tables cause significant problems even in your favored little metric and subject, that being code-change cost; perhaps I'll finish filling out the table and providing more. In any case, what works for you is fine, but the moment you PROMOTE your method, MindOverhaulEconomics IS NOT AN EXCUSE. I.e. you can stay with what is comfortable for you, but the moment you suggest it to someone else you are no longer in a rational position to claim MindOverhaulEconomics, since at that point you are not humbly claiming: "I don't know if this is good, but here is what I do."; instead, you are in the process of trying to change someone else's mind.
- Your evidence is weak. You simply exaggerate it in your mind like all your other exaggerations. You have a penchant for exaggeration for everything and thus have no credibility. And, psychology matters. Go defend CUI's if you say otherwise. People will just laugh at you. And, I don't necessarily disagree with all higher normalizations. The biggest issue is using it merely to rid blanks/nulls. -t
- UserInterface matters, and I don't deny that aesthetics get their share of the architecture and engineering considerations. But psychology doesn't 'trump' anything else, and it really only matters at the point of human interface. Some benefits even of GUI in certain domains have little to do with 'soft' psychology, and everything to do with communications bandwidth and latency (pumping more information to the user, faster receipt thereof, and allowing the user to perform low-latency inputs back to the computer) which, while related to psychology in terms of how much the brain can usefully absorb, is not at all 'soft' or 'subjective'. And while this is somewhat off-topic, you display remarkable ignorance when you essentially imply that people who believe only in hard-logic and hard-science would be unable to find hard-logic and hard-science reasons to use the GUI.
- I'd like to see you try. The closest thing would be to test the learning curve for a new app with a random user base. But obviously that is beyond the scope of wiki analysis. -t
- Eh? These things are already well-studied (in psychology) and have nothing to do with "new app" or learning curve. Ever try to play a shooter while having to type "aim up ten degrees", "fire"? I have... it involved throwing bananas. But it certainly wasn't low latency or high human-to-machine bandwidth. And I seriously doubt I'd want to play command-line solitaire with the card layout described in paragraph after paragraph of text, because that would be very low machine-to-human bandwidth and latency.
- I am skeptical. Citation? -t
- I can't think of any papers off-hand. However, I did take part in such a study as an undergrad, I remember reading about several such studies as an undergrad, and a team in my workgroup is handling a study on exactly this (latency, bandwidth, along with concerns for ambiguity) right now wrgt use of a dataglove as a possible mechanism for communicating with robotic vehicles.
- And I'd use narrow-table solutions to get rid of blanks/nulls, to eliminate certain cases of high change-expense post-deployment allowing for much more publicly accessible databases, to allow for improved meta-data and security, and for all the various reasons listed under NormalizationRepetitionAndFlexibility. All at the same time. Objectively better for many small things (sometimes a great deal better) and not measurably or qualifiably worse at others (with proper tools) = better overall... objectively. "[... If the normalized solution is roughly even then it has already won - because normalized tables can be queried more modularly, maintained easier (and a whole bunch of other advantages, history repeats itself (flat files versus databases)), and hence it isn't even if you say it is roughly even... ]" And if inflexible persons like yourself aren't willing to accept the mind overhaul for subjective reasons that mostly have to do with the habits you've already grown comfortable with, I really wouldn't care - but you telling other people it's a bad thing or spreading FUD to protect your ways (what you called 'fish brained' protecting of food) would be a different matter; with that, I would not abide.
- I reject the flat-files comparison. Decent wide tables are not objectively more OnceAndOnlyOnce than thin ones. Repeating that myth does not make it so. And I have worked with thin table design enough times to know it is not merely a familiarity thing. They were as annoying the first time as the last. And the assumption that all extra possible modularity (splitting and splitting) is always good is not proven. -t
- Decent wide-tables are, indeed, not objectively more OnceAndOnlyOnce than thin ones; one can have wide-tables up to 5NF and represent every fact just once. And the reason you find narrow-table solutions 'irritating' likely has a lot to do with poor DMLs and TableBrowsers for it. And the assumption that low-cost-of-change (mutating and mutating) is always good is ALSO not proven... but has just as much support as modularity - databases, especially, ought to be highly modular because data often outlives the applications using it and see many new uses over their lifetimes.
- I never claimed "always" or "never" and agreed that it depends on the situation/domain and agreed that I may be more open to the idea of thin tables if tools and DBA practices better supported it. But existing RDBMS seem geared toward wider tables, or at least "encouraging" one to pick one or the other. -t
- You often claim "always" or "never" implicitly and explicitly, and, indeed, did so explicitly in your last paragraph. And, yes, existing business RDBMSs are geared towards wider-tables, partially due to the rather voluble SQL syntax, partially due to the lack of dedicated optimizations, and partially due to the common cycle wherein the tool-makers don't design to support what users aren't using and users don't use what their tools aren't designed for - a cycle that often perpetuates the existence of sub-par tools and operating-systems, that forces backwards compatibility issues and makes messes of APIs, etc.
And do you believe that a code-change impact metric is valid and reasonable if you or I can arbitrarily (e.g. "to avoid a ThreadMess") exclude code-change issues from it? And have you actually avoided the ThreadMess after all the complaints you've received for having chosen an incomplete metric?
What did I exclude that's eating you?
What has been mentioned several times as you having excluded? It just might be that... the whole case analysis, sub-cases 1 and 4, etc. You've been told several times now; if you had bothered reading and comprehending, you'd already know.
I complained it was not clear and you blew me of with insults. -t
No, you complained it was not clear, so I did some work to clarify it, then you complained it was not clear seemingly without having bothered to look, so I did some more work to clarify it, then you complained it was not clear again, and then I said I had clarified it and I expect you to do YOUR share of the mental effort, then you insulted me, saying I was afraid of clarity and essentially said that you have no obligation to do any thinking, that you fully expect such clarity that everything clicks for you the first time you read it and anything less is obviously my fault because you believe yourself so great, and THEN you fully deserved a slap to the face, but the best I could give is an insult. Considering the baffled reaction you had to such simple a common logical analysis form as case analysis, I really don't think it is my fault that you find everything else 'unclear'.

And you shouldn't re-issue your arguments... what ever happened to 'OnceAndOnlyOnce' and 'DontRepeatYourself' that you were, in CriticizeDiplomatically, offering as reasons that particular forms of discussion are 'bad style'? Please resist such hypocrisy.

I was talking about name-calling there, not examples. And I restated in a slightly different way because it appeared to be disputed or not understood. This is proper in my book. In fact I've encouraged it around here.

Do I understand that you are advocating repetition of examples and arguments based simply upon dispute? I.e. that people should repeat their arguments as a response to disagreement? Wow. That's a recipe for "Did not!" "Did too!" exchanges, especially if the GoldenRule was followed. You should amend your book to allow for restatement only in cases of obvious misunderstanding or requests for clarification.
Past requests for clarification were ineffective with you (or somebody with similar style/opinions). You just accuse me of being dumb or lazy for not understanding the first one. So I tried something different this time: restated it in a different way. Quite rational. And if I rephrase it differently, it is not technically OnceAndOnlyOnce, at least not in a pure sense.
You restated your ideas in the same way - see: NormalizationRepetitionAndFlexibility, search for 'Snippet C' and 'Snippet A' respectively. I was able to piece together what you meant then despite the utter lack of context or explanation (GoldenRule - spend at least as long trying to figure out what they said as you'd want them to spend figuring out what you said). As here, I said there that you were skipping relevant cases.
First, somebody complained about the original examples. Quote: "Honestly, top, your example needs considerable work. What's with having two queries each time?" Second, the discussion of wild-cards was too far away from the examples. As far as skipping cases, that's not a problem. We can add them later.
"need work" does not mean "state them again, later, on a completely different page". It means 'fix them'. And proper examples from case-analysis were provided after which the wild-cards were discussed.
They didn't state that explicitly. And it was spread all over. I chose to rework it. Stop complaining. --top

Besides, I was just helping 'bring some closure to this'; I certainly don't plan to get involved in a repeat performance of your argument again and again because you keep feeling need to repeat yourself. But if you must, certainly don't fire up new arguments or repeat old ones (complete with structure) 'in review' or 'to bring some closure'; it is remarkably poor style, and somewhat impolite if intentional.

I have a right to change my mind about closure. I was hoping for a simple "yes" answer about the metrics based on prior statements, but got an unexpected response, so had to change.

Your excuses don't make it better style or any more polite.

I find it a valid "excuse". And to be frank, it is too minor a thing to nit about in my opinion. I've suppressed about 90% of the complaints I wanted to make about you because it would become a nagfest if I did otherwise. For example, your usage of "excuse" is inflammatory and unnecessary. But I didn't mention it (except as an example later). It appears to me what you are doing is trying to find something to complain about, anything, as revenge for things that upset you in prior debates. I cannot read minds and so don't know for sure. But, that's my working guess. --top