Valid Html

HyperTextMarkupLanguage that is valid according to the WorldWideWebConsortium specifications.

The official HTML home page, with specifications, is at: http://www.w3.org/MarkUp/. The W3C offers an online HTML validator at http://validator.w3.org/.


In 2001 I wrote a thesis named "How to cope with incorrect HTML", available at http://elsewhat.com/thesis/ . It deals with the HTML standard and how a browser can parse valid and invalid HTML. Perhaps the most interesting part of this thesis is a validation effort of 2.5 million HTML pages (XHTML was not considered at the time) taken from http://dmoz.org/. It showed that only 0.71% of all HTML pages were valid, and describes in some details the different errors the pages had. You can jump directly to this chapter via http://elsewhat.com/thesis/pages/?nr=81 , which shows each page of the thesis as a separate image. -- DagfinnParnas

(see OpenDirectoryProject for dmoz.org )


But there are lots of versions of HTML, and unfortunately, many who write HTML exclude or use the wrong DocType at the Head of their document. Thus confusing the browser as to what it is meant to be rendered.

Fortunately, the WorldWideWebConsortium have got a version of HTML, which I personally think should be considered finished. This version is XHTML 1.0 (for some reason they then released 1.1!). Now, the efforts of the W3 should be concentrated elsewhere, particularly in the ExtensibleMarkupLanguage area.

If everyone can be persuaded to write XHTML 1.1 for the Web and all browsers support it wonderfully, (nearly) everyone will be happy.

-- MatthewTheobalds

Apparently XHTML 1.1 is descended specifically from XHTML 1.0 Strict (see: http://www.w3.org/TR/2001/REC-xhtml11-20010531/changes.html ). However, I agree with you about XHTML 1.0 being considered finished, barring one aspect: it doesn't support the onmouseover attribute for anchors (although CSS now handles this). Whilst admittedly onmouseover might not be useful in the pure-information-SemanticWeb sense, it's certainly still very useful. The upshot of this is that I can't build sites using XHTML if the design I'm given specifies for even a simple rollover effect, if I want the code to validate successfully. On the issue of DocTypes, I believe that MicrosoftInternetExplorer (for Mac, at least) has a mechanism that allows it to cope with incorrect specifications (reference, anyone?), but I don't know about other browsers.

Try this: http://www.hut.fi/u/hsivonen/doctype.html, and recheck the DocType page for an update. Also read this: http://www.hixie.ch/advocacy/xhtml about not using XHTML when well-formed HTML 4.01 does the job, at least until browsers are properly XML-based -- DaveEveritt

What I said about the DocTypes was probably a bit snotty. I didn't actually think that browsers cared or listened to what they said anymore. I hadn't noticed that about the onmouseover though, it seems a bit strange to leave that one out. Perhaps it is part of a more cunning ploy to make the web simpler and more information-centric? Which can surely be only a good thing. -- MatthewTheobalds

Judging from the W3C's decision to have it descend from XHTML strict, I think you may well be right, and yeah, it can only be a good thing. However I must confess to being quite fond of little rollover effects on navigation - MeaCulpa.

The proper way to do a rollover in (X)HTML is with CascadingStyleSheets' :hover pseudo-selector.

The wrong DocType can cause some browsers problems. For instance, using a 4.0 Transitional DocType while using 'name' within the image tag will cause some versions of InternetExplorer to occasionally fail to load images. Changing to 4.01 Transitional resolves this, as 'name' is included in the image tag. In this case, IE is only trying hard to do the right thing, but is trapped between being forgiving with bad HTML (most of the web) and good standards (which means most bad pages wouldn't load). Just for the record, I tried rewriting a page as XHTML and it worked fine in both IE and NetscapeNavigator 4x browsers on an AppleMacintosh. I have a good guide to choosing the right (pre-XHTML) DocumentTypeDefinition, which is abbreviated for quick reference under DocType. Basically, leave it out if in doubt and let the browser decide. It won't validate, but it will allow picky browsers to parse the file. -- DaveEveritt

MozillaBrowser will lay out the page quite differently depending on the DocType declaration or absence. In quirks mode, it will behave like NetscapeNavigator 4.x. See http://www.mozilla.org/quality/help/bugzilla-helper.html -- StevenNewton

It just shows that the recommendation to include the correct DocumentTypeDefinition is a good one, but people have got so used to sloppy HTML (and visual web page editors have allowed them not to care) that it's often completely ignored and browsers are just left to cope as best they can. I think all docs should be validated with a correct DTD; the web would speed up and browser developers would be happier. But it just isn't going to happen as long as browsers have to cope with TagSoup? -- DaveEveritt


Whenever I write a web page, I run it through Dave Raggett's HTMLTidy, available at http://www.w3.org/People/Raggett/tidy/, an OpenSource project by the W3C HTML lead. It's very good, and it handles DocType declarations. -- EricJablow

HTMLTidy is good. On the Mac, I habitually use the BbEdit or (more recently) Coda inbuilt validation. I have only once been forced to upload invalid pages, while trying to cover how the two main browsers handle borders and spaces between frames in one document without resorting to an irritating branch-for-browsertype hack (I hate frames and no longer use them). -- DaveEveritt


This Wiki's HTML

WikiWiki's generated HTML looks like

  <hr>
   Opera is the browser 8<snip >8... version 6.01.
  <p></p>
where the two p tags come from a blank line in the edit window.

If there were an XHTML DocType at the top and Wiki generated <p> </p> it would be more nearly right :-) -- DaveGarbutt

Hardly. Note also that WikiWikiWeb pages (a) don't declare a DocType, (b) don't have a <html></html> pair(!), (c) capitalize the <ul></ul> tags, (d) have no </li> tags.

But actually the start and end tag for html and the end tag for li are optional meaning that it is valid according to the HTML standards , though not XHTML.


CategoryWebDesign CategorySemanticWeb


EditText of this page (last edited July 13, 2012) or FindPage with title or text search