XHTML, A Better, Cleaner HTML

Chiggins - 2007-12-01 18:29:55 in HTML
Category: HTML
Reviewed by: Chiggins   
Reviewed on: Dec 06 2007

XHTML, Extensible Hypertext Markup Language is a stricter and cleaner version of HTML, will gradually replace HTML, became a W3C recommendation on January 26th, 2000, and is supported by all new browsers. Many pages along the web have become very, cluttered and messy. Here is an example:

<title>Page Title</TITLE>
<h1>Header 1

What is wrong with this code:

<HTML> tag is capitalized
</TITLE> tag is capitalized
Missing </head> tag
Missing </h1> tag

Although these mistakes may seem trivial, it is sloppy work, and in the future, will be displayed wrong once HTML is fully replaced.

XHTML is based off of the HTML 4.01 standard. XHTML must:

Being Properly Nested

<b><a hrefhttp://www.google.com”>Google</b></a>

The above piece of coding is incorrect because it is incorrectly nested. As you can notice, at first, the <b> tag comes before the <a> tag, but at the end it ends with the <b> tag first and the <a> tag last. It should look like:

<b><a hrefhttp://www.google.com”>Google</a></b>

As you can see, the first tag ends last, the second tag ends second to last, and so on.

<b><h1><a hrefhttp://www.google.com”>Google</a></h1></b>

Being In Lowercase

This “rule” is pretty straight forward. All of your tags and attributes must be lowercase.

<TITLE>Page Title</TITLE>
<DIV ALIGN-“CENTER”><B>This is not accepted</B></DIV>
<title>Page Title</title>
<div aligncenter”><b>This is accepted</b></div>

Having Ending Tags

One of the most common errors in web coding is having ending tags. Now with XHTML, this is fixed.

<title>Page title
This is incorrect coding
<title>Page title</title>
This is correct coding

Now, sometimes you might be using the <img> tag, or the <hr> tag. They don’t have ending tags, so what do you do? The tag must end with />.

Incorrect formatting:
<img srcthingy.gif” titleThingy”>
Correct formatting:
<img srcthingy.gif” titleThingy” />
<hr />

Having Root Elements

This one is also easy. Each tag must have its own “parent” tag.

<head> … </head>
<body> … </body>

There are some more “rules” to have a cleaner, XHTML friendly syntax.

Quoted Attribute Values

All values for any attribute must be surrounded in quotes (“”).

<img srcthingy.gif” />
<table width95>

No Attribute Minimization

Some tags have attributes that in the past have been minimized in HTML. Now that is no longer acceptable with XHTML.

This is wrong:
<input checked />
<hr noshade />
<input disabled />
This is correct:
<input checkedchecked” />
<hr noshadenoshade” />
<input disableddisabled” />

Here is a list of the previously minimized attributes in HTML, and what they should be in XHTML.

compact compact="compact"
checked checked="checked"
declare declare="declare"
readonly readonly="readonly"
disabled disabled="disabled"
selected selected="selected"
defer defer="defer"
ismap ismap="ismap"
nohref nohref="nohref"
noshade noshade="noshade"
nowrap nowrap="nowrap"
multiple multiple="multiple"
noresize noresize="noresize"

Id Attribute replaces the name attribute

In HTML 4.01 the name attribute is defined in the a, applet, frame, iframe, img, and map tags. In XHTML, the name attribute is deprecated, and we use id.

This is wrong:
<img srcthingy.gif” namethingy” />
This is correct:
<img srcthingy.gif” idthingy” />
To compensate for older browsers, use both:
<img srcthingy.gif” idthingy” namethingy” />

The lang Attribute

Now in XHTML, the lang attribute is in almost every tag. It is used to show the language of the content of the tag. If we are to use the lang attribute, we must also use the xml:lang attribute.

<div langen” xml:langen”>Hello!</div>
<div langes” xml:langes”>Hola!</div>


The XHTML standard now defines three Document Type Definitions (DTD). They are Strict, Frameset, and the most common, Transitional. The DOCTYPE declaration must be at the start of every XHTML document, and because it is not a tag, it does not need an ending tag. Here is a minimal XHTML document with a Strict DTD.

<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
<title>Simple XHTML Page</title>
<p>This is a simple, minimal XHTML page.</p>

To define each DTD, use each of these:

Strict DTD
<!DOCTYPE html
PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
Frameset DTD
<!DOCTYPE html
PUBLIC “-//W3C//DTD XHTML 1.0 Frameset//EN”
Traditional DTD
<!DOCTYPE html
PUBLIC “-//W3C//DTD XHTML 1.0 Transitional//EN”

Each DTD has its own used. Use Strict when you want very clean markup, free of clutter, and it works great with CSS. Transitional is used when you want to use of some deprecated HTML features, and browsers that don’t understand CSS. Lastly, Frameset is used when you want to have a page that has frames to partition the browsers window.

In summary, XHTML in reality is not a completely new form of web coding. It is a set of rules and guidelines to help you have clean, valid coding. Now that you have read this and though “Oh my, I have been a slob all of my life”, go out there and clean up your web pages.