Fixing XHTML MIME types could break your JavaScript

I’m on a multi-staged mission to do a full, careful, from-the-ground-up rewrite of a client’s website. This time I’m trying to do everything right and apply all the advanced web solutions that I know about.

One of the things I decided I wanted to optimize was the Apache header response settings—things like gzipping html, css and JavaScript content where possible, making sure MIME types were appropriate, etc.

There was an excellent article posted on IBM’s website about Configuring Apache to send the right MIME type for XHTML that showed how to configure Apache so that modern browsers would receive http Content-Type: application/xhtml+xml for modern browsers while returning “text/html” for the old broken Microsoft browsers like IE6. I made the appropriate changes, confirmed that Apache was identifying my html pages, all of which were encoded in XHTML 1.0 Transitional, as the correct MIME type.

A couple days later I discovered that a Adobe Dreamweaver Spry menu-bar wasn’t working correctly. The behavior was strange: when I did a “Preview in Browser” the menu worked properly, but the live page on the website did not. Suddenly I was given a subtle lesson: there are multiple meanings to “XHTML”.


Mostly due to browser (ahem: Microsoft) incompatibilities, the lines between XHTML and HTML have grown blurry. For most web designers (myself included) XHTML just means closing all tags (or using the optional trailing-slash shortcut) and forcing case-sensitivity. In other words: some decent best practices. When you use the XHTML 1.0 Transitional DTD in your DOCTYPE declaration your are essentially doing just that.

Most people seem to use XHTML 1.0 Transitional and usually stick to valid coding practices—although I still see the occasionally forgotten trailing slash on <img> tags on even Fortune 500 corporate websites. The web pages keep the .html file suffix and the web servers tell the browsers that the content is actually html via the text/html MIME type. Really, they are coding in a flavor of HTML instead of true XHTML.

Strictly speaking, XHTML is intended to deliver HTML tagging in a valid XML document. The XHTML DOCTYPE (when paired with the appropriate application/xhtml+xml MIME type) is supposed to switch XML-compliant browsers into “XML mode” where subtle new rules apply. (For example, setting CSS background color on the <body> tag doesn’t necessarily change the background of the entire browser window!) You also get the ability to implement some cool and powerful XSLT transformations within the browser.

Now, Internet Explorer 6 (and earlier versions) puke when you send XHTML documents with their appropriately-paired MIME type. Because of this, servers typically fall back to old fashioned HTML and all browsers stay in HTML mode.

XHTML and <script> tags

Getting back to my Dreamweaver Spry nav bar problem: proper XHTML rules specify that JavaScript code in <script> tags must be surrounded with a CDATA declaration. Whereas the old-style technique (and what Dreamweaver inserts by default) for JavaScript tags looks like this:

<script type="text/javascript">
var MenuBar1 = new Spry.Widget.MenuBar("nav", {imgDown:"../SpryAssets/SpryMenuBarDownHover.gif", imgRight:"../SpryAssets/SpryMenuBarRightHover.gif"});

The XML-compliant variation needs to be written like this:

<script type="text/javascript">
/* <![CDATA[ */
var MenuBar1 = new Spry.Widget.MenuBar("nav", {imgDown:"../SpryAssets/SpryMenuBarDownHover.gif", imgRight:"../SpryAssets/SpryMenuBarRightHover.gif"});
/* ]]> */

Otherwise the JavaScript content is interpreted as an HTML comment and ignored. This caused the very bizarre behavior I first mentioned, whereby my Dreamweaver “Preview in Browser” content worked and the live content didn’t. The reason was that the Preview mode was served from my computer’s filesystem, and hence was not paired with the XHTML MIME type, and the browsers did not run in XML mode. But when the pages were published they did.

Conclusion: Best Practices

This leaves me with a puzzling question: what best practices should a web designer adopt? Should we code in XHTML because of its cleaner and more consistent (tag-closing) rules or should we back-out to HTML 4.01? The way I see it, there are three solutions:

  1. Code in HTML 4.01 and stop putting trailing slashes in <img> tags.
  2. Code in XHTML 1.0 (use Strict mode while you’re at it to force more discipline) and make sure your JavaScript uses the compliant CDATA declarations.
  3. Stick with business-as-usual and XHTML 1.o Transitional but make sure your server only serves content with the text/HTML MIME type.

Choice #3 embraces a certain level of sloppiness, but that might not be a bad thing. If you are a web designer whose client has many people modifying pages over time (adding Omniture or Google Analytics tags or the like) where tight source control and coding standards are impractical, this actually makes some sense. At least the XHTML DOCTYPE plays some lip-service to a commitment to write cleaner code.

Eating Crow

CSS pioneer <cite>Dave Shea</cite> wrote in his blog in April about his own decision to back out of XHTML and write in HTML 4.01. On May 26th I wrote a comment that was strongly disagreeing with his perspective, but after some reflection I have to admit he might be right after all.

He also cites a good article (that I lazily did not read before) that essentially says what I just explained.

So what am I going to do now? Actually, I’m going to continue, for this particular client, with the XHTML path and see if I can create a smooth and compliant site. This will be a good learning experience. I think I can make this decision responsibly with this client because I have tight control over the content, but I suspect for most future work I will also follow the HTML 4.o1 Strict path.
Technorati Profile

Best Books on Web Design

Anyone who gets to know me as a web designer/developer will soon realize that I’m a bit of a zealot when it comes to clean design. I’m a programming “purist” who believes that real rewards come from doing things right the first time. I try religiously to follow the K.I.S.S. (“Keep It Simple, Stupid!”) Principle, eshewing the newest and fanciest trends unless they involve simplification.

I like to write HTML code by hand, carefully thinking about whether a tag should be given a DOM id or class, and whether that adds semantic meaning or not. I think the ultimate web site should operate in Lynx beautifully, and that the markup of graphical web pages in tables—teaching a generation of graphic designers to design-and-slice in Photoshop—was a horrible practice.

There were two books that, three-to-four years ago, really inspired me and made it possible for me to use careful XHTML+CSS design in my web work. I know that four years is like two decades in “web years” but these books are still timely, and one is just about to receive an update!

Continue reading Best Books on Web Design