Structural Elements in HTML

Many HTML authors are not aware of the differences between logical (structural) elements and their presentational counterparts, and which element is more appropriate in different circumstances. Here you’ll be introduced to the terminology and methodology behind creating logical page structures.

Block-level and Inline Elements

Most HTML elements fall into one of two groups: they’re either block-level elements or inline elements. This is an important distinction. Block-level elements are used to format whole blocks of text — they stand out on their own, spanning the available screen-width and usually adding line breaks before and after themselves. Inline elements can be introduced along the normal flow of text without causing any major disturbance, and can be used to affect single characters.

The block-level elements are:

<address>
<blockquote>
<div>
<fieldset>
<h1> — <h6>
<hr>
<legend>
<p>
<pre>
<ul>, <ol>, <li>, <dl> and <dd>
The rest of the elements are inline elements. The rules which govern these two element types are simple, but important:

Block-level elements can contain other block-level elements and inline elements
Inline elements cannot contain block-level elements.
For instance, an a element cannot have a h1 inside it, but a heading can enclose a link. Table elements aren’t strictly of either type, but can hold block-level or inline elements. The <br> element is another special case. It’s strictly an “inline structural” type of element, but since it can’t contain any further content anyway, it doesn’t really matter. It’s easy to make mistakes with these rules, so you should check your pages with the W3C » validator.

It’s possible to change the way an element displays through its CSS display property.

Logical vs Presentational Elements

There is another grouping method that separates HTML text formatting elements into two groups — whether they’re logical or presentational. Logical elements represent the text’s function on the page. The way they’re displayed is up to the browser (although by now the browsers have largely standardised on how this is done), which means they’re platform-independant. Presentational elements exist to create a specific visual effect, but carry no hint to the text’s semantic meaning.

As people keep needing to be reminded; the original HTML specifications contained logical elements almost exclusively. HTML was a structural language, not a design language. It was meant to convey information in the simplest way possible. Since then Netscape and Microsoft created the hazardous situation we had a year or two ago, by adding in presentational extensions like the font element. HTML files became ridiculously large, inflated by redundant tags and presentation hacks. Thankfully, the HTML 4.01 and XHTML 1.0 specifications deprecated many of the presentational elements (replacing them with stylesheets), and brought in many new logical elements that add depth to your information. HTML code could again be clean and simple.

The Landscape

In the old days when it wasn’t so easy to control how your page looked (we have CSS for all that now), it was acceptable to use presentational elements for everything, and leave your pages without any sort of logical structure. Everyone was obsessed with how their pages looked, never mind the accessibility ‘concerns’. People didn’t use heading elements because they looked terrible; plain and simple.

This was the wrong way to go about things. The flood of WYSIWYG graphical editors at the time did this culture of aesthetics over compatibility no favours at all, and editors sent out thousands of pages full of font tags and no headings.

The problem: presentational elements can only mean one thing. The <b> element, for example, means ‘bold text’. What happens when a text-only or audio browser reads that element? It is meaningless in this context, and so whatever emphasis you may have meant to show by using the element in the first place is lost on that reader. If you use <strong>, each browser can decide on their own treatment, and present its result accordingly — by making the text bold, or by pronouncing the words louder etc.

Designers used to use large font faces instead of headings to avoid the line breaks. Interpreted through any device other than a graphical browser and those pages lost a lot of their meaning. What’s the title? Where do sections of the page start? Programs that try and construct a structural outline of your document will have nothing to work with. Also, most search engines try to pick out headings for better rankings.

Recently there’s been a move back to using logical elements, glazed over with a stylesheet, so your page looks as you want it but the elements that make it up actually do more than make the page look a certain way. They allow the page to be read in any number of browser types, like aural browsers, text-only browsers, Braille displays etc., and still be presented appropriately.

Using Logical Style

So what are the logical elements, and how will they look by default in a graphical browser?

<h1>–<h6> create headings. They should flow sequentially (try not to skip levels). The title of the page should always appear as a level 1 heading, with subheadings cascading down from it. Text is usually displayed in a large, bold font. Remember that they’re all block-level elements. You can remove the margins with CSS.
<em> creates emphasis, and is usually displayed as italicised text. Equivalent to <i>.
<strong> creates strong emphasis, and is usually displayed as bold text. Equivalent to <b>.
<code> is suitable for giving examples of computer code, and is usually rendered in a mono-spaced font. Equivalent to <tt>.
<blockquote> is a block-level tag that’s used to enclose multi-line quotations from other sources. It is usually displayed as indented from both sides.
<cite> is used to enclose the title of a work that is currently being referred to. It’s usually displayed as italicised text.
<q> is a short quotation from another source. Modern browsers will display contained text with quotation marks added on both sides.
<pre> is a block-level element that displays text in a fixed-width font exactly how it was typed in the source code (i.e. honouring all tabs, spaces and line breaks). pre is not strictly a logical element, but its use is often necessary.
<del> is a HTML 4.01 element used to show document revisions; text deleted from a page in this case. It is usually displayed as text with a strike-through.
<ins> is del’s partner in crime, used to show text inserted during a revision. It is usually displayed with an underline.
<address> should be wrapped around contact information, including email addresses.
<kbd> is suitable for marking up text that is meant to be entered by the reader on the keyboard. It is usually displayed in a fixed-width font.
<var> marks up a variable’s name. Useful if you’re writing about technical subjects like computer programming.
<samp> is used to signify a sample output from some code.
Adding Depth

The title attribute allows you to add a tooltip to any element of your page. These are especially helpful when applied to links, as they help the reader in judging what they may find if they click the link. You should add informational titles to as many of your links as possible, following these » title guidelines.

There are three more elements, all introduced in HTML 4, that allow you to add in supplementary information in a tooltip, using this attribute.

<abbr> is used to denote an abbreviation. Expand the abbreviation’s full meaning in the tooltip. This tag applies no formatting of its own.
<acronym> is used to enclose an acronym (e.g. ASCII), with the full meaning in the tooltip (which you should try to define for at least the first instance of each acronym on a page). Usually no formatting is applied.
<dfn> is used to give a definition of a tricky word. Write your own definition in the tooltip. Usually displayed in italics.

Making your content accessible

And so it all comes together nicely… With the advent of all of the recent upgrades to the tools and languages we use to create our web content (think XHTML, CSS), it is clear to see where the » W3C are going with the web — full accessibility. The web, as it was created, was envisioned as a place where anyone had full access to information. That hasn’t been the case for years, but now we are being asked to do something about it. It’s in everyone’s best interests.

A New Web

The following guidelines are intended to be followed by anyone responsible for putting information on the web. That means both designers and the guys who write the words. Separate documents have been published for webmasters, browser makers and HTML editor developers by the W3C as part of their » Web Accessibility Initiative. By following the many guidelines below, your content will become accessible not only to people with disabilities — although that’s the primary objective — but to all users of the Internet, irrespective of their chosen access device.

For examples of the kind of people you’ll be catering for with all this, read this list:

They may not be able to see, hear or move easily or at all.
They may have trouble reading, or may not speak the language the text is written in.
They may not be able to use a keyboard or mouse.
They may have a text-only screen, a small screen, or a slow Internet connection.
They may be in a situation where certain functions cannot be utilised (e.g. browsing while driving, or without sound).
They may not have a current browser, or a voice or Braille browser.
We, as content developers, will increasingly have to consider browsing situations other than the traditional setup of the user sitting in front of their desktop computer, as the web is beginning to be accessed from very diverse locations, and with new devices. Disabled readers use a variety of devices and assistive technologies such as screen magnifiers, screen readers etc. Designing clear and simply will aid these people; and hey, there’s a lot more people affected by this than you probably think.

The guidelines here are organised by topic, and are built on the foundations of two documents: the » W3C’s Web Content Accessibility Guidelines 1.0, and American Law » Section 508. These are the two existing standards for content creation on the Internet. The W3C’s standards are split into Priorities, and I’ve covered all of the Priority 1 checkpoints here, as well as many of the Priority 2s and 3s. There’s a check-list at the end too.

The Technology

So, our current standards and technologies that we all should be using by now, are:

Hypertext Markup Language (HTML) 4.01
Extensible Hypertext Markup Language (XHTML) 1.0/1.1
Cascading Style Sheets (CSS) Level 1 & 2
Document Object Model (DOM) Level 1
Synchronized Multimedia Integration Language (SMIL) 1.0
JavaScript/ECMAScript & Dynamic HTML (DHTML)
Nowadays most of the web audience have browsers that are more-or-less capable of supporting these languages. Better yet, these recent standards have all been designed to put accessibility and structure to the fore, and to degrade gracefully in browsers that lack decent support.

You can offer information in formats other than HTML pages. Java, Flash, PDF files and Word documents can be used, but you should also include HTML versions, since usually these other file formats are the least accessible way of transmitting information. Macromedia and Adobe are both working to make their respective proprietary formats accessible, but as it stands, Flash and PDF files are inaccessible to many people.

Coding

Use current and standard markup languages

You should always be on the ball about which version of the languages are available to use. Right now, you should be using nothing less than the » HTML 4.01 or » XHTML 1.0 standards, and using » CSS-1 or 2 to control the presentation of your pages. All of these technologies created by the W3C have been designed to be accessible, with many helpful features that make pages easier to read for everybody.

Also, browsers like screen readers and the like have a better chance of understanding valid, well-formed markup. Running your code through the » W3C validator and fixing errors will allow these browsers to interpret your code better. For all of the W3C specs, go to their » technical reports page.

Use appropriate markup for page structure

If you remember, and chances are you don’t, HTML is meant to be a purely structural language. Presentational element like the font element have been going against this principal for years. Using structural tags for presentational reasons (like using a <blockquote> for an indent) is damaging for accessibility, and we shouldn’t do it anymore. We’ve got CSS now to take care of how our pages look.

This also allows us to go back to using logical structure. Most of us avoided using heading elements in the past, because they didn’t look good. Now however, we can give our pages proper document outlines and not have to worry about the way the heading elements look, as we can change that with a stylesheet.

Using proper structural markup such as headings, paragraphs, lists, definitions and quotes allows screen readers, search engines and people to get a better feel for the way the document is laid out, and so makes it easier to read. Skimming documents is made faster too.

Use CSS for all formatting

Apparently, this can’t be stressed enough. Here are a few more benefits of using CSS to compliment your HTML:

Users can override CSS formatting easily if it suits their needs, with a personal stylesheet. You can’t do that with HTML formatting.
Having a stylesheet at your disposal will make it easier for you to concentrate on keeping your HTML clean and structural.
Using an external stylesheet makes your pages download faster for every user.
If an old browser runs into a styled page, it will just ignore all the CSS and so the content should stay completely available.
When CSS is not supported, your page should still flow logically, and be readable as it is. Usually this isn’t a problem and the style-less page poses no problems to accessibility, but make sure you check.

Text

Avoid using images to display text

The old method of putting text into a GIF just so you get a specific font or effect is particularly damaging to many readers. Users with limited vision often rely on the ability to resize text so that it is legible, or change it’s colour. Using this method inhibits that. As much as is possible, try to use the greater typographic opportunities of CSS text formatting to achieve a comparable effect. Logos are an exception to this rule, of course, but make sure you include an alt attribute.

Avoid using absolute font sizes

In an effort to keep their sites looking as similar as possible across browsers and platforms, many designers define their content text in absolute font sizes, like points or pixels. This makes resizing the text in your browser impossible, and so means many people will simply be unable to read it. Use relative font sizes like ems, keywords and percentages to define text size; again in CSS, or don’t specify a size at all and let the user’s preferences take over.

Specify the language of text

The attribute lang can be given to any element to specify what language the text is in. This helps both search engines and screen readers, some of which are able to read out the text in the appropriate language if it’s specified properly. You should apply this attribute to your main <html> tag; and then use <span lang=”…”> to denote smaller changes in language.

Some common languages and their initials are: ar (Arabic), de (German), el (Greek), es (Spanish), fr (French), he (Hebrew), hi (Hindi), ja (Japanese), it (Italian), nl (Dutch), pt (Portuguese), ur (Urdu), ru (Russian), sa (Sanskrit), zh (Chinese).

Back up your text by expanding out acronyms and abbreviations using <acronym> and <abbr>. Use <dfn> too to explain complicated terminology.

Avoid using ASCII art

The formation of pictures and icons using text, such as 🙂 to make a smiley face, or –> to make an arrow cause problems in screen readers. They call out the characters as they appear, so the arrow will translate as ‘dash dash greater-than’. While emoticons are fine to use in the chat room, they shouldn’t appear on webpages, and graphics with an appropriate alt text should be used instead.

Links

Make sure links are understandable out of context

Screen reader users or readers without mice often move between links with the tab key. Blind users can also generate a links list to allow easy access to specific links. It makes scanning the page simpler too. Thus, the links must clearly and succinctly indicate their destination or function without the words around it. You should always avoid using link text like ‘click here’ or ‘more’.

Use the title attribute to add informational tooltips to your links. You can use it to offer additional information about the reason you’re linking to a certain page, and to describe the content on the next page. You should also alert the user if you’re using the target=”…” attribute to open new windows. You can add this information in the tooltip or by writing links in a certain way — for example by preceding all external links with certain punctuation or a graphic.

Colour

Do not convey information with colour alone

Often we use bright colours to call attention to certain areas of a page, or to things like required form fields. This emphasis is lost on the blind, those of limited vision, or the many people with colour blindness. When you are doing this, add secondary emphasis with text formatting, or by adding *stars*, for example.

Remember that with the new devices coming on the market that people will be accessing pages with, many will only have black and white displays. Your images will need to remain legible.

Use contrasting foreground and background colours

This should be a known requirement to everybody anyway, but it’s particularly important for the people who aren’t just going to be annoyed by this — they’ll be blocked from reading content. As a general rule, use black for all of your main text, and place it on a white background. Avoid using red and green together, and never place text over a background image that makes it difficult to read in any way. Have a look at our page that shows you how to test your colour contrast.

Images

Provide alternate text for all images

You should use the ever-helpful alt attribute in every image you use on your page. Screen readers will read out the alt text of the images that their users can’t see, so you must use this text to describe the image completely. The alternative text should serve the same purpose and convey the same meaning. Creating proper description also helps users who browse with images turned off for faster downloading. Guidelines:

When you’re writing these values, you don’t need to include the text ‘image of’ — this is inserted by the screen reader automatically.
If there’s text in the image (Grr…), make sure to include this in the alternative text.
For active images that link to other pages, the alt text should indicate the link’s destination or function. Don’t write ‘link to’.
For spacer images, or purely decorative images, use the null alt attribute — alt=” “. This will allow the screen reader to skip over it.
If you can’t help yourself from needing to apply humour-filled tooltips to your images, construct the alt attribute as normal, but also add in the title=”…” attribute. This allows you to make the image accessible and say whatever you want in the tooltip.
Remember to add an alt attribute to every <applet> and graphical form button too.

Provide full descriptions for informational images

Special graphics like charts, diagrams and graphs should be given more description than an alt attribute. Add in a caption underneath the image for best results. The alt text in this instance should be the title of the image. More recent browsers support the image longdesc=”…” attribute, which you can use to link to a long description of the image. Until this tag is better supported however, you should also include the information on the page.

Image Maps

Use client-side image maps

In a server-side image map, all of the information about the image and the links contained within it are stored on the server. This means that a screen reader can’t get to this information. You should use client-side maps as much as you can. They’re easier to implement anyway.

Provide alternate text for each area

Just as every separate image needs its own alt attribute, each hot spot on an image map should be given a unique description. The map as a whole should have a main alt attribute to show its primary function, but each part should also have a description of its destination too. And, of course, as with all image maps, you should include a row of links beneath it to all of the links contained within. This makes your map faster to use, and greatly increases its usability.

Multimedia

Do not convey information with sound alone

If you use sound to denote an action, like when a user mouses over an image, include a visual clue to this too. Hard of hearing readers are going to miss out on the audio cues. You shouldn’t be using sound in a page design anyway — it’s nearly always annoying. If you are offering downloads of audio containing speech you should try to offer a text transcription of its content too. Not only will this allow people who can’t hear the audio to access the content, but it is invaluable to people without the means to go downloading large sound files too.

Text versions of non-text objects are great for accessibility as they can easily be transmitted in various ways for people with certain needs. For instance, text can take the form of a Braille display, or as synthesised speech.

If your multimedia is both audio and video, you can use SMIL, another W3C technology, to synchronise subtitles from a transcript with the video.

Provide alternatives to all multimedia

Accessing multimedia elements can be a hassle for all visitors, not just those with disabilities, due to the need for a specific piece of software, or bandwidth constraints etc. Often a low-bandwidth or alternate version of an effect is more desirable than the optimum, but more difficult to reach version of a multimedia object. The aforementioned text transcripts are always a good idea. Along with this, compressed versions of videos should always be provided for those with slower connections, or for anyone to preview the full-scale version with.

Animation

Avoid flickering and unnecessary animation

Animated GIFs, flash movies, <blink> and <marquee> are all used to display a variety of moving elements on-screen. These are always distracting and often irritating to all visitors. More seriously, a flickering animation can cause epileptic seizures in some people. The more obnoxious the animation, the greater the danger. Try not to use animation at all — it rarely benefits a page or gets any further information across than a static image would. Limit the amount of ad banners you place on your pages; they get more annoying and pointless every day.

Tables

Use structural markup correctly in data tables

With the recent improvements to how we can design tables brought on by the new HTML 4 table commands, you can now properly denote row and column headers. Data tables are ones used to display information, as opposed to layout tables which are used to lay out page content and design. You should not be using the <th> tag for creating boldfaced cells, they should only be used for the heading of the column or row, with scope set to row or col to show what the cell is the header of. This facilitates screen readers in picking out the important cells.

Try to keep your data tables simple by breaking larger ones into smaller separate tables. Only recent screen readers are able to use the newer markup to fully describe the table, and it can be difficult for readers using these programs to get a grasp of the table if it is overwhelmingly big.

Frames

Provide meaningful names and page titles for all frames

Screen readers identify the individual frames in a page to their user by calling out the frame’s name and <title>. Giving these elements meaningful values like ‘Navigation’ and ‘Content’ instead of ‘nav’ or ‘left’ will allow users of screen readers to access the particular frame they want more easily. The name should indicate clearly the frame’s function. Add in a title attribute to each frame too. You can use the <title> tag to expand on this further.

Don’t use non-essential frames

Frames are sometimes used by designers for margins or borders around content pages. This means that a screen reader must identify each frame to the user despite some of them containing no meaningful content. Really, frames are just plain bad for accessibility in all regards, but this is particularly wrong. This causes time consumption and frustration for blind readers. If a frame contains no proper content, eliminate it. Use margins and padding for your spacing concerns.

Forms

Associate labels with all form elements

Every form field (that’s every text box, radio button etc.) should have a text label defined with the form accessibility tag, <label>, as screen readers can’t always guess correctly which part of explanatory text points to which field. Defining it explicitly with this tag means there’ll never be any mistakes made. The text labels should also be placed in standard positions, like above or to the left of text boxes and select boxes; and to the right of radio and check boxes.

Break complex forms into natural groups of elements using <fieldset>and <legend>, and use <optgroup> to simplify the choices in a complicated <select>. All these elements are discussed in Forms Accessibility.

Include instructions with each field

Special form instructions and required field notes should be included in the field label. Screen readers generally only read out the label when filling out a form. Screen magnifiers will also miss out on instructions if they’re far away from the fields they are talking about. If the instructions are too unwieldy to go in the label, make sure they’re above the form, rather than below, and try to keep them short and simple, incorporating a list if possible.

Make sure fields tab logically

Screen readers and those of us stuck without a mouse must tab between fields. Often the logical next field is not the one that follows in the HTML code due to the way you’ve laid your form out. You can use tabindex to make every click of the tab key a logical progression to the next field in order. Use accesskey to give quick access to parts of the page also.

And now that you know the rules to play by, you can review your pages’ compliance using this handy » Content accessibility checklist from the W3C. There’s also this handy online » accessibility Validator which can help you diagnose problems.