The Web Design Group

Web Authoring FAQ: Getting Started


This list of Frequently Asked Questions is maintained by the WDG and was last updated on November 29, 1999. It may be found at the following URLs:

If you would like to contribute to this FAQ, please send mail to <darin@htmlhelp.com>. All contributors will be listed at the bottom of the FAQ.

1. Getting Started

  1. What is everyone using to write HTML?
  2. Where can I find a list of all the current HTML tags?
  3. How can I show HTML examples without them being interpreted as part of my document?
  4. How do I get a so-and-so character in my HTML?
  5. Should I put quotes around attribute values?
  6. How can I include comments in HTML?
  7. How can I check for errors?
  8. What is a DOCTYPE? Which one do I use?

1.1. What is everyone using to write HTML?

It seems that everyone has a different preference for which tool works best for them. You may find lists of HTML authoring tools at:

Keep in mind that typically the less HTML the tool requires you to know, the worse the output of the HTML. In other words, you can always do it better by hand if you take the time to learn a little HTML.

[Table of Contents]

1.2. Where can I find a list of all the current HTML tags?

The current W3C Recommendation is HTML 4.0. HTML 4.0 extends HTML 3.2 to include support for frames, internationalization, style sheets, advanced tables, and more. HTML 4.0 is not well supported by current browsers, but many of its features can be used safely in non-supporting browsers.

Recommended materials on HTML 4.0:

Recommended materials on HTML 3.2:

Some materials on browser-specific versions of HTML:

[Table of Contents]

1.3. How can I show HTML examples without them being interpreted as part of my document?

Within the HTML example, first replace the "&" character with "&amp;" everywhere it occurs. Then replace the "<" character with "&lt;" and the ">" character with "&gt;" in the same way.

The next Q&A addresses the more general issue of representing arbitrary characters in HTML documents.

[Table of Contents]

1.4. How do I get a so-and-so character in my HTML?

The safest way to do HTML is in (7-bit) US-ASCII, and expressing characters from the upper half of the 8-bit code by using HTML entities. See the answer to "Which should I use, &entityname; or &#number; ?"

Working with 8-bit characters can also be successful in many practical situations: Unix and MS-Windows (using Latin-1), and also Macs (with some reservations).

The available characters are those in ISO-8859-1, listed at <URL:http://www.htmlhelp.com/reference/charset/>. On the Web, these are the only characters widely supported. In particular, characters 128 through 159 as used in MS-Windows are not part of the ISO-8859-1 code set and will not be displayed as Windows users expect. This includes the em dash, en dash, curly quotes, bullet, and trademark symbol; neither the actual character nor &#nnn; is correct. (See the last paragraph of this answer for more about those characters.)

On platforms whose own character code isn't ISO-8859-1, such as MS DOS, Macs, there may be problems: you'd have to use text transfer methods that convert between the platform's own code and ISO-8859-1 (e.g Fetch for the Mac), or convert separately (e.g GNU recode). Using 7-bit ASCII with entities avoids those problems, and this FAQ is too small to cover other possibilities in detail. Mac users - see the notes at the above URL.

If you run a web server (httpd) on a platform whose own character code isn't ISO-8859-1, such as a Mac, or IBM mainframe, it's the job of the server to convert text documents into ISO-8859-1 code when sending them to the network.

If you want to use characters outside of the ISO-8859-1 repertoire, you must use HTML 4.0 rather than HTML 3.2. See the HTML 4.0 Recommendation at <URL:http://www.w3.org/TR/REC-html40/> and the Babel site at <URL:http://babel.alis.com:8080/> for more details. Another useful resource for internationalization issues is at <URL:http://ppewww.ph.gla.ac.uk/%7Eflavell/charset/>.

[Table of Contents]

1.5. Should I put quotes around attribute values?

It depends. It is never wrong to use them, but you don't have to if the attribute value consists only of letters (A-Za-z), digits, periods and hyphens. This is explained in the HTML 2.0 specs.

Be careful when your attribute value includes double quotes, for instance when you want ALT text like "the "King of Comedy" takes a bow" for an image. Humans can parse that to know where the quoted material ends, but browsers can't. You have to code the attribute value specially so that the first interior quote doesn't terminate the value prematurely. There are two main techniques:

Both these methods are correct according to the spec and are supported by current browsers, but both were poorly supported in some earlier browsers. The only truly safe advice is to rewrite the text so that the attribute value need not contain quotes, or to change the interior double quotes to single quotes, like this: ALT="the 'King of Comedy' takes a bow".

Note that XHTML 1.0 (a reformulation of HTML 4.0 as an XML 1.0 application) requires attribute values to be quoted.

[Table of Contents]

1.6. How can I include comments in HTML?

A comment declaration starts with "<!", followed by zero or more comments, followed by ">". A comment starts and ends with "--", and does not contain any occurrence of "--" between the beginning and ending pairs. This means that the following are all legal HTML comments:

But some browsers do not support the full syntax, so we recommend you follow this simple rule to compose valid and accepted comments:

An HTML comment begins with "<!--", ends with "-->" and does not contain "--" or ">" anywhere in the comment.

See <URL:http://www.htmlhelp.com/reference/wilbur/misc/comment.html> for a more complete discussion.

[Table of Contents]

1.7. How can I check for errors?

Various software is available to find errors in your web documents automatically. HTML validators are programs that check HTML documents against a formal definition of HTML syntax and then output a list of errors. Validation is important to give the best chance of correctness on unknown browsers (both existing browsers that you haven't seen and future browsers that haven't been written yet).

HTML linters (checkers) are also useful. These programs check documents for specific portability problems, including some caused by invalid markup and others caused by common browser bugs. Linters may pass some invalid documents, and they may fail some valid ones.

All validators are functionally equivalent; while they may have different reporting styles, they will find the same errors given identical input. Different linters are programmed to look for different problems, so their reports will vary significantly from each other. Also, some programs that are called validators (e.g. the "CSE HTML Validator") are really linters/checkers. They are still useful, but they should not be confused with real HTML validators.

When checking a site for errors for the first time, it is often useful to identify common problems that occur repeatedly in your markup. Fix these problems everywhere they occur (with an automated process if possible), and then go back to identify and fix the remaining problems.

While checking for errors in the HTML, it is also a good idea to check for hypertext links which are no longer valid. There are several link checkers available for various platforms which will follow all links on a site and return a list of the ones which are non-functioning.

You can find a list of validators, linters, and link checkers at <URL:http://www.htmlhelp.com/links/validators.htm>. Especially recommended is the use of an SGML-based validator such as the WDG HTML Validator <URL:http://www.htmlhelp.com/tools/validator/> or W3C HTML Validation Service <URL:http://validator.w3.org/>.

[Table of Contents]

1.8. What is a DOCTYPE? Which one do I use?

According to HTML standards, each HTML document begins with a DOCTYPE declaration that specifies which version of HTML the document uses. The DOCTYPE declaration is useful primarily to SGML-based tools like HTML validators, which must know which version of HTML to use in checking the document's syntax. Browsers generally ignore DOCTYPE declarations.

See <URL:http://www.htmlhelp.com/tools/validator/doctype.html> for information on choosing an appropriate DOCTYPE declaration.

Note that the public identifier section of the DOCTYPE declaration is case sensitive. Some versions of Netscape Composer are known to insert the lower-case "-//w3c//dtd html 4.0 transitional//en", rather than the correct mixed-case "-//W3C//DTD HTML 4.0 Transitional//EN".

[Table of Contents]


For additions or omissions to this FAQ, please contact <darin@htmlhelp.com>.

All information contained herein was originally compiled by members of the Web Design Group, principally Arnoud "Galactus" Engelfriet, John Pozadzides, and Darin McGrew.

Additional input has been provided by Boris Ammerlaan, Lori Atwater, Alex Bell, Stan Brown, Roger Carbol, Alex Chapman, Jan Roland Eriksson, Jon Erlandson, Mark Evans, Alan Flavell, Lucie Gelinas, Bjoern Hoehrmann, Tina Marie Holmboe, Peter Jones, Nick Kew, Jukka Korpela, Simon Lee, Nick Lilavois, Neal McBurnett, Glen McDonald, Dan McGarry, Ken O'Brien, Timothy Prodin, Steve Pugh, Liam Quinn, Colin Reynolds, Kai Schätzl, Doug Sheppard, Sue Sims, Toby Speight, Warren Steel, Ian Storms, Peter Thomson, Daniel Tobias, and Diane Wilson.

Thanks everyone!


Home, Reference, FAQs, Tools, Design, Feature Article, BBS, Links

Copyright © 1996-1999. Web Design Group All rights reserved.