The Web Design Group

Web Authoring FAQ: Web Publishing


This list of Frequently Asked Questions is maintained by the WDG and was last updated on November 29, 1999. It may be found at the following URLs:

If you would like to contribute to this FAQ, please send mail to <darin@htmlhelp.com>. All contributors will be listed at the bottom of the FAQ.

2. Web Publishing

  1. Where can I put my newly created Web pages?
  2. Where can I announce my site?
  3. Is there a way to get indexed better by the search engines?
  4. How do I prevent my site from being indexed by search engines?
  5. How do I redirect someone to my new page?
  6. How do I password protect my web site?
  7. How do I stop my page from being cached?
  8. How do I hide my source?
  9. How do I detect what browser is being used?
  10. How do I get my visitors' email addresses?
  11. Why is my custom 404 message not displayed?

2.1. Where can I put my newly created Web pages?

Many ISPs offer web space to their dial-up customers. Typically this will be less than 5MB, and there may be other restrictions; for example, many do not allow commercial use of this space.

There are several companies and individuals who offer free web space. This usually ranges from 100KB up to 1MB, and again there are often limitations on its use. They may also require a link to their home page from your pages. The following have pointers to providers of free web space:

There are also many web space providers (aka presence providers) who will sell you space on their servers. Prices will range from as little as $1 per month, up to $100 per month or more, depending upon your needs. Non-virtual Web space is typically the cheapest, offering a URL like: http://www.some-provider.com/yourname/ For a little more, plus the cost of registering a domain name, you can get virtual web space, which will allow you to have a URL like http://www.yourname.com/.

If you have some permanent connection to the Internet, perhaps via leased line from your ISP then you could install an httpd and operate your own Web server. There are several Web servers available for almost all platforms.

If you just wish to share information with other local users, or people on a LAN or WAN, you could just place your HTML files on the LAN for everyone to access, or alternatively if your LAN supports TCP/IP then install a Web server on your computer.

[Table of Contents]

2.2. Where can I announce my site?

[Table of Contents]

2.3. Is there a way to get indexed better by the search engines?

There is no single technique, but a number of factors can help.

Note that the CONTENT attribute of the META keywords and description tags may contain up to 1022 characters, but no markup other than entities.

You might want to preview your site with a text-only browser like Lynx, to get an idea of how your site appears to search engines. Search Engine Watch at <URL:http://searchenginewatch.com/> is a Web site dedicated to search engines and strategies for Web page authors.

[Table of Contents]

2.4. How do I prevent my site from being indexed by search engines?

See <URL:http://info.webcrawler.com/mak/projects/robots/exclusion.html>.

[Table of Contents]

2.5. How do I redirect someone to my new page?

The most reliable way is to configure the server to send out a redirection instruction when the old URL is requested. Then the browser will automatically get the new URL. This is the fastest and most efficient way, and is the only way described here that can convince indexing robots to phase out the old URL. For configuration details consult your server admin or documentation (with NCSA or Apache servers, use a Redirect statement in .htaccess).

If you can't set up a redirect, there are other possibilities. These are inferior because they tell the search engines that there's still a page at the old location, not that the page has moved to a new location. But if it's impossible for you to configure redirection at your server, here are two alternatives:

[Table of Contents]

2.6. How do I password protect my web site?

Password protection is done through HTTP authentication. The configuration details vary from server to server, so you should read the authentication section of your server documentation. Contact your server administrator if you need help with this.

For example, if your server is Apache, see <URL:http://www.apache.org/docs/misc/FAQ.html#user-authentication>.

[Table of Contents]

2.7. How do I stop my page from being cached?

Browsers cache web documents; they store local copies of documents to speed up repeated references to documents that haven't changed. Also, many browsers are configured to use public proxy caches, which serve many users (e.g., all customers of an ISP, or all employees behind a corporate firewall). To effectively control how your documents are cached you must configure your server to send appropriate HTTP headers. The configuration details vary from server to server, so check your server documentation.

The Expires header is understood by virtually all caches. The cached document will be retrieved again automatically once it has expired. The Expires header must contain an HTTP date, which must be Greenwich Mean Time (GMT), not local time.

HTTP 1.1 introduced the Cache-Control header, which provides more flexibility for telling caches how to handle the document. For more information, see the HTTP 1.1 draft (see <http://www.w3.org/Protocols/>).

The Pragma header is generally ineffective because its meaning is not standardized and few caches honor it. Using <META HTTP-EQUIV=...> elements in HTML documents is also generally ineffective; some browsers may honor such markup, but other caches ignore it completely.

Further discussion can be found at <http://www.mnot.net/cache_docs/>.

[Table of Contents]

2.8. How do I hide my source?

You can't. The HTML source is necessary for the browser to display your document; you must send the complete, unencrypted source to the browser. Even if a particular browser doesn't have a "View Source" feature, there are many that do, and someone can always retrieve the document by hand (using telnet) or from the browser's cache.

There are tricks that make it more difficult for some readers to view or save your source (e.g., tricking newbies into thinking there's nothing there by adding dozens of blank lines to the beginning of the document). However, just as with tricks that try to protect images from being saved, these tricks have very limited effectiveness and can cause various problems for law-abiding users.

[Table of Contents]

2.9. How do I detect what browser is being used?

Many browsers identify themselves when they request a document. A CGI script will have this information available in the HTTP_USER_AGENT environment variable, and it can use that to send out a version of the document which is optimized for that browser.

Keep in mind not all browsers identify themselves correctly. Microsoft Internet Explorer, for example, claims to be "Mozilla" to get at Netscape enhanced documents.

And of course, if a cache proxy keeps the Netscape enhanced document, someone with another browser will also get this document if he goes through the cache.

For these reasons and others, it is not a good idea to play the browser guessing game.

[Table of Contents]

2.10. How do I get my visitors' email addresses?

You can't. Although each request for a document is usually logged with the name or address of the remote host, the actual username is almost never logged as well. This is mostly because of performance reasons, as it would require that the server uses the ident protocol to see who is on the other end. This takes time. And if a cache proxy is doing the request, you don't get anything sensible.

But just stop to think for a minute... would you really want every single site you visit to know your email address? Imagine the loads of automated thank you's you would be receiving. If you visited 20 sites, you would get at least 20 emails that day, plus no doubt they would send you invitations to return later. It would be a nightmare as well as an invasion of privacy!

In Netscape 2.0, it was possible to automatically submit a form with a mailto as action, using JavaScript. This would send email to the document's owner, with the address the visitor configured in the From line. Of course, that can be "mickey.mouse@disney.com". This was fixed by Netscape 2.01.

The most reliable way is to put up a form, asking the visitor to fill in his email address. To increase the chances that visitors will actually do it, offer them something useful in return.

[Table of Contents]

2.11. Why is my custom 404 message not displayed?

Recent versions of Internet Explorer default to "friendly" HTTP error messages. When a special HTTP response (e.g., a 404 response) is shorter than 512 bytes, the browser substitutes its own message for the one delivered by the server. As a user of Internet Explorer, you can disable this feature in the "Advanced" options panel. As a web author, your only recourse is to make the error page larger.

[Table of Contents]


For additions or omissions to this FAQ, please contact <darin@htmlhelp.com>.

All information contained herein was originally compiled by members of the Web Design Group, principally Arnoud "Galactus" Engelfriet, John Pozadzides, and Darin McGrew.

Additional input has been provided by Boris Ammerlaan, Lori Atwater, Alex Bell, Stan Brown, Roger Carbol, Alex Chapman, Jan Roland Eriksson, Jon Erlandson, Mark Evans, Alan Flavell, Lucie Gelinas, Bjoern Hoehrmann, Tina Marie Holmboe, Peter Jones, Nick Kew, Jukka Korpela, Simon Lee, Nick Lilavois, Neal McBurnett, Glen McDonald, Dan McGarry, Ken O'Brien, Timothy Prodin, Steve Pugh, Liam Quinn, Colin Reynolds, Kai Schätzl, Doug Sheppard, Sue Sims, Toby Speight, Warren Steel, Ian Storms, Peter Thomson, Daniel Tobias, and Diane Wilson.

Thanks everyone!


Home, Reference, FAQs, Tools, Design, Feature Article, BBS, Links

Copyright © 1996-1999. Web Design Group All rights reserved.