asp tutorials, asp.net tutorials, sample code, and Microsoft news from 15Seconds
Data Access  |   Troubleshooting  |   Security  |   Performance  |   ADSI  |   Upload  |   Email  |   Control Building  |   Component Building  |   Forms  |   XML  |   Web Services  |   ASP.NET  |   .NET Features  |   .NET 2.0  |   App Development  |   App Architecture  |   IIS  |   Wireless
 
Pioneering Active Server
 Power Search





Active News
15 Seconds Weekly Newsletter
• Complete Coverage
• Site Updates
• Upcoming Features

More Free Newsletters
Reference
News
Articles
Archive
Writers
Code Samples
Components
Tools
FAQ
Feedback
Books
Links
DL Archives
Community
Messageboard
List Servers
Mailing List
WebHosts
Consultants
Tech Jobs
15 Seconds
Home
Site Map
Press
Legal
Privacy Policy
internet.commerce














internet.com
IT
Developer
Internet News
Small Business
Personal Technology
International

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers

HardwareCentral
Compare products, prices, and stores at Hardware Central!

ASP Techniques for Webmasters
By Alex Homer
Rating: 3.9 out of 5
Rate this article


  • email this article to a colleague
  • suggest an article

    Introduction

    This article is an abridged version a chapter, by Alex Homer, in a new book called 'Pro ASP Techniques for Webmasters' from Wrox Press (ISBN 1861001797). The article discusses how ASP can be used in conjunction with various other techniques to provide feedback about your site, and to prevent errors. The book as a whole covers a whole host of issues ranging from basic site navigation and browser compatibility, through security and remote administration, to tasks like visitor logging and mailing list management.

    Feedback, Errors, and Broken Links

    It would be nice to think that once we've set up our site, we can sit back and just keep tweaking the content and adding the occasional new items to it on demand-while it all keeps working transparently in the background. This kind of ideal world only exists in textbooks, and not in reality. We're only too aware of how easy it is for your site to get 'broken' by your, or other people's actions. Even the biggest sites, that have large development teams and no obvious shortage of resources, suffer broken links.

    As an example, if your site has links to another site outside your direct control, you can find that suddenly the links are broken. The other site may have moved some pages around, changed their main site URL, or just gone out of business altogether. And they didn't have the decency to let you know! But, there again, do you know who provides links to your site? If you change the URL of a page on your site, do you email all the other sites that link to it to tell them?

    In this chapter we'll tackle the issues that arise when you provide links to other sites, and they provide links to you. And of course we also need to think about how we ensure that we don't break any links that are wholly within our own site when we move pages around. We'll also look at how we can collect opinions from our visitors about our site. In particular we'll consider:

    Ways of preventing errors and broken links appearing on your site
    How we can create custom error pages to catch broken links or other errors
    How we can log inter-site navigation and other errors, when they do occur
    Ways of checking that other sites we provide links to are still available
    How we can provide feedback and collect opinions from our visitors
    We'll start with a look at how we can try and prevent errors or broken links appearing on our site in the first place.

    Designing to Prevent Errors

    No matter how well you plan and manage the development of your Web site, you are going to have at least a few errors or broken links appearing at some stage. It might not always be your fault. The Web depends on the unimaginably complex mass of inter-site and inter-page links to make it what it is. Maintaining and updating your share of these links is often the biggest headache of all for the site administrator. In this part of the chapter, we'll look at how they can arise, and see some basic procedures that will help to minimize their effects.

    A Typical Scenario

    Here's a typical scenario. A potential visitor to your site is looking for information on, say, reverse boost accelerator flanges. They go to a search engine and enter the criteria. Back comes a list of suppliers, with your site sitting proudly at the top of the list. They click the link and get this:

    OK, so it might be that you have stopped making reverse boost accelerator flanges. However, it might be that the page in the search engine's list referred to purple ones, and you now only make green ones. Even worse, it might be that you just changed the name or location of the page. Whichever is the case, the result is the same. The potential visitor (and customer) will buy their flanges from someone else.

    A Better Solution

    Here's an alternative scenario. Someone goes to a search engine and finds a link that points to a page on our Web-Developer site that no longer exists. Instead of the impersonal 404 Not Found error message, this is what they see:

    As well as an acknowledgement that our site still exists, the visitor gets a helpful message and (more importantly) a couple of links to follow. This is crucial because, having attracted them to our site, we want to keep them there. From this page, they can go to our Home page or to a map of our site. You'll see how we implement customerror pages like this later in the chapter.

    Provide a Site Map

    Most Web users have got used to the fact that pages (and sites) move around, and they accept this. However, once they get to our site-particularly if they are looking for something specific-we need to make things easy to find. The common answer is a site map page. This can be as simple as a list of pages, or as complex as a clickable graphical representation of the site.

    Our Web-Developer site uses a page that is mid-way between these styles. It provides a description of each section of the site, including the kind of content it contains, followed by links to the main pages within each section. It's certainly more attractive that just a list of links, and hopefully easier to use as well:

    Click Here to View the Figure

    Missing Images and Broken Links

    Keeping track of links between pages can be hard enough, but keeping track of image file links can be even harder. Placing all of your site's images in one folder helps, because at least then they all have the same path. However, if the image isn't there, getting the right path doesn't help. There isn't much that looks worse than a page with missing images:

    Click Here To View the Figure

    To make matters worse, the page you see in the previous screenshot is even more broken than it appears. The first and third links both have incorrect URLs, and so clicking on them results in an error. The only redeeming factor is that they do bring up our custom error page, so the user can go to the site map or home page to find the resource they want.

    Recording the Errors

    While it's generally easy to find missing image errors, because they are so visible in the page, they can arise without warning. If you change a graphic for one page, and delete the original one, do you remember to check for any other pages that use the old graphic? You might not visit that page for a while to find the 'missing image' symbol. Even worse, broken links to other pages are not visible when you just load a page. It's only when you click on a link that you find the error.

    The 404 Not Found HTTP Header

    This is where the custom error page you saw earlier in this chapter is so useful. When any broken link is activated, whether it's a link to another page or just a link to an image within a page, Internet Information Server loads our custom error page.

    Notice that the error page-either the standard version or our custom one-is still sent back to the browser when the link is to an image on the page, rather than a link to another page. The viewer doesn't see the error page, however, because the Web server accompanies it with a 404 Not Found header in the response (see Chapter 2 for a discussion of HTTP headers). The browser interprets this header and, because it knows that the request is for an image, displays the 'missing image' icon instead.

    So our custom error page is still loaded by the Web server and sent to the browser for every broken link, be it a missing page, a missing image, or any other missing resource that the browser requests-perhaps a Java applet or an ActiveX control. Inside our custom error page is code that records the fact that there was an error, and the reason why the error occurred, in a database on our server. The next screenshot shows the page that our administrators can view to monitor for errors. This is the result after loading the Programming Articles page with it's two missing graphics you saw earlier, and after clicking on the two broken links:

    You can see the entries for the missing graphics, which appear twice because we went back to the page after getting the first 404 Not Found error, and the entries for the broken page links. Later in this chapter we'll look at the whole process of logging broken links, and the way we can display and manage the results. You'll see how the pages shown here are created.

    Preventing Broken Links

    Before we move on to the techniques for creating your own custom error pages, we'll look at some of the ways that you can help to prevent broken links from appearing both on your site, and on other sites and search engines that provide links to your site.

    Tools for Automatically Checking and Maintaining Links

    If you create and maintain your site using one of the proprietary tools, such as Microsoft Visual InterDev or FrontPage, you can take advantage of their features for monitoring links. Using the View Links on WWW option, you can see a graphical representation of your site that automatically highlights any broken links.

    For example, the following screenshot shows Visual InterDev displaying the links in a special version of the Stonebroom Software home page-which intentionally introduced two broken links so that you can see how it works:

    Click Here to View the Picture

    The broken links are shown with the icon appearing to be torn into two pieces. And, although you can't see it here, the 'broken link' lines are also displayed in red, rather than gray like the other links. Notice also that one of the broken links ( software.htm ) is on this site, whereas the other ( http://www.ewarhouse.com ) is a misspelled external link. This is visible because we turned on the Show External Files option using the toolbar shown above. Most tools, including Visual InterDev and FrontPage, will also display links to all the graphics on the pages as well-we had this option turned off in the previous screenshot.

    There are several tools available that can provide this feature, in a range of different ways. You'll also find that most tools which allow you to create and mange you site's content will attempt to automatically update the links when you move pages from one folder to another. This can be a real time-saver, and avoid most instances of this kind of error.

    Check your HREF Syntax

    Even if you get the name of the page or graphic correct, you can still sometimes cause broken links. A typical case is where you only check your site using Internet Explorer, because it is a lot less fussy about the syntax of links than most other browsers. In particular, if you use a backslash instead of a forward slash in a URL, Navigator will disregard everything after the backslash. This means that it won't find graphics or other pages that are referenced in a relative link.

    For example, if we insert an image into a page named http://sunspot/test/stuff/mypage.htm using:

    <IMG SRC="mypicture.gif">

    then load this page into Navigator or Internet Explorer, it works fine. And if we put a link with the wrong type of slash on another page that points to this page:

    <HTML><A HREF="stuff\mypage.htm">Go to my page</A></HTML>

    IE converts the backslash into a forward slash and still finds the picture in the new page. However, in Navigator (and many other browsers), you don't get the image. They assume the image to be in the directory below-test in these screenshots. You can see the unconverted slash character in the box in the next screenshot:

    Click Here To View Figure

    The same applies if you include a backslash in the path to an image. It's an easy mistake to make if you are used to typing physical paths to files in a DOS Command window. Hence the often-stated advice-always check your pages in as many different browsers as you can.

    Provide Default Pages

    Back in Chapter 2, we discussed how we can use a default page to redirect visitors to an appropriate starting point for the resource they are looking for, or a main Home page. This is useful when users access the site without providing the name of a page, for example http://yoursite/usefulpages/. You can send back a menu of the useful pages you provide, or a Home page containing a prominent link to these useful pages.

    The default page can be an HTML page (usually default.htm) which contains a <META> redirection tag, some client-side redirection code, or just a normal <A> link-or preferably all three, as shown in Chapter 2. However, an even better solution is to use an ASP page (usually default.asp) that performs the redirection through a Response.Redirect statement. This is less obvious than displaying a link for them to click, or a blank page with a <META> redirection tag.

    For example, we might have this line in the default.asp page in our Reference directory:

    <% Response.Redirect "/default.asp?page=reference" %>

    Any user that accesses the URL http://webdev.wrox.co.uk/reference/ will be automatically and invisibly redirected to the main site frameset page- default.asp in the root directory of the whole site. But, because this page accepts a parameter named page in the query string that will load a particular page into the main frame, they get a frameset with the Wrox Reference Tools menu page displayed instead of the main site Home page:

    ...
    <FRAMESET cols="100,*" FRAMEBORDER="NO" BORDER="NO">
    <%
    strPageURL = Request.QueryString("page")
    If Len(strPageURL) = 0 Then 'main Home page
    %>
    <FRAME src="/navigate.htm" SCROLLING="NO" MARGINWIDTH=1 MARGIN0>
    <FRAME src="/webdev/WhatsNew.asp" NAME="mainframe">
    <%
    Else
    Select Case strPageURL
    Case "reference" 'reference tools menu
    %>
    <FRAME src="/nav_wdr.htm" SCROLLING="NO" MARGINWIDTH=1 MARGIN0>
    <FRAME src="/reference/reference.asp" NAME="mainframe">
    <%
    ... 'code for other 'page' values goes here
    ...
    <%
    End Select
    End If
    %>
    </FRAMESET>
    ...
    Although search engines and other sites tend to specify the full URL (i.e. including the page's filename) when they link to your site, this technique is still useful. Experienced visitors-when faced with a 'Not Found' message-often trim off the file name or parts of the path in the URL in their browser's address box and try again. If you've provided a default page, you'll catch the request this time.

    And of course it's always worthwhile trying to get other sites to omit the filename when they link to your site anyway-especially if it's to a specific subset of pages. One way to do this is to make it easy for other sites to put links to you on their pages. On the Web-Developer site we provide a Trade A Link page that contains not only some graphics that other sites can use, but also the code to insert them into the HTML. This means that we get to specify the HREF they use:

    Click Here To View the Figure

    Using a Directory Listing

    Back in Chapter 2, we mentioned the risks involved in allowing users to browse the folders on your Web server. If you turn on the Directory Browsing Allowed option (which can be done for an individual directory in IIS4, but only for the whole site in IIS3), visitors will be able to view that folder's contents and navigate between folders that don't contain a default file (usually default.htm or default.asp ):

    This might be useful if you want to give visitors free access to all the contents of that folder on the server. However, if the pages depend on each other (for example they are part of an application or have to be loaded in a particular order), visitors may get inconsistent results if they load a page that is not the proper 'start' page. Also remember that the listing allows users to move from one folder to another, so make sure that at least one physical folder above this one (i.e. nearer to the root folder of the site) has a default.htm and default.asp file to prevent the entire site's contents from being listed.

    Problems with Search Engines

    One area where more broken links that ever can arise is when your page is indexed by one of the Web's search engines . This might be done without you realizing. Often 'crawlers' that you've never even heard of follow links from one site to another, so they can index your site while you know nothing about it. The Excite Web Robot pages list 173 known robots currently active on the Web at the time of writing-see http://info.webcrawler.com/mak/projects/robots/robots.html for more details.

    If you have custom visitor logging enabled (as demonstrated in the next chapter) you can see if any crawlers have passed through your site, by examining the list of user agents and comparing them to the Excite Robots list.

    The traditional search engines, such as Yahoo, Alta Vista, Infoseek, Excite, etc. maintain huge indexes that contain millions of entries, and some don't get round to checking and confirming that entries are valid very often-if ever. They depend on you to do the work of keeping your index entries up to date. However, with the dozens of search engines in everyday use, it's very difficult to keep track of every link to your site.

    Remove Old Pages from Search Engines

    However, if you discover a broken link to your site, you can (and should) do something about it. One option is to place a default.asp or default.htm redirection file at the referenced position in your site. Then, visitors following the link provided by the search engine will be redirected to your new page, or directly to your Home page.

    Alternatively, once you know which search engine holds the broken link, you can go to that search engine's site and remove it. Almost all search engines provide a facility to report and remove old and non-existent links, for example here is the page on Infoseek for doing just that:

    Alta Vista also provides the same service. Their 'Scooter' crawler checks that indexed pages still exist, and removes them automatically. However, it may not get round to your site for a while. To remove pages from the index yourself, you just submit the old invalid URLs using the normal Add/Remove URL option. Scooter will then attempt to index them immediately, find that the have gone, and delete them from the index.

    Preventing Directory and File Indexing

    If you want to prevent crawlers and search engines from indexing parts of your site, you can use either a special <META> tag or a robots.txt file. Both techniques are simple enough to implement, though you can't guarantee that all search engines will abide by your instructions. However, one or both are worth including, as most of the popular search engines recognize them.

    Using a robots.txt File

    The simplest and quickest way to control indexing of your pages is to provide a single robots.txt file in the root folder of your site. Note that this file must be in the folder referenced by your root URL, i.e. /robots.txt or http://yoursite/robots.txt, and that the filename is case-sensitive-it must all be in lowercase.

    Inside the file you place a single User-agent identifier, followed by any number of Disallow statements. Each one prevents indexing of this folder (or file), and any folders below this folder. You can add comments after a hash character in any line. The usual kinds of entries are:

    User-agent: * # applies to all robots
    Disallow: /fileorfolder # all files and folders with this name
    Disallow: /thisfolder/ # disallow a folder with this name
    Disallow: /thisfolder/thisfile.htm # disallow just this file
    Note that the entry for /fileorfolder will prevent indexing not only of any folder named fileorfolder that is in the root folder of your site, but also any pages named fileorfolder.htm, fileorfolder.asp, etc. which are in that folder. To prevent any part of your site being indexed, you can use the file:

    User-agent: * # applies to all robots
    Disallow: / # applies to the entire site
    Remember that, even though Windows NT is relaxed about case sensitivity, this is not universally so on the Web. Make sure the URLs are all of the correct case in your robots.txt file. Search Engines change the rules that they apply when indexing sites, in particular to protect themselves from multiple entries pointing to the same page. At the time of writing, some were reviewing their policy on ASP and other dynamically created pages. Search sites generally publish the conditions that they apply when indexing sites, and you should keep up to date with these to make sure your sites are included where this is appropriate

    Using a <META> ROBOTS tag

    The alternative method of controlling indexing is with a <META> tag in each page or each section menu page. However (as in the case of using a robots.txt file), you can't guarantee that an instruction in one page will prevent indexing of pages linked to, or in folders below, this page.

    The <META> tag only instructs the crawler on whether it can index this page and follow the links in this page. If you stop it following links to, say, secret.htm in this page, it may still find another page that links to secret.htm and has no <META> indexing control tag. And this page could even be on a different Web site. You really should put the tag in all pages that you don't want to be indexed.

    The tag itself is simple enough. The NAME part is just "robots", and the CONTENT part is a comma-delimited string of instructions. Here are some examples:

    <META NAME="robots" CONTENT="noindex">
    <META NAME="robots" CONTENT="nofollow">
    <META NAME="robots" CONTENT="noindex,nofollow">
    <META NAME="robots" CONTENT="none">
    The first just prevents indexing of this page. The second allows indexing of this page, but prevents the crawler from following any links in the page. The third entry prevents it from indexing this page and following any links in it. The value none in the fourth entry is the equivalent of noindex,nofollow. You can also use index, follow and all, however, these are the default if omitted, so there is no real point in doing so.

    Using Custom HTTP Headers

    You'll recall from the discussion we had in Chapter 2 about <META> tags that they are often just another way of creating HTTP headers, but in the browser rather than on the server. In theory, a search engine or crawler should react to a <META> tag like this:

    <META HTTP-EQUIV="robots" CONTENT="noindex">

    in the same way as it does to the previous examples of the <META> tag. This would allow us to use custom headers created within Internet Information Server for all the files in one folder or virtual directory:

    While we haven't been able to verify that this technique is reliable, there is no harm in implementing it as well as the more traditional methods.

    One interesting point about robots is what you should do if you have pages that are secret, or for which you don't want to advertise their presence on your site. If you name them in robots.txt, are you just providing an excellent way for any crawler to find out about them? It's probably safer to stick to including a <META> tag in the page, making sure there are no links to it anywhere on your site, and of course protecting the file using one of the techniques we described in Chapter 5.

    And Finally

    And, finally, think about providing a site map or resources map to help your visitors find what they want more easily. However, even if you put all these techniques into practice, you can still get 404 Not Found errors. If a user types the URL of a page that doesn't exist, you can't do much about it. Even if there is a default page in that folder (when you have directory listing option disabled) the Web server will still send back a 404 Not Found error. To get round this, we can implement a custom error page like that we showed earlier in this chapter. We'll see how we created our custom error page next.

    Creating Custom Error Pages

    We saw earlier in the chapter just how effective a custom error page can be on your site. It can help you to retain visitors that found you site through an old or partially broken link, or those one-fingered typists who still insist on using the other nine thumbs to enter URLs into their browser's address box. As long as the link gets that visitor to your site (i.e. it includes your domain) it doesn't matter if the rest, the path and filename, is wrong. The visitor will get the custom error page.

    The Custom '404 Not Found' Page

    The error messages that your visitors see in their browser are just ordinary HTML pages. With Internet Information Server 4, when they request a page that doesn't exist, the page 404.htm (stored by default in the Winnt\help\common\ folder of your server) is sent back instead. This means that we can edit this file to personalize it, and-more important-add links to it that take the lost visitor to our Home page or site map.

    We can also use a different page, stored in a different directory on our server, or specify the URL of a page stored elsewhere. This means that we can effectively redirect users to another site if that is appropriate. In this case, the target page can be an ASP script instead of the default of an HTML file. And, best of all, by combining these two techniques it's possible to persuade IIS4 to use an ASP page stored on our own server. This allows us to execute a script in response to a navigation error or broken link-exactly the plan we had in mind.

    Setting Up the naverror.asp Page in IIS4

    The first point to note is that IIS treats custom error pages that contain ASP code differently when it loads them in response to an error. If an ASP page is just specified as a file (in the IIS error configuration settings), any script code in it won't be executed. To make our custom error page work, we have to specify it as being a URL. Also note that this technique isn't limited to just 404 Not Found errors-there are a whole host of different errors defined and detected within IIS, and there is a default HTML page for each one. You can apply the following technique to any of these pages.

    In IIS4, you can specify individual directories for which the custom error page will be used, so you can have different combinations of custom pages for each error in each directory or virtual directory. In IIS3, you can only specify one set of pages for the entire WWW sever.

    Internet Service Manager Configuration

    The first step is to build the custom error page we want to use. We'll show you how the one we use works after we see how to set up the custom error configuration within IIS, because there are issues involved that we'll have to resolve later in our code. We're going to apply our custom page to the entire Web site, so in the Internet Service Manager we right-click the Default Web Site entry and open the Properties dialog. In this dialog, open the Custom Errors page, scroll down to the entry for 404 Not Found , and click the Edit Properties button:

    Click Here to View the Figure

    This displays the Error Mapping Properties dialog, where we change the Message Type from File to URL , and enter the URL of the ASP page to be executed when the error occurs. Here it's named naverror.asp , and is in the root folder of our site.

    Because we are editing the properties of the default site, clicking OK brings up the Inheritance Overrides dialog. We don't want our custom error page used with the two child nodes (virtual applications) shown, so we click OK here without selecting them.

    You'll probably see a different list of child nodes, depending on where you are specifying your custom page, and what virtual applications you have set up on your server.

    How IIS Loads the Custom Page

    With configuration complete in Internet Service Manager, our custom page will now be used instead of the standard 404 Not Found error page, and sent to the browser with every instance of this error. However, one point to note is that the way IIS loads the page depends on what type of file caused the error in the first place.

    The next screenshot shows two error instances, one for an HTML file that doesn't exist and one for an ASP page. Notice that the URL for the missing ASP page is the name of our custom error page with a query string containing the error number and the full URL of the page that was requested. In the case of the HTML page that doesn't exist, the URL is just that of the missing file. You'll see why this is important when we look at the code in the page.

    Click Here To View Figure

    The Code for the naverror.asp Page

    Now that we know what to expect from IIS when our custom error page is loaded, we can look at the code. The first part creates a '404 Not Found' status value to return to the browser, indicating that the page it requested does not exist. Then it defines the heading for the page:

    <%@LANGUAGE="VBScript"%>
    <% Response.Status = "404 Not Found" %>
    <html>
    <meta NAME="robots" CONTENT="noindex">
    <head><title>Wrox Web-Developer Navigation Error</title></head>
    <body BGCOLOR="#FFFFFF">
    <img SRC="/images/WDLogoLg.gif" ALT="Wrox Web-Developer"><BR>
    <font FACE="Arial" SIZE="4" COLOR="darkgray">
    &nbsp; <B>Navigation Error</B></font>
    <hr><P>
    <font FACE="Arial" SIZE="3">The page you requested cannot be found.</font><P>
    ...

    Of course, you'll want to be sure to put a robots type <META> tag with the value noindex in this page, as shown in the code above, to prevent search engines indexing it!

    Collecting the Values We'll Need

    Now we can collect the two values we'll need-the URL of the page that the user requested, and which wasn't found, and the URL of the page that contained the link they clicked. If they simply typed the URL into their browser's address box, this will be empty. We're going to store the details using two varchar-type fields in our SQL Server table, which are limited to 255 characters.

    The final part of this section of code creates the information strings that we'll put into the page. The ones we actually end up using will depend on the values we obtained for the target and referrer. We also need to limit the referring URL in strReferer to 220 characters so that, even when added to the rest of the error message, it is no more than 255 characters long:

    ...
    <%
    On Error Resume Next 'important in an error page to prevent another error
    strTarget = Request.ServerVariables("QUERY_STRING")
    strReferer = Request.ServerVariables("HTTP_REFERER")
    If Len(strReferer) > 220 Then strReferer = Left(strReferer, 220)
    strInform = "Please inform the WebMaster of the site that contains the link."
    strTyping = "Please check the URL and try again or:"
    strRecord = "Error has been recorded and will be fixed as soon as possible."
    strSQLInfo = "The page " & strReferer & " contains a broken link."
    ...
    You'll recall from our earlier discussion on how IIS loads the custom error page that, if the request was for a missing ASP page, the target URL returned by IIS contains the URL of our error page followed by a query string containing the error number, a semi-colon, and the URL of the target page. The next step is to extract this by stripping it off the end of the original target string in strTarget. If there's no semi-colon, we know that it's an HTML page that caused the error, and so strTarget will just contain the target URL. Then we can trim it to a maximum of 255 characters, so as to fit into our table's varchar-type field. We also write it into the page:

    ...
    intSemiColon = InStr(strTarget, ";") 'get the original target
    If (intSemiColon > 0) And (intSemiColon < Len(strTarget)) Then
    strTarget = Mid(strTarget, intSemiColon + 1)
    If Len(strTarget) > 255 Then strTarget = Left(strTarget, 255)
    End If
    Response.Write "<B>&gt; &nbsp;" & strTarget & "</B><P>"
    ...
    Having got our final target string, we can now examine the referrer information we saved in the strReferer string. If this is empty, it indicates that the user typed the URL themselves, so there's not much point in storing the values. However, if it's not empty, we know the user came from another page. In this case, we'll store the values for the target and referrer in our database, because they indicate a broken link:

    ...
    If Len(strReferer) > 0 Then 'they came from a link on another page
    Response.Write "A link on the refering page:<B> " & strReferer _
    & "</B> contains an error.<BR>"
    ...
    'write the values into the NavErrors table
    ...
    Else
    Response.Write strTyping 'they just typed it wrong into their browser
    End If
    %>
    ...

    Storing the Values in the Database

    To get the values into the database, we just create our Connection object, open it, and fire a suitable SQL INSERT statement at it. The connection string is defined in the variable strConnect by using an include file, as we've done in previous chapters. However, you can specify a suitable DSN instead if required:

    If Len(strReferer) > 0 Then 'they came from a link on another page
    Response.Write "A link on the refering page:<B> " & strReferer _
    & "</B> contains an error.<BR>"
    Set oConn = Server.CreateObject("ADODB.Connection") 'to store the details
    oConn.Open strConnect 'defined elsewhere in the page
    strSQL = "INSERT INTO NavErrors (NavError, TargetURL) " _
    & "VALUES ('" & strSQLInfo & "', '" & strTarget & "')"
    oConn.Execute strSQL
    If Err.Number = 0 And InStr(strReferer, ".wrox.co") > 0 Then
    Response.Write strRecord 'came from a page on our site
    Else
    Response.Write strInform 'came from a page on another site
    End If
    Else
    Response.Write strTyping 'they just typed it wrong into their browser
    End If
    Notice that, after we've stored the values in the table, we examine the referrer string to see if the link that caused the error is on our site or another site. Even though we've recorded both, we only tell the viewer it will be fixed if it's one of ours-that's the only time we can be sure it will. For other errors we ask them to let the referring site know about the error.

    Of course, when we come to examine the table contents, we'll see the broken links on other sites. We can drop them an email to let them know, or put a default redirection page in place to catch any more referrals we get to the missing page.

    Finally, we can finish up our custom error page with a couple of links to our Home page and site map:

    ...
    <P><A HREF="/default.asp" TARGET="_top"><B>Click here to go to our Home page</B></A>.
    <P><A HREF="/default.asp?bookcode=sitemap" TARGET="_top"><B>Click here for a map of our site</B></A>.
    <P>or click the <B>Back</B> button in your browser to return to the previous page.
    </body>
    </html>

    A Table to Store the Error Details

    The code in the naverrors.asp page assumes that we have a suitable table (named NavErrors) already available to store the error details. If you implement the visitor logging and information system that we describe in the next chapter, and set up the database for it using the scripts that we supply, this table will be created for you in the IISLogs database.

    If not, you can use the script naverrors.sql that is included with the samples for this chapter to create the table in another database instead. This is the table, as seen in SQL Server:

    The table includes a field that can be used to store the IP address of the server-easily obtained from the Request.ServerVariables collection. This might be useful if you host several sites, which is why it is included. However, as you'll be able to see the referring URL in the values stored in the table, you may prefer to disregard this field as we've done.

    Managing the NavErrors Table

    Having stored all this useful data about the broken links on our and other people's sites, we'll need to be able to use it. Viewing and deleting the stored records is easy with ASP, and we'll very quickly show you how it can be done.

    Viewing the Stored Broken Links Details

    This page, viewbrokenlinks.asp, is included with the samples for the book. It simply opens the NavErrors table, using a connection string defined in an include file (as you've seen done many times earlier), and dumps the contents of the fields into an HTML table in the page. We haven't repeated the page headings again, but simply listed the code that does the work:

    <%@ LANGUAGE=VBSCRIPT %>
    <!-- #include virtual="/connect/iislog.inc" -->
    <html>
    ...
    <% '--get error information---
    On Error Resume Next
    Set oConn = Server.CreateObject("ADODB.Connection")
    oConn.Open strConnect 'from include file at top of page
    strSQL = "SELECT ErrDateTime, NavError, TargetURL FROM NavErrors " _
    & "ORDER BY ErrDateTime DESC"
    Set oRs = oConn.Execute(strSQL)
    If (oRs.EOF) Or (Err.Number > 0) Then
    Response.Write "<FONT FACE=" & QUOT & "Arial" & QUOT & " SIZE=3>" _
    & "<B>Sorry, database cannot be accessed.</B></FONT></BODY></HTML>"
    Response.End
    End If
    %>
    <table>
    <tr>
    <th nowrap>Date &nbsp; Time &nbsp;</th>
    <th nowrap>Error Details</th>
    </tr>
    <% Do While Not oRs.EOF %>
    <tr>
    <td nowrap valign="top"><% = oRs("ErrDateTime") %> &nbsp;</td>
    <td nowrap>
    <% = oRs("NavError") %><BR><B>Target:</B> <% = oRs("TargetURL") %>
    </td>
    </tr>
    <% oRs.MoveNext
    Loop
    Set oRs = Nothing
    Set oConn = Nothing %>
    </table>
    ...
    As all the best chef's say, here's one we made earlier. You can see that we have a broken link on Yahoo, and a missing file on our own site:

    Click Here to View Figure

    Deleting the Stored Broken Links Details

    Deleting the error details from the table, once we've fixed the links or reported the error to another Webmaster, is even easier. We just need to apply a SQL DELETE statement to the table to remove all the contents. The statement DELETE NavErrors does this, it doesn't drop (delete) the table itself. If you're feeling nervous about this, use DELETE FROM NavErrors instead-it does the same thing:

    <%@ LANGUAGE=VBSCRIPT %>
    <!-- #include virtual="/connect/iislog.inc" -->
    <html>
    ...
    <% If Request.QueryString("Sure") = "Yes" Then 'delete the table contents
    On Error Resume Next
    Set oConn = Server.CreateObject("ADODB.Connection")
    oConn.Open strConnect 'from include file at top of page
    strSQL = "DELETE NavErrors"
    Set oRs = oConn.Execute(strSQL)
    If Err.Number > 0 Then
    Response.Write "<FONT FACE=" & QUOT & "Arial" & QUOT & " SIZE=3>" _
    & "<B>Sorry, the database cannot be accessed.</B></FONT></BODY></HTML>"
    Response.End
    End If
    Set oRs = Nothing
    Set oConn = Nothing %>
    <P><B>All entries deleted...</B><P><A HREF="mainmenu.asp">Main Menu</A>
    <% Else 'prompt for confirmation before deleting records %>
    Are you sure you want to delete all the entries ? &nbsp;
    <A HREF="<% = Request.ServerVariables("SCRIPT_NAME") %>?Sure=Yes">Delete</A>
    &nbsp; <A HREF="mainmenu.asp">Cancel</A>
    <% End If %>
    </body>
    </html>

    Confirming the Delete Action

    One trick you can see used in the code above is how we get the user to confirm the delete action first, by looking for a value Sure=Yes in the query string. When the page is first loaded this value won't be there, so we create the 'Are you sure' page that contains the Delete and Cancel links:


    The Delete link, which reloads this page, contains the value Sure=Yes in the query string. So, this time, the code that deletes the records will be executed.

    Checking Links to Other Sites

    In Chapters 1 and 2, you saw the Links page on our Web-Developer site. This provides links to other related sites, and to sources of components and information on techniques that developers may be interested in:

    Click Here to View Figure

    However, as we discovered earlier in this chapter, providing links to other sites can be an easy way of introducing errors. Unless you are vigilant, and check each link on a regular basis, you won't know if one of the sites has moved its content around, changed the name of a file you are linking to, or just closed down altogether.

    Using Default URLs for Links

    We try and limit the effects of pages being moved around by following our own advice. Almost all the links are to a root URL or a specific folder on the target host, and not to a particular file. For example, we provide links to the World Wide Web Consortium at http://www.w3.org/, and the independent Active Server Pages resource site at http://www.activeserverpages.com/, using just their root URLs. In specific cases, we link to a pre-arranged file, such as Microsoft's Internet Explorer Home page at http://www.microsoft.com/ie/logo.asp. It's unlikely that this page will suddenly change, however, because it is used in the 'Get IE' banners you see at the foot of many Web pages.

    But this alone doesn't guarantee that our visitors will always have a current and active page to go to. To make sure they do, we've implemented a simple administration page named checkdeadlinks.asp that can be used to check URLs to make sure that they respond, and to discover a bit about them. We'll show you how this is done next. The code for the finished page (shown below) is included with the samples for this book:

    About Automated Link Confirmation

    As well as loading a Web page or any other file into our browser, we can use specialized tools to retrieve the file and present it in a way more appropriate to the task in hand. In the case of the component we're using in this example, the appropriate way is as an ordinary character string, which we can store in a variable in our ASP code.

    The ASP.HTTP Component

    There are several components that can retrieve files across the Web, using the HTTP protocol. The one we use is produced by Stephen Genusa (steve@genusa.com), and is available from his ASPServer Components Web site at http://www.serverobjects.com/. You can download a free time-limited evaluation copy of the latest version to experiment with, and we've included more details about the component with the samples available for this book.

    Basically, the component accepts a URL as a string, plus various other values that control the timeout, the protocol version to use, the headers it will present to the host, etc. Once we've specified the properties for the component, it connects to the specified URL (just like a browser running on our server would), and fetches the page.

    We are only using it in a basic way, but you'll no doubt find other ways of putting it to work in your own applications. For example, you can set it up to save the files it retrieves to disk, present usernames and passwords to remote hosts, etc.

    Building the checkdeadlinks.asp Page

    The checkdeadlinks.asp page has a simple enough job to do, though the code to implement it does become a little complex. The plan is to examine all the pages listed in our Links table, to make sure that the URLs we use to link to these pages are still valid. At the start of the page, we include a text file containing our database connection details (as we've done in most of the earlier examples), then we set the two timeout values.

    Setting the Timeouts

    The first timeout value is the ASP script execution timeout, which we'll increase from the default value of 90 seconds to 40 minutes. This is (hopefully) far more time than we'll need, but it allows for those days when the 'Net, or our connection to it, are running slowly. Remember, we'll be collecting each page listed in our Links table from its host site, and this could take some time.

    The second timeout is the value we'll use while fetching individual pages-in our example this is 45 seconds. If the host server doesn't respond within that time, we'll flag the page up as being doubtful. If it regularly takes this kind of time to react, we probably don't want to provide a link to it anyway, because our visitors will give up waiting if they try to follow the link:

    <%@ LANGUAGE=VBSCRIPT %>
    <!-- #include virtual="/connect/linksdb.inc" -->
    <% Server.ScriptTimeOut = 2400 'will probably take a while to run %>
    <% seekTitleTimeout = 45 'seconds to wait for page to arrive %>
    ...
    ...

    Getting a List of URLs from the Links Table

    Now we can open the Links table and create a recordset containing the URL of each page. You've seen this kind of thing done many times before in this book:

    ...
    <% '--get list of of HREFs from Links table--
    QUOT = Chr(34)
    CRLF = Chr(13) & Chr(10)
    On Error Resume Next
    Set oConn = Server.CreateObject("ADODB.Connection")
    oConn.Open strConnect 'from include file at top of page
    strSQL = "SELECT tLinkHRef FROM Links"
    Set oRs = oConn.Execute(strSQL)
    If (oRs.EOF) Or (Err.Number > 0) Then
    Response.Write "<FONT FACE=" & QUOT & "Arial" & QUOT & " SIZE=3>" _
    & "<B>Sorry, the database cannot be accessed.</B></FONT></BODY></HTML>"
    Response.End
    End If
    ...

    Checking each URL

    It's now time for the fun part, where we fetch each page and examine the contents. We'll use a couple of variables to keep track of the number of 'doubtful' pages we find (intNumPages), and to provide a set of unique window names to place in hyperlinks in this page (intWinNum). It makes sense to list each URL we examine as a hyperlink, so that we can easily open the site or page it refers to in cases where we get an error or warning. By opening each one in a separate browser window, we allow the administrator to view them without having to reload (and hence re-execute) our checkdeadlinks.asp page.

    Then we start looping through the URLs in our recordset. We write the URL as a hyperlink into our page, creating our unique TARGET value as we go (the value of intWinNum is incremented at the end of the loop each time):

    ...
    intNumPages = 0 'number of dead or possibly doubtful links found
    intWinNum = 1 'target window number for URL to be opened in for checking
    Do While Not oRs.EOF
    strURL = oRs("tLinkHRef") 'get the link URL
    '--process each link--
    Response.Write "Processing Link to: <A HREF=" & QUOT & strURL & QUOT _
    & " TARGET=" & QUOT & "CDLWin" & intWinNum & QUOT & ">" & strURL _
    & "</A><BR>" & CRLF
    ...
     

    Fetching the Page with the ASP.HTTP Component

    To fetch the page, we first set the values of the two 'result' variables, strResult and strTitle, to empty strings, and then instantiate our component. Then we set the Url and TimeOut properties, and call the GetURL method to retrieve the page. If it returns an empty string, we probably got a timeout against the host server, so we'll print a suitable message into the page and increment the number of doubtful pages in intNumPages.

    If we do get a result, we can look to see what it contains. Remember that the component returns the entire content of the page as a string, including the HTML tags, and we can manipulate it using normal string-handling functions. If the page is an error message, it will usually contain the word 'error' or 'invalid', for example 'HTTP Error 404' within the body of the page and 'Error 404' in the <TITLE> section. If it does, we'll flag this page as also being doubtful:

    ...
    strResult = "" 'to hold entire retrieved contents of the page
    strTitle = "" 'to hold the page title
    Set oHTTP = Server.CreateObject("ASP.HTTP") 'create component
    oHTTP.Url = strURL 'set the URL
    oHTTP.TimeOut = seekTitleTimeout 'set the timeout
    strResult = oHTTP.GetURL 'and get the page
    Set oHTTP = Nothing 'destroy the component
    If Len(strResult) = 0 Then
    Response.Write "<B>&gt;&gt; No reply from server in " _
    & seekTitleTimeout & "seconds.</B><P>" & CRLF
    intNumPages = intNumPages + 1 'increment number of doubtful links
    Else
    If Instr(LCase(strResult), "error") > 0 _
    Or Instr(LCase(strResult), "invalid") > 0 Then
    Response.Write "<B>&gt;&gt; Request returned an error.</B><P>" & CRLF
    intNumPages = intNumPages + 1 'increment number of doubtful links
    Else 'extract the title from the page if there is one
    ...

    Extracting the Page Title and Checking the Content

    The next section of the code strips out the page title, by looking for the HTML <TITLE> and </TITLE> tags (in upper or lower case). If it doesn't find a title, it sets strTitle to 'Untitled Page at:' instead. And while we've got the page content, we can play with it as well.

    For example, the next few lines of our code look to see if we somehow got a page with 'doubtful' content added to our list of links. You might also like to look for other words that identify if the page is connected with the topics you want to include in your list of links. If you provide links to other Windows NT pages, you could flag up any that didn't contain the words 'Windows NT' somewhere in the page:

    ...
    Else 'extract the title from the page if there is one
    intStart = Instr(UCase(strResult), "<TITLE>") + 7
    intFinish = Instr(UCase(strResult), "</TITLE>")
    If (intStart > 0) And (intFinish > intStart) Then
    strTitle = Trim(Mid(strResult, intStart, intFinish - intStart))
    End If
    If Len(strTitle) = 0 Then strTitle = "Untitled page at:"
    strResult = LCase(strResult) 'check for unwelcome content
    If InStr(strResult, " sex ") Or InStr(strResult, " adult ") Or _
    InStr(strResult, " porn ") Or InStr(strResult, " xxx ") Or _
    InStr(strResult, " nude ") Or InStr(strResult, " sexy ") Then
    intNumPages = intNumPages + 1 'increment number of doubtful links
    Response.Write "<B>&gt;&gt; Content Warning!</B> &nbsp; "
    End If
    Response.Write "Page title is: " & strTitle & "<P>" & CRLF
    strResult = ""
    End If
    End If
    oRs.MoveNext 'go to the next link
    intWinNum = intWinNum + 1 'increment the target window number
    Loop
    Set oRs = Nothing
    ...
    At the end of the previous section of code, we write the results of our content parsing exercise into the current page, and then go round and do the same for the next page. Once we've checked all the links, we write out a note as to what we found, and provide a link back to our main menu:

    ...
    If intNumPages = 0 Then %>
    <P>There were no unresponsive links, errors or content warnings.</P>
    <% Else %>
    <P><B>There were <% = intNumPages %> unresponsive links,
    errors or content warnings.</B></P>
    <% End If %>
    <HR>
    <FORM ACTION="mainmenu.asp">
    <INPUT TYPE="SUBMIT" NAME="Submit" VALUE="Main Menu &gt;">
    </FORM>
    </body>
    </html>

    Examining the Results

    The next screenshot shows the results of running this page against our own Links table. It looks really useful, but notice that we have an error shown against two of the pages in the table, including the one you see near the bottom of the list:

    Click Here To View the Figure

    We know that the site is still there, because we were visiting it yesterday. It might be that the folder has moved. By clicking the link, we can examine the page and see what happened. In fact, the page opens fine-so what went wrong?

    The answer is in the way we checked for an error. The technique of just looking for the words 'error' or 'invalid' means that pages containing either of these words within the text will fail our test. And because our component returns the entire content of the page (HTML and all) as a string, the word that triggered the error might not be visible in the page. This is the case with the activeserverpages.com site-examining the source of the page we find the following line that creates an entry in a SELECT list on the page:

    ...

    <OPTION value="/learn/dbtablewitherrortrap.asp">Db2table high quality

    ...

    Some Notes About the Code

    As you've just seen, the somewhat inelegant techniques we used for checking for an error do tend to trap valid pages. There are plenty of ways we could be more selective, for example, we could check for " error " " invalid " (i.e. complete words), or even " http error ". However, it's easy enough to open a site that returns a 'doubtful' status anyway, so you may not think it worth doing any more complex processing.

    The results we got earlier also include another anomaly, in that our own 'Doing Windows DNA' page (at http://webdev.wrox.co.uk/dna/) returned a title of 'Object Moved'. We know that this is because the page redirects the user via default.asp to collect a frameset (as shown in Chapter 2 and again earlier in this chapter). So, as you can see, picking out pages that are OK and those that aren't is more difficult than it first appears. How much code you implement in this respect ultimately depends on how many links you have to monitor, and how much you need to fully automate the process of picking out dead ones-without having to scan the list by eye.

    You might also have noticed that we are destroying and re-instantiating the ASP.HTTP component each time we use it in the page, i.e. for each URL we check. It would seem to be more efficient to create an instance of it before the loop, and then use the same instance for each URL. In fact we are running an old version of the component (which we use in other applications as well) and this version sometimes gets confused by servers that timeout or return an invalid response. By re-instantiating it each time, we solve any problems this might cause.

    You'll see very similar techniques to the ones we've used here later in the book, in Chapter 8, where we build up a list of sites that link to us (referrers). It uses the same ASP.HTTP component as we've done in our checkdeadlinks.asp page, but adds some extra features such as allowing users to delete links.

    Collecting Opinions and Feedback

    We've spent most of this chapter looking at ways that we can track errors on our site, and try to prevent them occurring in the first place. To finish off, we'll take a brief look at one way that you can prompt your visitors to tell you what they think about your site. While this might seem to be quite unconnected with errors, just remember that there are a whole variety of users, operating systems, browser types, and various other kinds of assorted network software and hardware out there. Your site may appear to work fine in your own browser and operating system environment, but be fundamentally broken when viewed by users in other environments.

    Checking Your Site for 'Content' Errors

    Unlike physical errors (such as broken links and missing graphics), cosmetic and content errors on your site generally can't be detected automatically. The only way you can really confirm that your site is always presenting the appearance you want, is to view it in as many different browsers as possible. For example, our HTMLReference Database works fine in Navigator and in Internet Explorer from version 3 upwards:

    Click Here to View Figure

    But this page just refuses to work at all in the latest version of Opera 3.0. This is because the values of the check boxes are not sent back to the server in the expected format, something we discussed when we looked at client-side scripting issues in Chapter 4. Our page contains <NOSCRIPT> tags within a <FORM>, which Opera 3.0 cannot handle (and yes, we've fixed it now):

    Click Here to View Figure

    What about Operating Systems, Networks and Language?

    Some errors are quite easy to find, but there can be more insidious ones that are very difficult to track down. For example, do you have a Unix box and a Mac for testing your pages? If you only check your site using Windows-based browsers, how do you know they will work properly with other operating systems? These probably won't have the fonts you specify in your pages, and will provide HTML controls that look totally different to the Windows-style ones you're used to.

    And what about other network software or hardware? Do you know what effect the various kinds of proxy servers have on pages that pass values between them? As we discovered in Chapter 5, you can't depend on Windows Challenge/Response authentication working though all proxy servers (depending on how they're configured), and it doesn't work at all on non-Microsoft browsers anyway.

    The chances of being able to check your site on all the possible combinations of browser, operating system, and network configuration are probably as close to zero as makes no difference. And did you remember language issues? Words and phrases you use in your pages may be meaningless to people in other countries, or-even worse-mean something rude! In reality, you are going to depend a lot on feedback from people who use (or try to use) your site to discover these kinds of problems.

    Providing an 'Opinions' Page

    It's nice to know what people really think about your site, as well as how they physically use it (i.e. the kind of thing you discover from traffic figures that you collect in your server log files). The easiest way to collect this kind of information is to provide a questionnaire or survey page on your site, and ask people to fill it in.

    We wanted to do more than that, because there are traditional difficulties with surveys on Web sites. Visitors often tend to be 'surfing', or moving around from site to site, when they get to your site. Expecting them to spend ten minutes filling in forms can be over-optimistic in a lot of cases. Unless there is some immediate inducement or benefit, it can often seem to be a waste of time.

    Instead, we collect visitors opinions; using a small, non-intrusive, and very simple page that pops up away from the main browser window. We use the term 'opinions', because to the visitor it probably sounds a lot less time consuming and complicated than filling out a questionnaire or survey form. We also make it physically simple for a visitor to express their opinions, by including just six check boxes and two buttons:

    Opening the 'Opinions' Page

    To open this page into a new browser window, we use some client-side JavaScript code. This means that it won't be seen by users of non script-enabled browsers, but, as the page itself requires script to work properly, this isn't a bad thing. We also want to display the opinions window only at certain times, and certainly only once during a visitor's session on our site.

    In fact, it would be nice to only open it at certain intervals for any one visitor, say a maximum of once every two weeks. This stops it annoying them by opening every time they come to our site, and makes the process of providing feedback look less like an automated operation and more like we really value their opinions.

    The code that achieves the control we want over opening the 'opinions' window is shown below. It implements a function named getOpinions which needs to appear in every page where we want to trigger the opening of the window (providing it hasn't been seen in the last two weeks). To make updating the code easier, we place it in an include file, and we can then insert it into each page where want to use it with a SSI #include tag. Here's the contents of opinions.inc:

    <SCRIPT LANGUAGE="JAVASCRIPT">
    <!--
    function getOpinions()
    {
    strCookie = document.cookie;
    if (strCookie.indexOf("WDDoneOpinions=True") < 0)
    {
    theDate = new Date();
    theDay = theDate.getDate() + 14; // gets the day of the month
    if (theDay > 28) {
    theDay = theDay - 28;
    theMonth = theDate.getMonth() + 1;
    theDate.setMonth(theMonth);
    };
    theDate.setDate(theDay); // sets the day of the month
    strDate = theDate.toGMTString(); // expiry dates must be UNC (GMT)
    document.cookie = "WDDoneOpinions=True;path=/;expires=" + strDate;
    window.open("/common/opinion.htm","opinion_win",
    "resizable=yes,scrollbars=no,toolbar=no,"
    + "location=no,directories=no,status=no,"
    + "menubar=no,width=370,200,top=5,left=5")
    }
    }
    // -->
    </SCRIPT>
    You can see that the code uses a cookie, in a similar way to the pop-up window examples we looked at in Chapter 2. If the value WDDoneOpinions=True is present in the document's cookie property, we don't show the 'opinions' window. If the value is not present, we create a cookie containing this value and add it to the document's cookie property then open the 'opinions' window. Remember that this is all happening on the client, not on our server.

    Creating the Expiry Date

    What is interesting is how we create the expiry date for the cookie so that it will remain on the client machine's hard disk for two weeks. Normally cookies are discarded when the user closes their browser, and without an expiry date they will see the opinions page once every visit after they close and reopen their browser.

    By creating an expiry date two weeks into the future in our code, and placing it in the cookie, we know it will not disappear when the browser is closed. But after two weeks it will expire, and no longer appear in the document's cookie property (or be sent back to our server). At this point, we'll create a new cookie with a new date two weeks into the future.

    In our code, we add 14 days to today's date, and then check to see if we went past the end of the shortest month. If so, we subtract 28 to make it wrap correctly into the next month and increment the month (OK, so we sometimes miss a day or two, but it's not critical in this situation). Notice that the getDate and setDate methods retrieve and set the day within the date, not the whole date. Once we've calculated the new expiry date, we can convert it to UNC (GMT), and drop it into the cookie:

    theDate = new Date();
    theDay = theDate.getDate() + 14; // gets the day of the month
    if (theDay > 28) {
    theDay = theDay - 28;
    theMonth = theDate.getMonth() + 1;
    theDate.setMonth(theMonth);
    };
    theDate.setDate(theDay); // sets the day of the month
    strDate = theDate.toGMTString(); // expiry dates must be UNC (GMT)
    document.cookie = "WDDoneOpinions=True;path=/;expires=" + strDate;
     

    Executing the getOpinions Function

    All that remains now is to insert the opinions.inc include file into appropriate pages, and fire the getOpinions function it contains to open the new window. We include the opinions.inc file in the main 'content' menus for each section of the site, and fire the getOpinions function in the unload event of the window object, which is triggered when the page is unloaded:

    ...
    <!-- #include virtual="/path_to_include_files/opinions.inc" -->
    </HEAD>
    <BODY ONUNLOAD="getOpinions();">
    ...
     
    An alternative way of opening the page would be to use server-side code to create the page and then add an HTTP WINDOW-TARGET header to the response-as shown in Chapter 2. However, as we discovered there, this is unreliable in many browsers. And it doesn't give us any control over window size and position, so a client-side solution is the better option.

    The 'Opinions' Table

    We now have a browser window that opens at the kind of intervals that we want, and we need to think about what we'll display inside it. We also need to think about how we'll store the opinions expressed by our users. We're collecting six items of information from checkboxes, which we'll store as a number in our database. SQL Server doesn't have an explicit True/False field type like Access and other desktop databases, so we'll use the usual notation of zero for False (no) and -1 for True (yes).

    The next screenshot shows the table named Opinions that we'll be using to store values. As well as the six yes/no answers, we're also going to give the user the opportunity to provide their email address to add to our mailing list, so we need a text field for that. We'll also store the IP address of the site, so that we can use the table and associated reports with all three sites we host, and we'll store the current data and time using a default value in the table:

    You'll notice that we've also included a unique key field, of type IDENTITY (which automatically generates unique numeric record identifiers, like an Access AutoNumber field). We'll need this when we come to use the information in a later example. If you implement the visitor logging and information system that we describe in the next chapter, and set up the database for it using the scripts that we supply, this table will be created for you in the IISLogs database. If not, you can use the script opinions.sql that is included with the samples for this chapter to create the table in another database instead.

    Building the 'Opinions' Page

    As to what you actually display inside the new browser window, that's obviously up to you. It depends on the kind of opinions and information you want to collect. The following code shows the file we use, named opinion.htm. We've included it with the other samples for this book, and you can modify it to suit your own requirements.

    The Visible Part of the Page

    The visible (body) part of the page is simple enough, with all the controls placed on a <FORM> that has the ACTION set to the name of an ASP file that will process user's opinions. The form also contains two HIDDEN-type controls that will return the user's email address (if supplied), and the IP address of our site (hard-coded into the HTML). To make it easier to see what's going on, we've removed the <FONT> tags that format the text from the code below-they are in the sample file that we provide:

    ...
    <BODY TOPMARGIN=10>
    <FORM NAME="opinion" ACTION="opinion.asp" METHOD="POST">
    <TABLE CELLPADDING=0 CELLSPACING=0>
    <TR>
    <TD NOWRAP COLSPAN=2><B>* Thank you for visiting ... </B></TD>
    </TR><TR>
    <TD NOWRAP VALIGN="TOP"><INPUT TYPE="CHECKBOX" NAME="chkUseful">&nbsp;</TD>
    <TD NOWRAP VALIGN="MIDDLE">I find that your site is a useful ... </TD>
    </TR><TR>
    <TD NOWRAP VALIGN="TOP"><INPUT TYPE="CHECKBOX" NAME="chkDesign">&nbsp;</TD>
    <TD NOWRAP VALIGN="MIDDLE">I like the design and organization ... </TD>
    </TR><TR>
    <TD NOWRAP VALIGN="TOP"><INPUT TYPE="CHECKBOX" NAME="chkVisit">&nbsp;</TD>
    <TD NOWRAP VALIGN="MIDDLE">I am a regular visitor and/or ... </TD>
    </TR><TR>
    <TD NOWRAP VALIGN="TOP"><INPUT TYPE="CHECKBOX" NAME="chkBuy">&nbsp;</TD>
    <TD NOWRAP VALIGN="MIDDLE">I plan to buy at least one of your ... </TD>
    </TR><TR>
    <TD NOWRAP VALIGN="TOP"><INPUT TYPE="CHECKBOX" NAME="chkEmail"
    ONCLICK="getAddress();">&nbsp;</TD>
    <TD NOWRAP VALIGN="MIDDLE">Please add me to your Web Developer ... </TD>
    </TR><TR>
    <TD NOWRAP VALIGN="TOP"><INPUT TYPE="CHECKBOX" NAME="chkComment"
    ONCLICK="sendComments();">&nbsp;</TD>
    <TD NOWRAP VALIGN="MIDDLE">I'm sending you my comments in a ... </TD>
    </TR><TR>
    <TD NOWRAP ALIGN="RIGHT" COLSPAN="2">
    <INPUT TYPE="HIDDEN" NAME="txtEmail">
    <INPUT TYPE="HIDDEN" NAME="txtSiteIP" VALUE="199.199.199.199">
    <INPUT TYPE="SUBMIT" VALUE="Submit">
    <INPUT TYPE="RESET" VALUE="Cancel" ONCLICK="window.close();">
    </TD>
    </TR>
    </TABLE>
    </FORM>
    </BODY>
    </HTML>

    The Code That Makes It Work

    At the start of the page, after the page title in the <HEAD> section, we provide two JavaScript functions. The first, getAddress, uses a prompt dialog to collect the user's email address, and place it into a HIDDEN-type control named txtEmail in the <FORM> section of the main page.

    Click Here to View Figure

    The second function, sendComments, just opens the browser's or user's default email message dialog, by setting the page's location.href property to a URL that uses the mailto: protocol. In both cases, the functions only carry out the task if the corresponding check box is set. This value will also be stored in our database to indicate that they provided an email address or sent a comment.

    In fact, we don't actually know this to be the case, because they could cancel the email prompt or new message dialog. We could check for a null value returned from the prompt and clear the checkbox, but we'll trust our visitors to provide accurate information. The results overall will be close enough for our needs anyway.

    ...
    <SCRIPT LANGUAGE="JavaScript">
    <!--
    function getAddress() {
    // prompts for an email address when user 'turns on' the fifth check box
    if (document.forms[0].chkEmail.checked)
    document.forms[0].txtEmail.value = prompt('Enter your email address:', '')
    };
    function sendComments() {
    // opens a new email message window when user 'turns on' the sixth check box
    if (document.forms[0].chkComment.checked)
    location.href = 'mailto:feedback@yoursite.com'
    };
    //-->
    </SCRIPT>
    ...

    The ASP Page to Store the Opinions

    To complete the operation of collecting opinions, we need a page to take the values in the form and store them in our database. It will also return a 'thank you' message in the opinions window. The page, opinion.asp, is shown below. It collects the values from the Request.Form collection and places them into a set of variables, converting the checkbox values into numbers as it goes.

    Then the code checks to see if at least one of the checkboxes was actually checked-if not we assume they pressed Submit instead of Cancel by mistake, and so we don't want to store the results. But, providing we have got at least one opinion, we can go ahead and create the SQL statement, then execute it against our table:

    <%@LANGUAGE="VBSCRIPT"%>
    <!-- #include virtual="/connect/iislog.inc" -->
    ...
    ...
    <%
    '---------- store comments if there are any ------------------
    bUseful = CInt(Request.Form("chkUseful") = "on") 'gives -1 for 'yes'
    bDesign = CInt(Request.Form("chkDesign") = "on") 'and zero for 'no'
    bVisit = CInt(Request.Form("chkVisit") = "on")
    bBuy = CInt(Request.Form("chkBuy") = "on")
    bEmail = CInt(Request.Form("chkEmail") = "on")
    bComment = CInt(Request.Form("chkComment") = "on")
    strAddress = Request.Form("txtEmail")
    strSiteIP = Request.Form("txtSiteIP")
    If bUseful + bDesign + bVisit + bBuy + bEmail + bComment < 0 Then
    %>
    <TD NOWRAP><B>* Thank you for you opinions about our site</B></FONT></TD>
    <%
    On Error Resume Next 'store opinions in database table
    Set oConn = Server.CreateObject("ADODB.Connection")
    oConn.Open strConnect 'defined in include file at head of page
    strSQL = "INSERT INTO Opinions(bUseful,bDesign,bVisit,bBuy,bMailList," _
    & "bComment,tAddress,tHostIP) VALUES(" & bUseful & "," & bDesign _
    & "," & bVisit & "," & bBuy & "," & bEmail & "," & bComment & ",'" _
    & strAddress & "','" & strSiteIP & "')"
    oConn.Execute strSQL
    Set oConn = Nothing
    ...
     

    If we didn't get any opinions expressed, we provide a polite note to the user that this is the case. Then the remainder of the page creates a nice 'thank you', including our feedback email address:

    ...
    Else 'no opinions expressed
    %>
    <TD NOWRAP><B>* You didn't check any of the boxes</B></FONT></TD>
    <%
    End If
    %>
    </TR><TR>
    <TD>This site is here to help you, as a Web Developer, ... etc.</TD>
    </TR><TR>
    <TD NOWRAP ALIGN="RIGHT" COLSPAN="2">
    <INPUT TYPE="RESET" VALUE="Close" ONCLICK="window.close();"> &nbsp;
    </TD>
    </TR>
    </TABLE>
    </FORM>
    </BODY>
    </HTML>
     
    And here's the result. It takes only a few moments for a visitor to tell us what they think, and then they can get on and use our site to find what they came for:

    You might disagree with our decision to treat no checked controls as meaning they have no opinion. It could mean that they just hated the site and never intend to buy any of our books. However, it's up to you to ask the questions you want answers to in relation to your own site, and to gauge the meaning of the answers you get.

    Analyzing the Opinions You Collect

    We've now got our visitor's opinions safely stored away in a table in our database. As it's likely that someone is going to want to know the results at some point, we need to find a way of extracting and using them.

    In Chapter 10, we'll show you a way that we can manage and use the email addresses that we collect this way, but in the meantime, the following page (showopinions.asp) is just one solution for analyzing the results. It shows the percentage of times that each of the six checkboxes was ticked by the users that expressed at least one opinion.

    Building the showopinions.asp Page

    The showopinions.asp page is quite a bit more complex than most of the ASP/ADO pages we've so far seen which extract and display data from a database. The reason is that we have to do quite a lot of work to get the results we want from the records.

    Each record contains six numeric fields holding the results of our user's opinions. Between one and six of these fields can have the value -1, with the remainder being zero. We want to find the number of times that each of the six checkboxes was ticked (i.e. has the value -1), as a percentage of the total number of visitors who provided opinions-in other words the percentage who agreed with each of the opinion statements in the page. We also want to present the results by week, rather than as an overall total, so that we can see any changes and trends.

    To do this means we need to summarize the results for each week in turn, starting from this week and going backwards to the first week that we have opinions for. Getting a meaningful result for a single week can be achieved by summing the values for each 'opinion' field for that week's records. This works because each of the six fields will have either zero or -1 in each record, so the result of adding the values together and reversing the sign will be the number of 'yes' answers for that field.

    Timeout, Connection and Parameters

    Our page starts off with the customary code to extend the script timeout and insert the data connection string. Then we check the value of the site parameter (if supplied) to see which of the sites we host the listing is for:

    <%@ LANGUAGE=VBSCRIPT %>
    <% Server.ScriptTimeOut = 600 %>
    <!-- #include virtual="/connect/iislog.inc" -->
    ...
    ...
    <%
    QUOT = Chr(34)
    CRLF = Chr(13) & Chr(10)
    strSite = Request("site") 'site to select opinions for
    Select Case strSite
    Case "wd" 'Web-Developer site
    strSiteName = "Web-Developer"
    strWhere = " WHERE THostIP = '194.73.51.228' "
    Case "cd" 'COMDeveloper site
    strSiteName = "COMDeveloper"
    strWhere = " WHERE THostIP = '194.73.51.229' "
    Case "wa" 'World Of ATL site
    strSiteName = "World Of ATL"
    strWhere = " WHERE THostIP = '194.73.51.230' "
    Case Else
    strSiteName = "(all sites)"
    strWhere = ""
    End Select
    %>
    Summary of opinions expressed for the <% = strSiteName %> site: <P>
    ...

    Getting the Date of the First Opinions

    Now we can get the date of the first record using a SQL statement that includes the MIN function, and store it in our dFirstDate variable. We're going to use a different SQL statement to create the recordsets later in our page, so we can destroy this recordset once we've got the value we need:

    ...
    <%
    On Error Resume Next
    Set oConn = Server.CreateObject("ADODB.Connection")
    oConn.Open strConnect
    '---------- get date of oldest record ----------------
    strSQL = "SELECT FirstDate=MIN(dOpinionDate) FROM Opinions" & strWhere
    Set oRs = oConn.Execute(strSQL)
    If (oRs.EOF) Or (Err.Number > 0) Then
    Response.Write "<FONT FACE=" & QUOT & "Arial" & QUOT & " SIZE=3>" _
    & "<B>Sorry, the database cannot be accessed.</B></FONT></BODY></HTML>"
    Response.End
    End If
    dFirstDate = oRs("FirstDate")
    Set oRs = Nothing
    ...

    Getting the Opinions for Each Week

    To select records from our Opinions table by week, we need to specify the week and year number in the SQL statement. We can calculate these values for any given date using the VBScript DatePart function. As we're starting with this week and working backwards, we first get the values for this week and year:

    ...
    intYear = DatePart("yyyy",Now())
    intWeek = DatePart("ww",Now())
    ...
    Now comes the loop that we'll use to extract and display each week's results. The condition makes sure that the week and year we're currently processing are greater than or equal to the first week and year in our Opinions records:

    ...
    '---------- loop while this week >= first date ----------------
    Do While ((DatePart("ww",dFirstDate) <= intWeek)
    And (DatePart("yyyy",dFirstDate) = intYear))
    Or (DatePart("yyyy",dFirstDate) < intYear)
    ...
    SQL Server provides us with a DATEPART function that works in a similar way to the VBScript equivalent, but with a different syntax for the first argument. Instead of a series of code letters in quotation marks, we use a word without quotation marks. With this function, it's easy to create a WHERE clause that selects just the required week's records. We add this clause onto the end of the existing WHERE clause that selects the site we're interested in:

    ...
    If Len(strWhere) > 0 Then
    strQuery = strWhere & " AND DATEPART(week,dOpinionDate)=" & intWeek _
    & " AND DATEPART(year,dOpinionDate)=" & intYear
    Else
    strQuery = " WHERE DATEPART(week,dOpinionDate)=" & intWeek _
    & " AND DATEPART(year,dOpinionDate)=" & intYear
    End If
    ...
    Now we can execute our SQL statement to collect the results for the week. We COUNT the total number of values in one of the opinion fields to get the total number of opinion records (this field will never be null, because we always place either zero or -1 in it). Then we SUM the six opinion fields:

    ...
    strSQL = "SELECT NumOpinions=COUNT(bUseful), Useful=SUM(bUseful), " _
    & "Design=SUM(bDesign), Visit=SUM(bVisit), Buy=SUM(bBuy), " _
    & "MailList=SUM(bMailList), Comment=SUM(bComment) FROM Opinions" _
    & strQuery
    Set oRs = oConn.Execute(strSQL)
    If (oRs.EOF) Or (Err.Number > 0) Then
    Response.Write "Week <B>" & intWeek & " : " & intYear _
    & "</B>. No opinions recorded.<P>" & CRLF
    Else
    ...
    Now we have all the totals, we can calculate the percentages for the week in question, remembering to take the absolute value of the result to remove the minus sign:

    ...
    intCount = oRs("NumOpinions")
    pcUseful = FormatPercent(Abs(oRs("Useful")) / intCount ,0)
    pcDesign = FormatPercent(Abs(oRs("Design")) / intCount ,0)
    pcVisit = FormatPercent(Abs(oRs("Visit")) / intCount ,0)
    pcBuy = FormatPercent(Abs(oRs("Buy")) / intCount ,0)
    pcMailList = FormatPercent(Abs(oRs("MailList")) / intCount ,0)
    pcComment = FormatPercent(Abs(oRs("Comment")) / intCount ,0)
    Response.Write "Week <B>" & intWeek & " : " & intYear & "</B>. " _
    & "Number of opinions recorded: <B>" & intCount & "</B><BR>" & CRLF
    %>
    ...

    Displaying the Opinions for Each Week

    And, finally, we can display the results. We create a separate table for each week, which means that the user will see each week's results while the page is still downloading. If we used one big table, they would have to wait for all the results to be calculated:

    ...
    <table>
    <tr>
    <td align="center">&nbsp; Found &nbsp; <br>&nbsp; Useful &nbsp;</td>
    <td align="center">&nbsp; Like &nbsp; <br>&nbsp; Design &nbsp;</td>
    <td align="center">&nbsp; Regular &nbsp; <br>&nbsp; Visitor &nbsp;</td>
    <td align="center">&nbsp; Intend &nbsp; <br>&nbsp; to Buy &nbsp;</td>
    <td align="center">&nbsp; Add to &nbsp; <br>&nbsp; MailList &nbsp;</td>
    <td align="center">&nbsp; Emailed &nbsp; <br>&nbsp; Comment &nbsp;</td>
    </tr>
    <tr>
    <td align="center" nowrap><b><% = pcUseful %></b></td>
    <td align="center" nowrap><b><% = pcDesign %></b></td>
    <td align="center" nowrap><b><% = pcVisit %></b></td>
    <td align="center" nowrap><b><% = pcBuy %></b></td>
    <td align="center" nowrap><b>