Certification Practice Exams!


HOME    articles    tutorials    tool directory    training    books    about

Free Email
Free Newsletter
META Tag Generator

Our Other Sites

GoTraining
Get Training on what you need.

GoExam
Certification practice tests with free demos to download.

GoCertify
Complete computer certification resource center covering virtually every IT certification.

SearchCertify
links, links and more certification links!


A Comprehensive Strategy for Using Web Site Statistics - page 3

When using these methods for identifying users, the following situations occur when sequentially processing access logs:

  1. a new IP address is encountered (assume this is a new user),
  2. an already processed IP address is encountered
    • the user agent matches prior requests (assume this is the same user),
    • the user agent filed does not match any prior requests form the same IP (assume this is a new user)
    • when a session is terminated due to a timeout, assume a new user has entered the site.

Therefore, if a substantial part of your statistics imply that many of the new hosts and timeouts were from hosts in the same domain/IP address space, you can infer that a large number of web site users either connect to the Web via ISPs with load balancing proxies, or that a large number of different users access the site from within the same domain as would occur with a large company, or that some combination of both cases exist.

Regardless, a significant number of page requests can result in ambiguous cases, where it is not possible to determine the existence of new users with certainty. While the incidence rate can vary considerably from Web site to Web site, the results can be inaccurate since these IP-based methods and other IP-based derivatives are used in cases where unique identifiers like cookies are not present.

Caching

Another major problem that dilutes the quality of the data is caching. There are two major types of caching. First, browsers automatically cache files when they are downloaded. When this is done, it is not necessary to subsequently download the entire page again. Depending on the browser settings, it can determine if the page has changed: in which case, you do know about it, and a page request is recorded. However, if the browser is not set to verify if a page has changed, then the user can read the page without any entry being recorded in the web log.

In addition, almost all ISPs now have their own cache. This means that when a web page request is made to the same page that anyone else from the ISP has made recently, the cache will have saved it, and will release it without any request being made to the original site. Therefore many people could request a site's pages from the same cache without the original web site (or its logs) even knowing about it.

For example, AOL uses caching extensively, and a single user with an AOL account may be reflected in your server logs by several different IP numbers as AOL uses its caching to grab the files for its user. If this happens, the logs will fail to identify a repeat customer. In addition, the logs will not be able to record if a visitor typed a URL into their browser after seeing a particular advertisement. If already cached when called, no page requests at all might show up in the logs.

Intelligent interpretation

Even with these limitations of both log file analysis and the use of cookies, a statistical interpretation of the data is currently the most effective form of analysis available. While the number of visitors is directly tied to the revenue that can be generated from a site, it is important to understand that this number has a varying number of interpretations from the data that is either collected from the log file or cookies.

While there is no ideal solution for getting precise site visitor statistics, one can seek a solution that is congruent with your organization's business plan. Two of the more popular web site analysis software products are Webtrends and Site Statistics. Site Statistics is sold by NetPromoter, which has a suite of Search Engine Optimization products that are designed to optimize commercial websites according to the missions and strategies of an organization (see Figure 1). The corporate mission is realized through execution of its strategies, which are influenced by the data and statistics that are collected from the log statistics and tracking counter modules. This information in turn is used to adjust the web site's use of keywords, phrases, and navigation paths. The Site Statistics' Top-ten Analyzer and Site Analyzer modules are designed to provide feedback to support any successful design efforts. This in turn, is an iterative process that is fed back to re-evaluate the mission and strategies of the overall ecommerce initiative.

Figure 1 How Site Statistics Supports a Successful Web Site

Log Analyzer

The log analyzer module parses and analyzes the log files and presents the results in a useful format. It displays information on visitor IP addresses, referring pages, requested pages, user paths through the site, and more. In Figure 2 a sample report displays the number of hits (unique visitors), the number of visitors and pages viewed, and the bandwidth used by browser type.

Figure 2 A Sample report by the Site Statistics Log Analyzer

Cookie-based module (Script Generator)

The cookie based module places JavaScript based tracking counters on pages on the web site. When a user requests a page, the JavaScript counter places a cookie on the user's computer, which serves as a unique user ID and helps identify subsequent visits and paths through the site even when users are working through a proxy server or are using dynamic IP addresses. The Site Statistics counter can discriminate between different users working through a proxy server, and even different users on the same computer (provided each of them is logged in under his own username).

The statistics does not require the installation of any additional software on either the web server or the client. However, potential drawbacks include site visits by users with disable cookies or instances when the counter is not allowed to function correctly when the page is not allowed to load completely.

Top Ten Analyzer

The Top 10 Analyzer module queries search engines by keywords that are important to your site, and retrieve the sites that occupy the top ranked positions by these search terms. Figure 3 is an example of a report that displays the top 10 search results for every selected keyword and search engine. This module can subsequently analyze these sites and discover the reasons for their top ranking, which include: the density and prominence of keywords on these pages, embedded tags, and the referring sites.

Figure 3 An example Top Ten Report

Site Analyzer

The Site Analyzer module retrieves information from all pages on your site, and generates the site structure, site map, and analyzes your site by the referrer and keywords information that is exported from the Log Statistics module. In addition it

  • Analyzes the site tree structure
  • Views the site map in the form of an interactive chart, which allows studying all page correlations
  • Conducts a site analysis by the actual keywords and key phrases (see Figure 4), and
  • Conducts a site analysis by referring pages.

Figure 4 Keyword Distribution Report

A comprehensive strategy

The suggested strategy of combining log analysis with a tracking counter helps overcome limitations of each data gathering tool and leads to a more comprehensive and accurate way to evaluate the success of a commercial web site. This article also suggests several other tools that are provided by NetPromoter that help to evaluate the success of your web site. The Site Analyzer tool supports the comparison of the effectiveness of your strategy with that of your competitors through an in depth analysis of the keyword distributions and other important parameters. These tools provide important information that can be iteratively used to fine tune your corporate strategies and ultimately help you to achieve your goals.

###

previous page


HOME    articles    tutorials    tool directory    training    books    about

(c) copyright 2000-2007 Anventure.  All Rights Reserved.
privacy policy