|
When generating and discussing web metrics, there are several units
that are commonly referred to. It is important to know the distinction
between these so that you know what you are looking at. One of the
most common metrics is ‘hits.’ You often hear people
saying “we had 10,000 hits this week on our website.”
Unfortunately, statements like that are not well defined. This kind
of information usually comes from a ‘hit counter’ placed
on the home page of the site. In this case, what this person really
means is “we had 10,000 hits on our home page last week.”
Hit counters serve their purpose, but only provide basic information.
Web analytics allows you to go much deeper, without losing this
basic information. The following are some basic and common measurements
of web analytics.
Hits and impressions
In most web analytics discussions a ‘hit’ is defined
as a single request for any item on your website. This can include
images, animations, audio, video, downloads, PDF or Word documents
or anything else that you allow visitors to access. When a web browser
loads a page, it also loads all the components referenced by that
page. This means that it requests all the images (often including
‘roll over’ images for mouse effects) and, perhaps,
stylesheets, JavaScript files or other external references. A single
page load can result in many ‘hits.’
Quickly, pull up the home page of your site and see if you can
figure out how many ‘hits’ you generated by doing that:
count all the graphics that it loads. If you are familiar with HTML,
look at the source and see if there are external stylesheets, script
files, or roll-over images that are loaded invisibly. You may be
surprised at the number of items required just to build your home
page.
The term ‘impression’ has been adopted into web analytics
lingo from traditional advertising. In advertising, a count of impressions
is the number of times an advertisement has been seen or heard.
This is usually very hard to count (in newspapers, radio or television)
and is often estimated. With websites, we have logs to tell us the
number of hits the server has registered for the ad. When requests
for a particular item, such as a page or advertisement are counted,
these hits can nominally indicate how many times the item was seen
and are therefore sometimes counted as impressions.
So if a single request to the home page can generate many hits
in your logs, what value is a hit count? Mainly it is useful for
managing or monitoring the load on your server. Each hit represents
a single network request. So a large number of hits corresponds
to many processes that your server has to deal with and more traffic
over your network connection. It is also a general indication of
traffic volume and can, therefore, be used to view growth of your
website.
It can be indicative of the user experience. If a single visit
to your home page generates 75 hits, that means the new visitor
has to wait through loading 75 items before she gets the full experience.
Most browsers will only load four or five items at a time, so loading
the home page will take a while. It might be a good idea to rethink
the design of that site to reduce the number of items on the pages
so they load faster. (Browsers do actually cache items; the affect
of this on your statistics are discussed in detail in Appendix B.)
Page views
Total Hits only provide a rough estimation to the visitors’
experience. Thus we have developed other metrics in web analytics
to get a better sense of our visitors’ behavior. “Page
Hits” tells the number of hits to pages (as opposed to graphics
or other content). This means that when you load the home page of
your site you register only one page hit, in addition to several
other kinds of hits.
Page hits are a much more useful metric than hits for analyzing
user experience. Web sites are organized into pages, and users (and
designers) think of them in pages, so counting page hits makes more
sense. Now that you have loaded your home page, wander around your
site a bit. Keep track of the number of times a new page loads.
This is the number of page hits your visit is registering. The more
time you spend on the site, exploring, the more page hits you register.
So page hits represents the navigational experience of the visitor.
There is one situation in which page hits do not correlate directly
to the user experience. If your site uses frames, the frameset page
and all the frames that load to make the ‘visible page’
are each counted as a single page hit. So if your ‘home page’
actually is a frameset with three frames, it will register four
page hits, not one (what the user is experiencing) for that initial
load. If the visitor then clicks on a link that only changes one
frame, that will register as a single page hit. If you have built
your site so that each click loads a whole new frameset, then you
will get multiple page hits for each click the visitor performs.
If you understand the design of your site, however, you can easily
adjust the page hits count to reflect these aspects.
Graphics hits
Other types of hits are important too. “Graphics Hits,”
are the number of requests for images, animations or other graphics.
Graphics are often larger than the content of a page and take up
a good portion of bandwidth as well as requiring the visitor to
wait while the page loads. If the number of graphics hits for your
site is significantly larger than that of page hits, then there
are probably a lot of graphics on some or all of your pages. You
might consider redesigning the site to make it load faster.
Downloads
This can be programs, archives, zip files, or PDF documents that
users download from your site. If you distribute software or documents
from the site, it is helpful to have a quick count of the number
of downloads in each period.
Errors
Errors are simply a count of the number of requests for items that
did not complete – either because they were not there or could
not be produced. It is a quick measurement of your site’s
diagnostics.
Bytes
The count of bytes in a period is very useful for tracking the
bandwidth usage on your network. If you are billed for bandwidth
usage on a monthly basis you can see an estimate of the amount of
bandwidth your website. You can also use bytes to assist in improving
the design of the site. Individual pages can take a long time to
load if the graphics they reference are large. You can look at the
ratio of Bytes to Pages in a report and determine the load-average
for each page on your site.
There are other factors that can affect the byte-to-page-hit value
that make this ratio only useful in some situations. If you have
any downloads on your site, this will likely increase the byte count
significantly. On the other hand, caches can make the byte counts
significantly lower than if all graphics were loaded. When assessing
the byte-to-page-hits ration, you may want to consider the environment
as well. For example, if these statistics are for an intranet, where
users have a lot of available bandwidth, you may be perfectly comfortable
with a 100KB-per-page average, where you would not be when your
visitors were using 56k dial-up Internet connections. Nonetheless,
the byte count does accurately reflect the amount of data requested
from your server.
Users
So far we have discussed metrics that are helpful in relating content
to user experience, but you are probably wondering now just how
many users experienced your content. The most accurate way of counting
users is to require them to login. If you use your web server’s
authentication system, then each user’s request will be associated
with his username. By counting unique usernames you can know the
number of unique users over the course of a period.
Unique hosts
Unfortunately, most websites do not require login. Because of the
way the HTTP protocol was designed, there is no connection between
requests by the same user to your server. So without requiring authentication,
visitors and their visits have to be estimated.
In the past an estimate of unique visitors could be found by looking
at the Unique Hosts. Each request that a visitor makes to the site
(whether for a page or for the graphics and other referenced files
from that page) is associated with a host address (or IP number).
When the Internet first started, each computer connected to it had
its own host address, so a count of unique hosts was usually a good
indicator of the number of unique visitors you had in a given period.
Unfortunately this is no longer true. The majority of Internet users
connect through dial-up services of some kind. In order to preserve
IP numbers (there are a limited number available right now), the
dial-up providers will assign each user a number when he connects
and then reuse the number when he is done with it. So a dial-up
service may have 100 IP numbers that they select from and use to
serve 2000 users. This gets even more complicated with caches and
proxies that many providers now use to improve.
There is still value to counting unique hosts. A comparative value
of unique hosts as a trend over time can be indicative of the reach
or breadth of coverage that your website is getting. If you increase
the number of unique hosts per month in a six month period, then
you probably reaching a broader group of visitors.
Visits or sessions
In addition to counting the number of unique visitors you have
had, you often want to know how many visits you have had. A given
user may visit the site three distinct times in the course of a
month. This would count as three visits. Summary uses some complicated
heuristics to determine all the requests in a given visits. The
nature of the web makes it very difficult to know exactly what requests
belong to each individual visit. The Visit count can therefore be
used as an estimate of the number of visits that occurred in a period.
Another common term for visits is ‘sessions.’ This
is because a visit is often defined as a series of consecutive requests
from a given user bounded by inactivity. So if a user makes 100
requests, then does not make any for 45 minutes, then 75 more, it
is assumed those were two separate ‘sessions’ or ‘visits’
to your site.
Visit tracking with cookies
One of the most common techniques for improving visit metrics
and request data in general is to configure your web server to send
out a ‘session cookie.’ A cookie is a small bit of information
that your site can have visitors store with their browsers. Whenever
they request a page, the browser sends the cookie back to your server.
Cookies can be given an expiration date if you want them to last
for a while. A ‘session cookie’ has no expiration date
set so the browser deletes it as soon as the session completes (usually
when the user closes the browser). Session cookies are therefore
unique to each visit. While a given browser in a given visit may
still cache some of the requests, two different visitors accessing
your site through a web cache or proxy server will have to make
distinct requests.
With a properly configured server and properly configured analysis
software, you can even use these cookies to help the software figure
out what requests comprise each visit. Summary SP will allow you
to configure it to track visit cookies in your logs. In order to
make this most effective you should configure your web server to
log outgoing and incoming cookies so that you can correlate the
first request with subsequent requests in a visit.
While this is much more accurate than the heuristic approach, especially
when dealing with proxy sharing, it does not provide a perfect solution.
Some users will adamantly refuse to accept cookies from your sites.
Other user agents (like robots or spiders) never accept cookies.
Finally, some firewalls and proxies will strip cookies from requests
completely. If you require users to accept cookies, you may be blocking
a large number of potential visitors. If you do not, then you are
back to having some visits without cookie data and the previously
mentioned issues all return.
Please feel free to read on about different web analytics software
vendors:
Salient Marketing can help you understand how to evaluate your
web analytics (what your visitors are doing on your site), select
the performance metrics that matter and make the information from
your log files actionable. Stop guessing about what your visitors
are doing; understand it and convert more visitors into customers.
The integration of our metrics services with our search engine
optimization services and our pay-per-click management services
- all managed by the same dedicated campaign management team - provides
a considerable advantage in terms of coordination and communication
over managing pieces of this process by yourself, or through separate
vendors.
For more information on on log file analysis please contact us
at 613-266-9812 or email us at info@salientmarketing.com.
|