ERCIM News No.25 - April 1996 - CLRC
Computing and Information Systems Department Web Demographics
by Victoria Marshall and Martin Prime
At RAL we are interested in the twin issues of Web demographics and Web
user interfaces.
As part of our on-going research work, we took as an example one month's
logs (October 1995) of CISD's web, analysed these files, and began to draw
some conclusions about reader demographics and user interfaces. The analysis
was by necessity highly localised and ideosyncratic as this was the first
time we had done any serious analysis of the web usage. The analysis did
however give some constructive pointers as to reader behaviour.
In October 1995 the CISD web had 352 UK readers (28% of all readers), of
which 89% were from the academic domain, 11% from the commercial domain.
This reflects the very large number of academic people who are interested
in an academic site such as RAL. However, of the 9929 pages accessed (81%
of the total), 98% were from the academics, but only 2% by the commercials.
This would seem to indicate that commercial readers are coming into the
CISD web but then reading very little, perhaps because the web has very
little of interest to them, or alternatively because the sheer number of
icons1 is (for a modem user) too great.
In the notionally American domain, the situation was reversed: we had 289
readers (18% of all readers), of which 38% were in the academic domain,
and 42% in the commercial domain. Of the 1155 pages accessed (9% of the
total), 40% were from the academics, 47% from the commercials.
Readership throughout the rest of the world was farily widespread. Germany
and France are our biggest readers, followed by the Netherlands, Italy,
Sweden (all of whom are our ERCIM partners) and Canada.
The People section of our web is the most popular within the CISD web. Most
of the accesses seemed to be by people simply trying to contact someone,
accesses to pages for people involved in some Web development, or by various
types of search engine. Some deliberate actions can be discerned however.
Quite a number of accesses were for specific groups of people; other accesses
were to the pages of internationally recognised names, or those involved
with some high-profile activity such as ERCIM. A small minority of people
accessed pages for what would seem to be infantile reasons; one person (possibly
a German student) accessed only the pages of female staff; others accessed
only the pages of people whose names are perceived to have some quirk or
"foreign" names spelled with an umlaut.
33% of (outside) referrals came via search engines. Because so many readers
were coming into the Web in this way, it seemed sensible to analyse the
queries they were using to get here.
Infoseek was the most popular web search engine (19% of queries came from
their site), lycos and altavista came next (with 5% and 4% of queries respectively).
From the 3838 incoming queries, 1555 unique queries were asked. Not surprisingly
perhaps the most common search was for "Rutherford" which occurred
either on its own or in combination with "Appleton" and/or "Laboratory".
Other common search terms were "heathrow", "gatwick"
and "transputer".
A few slightly dubious terms were also used: "chippendales" (a
male performance troupe) and "stevie+nicks" (a singer with Fleetwood
Mac) for example. These were mis-indexed as variants on the names of various
members of staff within the Department.
Some confusions were inevitable and evident. Of the 142 searches for "everest"
(a project within the Department), 57 were clearly intended to apply to
"Mount Everest" and included terms such as "everest+explorers",
"sir+edmund +hillary+everest" and "skidoo+everest".
It was also likely that some users had completely misunderstood the capabilities
of the search engine: "indirect+flight+from+heathrow+to+tokyo"
and "what+is+web+site+for+calton +university".
We also looked at reader's sessions once they were in the CISD web. 120
users made a total of 2424 hits, an average of 20 each, with a maximum of
149 hits. 38% of 1250 users hit just one page of the web, 18% made just
2 hits, 12% 3 hits, 8% 4 hits.
It is clear that much work could be done on the analysis of web demographics,
and this is an area of research that we are actively pursuing. At the moment
it is very much an exploratory exercise; however some suggestions and observations
can already be made.
Firstly, not all browsers comply with the http standard! Further information
about users (if only the machine name) would be invaluable. Secondly, as
the web grows, the use of search engines is bound to increase which necessitates
more efficient use of search terms, and the careful design of web page content.
Thirdly, (in the academic domain at least) people are interested in people.
We have already subtley re-architectured our Web to exploit this fact, and
"lead" readers from people pages to our project pages.
Please contact:
Victoria Marshall or Martin Prime - CLRC
Tel: +44 1235 82 1900
E-mail: V.A.Marshall@rl.ac.uk,
return to the contents page