Evaluating Web Server Log Analysis Tools David Strom
Evaluating Web Server Log Analysis Tools David Strom david@strom. com SD’ 98 2/13/98 SD'98 (c) David Strom, Inc.
Summary • Examine different log files • What you can and can’t learn from your logs • Pros and cons of various tools SD'98 (c) David Strom, Inc. 2
Different types of log files • • Access Error Referral Other SD'98 (c) David Strom, Inc. 3
Access logs • • • Domain name Date, time Server command processed and result URL of visitor Bytes transmitted SD'98 (c) David Strom, Inc. 4
Sample access log data • • • rm 258. fav. usu. edu [31/May/1995: 09: 03: 23 +0600] "GET /NEI. html HTTP/1. 0" 302 396 rm 258. fav. usu. edu [31/May/1995: 09: 03: 28 +0600] "GET /xculture/nei. html HTTP/1. 0" 200 2114 rm 258. fav. usu. edu [31/May/1995: 09: 03: 30 +0600] "GET /gifs/sedlbutton. gif HTTP/1. 0" 200 1336 129. 71. 83. 161 [31/May/1995: 09: 20: 32 +0600] "GET /RELs. html HTTP/1. 0" 304 0 Leslie-Francis. tenet. edu [31/May/1995: 09: 36: 06 +0600] "GET / HTTP/1. 0" 200 1867 ls 973. ulib. albany. edu [31/May/1995: 09: 40: 52 +0600] "GET /viii 1. html HTTP/1. 0" 404 244 SD'98 (c) David Strom, Inc. 5
Errors reported in your logs • Clients that time out (or leave in frustration!) • Scripts that don’t produce any output • Server bugs • User authentication or configuration problems SD'98 (c) David Strom, Inc. 6
Sample error log data • • • [Thu May 30 07: 25: 32 1996] send timed out for bamberg. sedl. org [Thu May 30 07: 57: 41 1996] send timed out for kenya. sedl. org [Thu May 30 08: 23: 11 1996] send timed out for ppp 092. kyotoinet. or. jp [Thu May 30 09: 15: 52 1996] access to /usr/local/www/htdocs/scimath/compass/vol 03 failed for 170. 211. 67. 51, reason: File does not exist [Thu May 30 09: 57: 56 1996] send timed out for dd 10048. compuserve. com [Thu May 30 10: 47: 25 1996] read timed out for ncia 110 b. ncia. net SD'98 (c) David Strom, Inc. 7
Referral logs • Who links to your site? • Who downloads your pages? SD'98 (c) David Strom, Inc. 8
Sample referral log data • • http: //www. isisnet. com/ ->/change/welcome. html http: //www. ipl. org/ref/RR/EDU/Research-rr. html >/welcome. html http: //www. tenet. edu/snp/main. html >/policy/networks/toc. html http: //www. tenet. edu/new/main. html >/policy/networks/toc. html http: //guide-p. infoseek. com/NS/Titles? qt=teacher+training >/resources/SCIMAST/announcement. html http: //www. tenet. edu/new/main. html >/policy/networks/toc. html http: //www. nwrel. org/national/regional-labs. html >/welcome. html SD'98 (c) David Strom, Inc. 9
Common log format • Output by most standard servers • Needed by most third-party log analyzers • hoohoo. ncsa. uiuc. edu/docs/setup/httpd/Overview. html SD'98 (c) David Strom, Inc. 10
Extended/custom log formats • Log whatever you wish in whatever order you wish • Useful if you will read them regularly! • But can’t work with the analyzers • Now in IIS v 4, NSCP v 3, others. SD'98 (c) David Strom, Inc. 11
What you can learn from your log files • Hits per day • Domain origins • The path people take in and around your web • Problem areas SD'98 (c) David Strom, Inc. 12
HITS • (How Idiots Track Success) • Nobody uses this word anymore • Doesn’t really measure individual users, just access • Catching servers and proxies mess up these statistics SD'98 (c) David Strom, Inc. 13
Domain origins • Where users are coming from -- sometimes • Just because they are from ibm. net doesn’t mean they work at IBM! • Forgotten accounts, friends and family using the account • Hacked user names • Proxies don’t help here either SD'98 (c) David Strom, Inc. 14
The path people take in and around your web • Search engines help sometimes • Which search site was the most popular front door • Who links to you and why • Is there a pattern or a random walk? SD'98 (c) David Strom, Inc. 15
Problem areas to deal with • Broken links (locally) • Broken outbound links • Time outs (sunspots? ) SD'98 (c) David Strom, Inc. 16
What you can’t learn from your logs • Who are these people, anyway? – No specific user names – Is it a bot or a real human? • How long did they view a page? – Most people don’t spend much time on your web – Where did they go visit next? SD'98 (c) David Strom, Inc. 17
What technologies are available? • • • Built-in analyzer tools Sites that capture user info Secure sites with registration Build your own from perl Third-party tools SD'98 (c) David Strom, Inc. 18
Built-in tools • • Web. Site, website. ora. com IIS with Site Server, www. microsoft. com/iis Netscape servers, www. netscape. com Easy to use but limited SD'98 (c) David Strom, Inc. 19
Web. Site Professional v 2 • Win NT, 95 • Best web server for learning about logs, best docs • Quick. Stats module for instant analysis: – single report but nice set of information – shows today, last two days requests and unique hosts – IP addresses of visitors, average requests/hour SD'98 (c) David Strom, Inc. 20
IIS Site Server • NT Server v 4 w/SP 3 only • Lots of preconfigured reports • Two versions, Express and Full (customized reports) • backoffice. microsoft. com/products/siteserve r/express/ SD'98 (c) David Strom, Inc. 21
Netscape v 3 web servers • Various NT, Unix versions • Reports for a few variables but nothing too extensive • Best to use a third-party tool here SD'98 (c) David Strom, Inc. 22
Sites that capture user info • Web. Counter, www. digits. com -- third-party hit counter • Someone else does the programming and debugging • But beyond your control SD'98 (c) David Strom, Inc. 23
Secure sites with registration • You know your users • But many won’t register, or forget their passwords • Requires scripting, database integration, more maintenance SD'98 (c) David Strom, Inc. 24
Build your own from perl • Needs some in-house support • Works best with Unix-based webs • Examples: – refstats, members. aol. com/htmlguru/refstats. html – surfreport, bienlogic. com/Surf. Report/ SD'98 (c) David Strom, Inc. 25
Third-party tools • • • Web. Tracker, www. CQMInc. com/webtrack Web. Trends, www. webtrends. com net. Genesis, www. netgen. com Market. Wave, www. marketwave. com IIS Assistant, www. go-iis. com SD'98 (c) David Strom, Inc. 26
Third-party tools (con’t) • Can make very pretty reports • Customizable • Make sure they support your particular log format • Not that expensive, mostly run on Windows SD'98 (c) David Strom, Inc. 27
- Slides: 27