A useful little web access log reporting tool

One of the things that was on my to-do list after setting up the Meninos do Morumbi Oldham website was to do something on the reporting front with the server log files. I'd already set up Tomcat to generate combined log format files by putting this in the server.xml file:

    <Valve className="org.apache.catalina.valves.AccessLogValve"
directory="logs"
prefix="access."
suffix=".log"
pattern="combined"
resolveHosts="false" />

so I had the raw data I needed, I just needed to do something with it.  In the past I've used AWStats to do log file reporting, but it is written in perl and therefore needs a CGI-bin setup.  This is easily done if you are running Apache, but I'm running stand-alone Tomcat, and although you can run CGI stuff under Tomcat, it isn't really recommended.

As is often the way, I was looking for something else entirely when I came across Visitors, a stand-alone log file analyser written in C.  It writes its report as either a single HTML or text file, and the report looked fine, so I could just run it from cron once an hour and put the generated report somewhere in the tree managed by MeshCMS.

One small additional wrinkle: as you can see from the sever.xml entry above I've turned off DNS lookups for the access logs. The reason for this is that DNS lookups can take some time, and I don't want logging to slow down the web server. However it's useful to have resolved names for reporting purposes, so I pipe the log files through the Apache logresolve utility before feeding them into Visitors.  At the moment I'm doing this each time I build the reports - I should really just do this once and cache the result, but that's a job for another day :-)

Tags : ,
Categories : Web, Tech