Google Analytics (GA) is a decent service for tracking activity on your site, however, there are times when it may be necessary to log some statistics yourself.
By its nature, GA does not include any information that would allow you to identify who visited a page or clicked a link. Also, GA does not give immediate updates – you have to wait up to 24 hours to see the current day’s stats.
A site that provides recommendations to registered users may wish to log some of their own statistics (Amazon, for instance). An ad network may want to log impressions and clicks on their widget. What do you need to be aware of when logging your own stats?
Writing to Text Files
If you’re going to run a widget on any site with high traffic, you may run into problems if you try to write to your database directly from the script that runs the widget.
A safer though not particularly elegant solution is to write to text files instead. Each file should allow you to easily recognise what it’s for, e.g. the filename “Advert1000” could be used to log stats for advert ID 1000. Or, you could store all advert stats in a folder called advert-stats and just put the ID in the filename.
Beware of Write Clashes
If you’re using text files, it may be safest to log a random number as part of the filename in case a large number of visitors clock up stats at the same time. e.g. “Advert1000_12345678”. The ID is still in the filename, and the random number appears at the end of the filename, with an underscore as a delimeter.
Of course, don’t just hard-code the same random number for each file or it’s not going to work.
Gather the Stats Regularly
You’ll need to set up a cron job to regularly parse and then delete your stats files. If you’re using a random number in the filename, it’s much less likely that you’ll lose some stats by reading a file, writing to your database and deleting the file afterwards – on a busy site, you may find that some views have been logged in the file since you started reading it.
Look At Alternative Database Options
Logging a lot of statistics for a lot of sites is going to result in you having to store a lot of data. Typically, this type of data only needs to be stored and retrieved – once you’ve collected all of your stat files for a given day, you won’t be going back and updating those values.
As a result, it’s worthwhile looking at other databases for storing your stats. Infobright is one option you could look at. It allows you to store very large amounts of data and run queries against it in a fraction of the time that it would take, say, MySQL.
You don’t have to change your entire application to use a different database engine – in fact it’s probably best that you don’t. Just the statistics will do.
Start Small, Work Up
Once you start logging your first statistics, you may start thinking of many other stats that you could log. It’s really important to get it right with your first attempt before adding more statistics to your site. Starting with something such as total pageviews is best. You can then look at unique views, clicks and so on once your stats have started to build up.
Watch out for today’s Hack of the Day, where I’ll be showing you a tool that can be used to display your stats in a graph.
Photo by kevindooley