Tracking RSS subscribers…

I’ve been talking about blogging and general business/marketing things a lot recently, so I thought it was about time for some more technical stuff. Don’t expect this to be too in-depth though, as I had this idea while washing up :0)

The problem with people subscribing to RSS feeds rather than visiting your site is you have less idea who is reading. Visitors to web pages leave tracks: IP addresses, browser versions, referrer details and all manner of other tidbits of information. RSS feed readers don’t provide so much in the way of clues, so I started thinking about how I could tell who is reading the RSS feeds on the Wiblog system..

The extremely popular Feedburner does this and much more, but requires sign up for every individual feed, which isn’t really practical for the Wibloggers. So as usual I’m going to attempt to roll my own.

The first thing to realise is that to track requests to any RSS resource, you need a unique reference for each visitor. The IP address would be great for this, except for the fact that some readers (including my favourite Bloglines) have one IP, but many people who connect to that RSS feed through it. That’s a problem, and I’m not sure what to do about it. Whether Feedburner have a way around that particular bugbear or not I don’t know.

So there’s one problem. The second niggle is that feed readers by their very nature hit an RSS feed quite often. Sometimes every 4 hours, sometimes every 30 minutes. But at any rate, it’s a lot. And there’s no point in storing every individual request as the database would easily become very large with pretty useless data.

So what I came up with is this. Each RSS feed on Wiblog.com will check the database to see if the requesting IP address has made a request for that feed before. If not, a new record will be inserted storing the IP address (I could do a lookup of the domain, too) and the datestamp of the first feed request. If the requesting IP has requested that feed before, the database will just be updated with the datestamp of the request. So, the table could end up looking like this:

Unique ID Referrer IP First request Last request
1 1.1.1.1 01/01/2006 10:00 30/03/2006 14:00
1 2.2.2.2 01/01/2006 10:00 27/03/2006 12:00
1 3.3.3.3 01/01/2006 10:00 12/01/2006 12:00

So here we can see that all the IPs made their first requests at the same date and time. Therefore at that date I had 3 subscribers. However IP 3.3.3.3 hasn’t made a request for well over a month, so I can discount them. Any feed reader that hasn’t read a feed in over 2 weeks is pretty much to be ignored, in my book. So at the present time I have 2 subscribers.

A simple table like that is enough to create basic statistics of subscribers. There are a lot of caveats and problems with this idea, but it’s a start. However I’m not only only person to think about this. If you know a better way to do this, please let me know.