There’s been a lot of talk recently about referral spam and how it’s ruining everyone’s analytics data. While this isn’t cause for panic, it is very annoying, and depending on the size of your site it could be having a very meaningful effect on your data.
Most solutions I’ve seen talked about so far involve maintaining a list of spammy domains, which seems impractical at best. In this post I’ll outline two filters which in most cases should exclude the vast majority of referral spam and require zero maintenance. Lastly, if you want to go the extra mile and filter out those last few spammy sessions, I’ll outline a low-maintanance option for that, too.
Firstly, if all this is news to you, check the referring domains report in your Google Analytics account and see if it contains any of these:
The image above is a screenshot from a dummy Google Analytics profile I created several months ago for testing. I never ended up using it – the tracking ID has never been added to a website. Given this, I can only assume that the spammers are going about their work by randomly cycling through possible tracking IDs. More usefully, I can be sure that all of the visits to this profile are spam.
If you’re not familiar with filters in Google Analytics, check out Google’s documentation here. It’s worth noting that they don’t apply to data collected before they were implemented – so you’ll have to create a segment with all of your filter rules to look at historical data spam-free.
In the case of my dummy Analytics account, and most of the client profiles I’ve looked at, the vast majority of spam sessions fall into one or both of the following two categories:
- Invalid hostname (i.e. not your site)
- Screen resolution = “(not set)”
As such, they can be permanently excluded with just two filters:
Screen Resolution Exclusion:
The Last Few Sessions
If you want to wrap up the last few visits, rather than maintaining a list of domains, you can use the spammers’ spamminess to your advantage – they have spammy domains, and it’s fairly easy to spot a spammy domain. Because of this, you can use a filter like the following which shouldn’t need to be updated for each new wave of spammy domains:
Brian Clifton suggests a more exhaustive version of a filter of this type here, for anyone interested.