Skip to main content

Comparing Siteimprove Analytics data to data from other analytics tools doesn’t always make sense - here's why

- By Søren Laumand - Jul 24, 2020 Web Analytics

One of the most common questions that we receive from our customers is about why the numbers they see in their Google Analytics (GA) account (or other third-party analytics providers) don’t always match the numbers in Siteimprove Analytics. It’s a legitimate question – with many answers.

Below I address and elaborate on six possible explanations for these differences, knowing that there are likely many more causes of the discrepancies between the numbers we provide and the numbers that customers see in GA and other analytics tools.

1. Different technologies and different ways of data processing

Other analytics providers have their scripts and we have ours. Although there are obviously similarities (many are based on JavaScript), the scripts are not identical and therefore do not operate identically. This covers everything from how data is collected when visitors arrive on a page, to how data is routed from the visitor’s browser via endpoints, to how data is stored in the database. We simply have no way of knowing how Google, Adobe, or any other analytics tool, does every little thing. As such, data discrepancies can be impossible to troubleshoot as they might not occur due to actual “trouble” but simply because the technologies aren’t the same.

2. Different settings = different numbers

Metrics and parameters are counted differently from tool to tool and are even customizable. This means that even if the default settings are the same, the metrics won’t necessarily be. If the criteria for determining a specific metric is different, the data will also vary.

An example: In GA and Adobe admins have the possibility of customizing how long it should take for a visit/session to time out – for example, decreasing it from the default 30 minutes to 10 minutes. This would likely lead to a higher number of sessions/visits when compared to Siteimprove Analytics where visits always time out after 30 minutes of inactivity.

Another example: GA counts a session with one campaign source as being closed (and starts a new session) if the same user accesses the page through a different campaign source before the original session would normally be closed (after 30 minutes). This can happen if someone searches for one keyword, then clicks a paid link, before going back to search for another keyword, and then clicking on a new paid link.

Siteimprove Analytics does not record a new visit/session if the campaign source changes, which would then lead to a different total number of visits/sessions.

These examples demonstrate how different settings can lead to discrepancies in figures between tools. In other cases, the metrics are actually counted in a uniform way between two analytics tools, but there might be system filters applied that leave out a certain IP group, such as internal users, or that exclude users from a specific geographical location. This is not necessarily visible when looking at the data, requiring the user to know whether a system filter has been applied through the settings page. Otherwise you might think there is an issue when comparing a certain metric, such as total visits, across tools that don’t have the same filters applied.

Siteimprove Analytics allows customers to apply system filters on a site-by-site basis. This enables them to, for example, not log data for internal IPs on public-facing websites, while still being able to log data for that IP range on their intranet.

Finally, users might have varying levels of access in different tools. And some tools might provide more granular capabilities of restricting access for individual users. A user with access to all data in one tool, but only partial data in another tool, would see contrasting numbers in the two tools.

3. Multiple domains on one site

Some tools, like GA, do not show the root domain in their site content section, which means that in cases where multiple subdomains are tracked within the same site, some data might be merged under one page. In comparison, Siteimprove shows the entire URL, and subdomains are therefore not merged with the root domain. An example could be that site.com and sub.site.com are both tracked on the same account. In GA they might both appear under the “/” page (the homepage), while in Siteimprove Analytics they will appear individually with the full URL.

4. Implementation can vary

How the script is implemented on a website can also influence the data collected. If a script is placed high up in the HTML it will normally be fully loaded earlier in the page loading process, compared to a script that is placed lower in the HTML – especially if the page is heavy and has a long load time. This variation can generate different numbers if a user clicks on through to another page before the previous page was fully loaded.

Besides the placement of the script in the HTML, how the script is implemented through a tag manager can also influence the data. Tag managers allow for a lot of added functionality when it comes to firing different scripts at different times – but if the tag manager implementation is not the same for all analytics scripts the data most likely won’t be either.

Different implementation recommendations

Google's recommendation on implementing their script: "The code should be added near the top of the <head> tag and before any other script or CSS tags"

Siteimprove’s recommendation: "We recommend that the script is always added at the bottom of the website in a footer or similar, just before the closing body-tag </body>"

Both scripts use asynchronous downloading, but the script placed in the header is loaded before the one placed in the footer, potentially leading to data discrepancies.

5. User behavior influences the numbers

Some users install ad-blockers that can block an analytics script from running successfully. If the ad-blocker allows a script from one analytics provider, but not another, it can lead to data discrepancies. It is also possible for a user to whitelist one analytics provider, but not another, again leading to discrepancies between the two.

In addition to ad-blockers, the user’s choice of browser can also influence which scripts are being loaded, as some browsers automatically apply a stricter privacy policy than others.

6. Bots and crawlers might be counted (or not counted) differently

Some analytics tools have a bots and crawlers exclusion list. Different tools have differing lists and some might not even use exclusion lists. So, depending on the amount of bot and crawler traffic on your site these exclusion lists can dramatically impact the amount of traffic filtered out, and consequently the numbers being presented in the analytics tool.

Siteimprove Analytics provides customers with the option to filter out bots and crawlers based on an industry-standard list from the IAB.

From data collection to analytics action

While analytics tools are all (to a certain degree) similar, and the metrics being shown often appear to be the same, comparing data across tools is not reliable for the reasons mentioned in this article – and many more besides.

The only thing you can say with certainty when the numbers in one tool don’t tally up with the numbers from another tool is that the numbers don’t match. Even then, both tools can still be correct – in the sense that they both track data in accordance with the implementations and setups they have in place. The most important factor is that the person analyzing the data understands exactly what data is being analyzed. Data in itself is worthless – it’s the leveraging of data that provides value. This is the key strength of Siteimprove Analytics. First, we provide an intuitive and approachable display of your data. And second, we help you understand how we collect data and what you see in our tool, empowering you to shift from simply collecting data to gaining powerful user insights and optimizing your website.