dataLayer-based tracking is best practice.
Extensive use of CSS-Selectors to scrape the data we need will inevitably fall apart when changes are made to a site. But this means that the vital flow of data that we manage for the business is dependent on code managed by Front-End.
I’m not blaming Front-End—they would never intentionally break our setup. But their focus isn’t on an innocuous ‘dataLayer’ object, but on whether the user can put something in their basket and buy it. They want to be sure that the website is working. Analytics will always be a lower priority, and when something breaks, it is much less noticeable. It’s only 2 days later (though I hope sooner) when you check your dashboard or analytics tool and see a problem that alarm bells go off.
After the problem is solved and the damage to the data is assessed, your thoughts will naturally turn to monitoring – how can we prevent this from happening again?
As the saying goes, ‘Trust is good; control is better.’
But how do we monitor this? Can we set up an alert system for when the purchases suddenly drop? What about if the purchase continues but the transaction ID is missing? Or the value is suddenly a string, breaking all of our custom JS variables?
Testing Frameworks
Over the past months I’ve been working extensively with Jest. If you haven’t checked it out before, I would highly recommend it, it’s delightful.
In Jest we define our expectations for a value, run our code, and then compare the result against what we expected. For example, we could state that we expect our dataLayer to be an array. We could also state we expect it to have a length greater than 0.
// Assert dataLayer is an array
expect(Array.isArray(window.dataLayer)).toStrictEqual(
true
);
// Assert dataLayer has length > 0
expect(window.dataLayer.length).toBeGreaterThan(0);
We could go deeper by looking directly at the ecommerce object. We expect the currency code to be present with three uppercase letters, following the three-letter ISO 4217 format.
expect(window.dataLayer).toContainEqual({
"event": "purchase",
ecommerce: {
"currency": expect.stringMatching(/[A-Z]{3}/)
}
})
The ecommerce object is very predictable. There are a lot of standards that need to be followed, for prices we expect a numerical value (though a stringified version of this will also suffice). Whenever we have a value, we also need to have a valid currency. An Item ID or Item Name is required in the items object, so that we know what was purchased or added to the cart.
Can we build something similar in GTM?
But can we do such a thing in GTM directly? Even better, could we do it in a template so that the logic is easily reusable between containers?
We don’t have access to testing frameworks that are built to handle this validation, we can only use vanilla javascript (and often only ES5). Templates are even more restrictive, only allowing access to specific APIs.
It turns out we can do this by using a schema structure that mimics what we want to test. Let’s take the currency example from earlier:
var ecommerce = {
currency: "USD"
}
var schema = {
currency: function(value){
return !!value.match('[A-Z]{3}');
};
Unfortunately, RegEx support in templates is still somewhat lacking. We don’t have access to the constructor or literals, but we can pass a string pattern and use the basic .match function. For validating our currency code, this will suffice to check our upper case and three letter criteria.
We can then loop through the schema using Object.keys combined with forEach, passing in the actual value from the ecommerce object as our value and performing the validation.
We then just have to define the schema for all of the values. Luckily I’ve already done this and present to you:
The GA4 Ecommerce Validator
The full template is now available in the GTM Template gallery. If you have any questions or find any bugs, then feel free to reach out to me here.
Features
- Activate/de-activate the item parameters that you actually use
- Patented fatFinger Detection™ – detect whitespace that has been added to your string parameters. Because you don’t want half of your products with the category ‘ apparel’ and the other half with ‘apparel’.
- Add your own custom parameters for validation, defining what they should look like. - Number or string? Whitespace allowed? Is an empty string acceptable?
- Read the ecommerce object from wherever you are using it. Whether you read directly from the dataLayer or have a custom javascript variable(s) building it out.
- Pushes an event to the dataLayer when values fail validation, stating in which event and what exactly failed (passed as stringified JSON).
- From here you have multiple options: pass it directly to your error logging tool? Pass it to GA4 and then on to Bigquery for analysis?
Why do it this way?
This validation could also be performed as the data arrives in your data warehouse. If you have the infrastructure to do this, then this is definitely a valid way to tackle the problem:
Snowplow (no affiliation) for example, offer this as part of their schema validation via Iglu.
For GA4 – if you export your data to Bigquery, you could then perform your validation in a tool like Dataform.
However, not everyone has access to this infrastructure and there are other benefits to this approach:
1- Speed – by passing the errors to an error tool with alerting capabilities, you could be warned in almost real time of a dataLayer issue on the site.
2- This approach could also be running on the DEV environment. As your devs test their changes on the success page, a warning message could already be triggered that the dataLayer is about to be modified.
Again – I’m not blaming the Front-End devs, it’s just missing a days worth of purchase data can be critical. It takes a long time to build trust in the data with your stakeholders, and that trust can be destroyed very quickly.
Losing data, even for a single day, can have a significant impact. Protecting our dataLayer is key.
Trust is good. Control is most definitely better.