Using CASE WHEN in Google Data Studio to Supercharge Your Reporting
Thursday, 6th May 2021
Google Data Studio has made creating and sharing SEO reports with clients much quicker and easier than ever before - however, there are ways that you can add even more value to your SEO reports. Enter the CASE WHEN statement. The CASE WHEN statement is a way of creating custom fields that create new data groups, and I'm going to show you exactly how to use this way of grouping information in Google Data Studio to supercharge your SEO reports This post is going to cover;
- What CASE WHEN statements are in Google Data Studio
- How CASE WHEN statements work
- Using REGEXP_MATCH / RegEx in Data Studio
- Real world examples of the CASE WHEN statement used in Data Studio
What does the CASE WHEN statement do?
In short, it can label groups of data in a way that you can specify - you can use these new groups to aggregate statistics together, and analyze or view your data better.
What does it look like?
Here is an example of a CASE WHEN statement - in this example it's being used to group the different social channels in the Google Analytics 'Source' field into one singular group
Source IN( “facebook” , “m.facebook.com” , “l.facebook.com” , “facebook.com” , “lm.facebook.com” ) THEN “Facebook”
Which is essentially saying;
If [specific field] matches ("this", "somethingelse", "orthis" ) THEN label as "GROUP NAME"
You can also group everything else that hasn't been covered by the statement, by including an ELSE statement, like this;
Source IN( “facebook” , “m.facebook.com” , “l.facebook.com” , “facebook.com” , “lm.facebook.com” ) THEN “Facebook”
This groups everything which is not one of those items listed above. It groups those terms with the label "Not Facebook" - or whatever else you'd like to see.You do not have to have an ELSE clause within a CASE WHEN statement, but this can be useful to include as this means you can view the information in your data set that may have been missed by your grouping method.
If you filter it as a table and include 'Other' or whichever group you've named your ELSE statement, you can use this information to tweak the grouping formulas you create to cover a larger range of your data.
The CASE WHEN function really comes into its own when used with REGEXP_MATCH - with this, you can cover all the statements in one fell swoop using Regular Expressions to match the data.
Using RegEx in Google Data Studio
I'm not going into detail on how exactly to use RegEx in this post but here is the Google support article on REGEX within Data Studio. The most common RegEx expression you'll see used here is [.*] which indicates "find this keyword anywhere within this search phrase, page URL etc - beginning, middle or end.
I'm no RegEx pro, so I will caveat to say there may be a more elegant RegEx solution to a field calculation than what I use within these examples.You can include multiple RegEx matches within that section by using the pipe | character to add extra matches. This means OR - so for example ".*seven.*|.*morning.*" will match the word seven or morning anywhere in the URL.
Going back to the social media example above that groups the different Facebook sources into one category, using RegEx means you don't have to specify each Facebook version - instead it would look like this;
REGEXP_MATCH(Source,'.*facebook.*') THEN "Facebook"
ELSE "Not Facebook"
As a note - Google Data Studio does use a slightly modified version of RegEx (RE2 Syntax) so if you use a tool to generate a RegEx string, it may not work 100% of the time.
That sounds great! How do I use it in Google Data Studio?
You'll need to add a Calculated Field into your data source. There are two ways to do this:
1) Select ADD A FIELD in the Data panel on the right hand side
2) Go to Resource in the top menu, and go to Manage Data Sources and select the data source you need, then click Add A Field in the top right, under Explore. Then you can edit the Field name, and add your custom formula in the box as shown below:
Use Cases for CASE WHEN
It's a super flexible function, and can be used in multitudes of ways - in this post I'm going to focus on my favourite uses for CASE WHEN statements that can help supercharge your SEO reporting in Data Studio.
Grouping Page Types in GA or GSC
This example shows how you can group pages of a site that is using Shopify as the CMS.
WHEN REGEXP_MATCH(Landing Page,'(.*blog.*)') THEN "Blog Page"
WHEN REGEXP_MATCH(Landing Page,'(.*collection.*)') THEN "Category Page"
WHEN REGEXP_MATCH(Landing Page,'(.*product.*)') THEN "Product Page"
WHEN REGEXP_MATCH(Landing Page,'(.*page.*)') THEN "General Page"
Why this is useful:
By separating these pages into groups, you can then use this information to see performance by page type, including impression and click through rates to page engagement. This will give you a more accurate view of how various sections of the site are performing, instead of looking at those metrics across the whole site, which often doesn't tell you the full story.
Adding the page groupings as a dimension within a table can enable you to assess engagement metrics from a page group level and compare performance with each other, making it easier to spot areas for improvement.
Calculating Branded Split: Brand vs Non-Brand Keywords
CASE WHEN example for use with Google Search Console data:
WHEN REGEXP_MATCH(Query, ('.*keyword1.*|.*keyword2.*') THEN "Brand"
This is best used on Google Search Console keyword view, segmenting keywords between ones which include the brand name and ones that don't. You can use this to then calculate the branded split. I like to show this in a pie chart to show the percentage of branded vs non-branded searches:
Common problems / things to bear in mind
If your brand name is misspelt a lot, you can often miss some keywords which are actually brand, but are being missed by being too specific in your statement. You may have to tweak the RegEx, or add extra pipes, to cover potential mispellings. It's always worth manually checking against the keyword list - to do this duplicate the chart, change the style to Table and add your new calculated field.
Things to be aware of:
Using the branded split like this does have some issues - you may find that adding "Brand" (GROUP 1) to "Non-Brand" (GROUP 2) does not equal the TOTAL Queries in your data set. This is because the grouping process can be imperfect, and sometimes Google will exclude some search terms from Google Search Console data, such as low volume search terms or ones that may include personally identifiable information, this means that you will see the brand split heavily in favour of brand.
To account for this, you can take the total of the "Brand" GROUP and take it away from the TOTAL in your data set, to get a more accurate measure of the non-brand total.When you are adding keywords for the brand name, you may come across misspellings - so it may be worth using a unique section in the brand name word within the RegEx formula.
How you can use this to supercharge your reporting:Using this method you can showcase the effect of brand awareness campaigns: Are you doing brand related PR activities? You could show the % of branded split over a time, and note when brand led campaigns went out to show the impact of this on brand searches.
Assessing Location Performance in Aggregate
This is super useful when working with clients who have multiple locations - whether that's service areas or stores, you can group keywords into those which are location specific, or specify which keywords are Local vs Generic.- In this example, we use the (i?) RegEx functions to make sure the case is not sensitive, so it picks up capitalisation as well, which can be useful when working with some CMS platforms.
CASE WHEN Example for use with page level GSC data:
WHEN REGEXP_MATCH(Landing Page ,'((?i).*birmingham.*)') THEN "Birmingham"
WHEN REGEXP_MATCH(Landing Page ,'((?i).*liverpool.*)') THEN "Liverpool"
WHEN REGEXP_MATCH(Landing Page ,'((?i).*london.*)') THEN "London"
WHEN REGEXP_MATCH(Landing Page ,'((?i).*manchester.*)') THEN "Manchester"
WHEN REGEXP_MATCH(Landing Page ,'((?i).*nottingham.*)') THEN "Nottingham"
WHEN REGEXP_MATCH(Landing Page ,'((?i).*sheffield.*)') THEN "Sheffield"
WHEN REGEXP_MATCH(Landing Page ,'((?i).*milton.*)') THEN "Milton Keynes"
WHEN REGEXP_MATCH(Landing Page ,'.*-chester.*') THEN "Chester"
You can also use Google Analytics as a source and use the same CASE WHEN statement to view site entrances to landing pages - just remember if you're looking at organic traffic only, to add the Organic segment as a filter. You could also swap 'Landing Page' out to 'Page' for calculating all sessions to page URLs which contain locations.
CASE WHEN Example for grouping location queries in GSC:
WHEN REGEXP_MATCH(Query,"((?i).*birmingham.*)") THEN "Birmingham"
WHEN REGEXP_MATCH(Query,".*( )chester.*|^chester.*") THEN "Chester"
WHEN REGEXP_MATCH(Query,"((?i).*Dublin.*)") THEN "Dublin"
WHEN REGEXP_MATCH(Query,"((?i).*Liverpool.*)") THEN "Liverpool"
WHEN REGEXP_MATCH(Query,"((?i).*London.*)") THEN "London"
WHEN REGEXP_MATCH(Query,"((?i).*Manchester.*)") THEN "Manchester"
WHEN REGEXP_MATCH(Query,"((?i).*Nottingham.*)") THEN "Nottingham"
WHEN REGEXP_MATCH(Query,"((?i).*Sheffield.*)") THEN "Sheffield"
WHEN REGEXP_MATCH(Query,"((?i).*Bristol.*)") THEN "Birmingham"
WHEN REGEXP_MATCH(Query,"((?i).*Derby.*)") THEN "Derby"
This example stumbles on a problem:
I was struggling to separate queries that contained "chester" as it was also grouping queries with the word "manchester" in, so we couldn't get accurate data. To combat this, the regex was tweaked to '(*( )chester.*|^chester.*)' this means anything which includes a space ( ) and the word chester, or if the keyword starts with (^) chester. This makes sure only queries with the specific word chester in them were included.
H/T to @RoxanaStingu for the solution to that problem.
Weekends vs weekdays (via Omi Sido )
This would be useful for analysing traffic for retail or local businesses, such as holiday places and hotels.
When Day of Week Name = "Monday" Then "Weekday"
When Day of Week Name = "Tuesday" Then "Weekday"
When Day of Week Name = "Wednesday" Then "Weekday"
When Day of Week Name = "Thursday" Then "Weekday"
When Day of Week Name = "Friday" Then "Weekday"
When Day of Week Name = "Saturday" Then "Weekend"
When Day of Week Name = "Sunday" Then "Weekend"
Using REGEXP_MATCH, this statement can be simplified to;
WHEN REGEXP_MATCH(Day of Week Name,('Monday|Tuesday|Wednesday|Thursday|Friday') THEN "Weekday"
WHEN REGEXP_MATCH(Day of Week Name,('Saturday|Sunday') THEN "Weekend"
Tips, Tricks & Things to watch out for
Copying data sources
You can't copy calculated fields across to different data sources - if you change the data source, you'll have to re-add the calculated fields each time. You should still be able to copy and paste the CASE WHEN statement itself without any edits though.Once a data source has been removed from Data Studio, again you will have to re-add any calculated fields.
The order matters
The CASE WHEN statements apply in order of appearance - for example, if a keyword would come under two categories, once it comes across the first CASE WHEN statement on the list that it applies to, it will add it to that group and ignore any further groupings further down the line.
You can't have a keyword in multiple groups within one CASE WHEN statement. This means the order that you write the statements is important. Go from the most specific statements first to the broadest last to make sure your groups cover as many keywords or pages as possible. Otherwise, you might end up with a large section of pages marked as 'Other' or keywords missing from categories you might have expected them in.
I recommend using a modifier to identify which fields are calculated, as this can sometimes get confusing if you name it similarly to other fields in a data set. I typically add a * to the front of calculated fields so it appears at the top of the list.
I would also recommend adding something to note what data source the calculated field uses. This means you know at a glance what data the calculated field is connected to, which is really useful when you head back into a report you may have not looked at in a while.
These are just a small selection of examples that use the power of the CASE WHEN statement in Google Data Studio. There are many more ways that you can use this to create calculated fields that can supercharge your reporting, but we hope this post gives you some inspiration on how to use this in your own work.
Have a CASE WHEN example to share? Let us know!
Also, take a look at our 15 Advanced Data Studio Hacks