Campaign (UTM) Parameter Naming Conventions revisited: Cryptic vs. Positional vs. Key-Value Notation
And why having all in one parameter is key
I have accompanied many an organization in designing their Campaign URL Parameter Naming Convention (“UTM parameter convention” for those out there with Google Analytics tunnel vision).
What is the same everywhere:
- No naming convention is perfect.
- Nobody likes to do it.
- You can spend months discussing which attributes should be in it, and which channels should be derived from it and how. And that’s not bad. You should spend some time on this, as it is the foundation of your marketing analysis capabilities and really tedious to change later on!
- It takes somebody with good discipline and a prison camp overseer gene to make sure agencies and co-workers follow the convention.
- Everybody will tell you that this is soo important, few will actually understand the relevance, and many will try to get by with “my_email_spam_campaign_jul21” until the annoying finger-pointer catches them.
- Too many marketing folks are not good at thinking on an abstract level — by that I mean questions like “what characteristics of a campaign should be grouped together / what do we want to be able to easily filter by”.
- There are vast differences in the usefulness of the various types of URL parameter naming conventions!
The last point is important. Campaign naming conventions are not in the realm of “it’s all relative, people just need to follow them”. Not at all. Let’s look at the 3 most common schemes and explain why “Key-Value Notation” beats them all in my very biased, but fully objective opinion. Finally, some more general truths on campaign naming conventions.
Let’s look at the most common campaign naming convention models:
1) Cryptic Notation
Here, your campaign name becomes a cryptic ID, e.g “utm_campaign=23478efj823948jewf9823j479h”. You then upload/import a “classification” aka “lookup” file that maps this ID to some sensible readable dimensional values in your Analytics tool.
Benefits:
- This model is common among organizations who want to make sure nobody shall be able to read from their campaign parameters that they are targeting you because you like to eat frog legs.
- Another benefit is that the ID can stay the same no matter how often you change the actual name of the campaign.
- It makes URLs and parameters shorter
- You can extend it (derive other attributes from a single ID) easily
Downsides:
- Unreadable for outsiders, but also for your own co-workers!
- Everyone is lost if the lookup data import does not work or takes longer
- Want to create a quick campaign without the “campaign ID generator” tool because it’s down or your agency does not have access to it? No chance.
- Every system that wants to consume campaign data (your CRM, your Tableau, your other Analytics tool, your Data Science’s recommendation algorithm etc.) also needs this lookup data, which in some cases can become gruesomely complicated and delay things mightily.
- Makes debugging tracking issues harder. I am often happy when I can see a raw readable URL query string in the data, e.g. when I want to quickly filter for people coming via newsletter campaigns because we suspect some issue with those.
- Spotting errors is near impossible: “Is that 23478wefj480we8j243wer2 in the Facebook URL correct?” — “Hmm I would need to look that up in our grand lookup sheet first, but that is 200 MB large and freezes my Excel.”
So you can go with this, but need some seriously good and proactive tech infrastructure / people in place to work around these downsides.
2) Positional Notation
This is probably the most common method. Here, you simply line up the values of all campaign attributes (e.g. engine, language, campaign type) with a separator between them, often an underscore. The meaning of each value (the attribute it describes) is then identified via the position.
Examples:
- de_20210607_emarsys_email_nl_special-offer-xy
- psrc_pbch_ch_202011_alonp_wr_na_go_na_na_w07732
The first example has the language in first position (“de”), then comes the date “20210607”, then the engine “emarsys”, then the campaign medium “email”, the campaign type “nl” (newsletter), and then the campaign topic (“special-offer-xy”).
Benefits:
- more readable even without any lookup files, thus none of the downsides of the cryptic notation mentioned above
- easy to understand if you only need 4–5 attributes -> good for starters
- no weird rarely used characters, braces etc.
Downsides:
- Dependency on order. Forget one of the attributes (e.g. the one in position 3) and everything else after that gets interpreted wrongly because it is now out of position.
- Thus, it is more likely to fail with machine integrations, e.g. a typical case is a Regular Expression logic that extracts the value in position 4 into a Data Studio drop-down menu called “Language” because position 4 is always supposed to be the language (apart from those 48 cases where the agency mixed it up with position 6).
- You always need to provide all attributes. If attribute in position 3 is not needed, you need some “filler” (e.g. “_na_na_” in the second example).
- Which position stands for what again? Would you easily remember which attribute the value in position 5 or 6 of the second example refers to, even if you work with campaign data regularly?
- All this means it is harder to extend this model, because each new attribute makes the campaign name longer and even harder to read/easier to fail
This is why I don’t recommend this notation. It is less fun for machines, less readable than the key-value notation, and it breaks more often with bigger impact when humans make mistakes (which they never do, I forgot, my mistake).
3) Key-Value Notation (aka “Attribute Notation”)
Here, every attribute gets…
- an attribute code, e.g. “la” for “language” or “t” for “Campaign type”
- an attribute value, e.g. “en” for English or “shopping” for Google Shopping
- a separator between attribute and value (usually “-”), eg. “la-en”
- a separator between each attribute-value pair (e.g. !la-en!t-shopping) or brackets (e.g. (la-en)(t-shopping)) around each pair. Brackets distinguish each pair more visibly. One-charater separators (e.g. “!” or “:”) save you one character with each pair. I prefer brackets. You can also use [] or {}, but beware of encoding issues.
Examples:
- Brackets: (ft-hp_camp)(cc-cmot)(pl-hnd)(l-d)(d-190506)(e-fb)(ap-inh)(t-osp)(t2-ppa)(i-brand)(cm-2)(s-fb)(w-hp)
- One-character separator: !cc-didp!b2-c!l-d!e-ta!t-pro!z-new_landingpage_partnerxx_footer
Let’s start with the downsides first.
Downsides
- Harder to type. Easy to miss a ! or a -, so a campaign URL builder tool is even more important (a self-made Google Sheet or a more professional tool should be in place anyway, no matter which notation you use)
- You still need to learn the meaning of the attributes (e.g. “l”, “d”, “t”)
- It is not that cryptic to outsiders (see benefits of cryptic notation)
- Unlike the cryptic notation, the campaign name changes when the Marketing Manager changes it.
Benefits
- Good for machines and readable for humans!
- Other than in the positional notation, it is clear which value stands for what (e.g. because “d” comes after “l-” (language), I know “d” is language “d”eutsch (German)).
- The order does not matter
- You need to only provide the attributes needed for this particular campaign link. So there is no need to provide all attributes every time. Campaign Managers do not need to think about how to incorporate the “bidding type” attribute into their newsletter campaign names or use an ugly filler like the “_na_” in the earlier example.
- Since not every new attribute means yet another position to add, this system is more extensible. Your attributes can grow more easily.
- Machines can more easily and more robustly do things with it. With a simple RegExp extraction rule, you can e.g. create automatic drop-down menus in Google Data Studio, e.g. one for each attribute which then auto-populates with all values so you can e.g. easily filter an entire dashboard by Campaigns with Language “German” and type “newsletter” or “lifecycle emails” (thanks to a truly cool client for the screenshots):
- Similarly, if you use Adobe Analytics, you can use a simple RegEx extraction ruleset in the Classification Rule Builder to automatically create entirely free extra dimensions for each attribute. This is hugely powerful as you can do attribute-specific attribution reports, use attribute values as filters, and it also makes creating segments a lot easier.
Key-Value Notation everywhere
That’s not all — key-value notation is great not only for campaign names. I use it e.g. also for On-Site Campaigns, and a client uses it for the namings of product lists. Key-value notation is good for anything that needs a common, human-readable key and yearns for systematic grouping/filtering.
More Campaign Naming Convention Tips
Let’s finish off with some more general recommendations when it comes to campaign naming conventions.
Have ONE parameter that contains all necessary info, even if that means a bit of redundancy.
Most of the examples above already employ this principle. “All necessary info” means you have everything you ever need for filtering/grouping on a campaign level. So do not spread that info across parameters just because you learned in your SQL cours that you should “normalize” data models. Instead, make sure all you need is in utm_campaign.
Example: Do not e.g. put the “Campaign Engine” (“google/bing”) only into utm_source and the medium (“email”) only in utm_medium and just all the rest in utm_campaign. Make sure the engine and the medium are also part of utm_campaign. Likewise, your campaign name also should contain the channel that it belongs to (e.g. Google Shopping could be represented as “(cc-gshop)”).
Why? Because…
- being able to work on top of one parameter (one dimension) speeds up a lot of things. E.g. you need to get only a one-dimensional table and have all you need in there.
- For example, Data Studio or other systems have an easier (and faster) time if they can simply pull a one-dimensional table and then create e.g. a RegExp filter on top of it (see examples above)
- It is also easier to store and retrieve the value of one parameter in your CRM as e.g. the “acquisition campaign for this user” instead of having to do that for 3 and then later combine them again to get the full picture. And lookup dimensions aka classifications (dimensions with values automatically derived from a key (=the campaign name)) in Adobe Analytics also depend on having all in one dimension.
- I am not saying you should not use utm_source and utm_medium. Just make sure the info in them is also available in utm_campaign.
- utm_content (or some other parameter that distinguishes on a below-the-campaign level) is still valid as a separate parameter because you also don’t want to make campaign names too granular for every-day reporting. Likewise, utm_term and others.
Use UTM Parameters
The common trifecta of source, medium, and campaign is a conceptually weak legacy of GA’s predecessor Urchin, and you can spend your life in futility explaining what a “source” and what a “medium” is— they might as well be called “A” and “B”, it is arbitrary.
However, UTM parameters come with the benefit that many people have used them before, many more think (including Wikipedia until I corrected it some years ago) that UTM parameters are sort of the only way to track campaigns in any form whatsoever (as if UTM were built into browsers) and many marketing tools (e.g. HootSuite, Facebook, some E-Mail Marketing tools) offer automatic support for these parameters.
So do use utm parameters even if Google Analytics is not your primary tool. If you’re an Adobe user, just make utm_campaign your s.campaign. This can in turn make some campaign cost data imports easier, as you can e.g. use the value of utm_campaign also as the campaign identifier in your Marketing tools and thus have a common identifier for cost, clicks and traffic.
Use only simple lower-case characters
This should be a no-brainer, but with upper-lower-case mixing, spaces, umlauts or other stuff you are just creating an encoding problem waiting to happen (even though it may look good at first in your Analytics system, wait until you have to export it out of the tool). Especially mixing cases leads to problems, as for some tools (e.g. GA) “Dog” is another row entry than “dog” whereas in some (e.g. Adobe) both are treated identically. So go with a-z0-9_ plus the brackets () or the one-character separator (e.g. “:” or ”!”, see above).
How about you?
What is your schema of choice? I am sure I missed some other naming convention models out there, so make sure I get enlightened! Thank you!