Decision-making with Amplify

TV and Digital

There are a number of different comparisons and decisions people want to make when pre-testing ads.  The truth is that most brands need at least some media support, so 'using nothing' is not an option.  Therefore, common decisions people need to make are:

Step 1:

  • Is this ad/are any of my ads strong enough for good ROI? 
  • Is/are the ad(s) good or great?

Step 2:

  • Which execution(s)/creative route(s) is/are strongest?
  • Is this new ad stronger than the most recent advertising I’ve invested in? Is this new ad stronger than my competitor’s ad?
  • Which iteration of the ad is strongest? (recommended for meaningful, not small iterations)

Step 3: 

  • What can I do to make the ad(s) stronger?

Here we focus on steps 1 and 2.  Step 3 is where you dive deep in your analysis following your initial 2 steps to identify the strengths, weaknesses and opportunities for improvement within time and budget available. It would require a whole guide to itself! – but we’ve included some ideas for next steps at the end.

Step 1 is to identify whether (any of) the ad(s) is strong enough. 

There are a few actions to take here:

Look at your Sales and Brand Impact scores and see if the ad/any of the ads are green (i.e., hitting your action standard).

  • Green means they are strong ads and hence likely, all things being equal, to deliver a good return on media spend.

Look at the percentile score and see if it’s nearer the low end or the higher end of being green. 

  • If you have an ad with a percentile of over 84 for example, your ad is likely not just good, but great, and hence you may want to invest more or run the campaign for longer, evolving your plans! 
  • You should not use these percentile scores to compare between ads as you won’t be sure how big a gap you need for a meaningful difference between any two ads (for example, an 85 may not be significantly better than a 75). See step 2 for how to check whether an ad is meaningful better than another. 

Look at key metrics related to your brand’s specific jobs to be done and again, see if your ad/any of the ads achieve green (significantly above average) on that/those key measures.

  • For example, the brand may have strong salience, but little emotional connection.  So the key metric for the ad may be ‘overall emotion’ to see if the ad can start to build that emotional connection. 
  • Or it could be that the advertising has historically not stood out and hence a key objective for this ad is to ensure it is ‘distinctive.’  
  • Or it may be that you need to be sure that the ad is not only effective, but also ‘on strategy’, associating the brand (emotionally or functionally) with the right things.

If one or more of your ads hits green on the key metrics above, look for opportunities to optimize and know you have one you can confidently invest in. Move to step 2 below to identify whether one is truly stronger than another to move forward with.

If your ad/none of your ads hit green on the metrics above but do hit amber, look to understand why people are reacting as they are and if there’s opportunities to optimize. Move to step 2 below to identify which ad or ads are strongest to move forward with.

If your ad/all your ads are red on the metrics above, it is likely that you’ll need to rethink, or use an old ad.  There are occasions when through diagnosis you identify a single thing you could do differently that would unlock the success of the ad, but it is more rare.

Step 2 is where you go to start choosing between ads.

Through step 1, you could not deduce yet that the ‘best’ ad (i.e., the one with the highest score) is actually meaningfully better than any other ad. It may be that you have two or more ads that are green on the key metrics, two or more that are amber, or even one that is amber and one that is green, and now need to see if one is meaningfully different/better than the other(s).

For step 1  you have compared each ad to a norm. For step 2, you need to change this view in-platform and compare the ads to each other. To do this, select the comparison to be stimulus to stimulus (rather than to the norm).  You will then see the ads sig tested against one another. 

Here’s how to interpret what you see:

  • Where a measure for a specific ad is significantly above another ad, the color will be bolded to draw attention to this strength.
  • Under the achieved score for that ad (where you would see a norm in step 1) you will see chevrons (^) pointing upwards with letters next to it. These letters tell you which ads (columns) this ad is significantly above on this particular measure. 
  • For sales/brand impact, you will still see the percentile score for each ad, but we are using the absolute sales impact score (from which the percentile is calculated) to inform whether one ad is significantly above another on the summary metric.

In the above example, Ad A is the strongest of the 3 potential options. It has a greater number of bolded measures, including overall sales and brand impact score. Ad C is bolded for watched full ad showing it is significantly stronger than another ad on this particular measure. Looking closer, the letters A and B under the score for the ad show it is better than both ads on this particular measure.

Now you have your view ready for step 2, here are the actions to take:

Determine whether there are a number of ads/creative routes which are similar to one another in performance on key measures (i.e., not significantly different to one another). 

  • Where there are a couple which are ‘equal’, you have options! They should all be explored to determine which has the biggest opportunity to meet the brand’s jobs to be done and be optimized (ie dive deeper into the diagnosis, strengths, weaknesses and the why)

Determine whether a new ad is a meaningful improvement versus previous executions or will result in similar outcomes. 

  • If the new ad is significantly better than the current one you are investing in, the decision is easy - invest in the new one! If the new and previous executions are similar in performance, the choice comes down to whether the new one can be optimized further or whether a new ad is needed because of a change in strategy or wear out in news (note: wear out only really happens with news!)

Determine whether the ad being developed will allow you to get a better bang for buck and achieve a higher share of voice with equal investment to a competitor.

  • Here is where regular competitive testing can be really useful!

Determine if one iteration of an ad is stronger than another, or whether they are equally as strong. 

  • This can help you choose between iterations and learn for the future. Note that only meaningful iterations should be researched, not tiny changes like a different end card. See more guidance here.

Once you have done steps 1 and 2, you have the key ‘first evaluations’ you know whether you have any options on the table. And you know which options to dive deeper on.  

Step 3 is then to use all the other many diagnostics available in Amplify to:

  • Finalize your choice of ad(s) that you invest in 
  • Optimize the chosen ad(s)
  • Input learnings around audiences to media planning
  • Learn about how consumers respond to advertising to help with your next campaign development

FAQs

What do the brand/sales impact scores mean?  How can I interpret the number?

  • The sales/brand impact scores are made up of a number of measures (link to what they are).  Each of them is then transformed into a number on the same scale before they are weighted together as per their coefficients. 
  • When comparing ad to ad, this score is sig tested between ads so you know whether one is meaningfully better than the other (the sig testing uses this absolute score even if it is displayed as a percentile). 
  • When you look at this score as a percentile, you are then seeing where the sales/brand impact score for this ad falls in a distribution of all other ads within your chosen universe.  

How big a difference do I need on a percentile to be meaningful?

Percentiles can be a very useful way to understand how well an ad has been received.  They have instant meaning and they discriminate between ads (because there are effectively 100 points on the scale). They have their use, particularly in getting a guide for how an individual ad compares to all other ads within your chosen norms universe. 

You shouldn’t however use percentiles to compare across ads and infer meaningful differences. You often need quite a large difference in percentiles to be meaningful (and this will differ depending on the distribution of the norm you are using. Where you use a sub category norm for example, there will be many ads which are similar to one another in terms of performance. And where this is the case, small differences in actual performance can look much bigger on a percentile based on this narrower distribution of scores). 

Where you have multiple ads and want to see which is best, we recommend switching to the ‘stimuli’ comparison view (rather than ad to norm view).  When you look across the ads here, you will not only see the percentiles, but importantly you will see whether an ad is significantly better than another on Sales/brand impact by sig testing the absolute sales/brand impact score that sits behind the percentile.

Why are small differences in an absolute score sometimes big differences in percentile?

This is due to the shape of distribution. In the image below, you can see two different norms universes.  They share the same average, but the green one has a wider spread of scores and hence a higher standard deviation. 

  • When a distribution of scores is narrow, differences in scores which aren’t meaningful can result in bigger differences in percentile. 
  • When a distribution of scores is wider, small differences in absolute score are less likely to lead to big differences in percentile score.