Zappi Amplify Out-of-Home


Solution Summary

Zappi Amplify Out-of-Home (OOH), provides rich diagnostics that will unlock winning creative. You’ll get fast, powerful and detailed consumer reactions resulting in actionable insights that help you optimize and improve creative effectiveness.

We have a human centered approach to creative effectiveness of OOH ads like billboards, bus stops, and more. This solution is part of an agile end-to-end ad research system that fuels ad development with better, more actionable consumer insights early and often when developing campaigns. The system provides exceptional sales and brand growth predictions, and is 60% more predictive of in-market ROI.

Get smarter over time by analyzing across all your research, at all stages of development, quickly and easily to learn what works and what doesn’t, creating a learning loop over time.

Solution Basics

Respondents are exposed to the ad for 2 seconds, reflecting the often short dwell times common for OOH ads. 

After the 2 second exposure, we capture unaided brand recall, understanding (instant meaning) and message take-away, reflecting key impacts in the real world.

People are exposed to the ad for a second time and control how long they want to look at it, then we capture System 1 emotional response using emojis, as well as other key ratings and diagnostics such as likes/dislikes and heatmaps, uncovering the why 

behind their response.

Available for these verticals:

  • CPG & QSR

In-Context Availability: Amplify engages consumers in the same way they see, hear and respond to content in the real world. In reality, 98% of all outdoor advertising receives less than 2 seconds of attention. Zappi uses a 2 second “fast exposure” to measure brand breakthrough, comprehension and message breakthrough followed by a forced exposure to understand additional creative potential and understand optimization opportunities.

Stimuli: 1-10 ads per test

Evaluation: Monadic

Sample default: 200 respondents with flexibility up to 800. Increasing sample size may result in issues with completing the project. 

Norms: Yes, the solution can use norms*.

*Market-wide norms become available when total stimuli researched reaches 20. Your own customer norm per market becomes available when you've researched 20 stimuli in each market.


Getting Started with Amplify Out-of-Home

Research Analysis | Understanding & Interpreting Results


Key Measures

The 3 Rs framework

The reporting output focuses on a blend of system 1 emojis, system 2 and short exposure survey data, which ladder into two comprehensive indicators for sales impact and brand impact: 

  • Sales Impact Score: The sales impact score measures the potential of the creative to drive short term sales.
  • Brand Impact Score: The brand impact score measures the potential of the creative to build the brand and drive sales into the future.

Sales Impact is then diagnosed using the 3 Rs - Reach, Resonance and Response. Within your Quick report there are summary scores for each, as well as the ability to see which individual measures are driving the summary score up or down. 

  • Reach: The ad’s ability to grab attention, link to the brand, and build distinct memory structures making the brand more salient.
  • Resonance: The ad’s ability to engage and trigger an emotional response helping the ad go into memory and making it come to mind more positively into the future
  • Response: The ad’s ability to generate a more immediate response and brand reappraisal, more important for smaller brands or news based ads

How are Sales and Brand Impact scores calculated?

Sales and Brand Impact are composite scores that measure the potential of the creative to drive short term sales (sales impact) and longer term brand equity (brand impact). They are available as absolute scores which can be sig tested against the norm (using cross tabs) and are displayed in platform as a percentile scores calculated using the absolute score and the norms scope you have selected in the platform.

Sales Impact includes the following measures:

The tables replace the image that are on other solution pages

Sales Impact
Reach

Unaided brand recall

Brand connection

Ad distinctiveness

Resonance

Overall emotion

Understanding

Response Brand appeal/persuasion

Brand Impact
Reach

Unaided brand recall

Claimed attention

Uniqueness of brand impressions

Brand distinctiveness

Brand connection

Resonance

Overall emotion

Likability

Relevance

Brand meets needs

Response

Brand appeal

Category drivers

Research process

  1. Upload between 1-10 ads per project
  2. A default of 200 category-consumer respondents are exposed to each ad for 2 seconds.
  3. Results are provided in the context of the available norm.

Configuration checklist and guide

Image Guidelines

Formats accepted: JPEG, JPG, PNG. (RGB color format only - CMYK not supported)

Resolution:  Max: 800 high, 1366 wide

System limitations: The larger the file, the longer the image takes to display. This means respondents with slow computers or internet speeds may drop out.


  • Add the following information about your stimuli:
    • Ad name
    • Brand information 
    • Stage of development
    • Target audience profile
    • 2-20 brand/category attributes that reflect category drivers (category entry points)
      • 255 character limit per attribute
    • 1-4 key messages for your stimuli
      • 255 character limit per message
    • Presence of celebrities
    • Tags
      • Tagging your stimuli allows you to categorize your content efficiently and unlocks additional analytic capabilities.
      • Standardized tags - taxonomy defined by your organization.
      • Custom and smart tags.

Step-by-Step Configuration Process Guide


The fast exposure | The basics & FAQs


What is the survey experience for the respondents like?

After a short welcome message letting them know they will be seeing an ad that they would see out and about, the respondent will see the ad for 2 seconds. They will answer some questions about brand and message recall as well as comprehension, then they can look at the ad again for as long as they want before answering the rest of the questions.

What do we ask after the fast exposure?

Following the fast exposure we ask 3 questions: unaided brand recall, ease of understanding, and unaided key message recall

What if someone missed the ad?

The respondent is prompted to click next when they are ready to see the ad, and they are given a 3 second count down, so missing the ad is unlikely, but if someone misses the ad they are still asked the memorability questions. They are then shown the ad in full again in a ‘forced exposure’ to get more qualitative/system 2 responses to the ad.

What is the Unaided Brand Recall score based on and how do I configure brands and sub brands to ensure my unaided brand recall score is as accurate as possible? 

The percentage for Unaided Brand Recall is based on auto-coding for Parent brand and Sub brand. This means that some ads will only have a Parent brand present but others will have their score based on their Sub brand too. 

  • For auto-coding to pick up your brand as accurately as possible, be sure to also submit alternative brand names and slang terms. The auto-coding accounts for fuzzy matching, such as simple spelling errors, so that those answers will still count. Submit as many alternatives as consumers may use (example for M&Ms: M&Ms, M&M, eminems, M and Ms, M and M etc)

A few notes on spelling:

  • Capital Letters: Don Simón vs don Simon = no need to add alternatives (except for Cyrillic languages).
  • Accent: Don simón vs Don simon = Must add the alternative (“ó” is a different character to “o” from a computer's perspective).
  • Spaces: Don Simon vs DonSimon = no need to add alternatives

Brand recall only shows coded brand recall for my tested brand, but can I see misattribution?

While on the unaided brand recall chart, you can export the raw data from the export tab and code this up manually to get a sense, however there is no current quick way to automate  this.


Audience and sample

What is our approach to sample? Who are we interviewing? 

We interview a Representative Audience that reflects the real market for your category. Having Profiles (subgroup norm) analysis available means that we are able to check resonance with narrower audiences, but defaulting reporting to category consumers/users ensures that we are both benchmarking consistently, and providing a high threshold for creating great advertising.

While other testing approaches allow a user to configure a sample each time and database everything together, Zappi Amplify applies a standard approach to sample, based upon consistent quotas and broad category relevant sampling. This consistent sample means: 

  1. Any category will find its target audience in a dataset and understand a broad consumer reaction.
  2. Any test can be compared, confidently, to any other test at a total population and sub-group level.

How do we ensure the sample composition of each test is comparable?

We apply several different weights to our data to ensure consistency across studies. The weighting reflects the actual makeup of the relevant country so the sample remains consistent.

Sampling:

Data for Amplify OOH is collected with sample targets set for age nested within gender, and socio-economic class (SEC).

For age and gender we ensure that within each age group there is a 50/50 M/F split based upon census data. 

Weighting:

Weighting is applied on four axes: Age nested within gender, socio-economic class, brand usage, and category usage.

The targets for category and brand usage are calculated dynamically based on a norm:

  • Category usage targets are calculated independently and across different customers for each category within a country. A norm is created for each usage frequency response option for all cells in the database.
  • Brand usage targets are also calculated independently, In this case for each brand within a country/category combination. A norm is then created for each usage frequency response option for all cells in the database.

Our sampling and weighting approach means that every comparable project has the exact same weighted fallout of each of our 5 variables: Age, gender, SEC, Category Usage, Brand Usage


Understanding metrics that matter

Reporting deliverables focus on the “Metrics that matter” to drive short-term “Sales Impact” and long-term “Brand Impact”. Based on performance on these metrics, users are guided to the linked diagnostic sections of the report to better understand how they can improve their ads.


Norms and interpretation

What is a percentile?

A percentile score is a method of ranking that takes the score for an ad, and reports back where the results for that ad sit within the total distribution of the norm that it is being compared to. For example a score in the 70th percentile suggests that the ad has performed in the top 30% of ad scores.

The norm that is being used is an important part of this calculation, and changing the norm will change the percentile scores.  An ad may be in the 50th percentile when looking at all ads in the market but in the 99th percentile when looking at ads for a specific category. 

At Zappi, we perform a specific type of norms calculation that smooths out the database and removes any skews. We take the mean and the standard deviation of the norm, and use them to create a normative database that follows a normal distribution. Each ad’s performance is then plotted against this distribution. This is known as a cumulative distribution function.

Norms 

Definitions for norms:

  1. Country level - Ads can be compared to the norm for the country in which it was tested.
  2. Language - The language that the fieldwork for the ad was done in. This can be different from the language of the ad itself, since you can test an English ad with Spanish-speaking respondents. 

  3. Parent Category - The vertical or industry the brand is from (ie. fast food company would go in the restaurant parent category). This norm includes the data from all our other Zappi customers in the category. 

  4. Child Category - This norm will encompass all the ads tested within the narrower category. For example, within beverages the children categories would be carbonated soft drinks, sports drinks, bottled water etc. The category will have all the ads from our other Zappi customers tested within the category.

  5. Brand - This will use only your own ads as the normative comparison for the norm.

  6. User Defined norms - You are can create custom norms based on a number of criteria or on tags you have applied to your ads in the platform 

You can choose to include only ads in your domain in the norm rather than across other customers. This is set up once and becomes your default.


FAQs

Why has my Sales and brand impact percentile or color coding changed?

While your absolute score for sales and brand impact won’t change, how these scores compare to the database will change as the database changes. The database is dynamic in nature and monthly norms updates will result in changes. If, for example, lots of great ads are researched and added to the database, your ad may achieve a lower percentile (or different colour code) as a result. In order to see the same percentile score/colour code you need to ensure you have selected:

  • Norm from the time of testing (first of the month)
  • The same norms scope each time

Alternatively, Quick reports are static and reflect a snapshot in time from the month that your survey completed. This means the percentiles and color coding will not change on the quick report over time.

Key messages can be unique per project. How is the norm calculated?

The key message norm is the average across key messages asked across all projects. The norm does not take into account the specific message or what order the messages were sown in the survey. Since they are all messages that you want to convey, the norm compares how well you got your message across to how other ads got their message across, regardless of what the message actually is.

What is a profile and how does it differ from a filter?

Profiles represent the different data cuts that you use to look at your data (for example: men only, or Ages 18-34s only). Each profile that is created leverages all available data to create a norm for that Profile. A study contributes to a profile if it has a minimum of 30 respondents that match the description. There are two different types of Profiles that are available

  1. Universal Profiles - these are profiles based on questions that are asked about ALL ads such as age and gender. This profile uses all the available data of all ads tested to generate a norm for that subsegment of the data.
  2. Client Specific Profiles -  these are profiles based on questions that are only asked for a particular client’s ads (rather than for all ads)and in these cases the norm is only based on this group of people within the client’s own projects.

When you filter data and look at a norm, you are comparing the sub group in the filter to a total sample norm.

When you use profiles and look at a norm, you are comparing the sub group in the profile to a norm made up of only that sub group making it more meaningful as it accounts for the fact that a specific sub group may always be more positive or negative in their responses.

Interpreting data and making decisions

There are a number of different comparisons and decisions people want to make when pre-testing ads. Most brands need at least some media support, so ‘using nothing’ is often not an option. Therefore, common decisions people need to make are:

  1. Is this ad/are any of my ads strong enough for good ROI? 
  2. Is/are the ad(s) good or great?
  3. Which execution(s)/creative route(s) is/are strongest?
  4. Is this new ad stronger than the most recent advertising I’ve invested in? Is this new ad stronger than my competitor’s ad?
  5. Which iteration of the ad is strongest? (recommended for meaningful differences between creative, not small iterations)

Within the platform there are two different analysis routes that enable you to do all of the above and there is a simple toggle available on the right hand side of the page called ‘significance testing’ so you can switch between them. Toggle to ‘Norm’ for ‘Questions 1 and 2 (comparing to a norm). Toggle to ‘stimuli’ to sig test between the chosen ads and answer questions 3-5.


How does the ad to ad comparison work?

The ad to ad comparison sig tests between all the stimuli on each measure. 

  • Where a measure for a specific ad is significantly above another ad, the colour will be bolded to draw attention to this strength
  • And then in place of showing the norm under the achieved score, there is a letter which denotes which column (ad) this ad is stronger than (if for example it says B, it means the ad in this column is stronger than the ad in column B). 
  • For sales/brand impact, you will still see the percentile score for each ad but we are using the absolute sales/brand impact score (from which the percentile is calculated) to inform whether one ad is significantly above another on the summary metric

To learn more about making decisions with Amplify, read here.


Interpreting cultural sensitivity

Why measure it

It's crucial to approach the interpretation of sensitive or potentially offensive content with empathy and cultural awareness. This type of feedback is largely subjective, but by actively engaging with the feedback and striving to stay attuned to cultural nuances, you can refine your advertising to better align with the values and expectations of your audience, thus building trust and equity for your brand. Read more in our full best practice advice.


AI-Generated Reports

AI Quick Reports will become available once norms are available.


For Zappi Customers

For additional information please reach out to Zappi Support or your CSM for access.

  • Methodology Guide
  • Demo Report