Research and sample quality

In this article, you'll find information about:

How does Zappi source sample?
How Zappi ensures high-quality insights
- Survey Design
- Sample Consistency
Data Quality
- How is Zappi keeping up with current data quality threats?
- Zappi Quality Score

How does Zappi source sample?

At Zappi, we take data quality very seriously. We work with PureSpectrum, our survey technology partners, to access a range of ESOMAR and ISO 20252 certified panels. We do not own our own panel, so we aren’t limited to a single supplier. Instead, we can vary our sample sources and choose them based on quality and feasibility.

We developed a custom supplier blend to ensure we source sample from the highest-quality suppliers in the marketplace who manage a mix of double opt-in panels and affiliate networks/sources. This approach allows us to maintain control over the quality and consistency of data across projects.

As we utilize our network of panel partners, we rely on those partners to handle all panel management aspects. This includes panelists' recruitment and validation, incentive management, and any panelist interactions. Our partners, in return, ensure all Zappi work is managed to meet their panel best practices and follow their panelists' terms and policies.

Zappi conducts regular research on research to evaluate the stability of the results offered by our suppliers, particularly when entering new markets.

What are Zappi’s primary considerations when sourcing sample?

Some of our typical considerations are:

Minimum quality requirements for responses we receive, and the continuity and stability of results offered.
Market availability
Feasibility
Targeting capabilities
Measure profiling depth
Cost & fees
Support SLAs
Project management SLAs (if applicable)
Fit with Zappi requirements and typical study types
Technical aptitude (if applicable)
Technology commitment to the roadmap (if applicable)

How does Zappi ensure high-quality insights?

Our operating model is built on three pillars:

Survey Design
Sample Consistency
Data Quality

We pride ourselves on producing research that has a positive impact, so we build quality into your data from the start.

Survey Design

Survey Design and respondent experience are crucial first steps in ensuring quality, engaged responses. We follow these best practices of survey design to ensure the highest quality responses:

Short length: The most engaged responses from our panels come in the first 10-12 minutes of a survey, so our surveys take no longer than 10 minutes
Low dropout: We mandate a 20% minimum inclusion rate in all surveys
Mobile optimized: All our surveys are mobile-optimized, so respondents can access the survey and have a consistent experience, regardless of the device.

We regularly ask respondents for feedback on their experience completing our surveys so we can make improvements. In our most recent research, 68% of respondents gave an enjoyment score of at least 8 out of 10, and 72% of respondents gave an ease-of-use score of at least 8 out of 10.

Sample consistency

We use a sample blend categorized by product and country to make sure we upkeep our consistency and quality standards even when we source from multiple suppliers.

We create these blends at the launch of a new product or upon entering a new market. Factors like quality, production capacity, speed, and automation ease guide how blends are created.

Using quota sampling, the automated sample platform and integrated panel partners are able to deliver consistent samples.

To ensure a representative and consistent sample frame, our standard practice is to set quotas by product, including controls for variables such as provider blend, respondent device type, age, and gender. In most cases, there will be a standard audience selection that aligns with the category. For certain solutions, we can collaborate with you to determine quotas and in-survey screening questions to help us define a custom target audience. We use the same supplier in each combination of client, product, and country, to make sure you get consistent results and can confidently compare results across multiple surveys, and to multiple benchmarks.

Data Quality

How is Zappi keeping up with modern data quality threats?

Advancements in LLMs like ChatGPT, and a spike in quality issues in the online sample ecosystem mean that data quality features have never been more important.

Bad actors appear in many forms, from simple screen recording macros, to human click farms, to the more advanced LLM bots. Zappi’s data quality features are designed with each of these challenges in mind. There is a difference between a low engagement but well intentioned respondent who is taking a survey quickly, and a malicious actor who seeks to maximise incentive payouts. Zappi treats each of these challenges differently:

Issue	How Zappi handles it	Benefit
Unengaged respondents	Monitor and understand them using the Zappi Quality Score.	Improved sample sourcing and solution design
Bots and bad actors	Preventative measures to detect bot and bad actor activity.	Flagged responses are removed from the data.

Data Quality measures are present on every single survey we run.

We start by ensuring our partners provide the highest quality sample, then we take it a step further with our own quality measures.

Sourcing our sample from suppliers that meet ESOMAR and ISO 20252 certification levels.
Adhering to all regional regulations, including GDPR.

Bot and Bad Actor Detection

At the survey level
- Captcha - Established bot detection technique.
- Geo IP fingerprinting - To ensure the respondent is where they say they are.
- Response repetition - To check for duplicate responses across multiple respondents
- Deduplication and respondent quarantine - To check that they are a unique respondent.
- Speeder Analysis - To determine if the respondent is truly engaged or just speeding through.

In open ended questions we detect how engaged individuals are when completing a survey, checking for:
- Gibberish answers - random characters
- Illogical answers -
- Automatically generated text - Checks for pasting or predictive text can be flags of bot activity.
- LLM Detection - Checks for identification of known LLM behaviours like ChatGPT
- Research Defender integration - Tracks across the sample ecosystem on the supply side.

All respondents that are deemed poor quality, are referred back to our providers with the appropriate behavioral flag.

Exclusions

Due to the high-volume nature of online research, the standard recommendation is to set exclusions (sometimes known as “lockouts”) on a project-by-project basis. This will help to narrow the focus to the target audience, reduce bias, and increase accuracy and efficiency.

Exclusions are applied automatically to:

Cells in a single order

Multiple orders grouped by:
- The same product.
- The same customer.
- The same audience.
- Projects launched within 5 days of each other.

If you need to apply more complex exclusions because they don’t fall into the rules above, please reach out to our Customer Support team before launching your project or program.

Zappi Quality Score

The Quality Score looks at 14 different signals to measure the value we can gain from a respondent. Unlike other quality measures, this score does not prevent respondents from entering our surveys, it monitors respondent quality after the fact. As well as looking at behaviours such as gibberish or straightlining, it looks at the level of engagement and density of insight available in the verbatim responses we collect.

Check out our page on Quality Score for more information.

Respondent Quality Service

In particularly challenging markets we add an additional layer of human checks on top of our automated checks.

A Zappi consultant manually checks every project following a set of specified criteria and removes poor quality responses from the dataset. Then, we re-field the study to fill in the gaps. Once we receive the new responses, we quality check those and repeat as needed. By completing this process, we can ensure the data in these higher risk markets has been validated so you can make decisions in confidence.

What do we check manually?

In addition to the automated checks, we perform manual checks on collected responses. Open ended responses are a leading indicator of respondent quality.

Repetition across questions within a respondent: When a respondent answers with the exact same response across all verbatims, this is often a sign that they are speeding through the survey and should be removed. This could be “none” or “it was OK” or something longer.
Repetition across respondents: This is where multiple respondents have the exact same answers for a question. In some cases this is not a red flag. Many respondents may write “I liked all of it” or “The actor”, but we should not have many respondents answering with something more complex, or something that has nothing to do with the question being asked. This is a strong indicator of bots or fraud.
Answering the wrong question or nonsensical answers: Here we are looking for responses that have nothing to do with the question being asked. An example of this would be responses such as “yes” and “it was very good” to brand recall, a measure that asks respondents to type in the brands they remember seeing ads for. This is also a strong indicator of bots or fraud
Gibberish answers. This can be anything from multiple words that do not form a sentence to “dfhdhndghd” This is a sign of a poor quality respondent. Note that “No response” or “I don’t know” or “I can’t answer this question” are all perfectly legitimate responses.