Translating top-box proportions between different scales
"What scale should I use?" is one of the age-old questions in market research, which invites many viewpoints and does not have a single right answer. This is why our team of research experts at Zappi has reframed the question and asked "In what ways does it matter what scale I use?". We have made great strides in learning about how respondents interact with different scale types, and how to bridge the gap between them by identifying ways to translate between scales with similar properties but a different number of options.
Interpreting top-box proportions
One such instance is when it comes to interpreting top-box proportions. Top-boxes on scalar measures are a great way to size the group of people who express an opinion or hold an attitude which you would like your creative or concept to influence - for example, the share of respondents who report themselves most likely to purchase your new product idea. Having this information for any project, and how it compares to a relevant norm, will give you important insights into the quality of your stimulus.
Most Zappi products use the 11-point scale, which ranges from 0 as the leftmost negative option up to 10 as the rightmost positive option. Other frequently used scales in survey research are the 5-point scale (from 1 to 5), and the 7-point scale (from 1 to 7).
Top-box converter: how to compare scales
We ran a series of controlled side-by-sides in order to analyze how the results of those scales compare to each other. Everything in the studies we ran was kept equal - the sample composition and fieldwork, the question wording, the tested stimuli - other than the number of options on the numerical scale which they used. Then, we analyzed the studies and asked if proportions that were logically similar would give similar scores.
For example, if 40% of respondents chose the top-box in the 5-point scale as the most positive possible answer, we wondered how this top 40% would be distributed if we asked the same question on the 11-point scale - would they map out exactly to the top options on the scale, or would they fall between the margins. Having asked comparable samples, we could make this comparison directly between the studies - we observed that there was a great deal of consistency between the proportion scores for top-box in 5-point scale, and top-2-box in 11-point scale for any measure.
See the comparison of the two scales on the chart below:
A significance test of proportions across all of our observations (using 95% confidence interval) showed that the values for those top-box proportions were not significantly different any more often than you would expect as a result of random error. This held true when repeated on common age and gender groups, giving us great confidence that the effect is not due to random probability. Therefore we can conclude that the top-box in a 5-point scale and the top-2-box in an 11-point scale are equivalent proportions.
The same analysis was repeated on all possible iterations, and we identified a set of proportions that were mathematically equivalent, logically consistent, and analytically useful.
Original proportion in 5-point scale |
Equivalent proportion in 11-point scale |
Top-box | Top-2-box |
Top-2-box | The average of the sum of top-4-box and top-5-box |
Top-3-box | Top-6-box |
This points toward a fundamental shared quality of scales with a different number of options - they measure the same sentiment, but with a different degree of nuance. If we study how this nuance is expressed, we can find ways to translate between the scales.
The same analysis as above was repeated in order to identify the link between the 7-point and the 11-point scales, and the results were equally promising. Below are the equivalent proportions we identified for those two scales:
Original proportion in 7-point scale |
Equivalent proportion in 11-point scale |
Top-box | Top-box |
Top-2-box | Top-3-box |
Top-3-box | Top-5-box |
We validated these results by repeating this experiment several times with a completely distinct set of tests - using a variety of stimuli types, and in several markets and languages. In total, we collected 7,200 responses across tests in the US, UK, and Japan. The success of this approach in Japan was particularly encouraging since it is a country where market research studies are known to tend towards the negative ends of a scale.
An important thing to note here is that this only applies to scales going from left to right - inverted scales have different features, and comparing between a left-right scale and a right-left scale may decrease the effectiveness. It is of course best practice to compare questions that have identical or very similar wording for their questions and end labels.
The ability to project a top-box proportion from one scale to another makes it even easier for our customers to adopt Zappi’s expert-led solutions. Based on this extensive research, you can have confidence that as long as your preferred top-box proportion has an equivalent in the scale used by Zappi, the data you receive will be reliable and comparable to your previous body of research. This is just one of the possibilities that we have discovered for bridging gaps between data from different scale types - we will continue to carry out more research and share results.
If you have any specific queries related to transposing between scales - including for mean scores - get in touch with us.