Getting started with product recommendations testing and personalization
Selecting the right primary audiences, test locations, algorithms, and optimization opportunities.
Personalization is a big word, leading teams to often feel overwhelmed when tasked with deciding where to get started when there are so many different ways to tailor the digital customer experience.
However, beyond being one of the most straightforward types of personalization, product recommendations can also generate some of the most meaningful results for companies, which is why so many choose to employ them right out of the gate (and even well into their journey with personalization). For example, product recommendations can be implemented across the site (on the homepage, category and product pages as well as within search and checkout) to improve discovery and upselling/cross-selling opportunities.
So how can teams put this foundational element of personalization into practice and start seeing the payoffs in conversions, average order value, and revenue?
It all begins with process
In a nutshell, teams will ultimately accomplish the following after implementing the steps outlined in this post:
- Prioritizing product recommendation tests based on impact
- Testing algorithms to determine the best one per audience
- Fine tuning performance by testing other experience elements
Now, let’s dive into the work that goes behind it.
Step 1. Determine the audience for targeting
Since we know we want to launch a product recommendations campaign, our next natural step is to choose the audience we want to target the experience to.
Unsure of how to prioritize the right one?
Using the Primary Audiences Framework, teams can identify 3-4 primary audiences that cover 100% of the site’s traffic and are based on a single segmentation principle to ensure accurate and consistent targeting as well as scalability in both execution and learning.
An increasingly popular segmentation principle due to the natural evolution of consumer behavior over time, would be to break down a set of audiences according to low, medium, and high purchase intent.
Here are some characteristics / observations that we might associate with each audience:
- Low Intent:
- New to the brand or site
- Unfamiliar with site’s product selection
- Limited time on-site and minimal browsing data
- Has not added any items to cart
- Medium Intent:
- Some familiarity with the company
- Might be deciding between brands or products
- May have signed up for email or loyalty membership
- Has added an item to cart but not yet purchased
- High Intent:
- Has a good sense of what they want
- Understands why they should shop with the company
- Either hasn’t found the product yet or needs a final push
- Lots of behavioral data from site activity
- Further along the funnel
To keep things simple, we’ll focus on the “High Intent” audience and take you through the process of how to hypothesize, set up, and create an experiment for that audience of users.
Step 2. Choose an area on the site for implementation
Did you know that recommendations deployed on product detail pages (PDPs) drive the majority of direct revenue when looking at the total impact of these campaigns against a company’s larger personalization program?
In fact, 60-70% or more on average.
It makes sense as areas of the site like PDPs are either high-trafficked (gaining a lot of exposure) or fall lower in the funnel, where product recommendations are well-suited given users at that stage in the buying process are typically more ready to convert.
So we’ll start there to guarantee maximum impact earlier on in our experimentation with product recommendations, making it easier to show value and unlock internal resources and support for additional tests.
Step 3. Choose the algorithms to test
Recommendation strategies provide the logic behind the item selection within your widget based on a particular algorithm (e.g. the most popular items, items that the user viewed in the past, etc.). And much like there are lots of different types of personalized experiences a brand can deliver, there are also a ton of recommendation strategies to experiment with.
Let’s try and widdle our options down by hypothesizing what we believe would work best for each audience overall and why. A good way to go about building a hypothesis would be to use the following statement and plug in the missing details:
IF we [proposed change to the site or digital experience] FOR [audience], THEN we will [impact on KPI] BECAUSE [reason behind why we think the outcome will be positive].
Using some of the initial observations we already made in our audience selection process, how might our “Hight Intent” audience pair nicely with different recommendation algorithms?
Audience | Site Location: Product Detail Page | |
High Intent | Similarity | Affinity |
The above options represent some basic, initial algorithms to try on the PDP – winners can vary greatly according to the brand, user base, and audience, which is exactly why we test to begin with.
In general, we see product recommendations as evergreen elements on the site, so it’s important to set the right KPIs to understand impact over the long-term. While each product recommendation test will be different depending on which page the widget is being placed on (and even how the widget is designed, e.g. includes an add-to-cart CTA), a good rule of thumb is to look at the next logical step in the funnel.
For example, the homepage would be click-through rate (CTR) or pageviews, whereas product view or add-to-cart would make sense for a product listing page (PLP), and revenue per user for the cart itself. Since we’re focused on the product detail page, the appropriate KPI in our case would be add-to-cart of purchases per user.
Now, let’s break down our hypothesis for the “Similarity Strategy,” which is best paired with the “High Intent” audience, and see how it resonates:
IF we should similar products on the product detail page FOR high-intent users, THEN we will see an increase in purchases per user BECAUSE we’re showing them items that are more suitable version of the product they are looking at.
However, we always recommend testing at least two recommendation strategies against one another (as well as a control) should our initial hypothesis not end up holding true.
Therefore, our second hypothesis for the “Affinity” strategy (also well-suited for this audience) might look something like this:
IF we show affinity-based products on the product detail page FOR high-intent users, THEN we will see an increase in purchases per user BECAUSE we’re delivering holistic product recommendations that are more personalized to their preferences.
On to testing those theories.
Step 4. Collect data for analysis
After launching any experiment, it’s important to let the test run for a minimum of two weeks for the purpose of both reaching statistical significance as well as reducing any potential effects from seasonality.
Beyond statistical significance, we also recommend gauging a particular variation’s Probability to Be Best (P2BB). This indicates its long-term probability to outperform all other live variations, which in our case includes the recommendations being delivered with a “Similarity” strategy vs. the “Affinity” strategy and then vs. our control.
Standard practice holds that if the P2BB of the leading variation goes above 95%, it can be declared a winner. However, 80% is typically a healthy enough threshold to act on – truer among websites with less traffic that may require more time to populate results.
For the sake of our purposes, let’s say the “Affinity” algorithm achieved a P2BB of 80%. We want to then declare its winning position and kill off the losing algorithm, making notes on why our “High Intent” audience preferred it over the “Similarity” strategy.
In this case, we’ll conclude that personalized recommendations based on an individual’s previously expressed affinities was more powerful than showing a similar item to the one already in view for this audience. And with these insights, we’ll then go on to test another algorithm against our winning “Affinity” strategy.
Our next test could be a completely different algorithm, such as “Viewed Together,” or modifications could be made to the existing experiment. For example, adding dynamic inclusion/exclusion filtering rules such as “only show products above $25” or “only show items within the same category.”
This process can be continued indefinitely to further maximize results, but it’s always good to set a benchmark to hit that will allow room for additional improvements to be made to the experience, like the messaging element of the recommendation widget.
Step 5. Fine tuning by testing other experience elements
It might seem trivial given the title of the recommendation widget usually consists of only a few words, but it’s absolutely critical to contextualize to audiences why a particular selection of products is being shown.
Using our winning “Affinity” algorithm, let’s put together a series of message options to test and roll out the best performing variation:
- Control – Recommended for You
- Variation A – Just for You
- Variation B – Tailored to You
- Variation C – Chosen for You
- Variation D – Based on Your Views
Use CopywriteML to generate more alternative copy for your text and calls-to-action with AI.
Beyond messaging, teams can also experiment with the overall look and feel of the recommendation widget as well as its placement on the page.
From zero to personalization with recommendations
The only way to overcome the paradox of choice associated with personalization is to simply get started. And while there is no right answer when it comes to where a company’s journey begins, there are higher impact efforts teams can prioritize right out of the gate to generate meaningful quick wins that will lay the foundation for the rest of their program.
Hopefully the process above will allow you to get there just a little easier!