Selling fragrance online is like selling music on compact disks.
Imagine you can’t download it.
There are genres, notes, bands/brands, but you can’t say if you like it before you experience it. Maybe several
times. But you know that your friend recommended you a lot of great stuff before because they tried it all. Not
because of the notes or octaves or instrumentals or vocals, right? And they recommended different genres, too.
That’s where the recommender comes into play: a friend.
Users range from newbies to connoisseurs and most of them have to figure out how to pick a fragrance online.
They rely on an algorithm to narrow down the options, but they want to make the final choice themselves.
How do I pick the next fragrance online?
I want personalized recommendations.
Increase LTV by providing better recommendations.
Scope of work + outcomes
Learn how they pick the next product online; 20 interviews.
Product card redesign
Add to queue metric performed ~50% better.
We needed more product ratings to train new model; the Orders history section redesign helped us to grow the number of ratings three times in the first month.
We've built data pipelines to track each recommender's performance from the moment the product was displayed to the moment it was received and rated.
The new recommender algorithm
The new algorithm was implemented and tested against the previous during the A/B test.
Pick the right algorithm
The previous recommender was based only on product properties (TF-IDF). I.e. if you like banana, it will
recommend you more banana-based items.
My goal was to provide recommendations that were based or real-life experience from other users to give people
more variety of recommendations. I.e. if you like Tesla EV, you are likely to be interested in SpaceX as well,
even though their properties are different. Same with fragrance.
Natural language processing
Another option was to try Natural Language Processing methods to extract topics from user-generated product
reviews. I've built a prototype. After analyzing 400,000+ comments I found out that most of
them were vague ("LOOOOOOVE IT!!!"-kind of vague) and the remainder was insufficient to extract representative (useful)
information. At that moment we decided to postpone this path, although scraping other sources was really an
The ultimate idea was to combine all of them, but it was too much for an MVP.
Lack of data on products outside Scentbird's portfolio (~500 items vs. 65,000+ out in the world)
The original algorithm was based on product properties.
Zero data on how existing recommenders affected user retention.
Vague user-generated product reviews forced us to postpone natural language processing experiments
(recommendations based on topic extractions).
Hypothesis: recommendations quality affect user retention
Core design stages
Build data pipelines to start tracking recommendations influence on retention
Gather product ratings to train the collaborative filtering model (recommendations based on
the real-life experience from other users)
Test and implement collaborative filtering algorithm
There are a lot of recommenders sprinkled all over UI. Depending on the context, they were mostly smart
database queries based on similar notes, brand names, collections, search suggestions, the fragrance of the
There was a hypothesis that some of them caused more harm than good.
To measure the efficiency of each recommender and its influence on retention, we decided to build data
pipelines, gather the data, and analyze it to make informed decisions in the future.
Product tracking stages
Exposure: location (recommender id) and frequency
Added to the Queue
This data collection implies long waiting time since fragrances are shipped one item monthly (maximum three
items per month on a highest plan).
Get 3× more ratings to improve the new model
We added a simple rating mechanism to the Orders history section. The number of ratings grew three times
instantly, thanks to the existing user base.
Product card redesign
I learned from power users how they pick fragrances: they read a lot, digging into notes and reviews, searching
for more information on other sites. To ease their process and to show more options to newbies, we altered a
(Small product cards remained for backward compatibility.)
Personalization and future algorithm improvements
We added like/dislike options to the recommendations feed to gather people's reactions as an initial measurement
for any of our recommenders. These options are also a great tool to personalize future recommendations.
Dislike options were implemented first to learn what confuses users. Sometimes dislikes don't mean that the
recommendation is wrong; it's just that the user has already tried the fragrance and doesn't want to see it in
the feed again.
Previous Quiz was based on product properties.
This time we mapped positive ratings to quiz answers, and from this moment, we could also map quiz answers to
other users' experiences. I.e., new users could take a quiz and get recommendations based on similar responses
and positively rated products by other users.
Building services like recommendations are going far beyond UI.
Some important features were not delivered because we lacked the necessary data (or it was broken and
inconsistent). So, data is the key.
Even though it takes time to test the results on a scale, we could test an algorithm right in the office —
just by trying recommended fragrances. It doesn't give you statistically significant results, but you can
feel a bit more confident launching the experiment.
As with everything at Scentbird, the project was decomposed into smaller tasks and they were implemented
gradually, which made a bit harder to track changes.
It was very pleasing to observe positive reactions from users and colleagues on new product representation.
People wanted to explore more products, which was the initial goal.
There is a correlation between negative product ratings and churn rate, although it takes
time to gather enough data to make a definitive
answer. We implemented all the tools to track the necessary data and improved UI.
The new algorithm will continue to use data to provide different recommendations — ones that real users found useful.
Collaborative filtering, being automated and feature-independent, scales better for recommender services.