View profile

Amazon´s Recommendation Engine Secret Sauce - Issue #4

Amazon´s Recommendation Engine Secret Sauce - Issue #4
By Mario Gavira • Issue #4 • View online
Hi from tropical Barcelona,
You bought a new shiny digital camera on Amazon and in an eyeblink the site recommends you a memory card that perfectly fits your buy. 
How does the magic happen? Has Amazon`s engine any weaknesses? Can the recommendation algorithms be improved? 
Let’s uncover the secret. 

Amazon’s US retail e-commerce share is expected to increase from 44% last year to half of total e-commerce sales in 2018. Amazon owns the richest world dataset on how consumers consume and how sellers sell. This allows the retail giant to continously optimize its online shopping experience through data signals like purchases, searches, and reviews in an ongoing data feedback loop, creating a classic network effect flywheel. 
Source: https://www.cbinsights.com/research/report/amazon-strategy-teardown/
Source: https://www.cbinsights.com/research/report/amazon-strategy-teardown/
In the heart of this ecosystem lies Amazon´s recommendation algorithm that allows to create a personalized shopping experience and increase the amount of revenue generated from each customer. According to a McKinsey report,  35% of all Amazon’s transactions come from algorithmical product recommendations.
Given Amazon`s scale, they have developed their own recommendation engine capable of handling tens of millions of customers and products in near real time.
Although their proprietary recommendation algorithm is secret, Amazon’s patent application “Personalized recommendations of items represented within a database” provides a fascinating insight on the ins and outs of the engine. You can dive into the 14 040 words document here or discover the key learnings in this piece.
As Brent Smith, Amazon Sr Manager in Personalization, poetically stated:
Recommendations and personalization live in the sea of data we all create as we move through the world, including what we find, what we discover, and what we love.
Being more specific, Amazon´s recommendation system is based on a number of data signals collected throughout the shopping experience: what a user has bought in the past, which product they place in their shopping cart, items they’ve rated and liked, and what other customers have viewed and purchased. 
Two approaches to recommendations have become the mainstream solutions in e-commerce:
Source: http://datameetsmedia.com/an-overview-of-recommendation-systems
Source: http://datameetsmedia.com/an-overview-of-recommendation-systems
Collaborative filtering: looking at the user-product interaction by finding customers with similar transaction history and recommend the top products bought by that similar buyer to the shopper under study. 
Content based filtering: looking at the product and not the customer, by simply recommending the top items most similar to the product viewed by the user.
Amazon combined the best of both world appoaches to create its  homegrown algorithm defined as “item-to-item collaborative filtering” .
This flowchart illustrates how the list of recommendation is built. It begins by looking at the items that are associated with the user and builts the recommendation table by computing how similar it is to other items in the collection. 
Source: "Personalized recommendations of items represented within a database" Patent
Source: "Personalized recommendations of items represented within a database" Patent
To determine how relevant the recommended items are, the algorithm looks at customer ratings for each product and filters out items that have already been bought by the user.
Source: "Personalized recommendations of items represented within a database" Patent
Source: "Personalized recommendations of items represented within a database" Patent
The beauty of the system is that most of the computation is done offline. Once the recommendation table is built (38) it is injected into the engine.
Source: "Personalized recommendations of items represented within a database" Patent
Source: "Personalized recommendations of items represented within a database" Patent
This allows to display recommendations almost in real time.
Amazon uses neural networks for their engine.  To be capable to compute hundreds of millions of customers and products in real time, they created a so called DSSTNE, “Deep Scalable Sparse Tensor Neural Engine”
This system trains neural networks and powers the different personalized experiences for millions of customer journeys.
Amazon opened it’s machine learning power to external users as part of its Amazon Web Services (AWS) product. The architecture of their real time recommendation engine for 3d parties is detailed in this diagram.
Source: https://aws.amazon.com/blogs/big-data/analyze-data-in-amazon-dynamodb-using-amazon-sagemaker-for-real-time-prediction/
Source: https://aws.amazon.com/blogs/big-data/analyze-data-in-amazon-dynamodb-using-amazon-sagemaker-for-real-time-prediction/
Lost in translation?  Let’s recap
The key advantages of Amazon’s item-based collaborative recommendation algorithm are:
  1. the recommendations are highly relevant
  2. they are computed in real time
  3. the algorithm scales to hundreds of millions of users and tens of millions of items without sampling or other techniques that reduce the quality of the recommendations
  4. it updates immediately on new information about shopper’s interests.
  5. this feedback loop allows to constantly improve and tweak the algorithmic models. 
Can we conclude that Amazon has reached its maximum potential in its recommendation capabilities?
I would argue that the system is far from being perfect and that the tech giant only scratched the surface of what AI can offer in terms of recommendation intelligence. 
Let me tell you why.
Timing is everything
Modelling time correctly in the recommendation algorithm is both an art and a science.
Amazon.com’s catalog is continually changing. Thousands of new items arrive and disappear on a daily basis,  especially in categories such as seasonal clothing fashions and consumer electronics. 
The so called  cold-start problem means that new arrivals can be at a disadvantage because they don’t have enough data yet to have a strong correlation with other products. This requires an explore/exploit process to give items an opportunity to be shown. 
The recommendation engine also faces the cold-start problem for new visitors with no information about their interests and behaviour. It is critical to collect data for these first-time users on referral sites, what ads they are attracted to, what categories they browse, what items are added to the shopping cart and which ones are abandoned.  Computing this browsing behaviour on the fly to generate relevant recommendations is key convert first-time visitors to customers.
Even for loyal customers the algorithm needs to factor in critical timing elements:  
  • older purchases becoming less relevant to is current interests. The speed of the decline is different between products indicating a durable long-term interest (i.e. bike helmet) and items that fulfill a short term need  (i.e. light bulbs)
  • some purchases will trigger a change in recommendations over a longer period of time  (i.e. from baby diapers to child games)  
  • for daily use products (FMCG´s) such as toiletries or packaged food, recommendations can be scheduled in regular time periods based on purchasing patterns.
  • time-limited external events can massively influence buyers behaviour and need to be factored into the recommendation engine.
Let´s get personal
Let me share a personal real life example to illustrate a use case with an external event. Mid June I bought the national flag from Germany… you guessed it right, we supported Germany in the FIFA World Cup.
Beginning of July, a week after Germany was (to our deep regret) eliminated from the tournament, I received an Amazon email highlighting an identical German Flag to the one I bought a few weeks earlier:
Amazon email "recomending" the same flag again when Germany was out
Amazon email "recomending" the same flag again when Germany was out
Amazon state of the art recommendation engine suggesting the same flag I already bought, and on top of it with Germany already out of the tournament? 
What went wrong? I can’t say for sure, but me share my theory…
Finding the right match
Amazon must have in its backoffice a unique ID for each product from each seller, but in many cases several sellers will offer nearly identical products with different ID’s. Therefore the engine is unable to match the recommended item with the purchased one.
This is a problem that is likely to increase over time with the number of worldwide Amazon seller and overlapping catalogues exponentially growing. 
I dare to say that an AI powerhouse like Amazon is capable of tackling this problem. Combine some image recognition programs with text mining algorithms and it should be fairly straightforward  to match different product ID’s when images and text descriptions overlap and filter the item out of the recommendation table.
The world moves too fast
The second problem in my personal use case is more complex. With Germany’s exit  from the tournament, customer desire to buy a national flag should in all evidence drop dramatically.  Amazon’s algorithm clearly did not factor in this external event in the recommendation process.
How can the system be improved to avoid this type of misplaced recommendations?
External events do influence consumer behaviour. World Cup results impacting the purchase of football fan equipment might be an extreme case, but other events such as fashion trends, marketing campaigns or even economic or political changes can influence buyers behaviour and shopping habits. 
Trying to factor in all these elements by adding external data signals into the recommendation algorithm seems not only a titanc effort but also a potentially biaised approach: how do we prioritise external data sources and where do we set the limits?
Certain recuring events where a clear correlation is identified and a robust data source is available might justify being integrated into the algorithm.  An example would be the local weather conditions and their impact in the apparel industry sales.  
But overall,  the most effective systemic solution to this problem wll be continously refreshing the weight of items in the algorithm, based on the aggregated navigation and buying behaviour of other users - without the need of understanding what external factors trigger changes in the purchase behaviours.   
Back to my personal example: either Amazon did not identify a drop in searches, clicks and sales for German Flags after the country dropped out of the tournament, or the algorithm is not sophisticated enough to factor in the drop in my recommended item list.
What will Amazon’s future recommendation engine look like?
A recommendation engine boils down to a number of pipelines (or filter pattern implementations)  that allow for a context to be evaluated by a number of modules applying certain business rules.
Amazon´s leading edge in machine learning technics, computing power and massive wealth of consumer data puts them in a unique position to keep tweaking and optimizing its recommendation algorithms moving forward. Quoting Mr Bezos, they are still in “Day One” of its AI efforts, reflecting his belief that Amazon’s journey has just begun—and begins each day again and again.
So where will this algorthimical path leads us to? 
Amazon’s vision is that recommendation engines will move beyond the current paradigm of searching, clicking and buying and will become like talking to a friend who knows you, your interests, what happens around you  and anticipates your needs. 
 We’re convinced the future of recommendations will further build on intelligent computer algorithms leveraging collective human intelligence. The future will continue to be computers helping people help other people.
Can AI become your best teammate?
Google´s Deepmind believes that AI agents might be more collaborative than humans if properly trained.
The Office
The Office
Teamwork is extremely difficult to develop in AI programs, because it involves dealing with complex and ever-changing situations.
Google’s DeepMind shared the results of an experiments in which AI bots were trained to play a first-person multiplayer  game “Quake III Arena Capture The Flag”. The goal was to learn how AI agents can team up in a simple video game and understand how effectively machines can work with other machines and also with human players. 
The researchers used an architecture named “For the Win (FTW),” that trains hundreds of agents using reinforcement learning. 
A schematic of the For The Win (FTW) agent architecture. The agent combines recurrent neural networks (RNNs) on fast and slow timescales, includes a shared memory module, and learns a conversion from game points to internal reward.
A schematic of the For The Win (FTW) agent architecture. The agent combines recurrent neural networks (RNNs) on fast and slow timescales, includes a shared memory module, and learns a conversion from game points to internal reward.
DeepMind´s AI agents had to learn general strategies to be able to adapt to each new map, cooperate with team members, compete against the opposite team and be able to adjust to different enemy play styles. 
The machines were never told anything about the rules of the game, yet learned about fundamental game concepts and apparently developed an intuition for the game.
Three examples of the automatically discovered behaviours that the trained agents exhibit.
Three examples of the automatically discovered behaviours that the trained agents exhibit.
Researchers found that AI agents win more often than humans.
The performance of AI agents during training. A new agent, the FTW agent, obtains a much higher Elo rating - which corresponds to the probability of winning - than the human players and baseline methods of Self-play + RS and Self-play.
The performance of AI agents during training. A new agent, the FTW agent, obtains a much higher Elo rating - which corresponds to the probability of winning - than the human players and baseline methods of Self-play + RS and Self-play.
Even more surprising though was the fact that machines learned human-like behaviors like following teammates and turned out to be more collaborative than people. 
Can machines become the future teammates you always wished to have?  
Not so fast. The game environment where these AI algorithms were trained and operated were limited by a clear set of rules.
Demonstrating teamwork in the real world will be a lot more challenging—and achieving that is likely to be a long way off.
The Office
The Office
Capture the Flag: the emergence of complex cooperative agents | DeepMind
Can AI beat professional gamers in complex videogames?
The annual world championship of Dota 2 this August will bring together the best professional gamers in one of the internet´s most watched events. Five AI bots from OpenAI, a company backed by Elon Musk , will be part of the contenders with the goal  of setting a new milestone in the man vs machine game duels. 
The strategy game  present an average of 1,000 choices of possible actions to the player at any one time of the match. That makes it harder to win than other games such as Go, which has 250 actions, or chess.
OpenAI learned how to play Dota 2 by playing the equivalent of 180 years of matches using the reinforcement learning technique, in which software uses trial and error to discover what actions will maximize a virtual reward.
Will they beat professional gamers? Stay tuned…
Elon Musk’s OpenAI Takes on Pro Gamers in Dota 2—And Could Win | WIRED
The quote of the week
It’s no longer about having a technology strategy but data strategy. If you don’t have a data strategy, you won’t be here long.
Rob Brown, MD OTA Travelport, in Travelport Live 2018 Bangkok.
Thanks for reading, AI enthusiasts!
Did you enjoy this issue?
Mario Gavira

Your regular dose of AI news and trends decrypted for Non Techies, including practical insights and case studies on how AI can have an impact in your daily business.
Carefully curated by non-artificial intelligence!

In order to unsubscribe, click here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue