Lola’s Recommendation Challenge
How to make recommendations with limited data and unusual constraints
As a member of Lola’s hotel recommendations team, I’ve frequently been asked what makes our work interesting. Recommenders are old news - Netflix and Amazon have been around for ages now and seem to have nearly perfected the formula.
My answer is simple - Amazon and Netflix got the easy version of the problem.
Fundamentally, recommendation engines are tools that identify patterns of relationships. The more relationships - between two products, between a user and product, or between users - the better your recommendations get. This means you want:
- Users who are willing to buy frequently, to establish more user-product relationships;
- Users who will purchase products that cover a wide space of possibilities, to create more relationships between your products;
- Users who readily identify their demographics, to identify better user-user relationships;
At Amazon, users buy new products all the time, from a vast array of categories. Aside from simple user-product relationships, these purchases also reveal lots of demographic information through particularly indicative sales or overall behavior.
At Netflix, users can freely try out new shows with no fear of sunk cost - if they don’t like it, they can simply stop watching. Additionally, users are on Netflix all the time, watching hours of content every day.
With so many clear relationships, you can extract great recommendations through any number of mathematical techniques. Collaborative filtering recommends items with similar ratings to others a user prefers; matrix decomposition attempts to identify characteristics of users and items that indicate good fit; clustering assumes there’s a discrete number of types of users and recommends the same set of items to all of them. Of course, Amazon and Netflix have grown far more sophisticated than any of these simple techniques - they have to take into account things like when you made a purchase (as your tastes might change), whether your preferences vary around certain holidays, whether you sometimes make purchases for another person, etc. But it is important to understand that these companies (and others in similar spaces) can rely on these basic techniques to quickly make decent recommendations with easily-understood consequences. Even as a simple baseline to begin building a machine learning (ML) system from, this is hugely valuable.
Hotels, however, are a different sort of beast. Lola’s users buy infrequently - few people need hotels more than once or twice a month. Purchases don’t vary much - brand loyalty and the complexity of the space means people default to staying at a small set of familiar hotels, revealing very little about what they’d truly enjoy the most. And perhaps most importantly, hotels are a major purchase. We cannot afford to suggest that our users book a hotel we’re not certain they’ll enjoy, which means it’s very difficult to explore lesser-known possibilities.
An example of hotel results. After distilling all of this data, we make sure to present the options to our users in the simplest way possible.
These challenges do not mean we should invent an entirely new algorithm. After all, collaborative filtering works no matter how much data you have, it just gets better with more. Our question is, how can we kick start these sorts of algorithms by squeezing as much information as we possibly can from as little data as possible?
Hmm. That sure sounds like the domain of deep learning, doesn’t it?
Deep learning is incredibly good at identifying every pattern the data has to offer, though at the risk of getting too good and identifying “phantom” patterns that aren’t real. Deep neural networks have been used to make great strides in a wide variety of problems, ranging from identifying the content of an image, to predicting someone’s personality from a picture of them, and, of course, making recommendations.
Deep learning, while not typically used as the first attempt at a recommender, is still a fairly standard technique. It does have some challenges - primarily bootstrapping, or the chicken-and-egg problem of how to initially train the system. Basically, the ideal way to train a deep learning system is with examples of what great recommendations look like - but our best way of getting examples of great recommendations is to start making them and keep track of the best ones. So in order to obtain data to train our recommender, we need to use our recommender! This is a typical issue with deep learning systems, and can be solved in a variety of ways, from purchasing data from another organization to finding mathematical rules to train the system on. In our case, we believe our human travel agents have a lot of wisdom, and have used their best recommendations as a starting point for training our system, while still letting it experiment and improve itself over time.
So we seem to have a typical ML problem, which is difficult, but perhaps not as interesting as I implied, right? Well, not quite, because we aren’t quite done. Our challenge isn’t that we are using ML, and that it’s simply hard. It’s that we’re constrained to ML, for the reasons above, while also having one more critical constraint: Explicability.
Lola doesn't just recommend the right hotels - it makes sure you understand why they are right for you.
At Lola, we know a new user won’t book a hotel just because our AI tells them to, no matter how well the AI knows them. They’ll want to know why the hotel is a good fit - and we want to tell them! But anyone who has dealt with the black boxes that are neural nets knows how difficult that is. Pulling an explanation out of a neural net is true cutting edge work - while some techniques exist for visualizing the latent space, they are still immature. They are meant to help human researchers gain intuition, rather than to produce plain-English automated explanations of their thinking.
I’d love it if I were able to report on our brilliant solution to this problem, but we’re still quite early in our process. We’re beginning to explore a number of strategies, ranging from heavy NLP (trying to mathematically describe a good explanation, so that we can design systems to generate them based on a recommendation); to massively complex (training a recommender to produce both a recommendation and a plain-English explanation); to basically psychological, for instance, doing research on whether the information users most want to hear about is the same as the information that leads to the best actual recommendations - any divergence might let us design honest descriptions while still accepting a largely black-box recommender. We’re looking forward to being able to report back as we make further progress - but for now, I’m just excited to dive further into what promises to be an adventure.