There are five skills that are important for product managers. It’s about customer perspective and understanding of design, communication, collaboration, business strategy, and technology. As we use machine learning more and more these days, our understanding of the technology becomes even more important. It doesn’t mean that you can’t be a good product manager if you don’t have a deep understanding of machine learning, but it is still necessary to have some understanding of how machine learning works in order to make good product decisions.
By understanding the impact and limitations of machine learning algorithms, you can determine whether the same customer problem should be solved through product design or through machine learning methods. So we’ve compiled a list of five machine learning algorithm limitations that every product manager needs to know.
- bias in the data
It’s important that you have data that is representative of the users you’re targeting. This data bias problem is one that comes up very often when doing machine learning projects. It’s a bit of an exaggeration, but when you’re predicting conversions, for example, when there are so few people who are converting in the first place, the data we have is full of people who haven’t converted. So it will be difficult to predict who will convert in the first place. Another example is Google’s labeling of black people as gorillas. This was also probably because the original data did not contain enough data from the photos of black people.
Even if there’s nothing wrong with the way the data is collected, there can be a bias in the data at hand in the first place – I once gave IBM Watson data from the Urban Dictionary to learn a language and it became a name-calling thing. The intention was to use polite language, but the cursing was in the data, and the computer had learned that too. This is an example of how such data needed to be cleaned up ahead of time.
- the trade-off between precisions and recalls
I recently had the opportunity to speak with a team that is building a product that does the same type of prediction that my team does. Both were about predicting bad user behavior, but each team’s objectives were different. My team only wanted to get rid of the bad users, and we wanted the good users to continue using our service, so we didn’t want to get rid of them by mistake. So, for us, precisions are important. (The probability that it’s right when it’s predicted to be bad, i.e. that it’s a really bad user as a result.) Another team requested that we never want bad users to use the service. Even if it actually ends up getting rid of a good user. This means that for them, recall is important. (The odds of being able to hit a really bad user.) It’s always a trade-off relationship between precisions and recalls, where one gets better and the other gets worse, so it’s important to understand the purpose.
- cold start
This is a question of how the algorithm should predict when there isn’t enough data yet.
There are two patterns. Let’s start with a cold start in regards to users.
For example, when a new user comes in, we don’t have any data on them yet, so we don’t know what to recommend them to. There are a few ways to solve this. I ask a few questions when a new user first comes in. It’s like saying, “What movie do you like? Or to the extent that you can, using other available data. For example, if you’re from California, you might be able to recommend the 10 best movies in the same area.
Next is the case of a cold start for items such as products.
When a new item comes in, say a new movie title in the case of Netflix, we don’t know who to recommend it to because no one has seen it yet and we don’t have any data about this movie at all.
One solution is human annotation (annotating). This is also the case with Stitchfix, where items are tagged as categories by experts in the field, making it easier to recommend them to users who are interested in those categories.
An alternative solution is to use an algorithm. It’s like A/B testing, or more accurately, testing using an algorithm called Multi Armed Bandit, where you start showing a new item to users at random and then quickly learn the segment of users who are interested in the item based on the results.
The system of outlier detection can be confused by this cold start problem. For example, you may have experienced the sudden loss of credit card privileges while traveling for the first time, or a new employee accidentally locks their account when they try to access the database. In order to correct these false positives, or problems that the predictive model judges to be problems when they are not, we should have a solid feedback loop mechanism in place, which is the next topic.
- a feedback loop for model validation
You should have a mechanism in your product that uses machine learning models to give feedback on those models. That way, we can validate the performance of the model in the real world and continue to improve it. This is the kind of feedback that says a user ignored the recommended news, or read the news in reverse. This can also provide clear feedback to users. This is the kind of message you often see that says, “Did you find this article useful?
- exploration and exploitation
Let’s say Netflix finds out that I like watching football. That way, the recommended list would include relevant shows, such as football games and documentaries. If you look at some of them, you’ll see more and more football-related things being recommended. This type of algorithm uses the signals it finds and optimizes them. But of course I am interested in more than just football. For example, I’m interested in technology, but Netflix doesn’t recommend any technology-related content. This is what we call a filter bubble in the media these days, where if you ‘like’ a particular news item on Facebook, your timeline fills up with that kind of news. That way, we don’t have to look at other news and other perspectives.
This means that I don’t have any perspective other than what I like to do.
To solve this problem, the system needs to recommend content that encourages users to venture out, even if there is no clear signal to do so. There are various ways to do this, such as randomly deciding, or based on the preferences of users who are similar to you.