Before explaining what this operator does, let’s consider the meaning of recommendation algorithms.

Typically, data needs involve classifications and regressions. Classifications attempt to correctly assign one of the known / modeled labels (as of our problem requirements) to each incoming data point (according to given domain) in whatever incoming dataset we want to know about, based on previously gathered data we deem as training data. In this case, the training data attempts to teach the classifier we use how to link the data points to one of the known labels.

Regressions work similar to classifications. In this case, the co-domain we want to predict is a continuous one. Common examples involves a price, sales volumes, amount of people in certain place(s), the spread and incidence of an airborne disease, or the correlation between certain genes and diseases. Again, the algorithm needs to be trained first, a process from which it will learn how to, this time, assign a value in that continuous co-domain.

What they have in common is that those are algorithms that understand a pattern based on a previous training, and then operate over new incoming data.

Recommendation algorithms (in particular, a subset we’ll care about in this document which is called Collaborative Filtering algorithms) work in a different domain of problems: they operate in the actual and potential relationships between a known sets of users and products, and then they suggest or predict values to those known users in that known space, like this:

  1. For a target user (actually, this runs for many of them, or all of them, simultaneously), it is known which product they bought, or which content they were exposed to, and -if available- which rank, rating or score they gave to that interaction (if the system does not support ranking, a constant value is typically used there).
  2. The algorithm compares the target user with other users which also watched their product and gave their score.
  3. The algorithm attempts to fill the empty spaces with recommendations for the target user for products or services they did not acquire yet (and how much would they like / rate them).

The mainly known algorithm (and the one implemented in this operator) is called Alternating Least Squares (ALS) or Matrix Factorization, which was awarded by Netflix in a competition they created to find a good recommendation system for their streaming platform*.* Both names stem directly from its implementation: the algorithm attempts to derive two lower-range matrices with an extra dimension, typically involving features and their respective appreciations (these are called latent factors and are inferred from the in-place learning), both for the users and products lists; when doing a matrix product on them (hence the factorization term), a good enough approximation of whatever the users ratings of unwatched / unbought products will be created by using a sort of iterative algorithm that solves an euclidean distance on those ratings (hence the ALS name). While there are more algorithms (which also have their own hyperparameters, when that applies), this document will focus on ALS, its input needs, and how the output results should be understood.

What can it be used for?

The scope for this algorithm type is quite narrow in nature: It can be used to recommend an interaction or assess how much will the user feel an interaction with a given element of the system or a product to consume. It does not have a notion of sequence, so time-domain problems or probability of future interactions might be out of this domain, depending on the problem structure. Again: Streaming platforms use algorithms like ALS to predict what the visitor would like, based on previous purchases and similar profile users.

How to use it?

We will need at least three elements in our pipeline in order to appropriately set this operator up and retrieve the expected results:

The first thing to understand is the input of the Recommended Products operator. Such dataset must have the following fields: