I want to know if my current customers will buy my products or not, or if they will buy a specific product in the future.
Logistic regression is a binary classification algorithm that is used to predict the probability that the answer to a question is 1 or 0, yes or no, true or false, among others. In this case, we are going to use logistic regression to find out the probability that a customer will buy a product or not. Using our logistic regression model, it is now possible to find predictions of the "probability of an event occurring" where it is considered as a cut-off for segregating if the customer will buy a product or will not buy a product.
Once the data source is integrated, go to the pipeline's section, and use the operator Logistic Regression.
Identify the variables, independent or predictors and dependent or target variables. These are what we will call the feature to predict “Y Column” from the features “X column”. The goal is to come up with a model to classify / identify “y column”.
After this, build the dataset only with the characteristics to be used, that is, only having x_columns and y_column.
With the perfectly constructed dataset, the data feature is entered into the model.
If the dataset or data source contains more features that are going to be used in the model, then it is recommended to use the “CustomSQL operator” to only obtain the desired features.
A table will be obtained for each step. If the PREDICTION DATA table is entered, then the respective model metrics or predicted values will be obtained.
If the dataset or data source contains more features that are going to be used in the model, then it is recommended to use the “CustomSQL operator” to only obtain the desired features.