Machine Learning in R with caret

Machine Learning is one of the main uses of the R programming language. There are many models within the world of Machine Learning and there are still many more R packages to use these algorithms and caret is one of the main ones.

Although the fact that there are many libraries implementing models is a good thing, it also has its negative points. For me, the main disadvantage is that finding the optimal value of hyperparameters changes depending on the library… So the process is really tedious.

Luckily, in R we have caret, in my opinion, a fantastic package that will allow you to apply more than 230 Machine Learning models, all of them following the same structure and making your work much easier. In addition, R’s caret package has a lot of fantastic functions that will make your work much easier in the different stages of the Machine Learning process: feature selection, data splitting, model validation, etc.

As you can see, R’s caret is a fantastic package and without a doubt, if you use R it is one of the packages that you should know in-depth. And that is exactly what we are going to do in this post. Let’s get to it!

How to prepare data with caret

Before building a Machine Learning model, we must first perform several steps, such as Feature Selection, imputation of missing values, data dummyfication, etc.

These are many steps that you generally have to code yourself. Luckily, caret offers features to help us through most of the steps in the data preparation process. Let’s get started with Feature Selection!

How to perform Feature Selection

One of the fundamental aspects in the selection of variables is to check if their variance is zero or close to zero. This makes perfect sense: if the variance is close to zero, that means that there is not much variation within the data, that is, almost all observations have similar values.

Therefore, variables with variance 0 are usually discarded, since it is very likely that they only add noise to our model (obviously you should also consider the range of the variable).

Checking whether variables have zero variance or not with caret is very simple. We just have to use the nearZeroVar function.

Let’s see how it works with an example from the Sacramento dataset, which includes information on house prices in Sacramento. First of all, in case you are not familiar with the dataset, let’s visualize it:

library(caret)
data(Sacramento)
str(Sacramento[1:3,])
'data.frame':   3 obs. of  9 variables:
 $ city     : Factor w/ 37 levels "ANTELOPE","AUBURN",..: 34 34 34
 $ zip      : Factor w/ 68 levels "z95603","z95608",..: 64 52 44
 $ beds     : int  2 3 2
 $ baths    : num  1 1 1
 $ sqft     : int  836 1167 796
 $ type     : Factor w/ 3 levels "Condo","Multi_Family",..: 3 3 3
 $ price    : int  59222 68212 68880
 $ latitude : num  38.6 38.5 38.6
 $ longitude: num  -121 -121 -121

As we can see, we have several numerical variables (number of bathrooms, number of beds, price, latitude, and longitude). Let’s see if they have zero variance or not.

numeric_cols = sapply(Sacramento, is.numeric)
variance = nearZeroVar(Sacramento[numeric_cols], saveMetrics = T)
variance

As we can see, if we pass the saveMetrics the argument, it saves the values ​​that it has used for the calculations. So one of the returned values ​​is nzv (near-zero-variance), which is false in all cases.

So we can use all of our numeric variables in our model prediction, at least for now.

Another important question is the correlation between variables. There are several models such as linear regression and logistic regression where you cannot have correalted data.

So, let’s see how to check the correlation between variables in R with caret.

How to find correlated variables with caret

Finding correlated variables in R using caret is very easy. To do this, you just have to pass a correlation matrix to the findCorrelation function. With this, caret will tell us which variables to eliminate (if there are any).

Let’s see how it works:

sacramento_cor = cor(Sacramento[numeric_cols])
findCorrelation(sacramento_cor)
integer(0)

As we can see, in this case, there are no correlated variables, so caret tells us that there is no variable to eliminate. However, if we create a new correlated variable, we will see how it would tell us that there are problems. Let’s see:

fake_data = data.frame(
  variable1 = 1:20,
  variable2 = (1:20)*2,
  variable3 = runif(20),
  variable4 = runif(20) * runif(20)
)

findCorrelation(cor(fake_data), 
                verbose = T,
                names = T)
Compare row 1  and column  2 with corr  1 
  Means:  0.438 vs 0.269 so flagging column 1 
All correlations <= 0.9 
[1] "variable1"

As we can see, the findCorrelation function identifies that variable1 is correlated with variable2, and indicates that it should be removed. But what if the variable was a linear transformation of other variables? That is, if we had a variable5, for example, that is the sum of variable1 and variable3. This would still be a problem, even though the variables are not correlated.

Well, precisely to detect these cases, caret includes the findLinearCombos function. Let’s see how it works:

# I create fake data
fake_data$variable5 = fake_data$variable1 + 2*fake_data$variable3

# I check if there are any linear combinations
findLinearCombos(fake_data)
$linearCombos
$linearCombos[[1]]
[1] 2 1

$linearCombos[[2]]
[1] 5 1 3


$remove
[1] 2 5

As we can see, the findLinearCombos function tells us that columns 1 and 2 are linear combinations and so are columns 5,1, and 3. That is why it recommends eliminating columns 2 and column 5.

As we can see, we can perform very important questions about the Feature Selection process in R thanks to the caret package and, furthermore, in a very simple way.

But that’s not all, R’s caret package also helps a lot in the transformation of the data. Let’s see how he does it!

How to transform data with caret

Among the transformations we usually undertake in Machine Learning we remark:

  • Creating dummy variables: many models cannot work with categorical variables. Instead, they dumify it, that is, they create n-1 columns (where n is the number of categories), and that each of these columns indicates the presence (1) or absence (0) of that particular value. Although many models (such as logistic regression) do it themselves, other models (such as xgboost) require you to do it manually.
  • Data scaling: consists of normalizing the scale of the data, since this is very important in algorithms such as regularization models (Ridge, Lasso and Elastic Net) or kNN, among others.
  • Imputation of missing values: the vast majority of models (except those based on trees) cannot work with missing values. That is why, when we have missing values, we either impute or eliminate those observations or variables. Luckily, caret makes it very easy to impute missing values ​​using various types of models.
  • Dimensionality reduction: When we work on a problem with a high level of dimensionality, that is, with many variables, it is usually interesting to reduce the number of variables while maintaining as much variability as possible. This process is usually done with a principal component analysis or PCA.

So, let’s see how to do all these types of transformations in our machine learning models in R with caret, the vast majority of them with the same function: preProcess.

How to create dummy variables with caret

Creating dummy variables with caret is very simple, we simply have to use the dummyVars function and apply a predict to obtain the resulting data.

head(sacramento_dummy)
  type.Condo type.Multi_Family type.Residential
1          0                 0                1
2          0                 0                1
3          0                 0                1
4          0                 0                1
5          0                 0                1
6          1                 0                0

As we can see, caret has converted a single column (type) into three columns (one per category), each of them being binary. However, it has not eliminated one of the categories, creating redundancy. After all: Condo = 0 & Multi_Family = 0 --> Residential = 1 .

Luckily we can indicate this with the drop2nd == TRUE parameter.

pre_dummy = dummyVars(price ~ type, data = Sacramento,
                      drop2nd = T)
sacramento_dummy = predict(pre_dummy, Sacramento)

head(sacramento_dummy)
  type.Condo type.Multi_Family type.Residential
1          0                 0                1
2          0                 0                1
3          0                 0                1
4          0                 0                1
5          0                 0                1
6          1                 0                0

How to scale data

To scale the data, we simply have to pass arguments to the method parameter to caret’s preProcess function. This function accepts two main types:

  • center: subtract the average from the values, so that they all have average 0.
  • scale: divide the values between the standard deviation. In this way, the data will have standard deviation 1.
  • range: normalizes the data, making it have a range from 0 to 1.
preProcess(Sacramento, method = "center")
Created from 932 samples and 9 variables

Pre-processing:
  - centered (6)
  - ignored (3)

As we can see, caret has centered the data of 6 variables, corresponding to the numerical variables, ignoring 3 variables. We see the message, but not the data. Why?

The reason is that the preProcess function is not intended to transform the data at the moment, but to do the transformation in the training (or inference) process.

However, we can see how our data looks after applying the transformation. To do this, we have to pass the result of the preprocessing and our data to the predict function. Let’s see how it works.

preprocess = preProcess(Sacramento, method = "center")
predict(preprocess, Sacramento)[1:10,]

As we can see, now caret does return all the data with the processing already applied (in this case, having subtracted the mean). Let’s see, for example, how we would normalize the data.

preprocess = preProcess(Sacramento, method = "range")
Sacramento_processed = predict(preprocess, Sacramento)

cat("--- Datos sin procesar ---","\n",
    "Min:", min(Sacramento$sqft),"\n",
    "Max:", max(Sacramento$sqft), "\n","\n",
    "--- Datos procesados ---","\n",
    "Min:", min(Sacramento_processed$sqft),"\n",
    "Max:", max(Sacramento_processed$sqft)
    )
--- Datos sin procesar --- 
 Min: 484 
 Max: 4878 

 --- Datos procesados --- 
 Min: 0 
 Max: 1

As we can see, we have normalized the data simply with one line of code. But this is not all, since the preProcess function allows you to do much more, such as impute missing values. Let’s see.

How to impute missing values ​​with caret

To impute missing values ​​with caret, we will use the preProcess function. In this case, there are different values ​​that we can pass to the method parameter:

  • knnImpute: allows you to use the kNN algorithm to impute missing values. As you know (if not, I’ll explain it in this post), the kNN algorithm requires you to indicate the number of neighbors to use in the prediction. That is why, if we use the knnImpute method, we will also have to indicate the k parameter .
  • bagImpute : with this value we will use several decision trees to make the imputation of our missing value.
  • medianImpute : as its name suggests, it imputes the median (in the case of a numeric variable). This is usually preferable to imputing the mean, since the mean can be affected by outliers.

Let’s see how missing value imputation works with caret in practice. To do this, first of all, we are going to “remove” some data from our dataset to simulate that we have missing values.

colSums(is.na(sacramento_missing))
     city       zip      beds     baths      sqft      type     price  latitude longitude 
        3         0         3         3         3         0         0         0         0 

As we can see, we now have 4 variables with 3 missing values ​​each. Let’s see how each imputation method works:

# Realizamos la imputación
pre_knn = preProcess(sacramento_missing, 
                     method = "knnImpute", k = 2)

pre_bag = preProcess(sacramento_missing, 
                     method = "bagImpute")

pre_median = preProcess(sacramento_missing, 
                        method = "medianImpute")

# Obtenemos los datos
imputed_knn = predict(pre_knn, sacramento_missing)
imputed_bag = predict(pre_bag, sacramento_missing)
imputed_median = predict(pre_median, sacramento_missing)

# Comprobamos con el valor real
print(Sacramento[c(1,4,5), c(1,3,4,5)])
print(imputed_knn[c(1,4,5), c(1,3,4,5)]) # Uses normalized data
print(imputed_bag[c(1,4,5), c(1,3,4,5)])
print(imputed_median[c(1,4,5), c(1,3,4,5)])

As we can see, we have been able to carry out the imputation of the missing values ​​in a very simple way. So far we have already seen a lot of things for data preprocessing with caret: variable selection, data transformation, imputation of missing data… But there is still more! With caret you can do cool things like using a PCA for dimensionality reduction. Let’s see how it works!

How to reduce dimensionality

When we work on Machine Learning problems with many variables, we often have problems because, the vast majority of models do not work well with many predictor variables and, if they do, they require a lot of data.

In these cases, a good option is usually to apply a dimensionality reduction method, such as principal component analysis or PCA.

Luckily, applying a PCA to our R dataset is very easy thanks to caret. To do this, we simply have to indicate the PCA value to the method parameter of the preProcess function. Likewise, with the thresh parameter we can indicate the percentage of variability that we want to keep.

pre_pca = preProcess(Sacramento, method = "pca", thresh = 0.8)
predict(pre_pca, Sacramento)

As we can see, now the dataset has 6 columns instead of 9. Yes, I know, this is not the best example in which applying a PCA adds a lot of value, but, as we can see, we can do it and in a very simple way thanks to caret.

With this, we have already seen all the options that the caret library offers for data transformation. But the options go much, much further, especially in modeling. Let’s see what it offers.

How to create machine learning models with caret

Choose the machine learning model to use

When we want to create a Machine Learning model in R, we generally load a library that contains the algorithm that interests us. For example, if we want to use a Random Forest, we will load the randomForest package, while if we want to use AdaBoost, we will load the ada package.

And here the first problem arises, and that is that each package is different and has its own implementation: some require that you pass a formula, others that you pass the predictors and the dependent variable separately, some manage the dummyfication, but others do not …

In addition, each model has its own hyperparameters, and the way to tune them changes from package to package.

Well, creating machine learning models in R with caret is very simple, since caret unifies the way of creating and optimizing the hyperparameters of 238 different models.

So, if we want to create a machine learning model with caret, the first thing is to know how that model is called within caret. We can discover this on this page. For example, there we will see that we can call the randomForest model from the randomForest library with the name rf.

Random Forest models available in caret
Random Forest models available in caret

How to partition data in train and test

Once we have chosen our model, we will have to divide the data into train and test. To do this, caret offers a very useful function, called createDataPartition, which is used to make this partition.

The function is very simple, we simply have to pass our dependent variable and the proportion of data that we want to be trained (generally between 0.7 and 0.8 of the total).

With this, the createDataPartition function returns the indices of the observations that must go to each partition. By default, the information is returned as a list, which I personally don’t like. Luckily, we can avoid this by specifying the list = FALSE parameter.

Let’s see how to split our data between train and test with caret:

cat('Train rows: ', nrow(train),"\n",
    'Test rows: ', nrow(test),
    sep="")
Train rows: 747
Test rows: 185

As we can see, we have been able to create the data partition in a super simple way in caret. Having seen this, let’s see how to train a machine learning model in R with caret.

How to train a machine learning model with caret

Once we have defined the model, we can create it very easily with the train function. We will simply have to pass the independent variables on one side and the dependent variable on the other and indicate the model in the method parameter.

Sacramento$zip = NULL
Sacramento$city = NULL

indep_var = colnames(Sacramento) != "price"
model_rf = train(x = Sacramento[indep_var], 
                 y = Sacramento$price,
                 method = 'rf'
                 )

model_rf
Random Forest 

932 samples
  6 predictor

No pre-processing
Resampling: Bootstrapped (25 reps) 
Summary of sample sizes: 932, 932, 932, 932, 932, 932, ... 
Resampling results across tuning parameters:

  mtry  RMSE      Rsquared   MAE     
  2     76275.05  0.6534992  54482.84
  4     77632.80  0.6416936  55744.72
  6     78933.34  0.6310063  56864.50

RMSE was used to select the optimal model using the
 smallest value.
The final value used for the model was mtry = 2.

As we can see, with the train function we have not only created the model (in this case a randomForest), but also made a small tuning of the mtry parameter (which indicates the number of random variables to choose from each tree created) and indicates the main adjustment measures of the model (RMSE and MAE).

But there is still more, when creating our model, we can tell caret to make a transformation of our data, like the ones we have seen previously. For that we simply have to pass the preProcess value to the train function.

For example, suppose we are going to use the kNN algorithm, which requires that the data be normalized. Let’s see how we can process the data in the training itself:

# Me quedo con las variables numéricas
num_cols = sapply(Sacramento, is.numeric)
Sacramento_num = Sacramento[num_cols]

# Separo entre variables dependientes e independientes
indep_var = colnames(Sacramento_num) != "price"
model_knn = train(x = Sacramento[indep_var], 
                 y = Sacramento$price,
                 preProcess = "range",
                 method = 'knn'
                 )

model_knn
k-Nearest Neighbors 

932 samples
  6 predictor

Pre-processing: re-scaling to [0, 1] (6) 
Resampling: Bootstrapped (25 reps) 
Summary of sample sizes: 932, 932, 932, 932, 932, 932, ... 
Resampling results across tuning parameters:

  k  RMSE      Rsquared   MAE     
  5  39670.84  0.9157169  26896.41
  7  39718.85  0.9183391  27052.33
  9  40274.35  0.9188837  27325.56

RMSE was used to select the optimal model using the
 smallest value.
The final value used for the model was k = 5.

As we can see, the model has normalized the 6 variables, has created an algorithm kNN with different values ​​of k and has decided that the optimal value of k is k = 5.

And, best of all, there is still more: caret makes tuning a model much easier using grid search. Let’s see how.

How to optimize the hyperparameters of a model with caret

As we have seen, when we make a model in caret, it directly applies a default tuning. However, we may be interested in controlling what values ​​these hyperparameters take. Well, doing this with caret is very simple.

In order to test different parameters, we must first create our own Grid Search. When we do an optimization by Grid Search we basically test all the possible combinations of all the hyperparameters that we indicate.

For example, suppose we want to create a rule-based Lasso regression and we want to tune the lambda parameter, which indicates the level of penalty to be performed.

Important: the parameters that we can tune for each model appear in the list of available models.

To do this, we simply have to pass each value of each parameter that we want it to test to the expand.grid function. Important, if there are parameters that we only want to have one value, we also have to include them. Let’s see how it’s done:

model_lasso = train(x = Sacramento[indep_var], 
                 y = Sacramento$price,
                 method = "glmnet",
                 family = "gaussian",
                 tuneGrid = tunegrid
                 )
model_lasso = train(x = Sacramento[indep_var], 
                 y = Sacramento$price,
                 method = "glmnet",
                 family = "gaussian",
                 tuneGrid = tunegrid
                 )
model_lasso
glmnet 

932 samples
  3 predictor

No pre-processing
Resampling: Bootstrapped (25 reps) 
Summary of sample sizes: 932, 932, 932, 932, 932, 932, ... 
Resampling results across tuning parameters:

  lambda  RMSE      Rsquared   MAE     
      0   83549.32  0.5893103  61258.44
      1   83549.32  0.5893103  61258.44
    100   83549.32  0.5893103  61258.44
    200   83549.08  0.5893124  61258.51
    500   83524.09  0.5894426  61275.17
   1000   83508.27  0.5894367  61319.27
   2000   83566.59  0.5887109  61473.51
   5000   84354.89  0.5812886  62298.71
  10000   85149.90  0.5764404  63031.99
  50000   97271.45  0.5764554  72190.38

Tuning parameter 'alpha' was held constant at a value of 1
RMSE was used to select the optimal model using the smallest value.
The final values used for the model were alpha = 1 and lambda = 1000.

As you can see, performing a Grid Search with caret is very simple. And yes, although it is already a lot, there is still more. And is that caret allows something else too: do cross validation or cross validation. Let’s see how to do it.

How to do Cross Validation with caret

To perform cross validation in R with caret we simply have to call the trainControl function and pass this result to our training function.

Within the trainControl function we can indicate many of the questions that interest us, such as the resampling method to use or how many times we should use it.

The most typical thing is to set the method as cv or repeatedcv which allows cross validation, although we can also bootstrap if we set the value to boot, boot632, optimism_boot or boot_all.

Also, if our data is imbalanced, we can balance it in different ways using the sampling parameter. The types of samples it allows are: down for downsampling, up for upsampling or applying specific sampling models with smote or rose.

Let’s see how it works by applying it to the Lasso regression example we created previously:

fitControl = trainControl(method = "repeatedcv",
                          number = 10,
                          repeats = 10)

cv_model_lasso = train(x = Sacramento[indep_var], 
                 y = Sacramento$price,
                 method = 'glmnet',
                 family = 'gaussian',
                 tuneGrid = tunegrid,
                 trControl = fitControl
                 )
cv_model_lasso
glmnet 

932 samples
  3 predictor

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 10 times) 
Summary of sample sizes: 839, 840, 838, 839, 839, 839, ... 
Resampling results across tuning parameters:

  lambda  RMSE      Rsquared   MAE     
      0   82431.41  0.6050358  60544.17
      1   82431.41  0.6050358  60544.17
    100   82431.41  0.6050358  60544.17
    200   82431.36  0.6050361  60544.15
    500   82428.08  0.6050931  60566.41
   1000   82445.83  0.6050029  60618.18
   2000   82539.04  0.6043956  60790.61
   5000   83361.05  0.5979379  61635.24
  10000   84335.30  0.5926398  62447.44
  50000   97474.98  0.5926398  72309.98

Tuning parameter 'alpha' was held constant at a value of 1
RMSE was used to select the optimal model using the smallest value.
The final values used for the model were alpha = 1 and lambda = 500.

As we can see, each of the models has performed a 10-fold cross validation, which has been repeated 10 times. And, of course, the model error for the different lambda values ​​have changed.

Finally, I would like to comment on an important question for the training of our machine learning models in R with caret: parallel training.

How to train Machine Learning models in R in parallel

When we create models, they can take time to execute, especially if we carry out a very extensive Grid Search (besides, it must be said, caret is not very fast).

Luckily, caret offers us the option of parallelizing the models, in such a way that we can make many more models in less time.

To check this, let’s see how long it takes to create a Lasso regression with many hyperparameters if we do not parallelize the model:

tic = Sys.time()

tunegrid = expand.grid(
  alpha  = seq(0,1,0.1),
  lambda = c(0,1,100,200,500,1000,2000,5000,10000,50000)
) 

fitControl = trainControl(method = "repeatedcv",
                          number = 10,
                          repeats = 10)

cv_model_lasso = train(x = Sacramento[indep_var], 
                 y = Sacramento$price,
                 method = 'glmnet',
                 family = 'gaussian',
                 tuneGrid = tunegrid,
                 trControl = fitControl
                 )

toc = Sys.time()

cat("Total time:",toc-tic)
Total time: 20.35262

As we can see, it took a little over 20 seconds to complete the entire process. But what if we parallelize it?

Parallelizing a model in R with caret is very simple, you just have to create a cluster with the doParallel library and stop the cluster once we have trained.

The cluster can be created as follows:

library(doParallel)
cl = makePSOCKcluster(5)
registerDoParallel(cl)

Now that we have created the cluster, we can run the same code as before, which will now be parallelized automatically.

tic = Sys.time()

tunegrid = expand.grid(
  alpha  = seq(0,1,0.1),
  lambda = c(0,1,100,200,500,1000,2000,5000,10000,50000)
) 

fitControl = trainControl(method = "repeatedcv",
                          number = 10,
                          repeats = 10)

cv_model_lasso_par = train(x = Sacramento[indep_var], 
                           y = Sacramento$price,
                           method = 'glmnet',
                           family = 'gaussian',
                           tuneGrid = tunegrid,
                           trControl = fitControl
                           )

toc = Sys.time()

cat("Total time:",toc-tic)
Total time: 9.741221

As we can see, now the creation of the model has only taken 9 seconds, that is, less than half the time that without parallelizing. All with 2 very simple lines of code. And be careful, this is applicable to all 238 models that caret includes.

Finally, we have to stop the cluster, which we can do with the following function:

stopCluster(cl)

As you can see, caret offers very very interesting advantages. Finally, we come to the final stretch of this post, where we will see how to make predictions with caret, as well as evaluate the performance of an ML model. Let’s go there!

How to make predictions and measure predictive capacity of the model with caret in R

To make predictions with R we must pass new data and our model to the predict function, like any other normal model in R.

head(pred)
       1        2        3        4        5        6 
141811.1 168743.7 135557.5 144312.5 135713.9 161708.4 

Likewise, caret also offers functions to calculate the predictive capacity of the models. This will depend on the types of data we have. For numeric variables, we can use the RMSE functions and the defaultSummary function, which returns the main metrics (RMSE, R2, and MAE).

I personally tend to like the RMSE function better, basically because it tends to be the way (in general) I use to measure the predictive capacity of models. Also, it is easier to perform than the defaultSummary function, since the latter requires you to create a dataframe for it to function. Let’s see how they work:

print("Use of defaultSummary")
defaultSummary(
  data = data.frame(obs = Sacramento$price[1:100], 
                    pred = pred))

print("Use of RMSE")
RMSE(pred, Sacramento$price[1:100])
[1] "Use of defaultSummary"
        RMSE     Rsquared          MAE 
5.615570e+04 3.388848e-01 4.676571e+04 

[1] "Use of RMSE"
[1] 56155.7

Likewise, in the case of categorical variables, caret offers the confusionMatrix function, which calculates the confusion matrix, as well as all the metrics associated with it.

pred_fake = factor(round(runif(100)))
real_fake = factor(round(runif(100)))

confusionMatrix(pred_fake, real_fake)
Confusion Matrix and Statistics

          Reference
Prediction  0  1
         0 25 28
         1 21 26

               Accuracy : 0.51           
                 95% CI : (0.408, 0.6114)
    No Information Rate : 0.54           
    P-Value [Acc > NIR] : 0.7591         

                  Kappa : 0.0247         

 Mcnemar's Test P-Value : 0.3914         

            Sensitivity : 0.5435         
            Specificity : 0.4815         
         Pos Pred Value : 0.4717         
         Neg Pred Value : 0.5532         
             Prevalence : 0.4600         
         Detection Rate : 0.2500         
   Detection Prevalence : 0.5300         
      Balanced Accuracy : 0.5125         

       'Positive' Class : 0              
                                         

Although it is not a real case, we see that caret offers a lot of information with just one line of code.

Summary

In short, if you are going to do machine learning with R, caret is a package that you should know. Not only does it unify many models in the same package, but it standardizes super interesting things such as hyperparameter optimization or cross-validation. In addition, it allows you to train all the models in a super simple way.

As if that were not enough, it has several functions with which, in a very simple way, we can see how good our model has been.

In short, caret is a very good package and I hope this post has served you to all the potential it has. See you in the next post!