Compare Classification Models

This function compares 13 different classification models using the caret package.

Usage

model_comparer(
  data,
  target_var,
  train_prop = 0.8,
  seed = 3456,
  for_utest = FALSE
)

Format

The most important input arguments needed are data and target_var:

data: contains 768 rows (observations) and 9 columns (features).
target_var: column containing a binary vector where 1 indicates diabetes patients and 0 for otherwise.

Source

https://www.kaggle.com/uciml/pima-indians-diabetes-database

Arguments

data: A data frame containing the dataset to be used for modeling.
target_var: The name of the target variable in the dataset.
train_prop: The proportion of data to be used for training (default is 0.8).
seed: The random seed for reproducibility (default is 3456).
for_utest: only for unit test when is TRUE (FALSE by default).

Value

A list containing the trained models and their performance metrics.

Details

This data set utilized in the example is originally from the National Institute of Diabetes and Digestive and Kidney Diseases.

References

Smith, J.W., Everhart, J.E., Dickson, W.C., Knowler, W.C., & Johannes, R.S. (1988). Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Symposium on Computer Applications and Medical Care (pp. 261–265). IEEE Computer Society Press.

Examples

library(mlbench)
data("PimaIndiansDiabetes", package = "mlbench", for_utest = FALSE)
#> Warning: data set ‘FALSE’ not found
# results <- model_comparer(PimaIndiansDiabetes, "diabetes")