Overview
TCMC is made to facilitate the comparison of multiple classification models using the caret package. It provides a streamlined approach to train, evaluate, and compare 13 different classification algorithms, offering the user a framework for model selection in machine learning projects.
What this tool does
- Trains and compares 13 different classification models
- Utilizes repeated cross-validation for robust performance estimation
- Provides variable importance plots for each model
- Generates confusion matrices for model evaluation
- Supports customizable training/test split ratios
Installation instructions
Get the latest stable R release from CRAN. Then install TCMC from Bioconductor using the following code:
You can install TCMC from GitHub using the devtools package:
devtools::install_github("danymukesha/TCMC")In the near future, you could also see it on Bioconductor
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("TCMC")Models included
TCMC includes the following classification algorithms:
- Learning Vector Quantization (LVQ)
- Gradient Boosting Machine (GBM)
- Support Vector Machine with Radial Basis Function Kernel (SVM-RBF)
- Generalized Linear Model (GLM)
- Bagged CART (Tree Bag)
- Random Forest (RF)
- C5.0
- Linear Discriminant Analysis (LDA)
- Elastic Net (glmnet)
- k-Nearest Neighbors (KNN)
- Recursive Partitioning and Regression Trees (rpart)
- Naive Bayes (NB)
- Extreme Gradient Boosting (XGBoost)
Performance metrics
The package uses accuracy as the primary metric for model comparison. However, it also provides confusion matrices for each model, allowing for the calculation of additional metrics such as sensitivity, specificity, and F1 score.
Variable importance
TCMC generates variable importance plots for each model, offering insights into feature relevance across different algorithms.
Limitations and future work
- Currently limited to binary classification problems
- Future versions will include support for multiclass classification and regression tasks(or with a separate package)
- Plans to incorporate more advanced hyperparameter tuning methods
Citation
Below is the citation output from using citation('TCMC') in R. If you use TCMC in your research, please cite it as follows:
print(citation("TCMC"), bibtex = TRUE)
#> To cite package 'TCMC' in publications use:
#>
#> Mukesha D (2024). _TCMC: Compare Classification Models_. R package
#> version 0.99.0.
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Manual{,
#> title = {TCMC: Compare Classification Models},
#> author = {Dany Mukesha},
#> year = {2024},
#> note = {R package version 0.99.0},
#> }