Cross Validation — perf_cv • scorecard

perf_cv provides cross validation on logistic regression and other binomial classification models.

perf_cv(dt, y, x = NULL, no_folds = 5, seeds = NULL,
  binomial_metric = "ks", positive = "bad|1", breaks_list = NULL, ...)

Arguments

dt: A data frame with both x (predictor/feature) and y (response/label) variables.
y: Name of y variable.
x: Name of x variables. Defaults to NULL. If x is NULL, then all columns except y are counted as x variables.
no_folds: Number of folds for K-fold cross-validation. Defaults to 5.
seeds: The seeds to create multiple random splits of the input dataset into training and validation data by using split_df function. Defaults to NULL.
binomial_metric: Defaults to ks.
positive: Value of positive class, defaults to "bad|1".
breaks_list: List of break points, defaults to NULL. If it is NULL, then using original values of the input data to fitting model, otherwise converting into woe values based on training data.
...: Additional parameters.

Value

A list of data frames of binomial metrics for each datasets.

Examples

if (FALSE) {
data("germancredit")

dt = var_filter(germancredit, y = 'creditability')
bins = woebin(dt, y = 'creditability')
dt_woe = woebin_ply(dt, bins)

perf1 = perf_cv(dt_woe, y = 'creditability', no_folds = 5)

perf2 = perf_cv(dt_woe, y = 'creditability', no_folds = 5,
   seeds = sample(1000, 10))

perf3 = perf_cv(dt_woe, y = 'creditability', no_folds = 5,
   binomial_metric = c('ks', 'auc'))

}