One-hot encoding on categorical variables and replace missing values. It is not needed when creating a standard scorecard model, but required in models that without doing woe transformation.

one_hot(dt, var_skip = NULL, var_encode = NULL, nacol_rm = FALSE, ...)



A data frame.


Name of categorical variables that will skip for one-hot encoding. Defaults to NULL.


Name of categorical variables to be one-hot encoded, Defaults to NULL. If it is NULL, then all categorical variables except in var_skip are counted.


Logical. One-hot encoding on categorical variable contains missing values, whether to remove the column generated to indicate the presence of NAs. Defaults to FALSE.


Additional parameters.


A data frame


# load germancredit data

dat = rbind(
  setDT(germancredit)[, c(sample(20,3),21)],

# one hot encoding
## keep na columns from categorical variable
dat_onehot1 = one_hot(dat, var_skip = 'creditability', nacol_rm = FALSE) # default
#> Classes ‘data.table’ and 'data.frame':	1010 obs. of  6 variables:
#>  $ credit.amount                                     : num  1169 5951 2096 7882 4870 ...
#>  $           : num  2 1 1 1 2 1 1 1 1 2 ...
#>  $ creditability                                     : Factor w/ 2 levels "bad","good": 2 1 2 2 1 2 2 2 2 1 ...
#>  $ telephone_NA                                      : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ telephone_none                                    : int  0 1 1 1 1 0 1 0 1 1 ...
#>  $ telephone_yes, registered under the customers name: int  1 0 0 0 0 1 0 1 0 0 ...
#>  - attr(*, ".internal.selfref")=<externalptr> 
## remove na columns from categorical variable
dat_onehot2 = one_hot(dat, var_skip = 'creditability', nacol_rm = TRUE)
#> Classes ‘data.table’ and 'data.frame':	1010 obs. of  5 variables:
#>  $ credit.amount                                     : num  1169 5951 2096 7882 4870 ...
#>  $           : num  2 1 1 1 2 1 1 1 1 2 ...
#>  $ creditability                                     : Factor w/ 2 levels "bad","good": 2 1 2 2 1 2 2 2 2 1 ...
#>  $ telephone_none                                    : int  0 1 1 1 1 0 1 0 1 1 ...
#>  $ telephone_yes, registered under the customers name: int  1 0 0 0 0 1 0 1 0 0 ...
#>  - attr(*, ".internal.selfref")=<externalptr>