BestFeatures {tools4uplift} | R Documentation |
Penalized logistic regression (LASSO) in order to select the features that maximize the Qini coefficient.
BestFeatures(data, treat, outcome, predictors, nb.lambda = 100, nb.group = 10, validation = FALSE, p = 0.3, value = FALSE)
data |
a data frame containing the treatment, the outcome and the predictors. |
treat |
name of a binary (numeric) vector representing the treatment assignment (coded as 0/1). |
outcome |
name of a binary response (numeric) vector (coded as 0/1). |
predictors |
a vector of names representing the predictors to consider in the model. |
nb.lambda |
the number of lambda values - Default is 100. |
nb.group |
the number of groups for computing the Qini coefficient - Default is 10. |
validation |
if TRUE, the best features are selected based on cross-validation - Default is FALSE. |
p |
if validation is TRUE, the desired proportion for the validation set. p is a value between 0 and 1 expressed as a decimal, it is set to be proportional to the number of observations per group - Default is 0.3. |
value |
if TRUE, the values of the best lambda and Qini coefficient will be printed - Default is FALSE. |
The regularization parameter is chosen based on the interaction uplift model that maximizes the Qini coefficient. Using the LASSO penalty, some predictors have coefficients set to zero.
a vector of names representing the selected best features from the penalized logistic regression.
Mouloud Belbahri
Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2019) Uplift Regression, <https://dms.umontreal.ca/~murua/research/UpliftRegression.pdf>
library(tools4uplift) data("SimUplift") features <- BestFeatures(data = SimUplift, treat = "treat", outcome = "y", predictors = colnames(SimUplift[,3:7])) features