train an xgboost cross validation classification model with k-fold cross-validation
Source:R/prediction_model.R
dot-mldpEHR.cv_train_outcome.Rdtrain an xgboost cross validation classification model with k-fold cross-validation
Usage
.mldpEHR.cv_train_outcome(
target,
features,
folds,
required_conditions = "id==id",
xgboost_params = list(booster = "gbtree", objective = "binary:logistic", subsample =
0.7, max_depth = 3, colsample_bytree = 1, eta = 0.05, min_child_weight = 1, gamma =
0, eval_metric = "auc"),
nrounds = 1000
)Arguments
- target
data.frame containing the patient id, sex, target_class (0/1) and fold (number used to assigne to cross validation folds)
- features
data.frame containing patient id along with all other features to be used in classification model
- folds
number of cross-validation folds
- required_conditions
any filter to apply to the features to filter out training/testing samples (e.g. missing data)
- xgboost_params
parameters used for xgboost model training
- nrounds
number of training rounds
Value
a predictor, a list with the following elements
model - list of xgboost models, for each fold
train - data.frame containing the patients id, fold, target class and predicted value in training (each id was used in nfolds-1 for training)
test - data.frame containing the patients id, fold, target class and predicted value in testing (each id was tested once in the fold it was not used for training)
xgboost_params - the set of parameters used in xgboost
nrounds - number of training iterations conducted