analyze feature significance for a k-fold cross validation model using shaply values

Usage

mldpEHR.prediction_model_features(predictor)

Arguments

predictor

classification model trained by mldpEHR.cv_train_outcome

Value

a list with the 3 objects

summary - data frame containing for each feature the mean absolute shaply value for the feature across all training data
shap_by_patient - data frame containing for each patient and feature the feature value and mean shap value across all training folds
shap_by_fold - similar to shap_by_patient, but for each fold seperately

Examples

N <- 100
patients <- list(data.frame(
    id = 1:N,
    sex = rep(c(1, 2), N / 2),
    age = 80,
    death = c(rep(NA, 0.2 * N), rep(82, 0.8 * N)),
    followup = 5
))
features <- list(data.frame(
    id = 1:N,
    a = c(rnorm(0.2 * N), rnorm(0.8 * N, mean = 2, sd = 1)),
    b = rep(c(rnorm(N / 4), rnorm(N / 4, mean = 3)), 2)
))
predictor <- mldpEHR.mortality_multi_age_predictors(patients, features, 5, 3, q_thresh = 0.05)
#> 

#>   Training [-----------------------------] 0/3 (  0%) in  0s
#> 

#>   Training [=========>-------------------] 1/3 ( 33%) in  0s
#> 

#>   Training [==================>----------] 2/3 ( 67%) in  0s
#> 

#>   Training [=============================] 3/3 (100%) in  1s
#> 
predictor_features <- mldpEHR.prediction_model_features(predictor[[1]])
#> Error in pivot_longer(., !id, names_to = "feature", values_to = "shap"): could not find function "pivot_longer"