model_tests.FEAT.SHAPFeatureImportance¶

SHAPFeatureImportance Objects¶

@dataclass
class SHAPFeatureImportance(ModelTest)

Test if the subgroups of the protected attributes are the top ranking important variables under shapely feature importance value.

To pass, subgroups should not fall in the top n most important variables.

The test also stores a dataframe showing the results of each groups.

Arguments:

attrs - List of protected attributes.
threshold - Threshold for the test. To pass, subgroups should not fall in the top n (threshold) most important variables.
test_name - Name of the test, default is 'Shapely Feature Importance Test'.
test_desc - Description of the test. If none is provided, an automatic description will be generated based on the rest of the arguments passed in.

 | get_shap_values(model, model_type, x_train_encoded, x_test_encoded) -> list

Get SHAP values for a set of test samples.

Arguments:

model - Trained model object.
model_type - type of model algorithm, choose from 'trees' or 'others'
x_train_encoded - Training data features, categorical features have to be encoded.
x_test_encoded - Test data to be used for shapely explanations, categorical features have to be encoded.

 | shap_summary_plot(x_test_encoded, save_plots: bool = True)

Make a shap summary plot.

Arguments:

x_test_encoded - Data to be used for shapely explanations, categorical features have to be encoded
save_plots - if True, saves the plots to the class instance

 | get_result(model, model_type: str, x_train_encoded: pd.DataFrame, x_test_encoded: pd.DataFrame) -> pd.DataFrame

Output a dataframe containing the test results of the protected attributes.

Arguments:

model - Trained model object.
model_type - type of model algorithm, choose from 'trees' or 'others'
x_train_encoded - Training data features, categorical features have to be encoded.
x_test_encoded - Test data to be used for shapely explanations, categorical features have to be encoded.

 | shap_dependence_plot(x_test_encoded, show_all: bool = True, save_plots: bool = True)

Create a SHAP partial dependence plot to show the effect of the individual subgroups on shapely value.

Arguments:

x_test_encoded - Test data to be used for shapely explanations, categorical features have to be encoded.
show_all - If false, only show subgroups that failed the test.

 | run(model, model_type: Literal["trees", "others"], x_train_encoded: pd.DataFrame, x_test_encoded: pd.DataFrame) -> bool

Runs test by calculating result and evaluating if it passes a defined condition.

Arguments:

model - Trained model object.
model_type - type of model algorithm, choose from 'trees' or 'others'
x_train_encoded - Training data features, categorical features have to be encoded.
x_test_encoded - Test data to be used for shapely explanations, categorical features have to be encoded.