ballet.validation.gfssf module

class ballet.validation.gfssf.GFSSFIterationInfo(i, n_samples, candidate_feature, candidate_cols, candidate_cmi, omitted_feature, omitted_cols, omitted_cmi, statistic, threshold, delta)[source]

Bases: object

candidate_cmi: float
candidate_cols: int
candidate_feature: ballet.feature.Feature
delta: float
i: int
n_samples: int
omitted_cmi: float
omitted_cols: int
omitted_feature: ballet.feature.Feature
statistic: float
threshold: float
class ballet.validation.gfssf.GFSSFPerformanceEvaluator(*args, lmbda_1=0.0, lmbda_2=0.0, lambda_1_adjustment=64, lambda_2_adjustment=64)[source]

Bases: ballet.validation.base.FeaturePerformanceEvaluator

A feature performance evaluator that uses a modified version of GFSSF[1]

lmbda_1

GFSSF parameter used to calculate the information threshold. Default is a function of the entropy of y.

lmbda_2

GFSSF parameter used to calculate the information threshold. Default is a function of the entropy of y.

lambda_1_adjustment

Adjustment to estimated entropy used to calculate lmbda_1.

lambda_2_adjustment

Adjustment to estimated entropy used to calculate lmbda_2.

References

[1] H. Li, X. Wu, Z. Li and W. Ding, “Group Feature Selection

with Streaming Features,” 2013 IEEE 13th International Conference on Data Mining, Dallas, TX, 2013, pp. 1109-1114. doi: 10.1109/ICDM.2013.137