ballet.encoder module

class ballet.encoder.EncoderPipeline(*args, can_skip_transform_none=False, **kwargs)[source]

Bases: sklearn_pandas.pipeline.TransformerPipeline

Pipeline of target encoder steps

This wraps sklearn.pipeline.Pipeline. Each step receives a single argument y to their fit and transform methods. This is needed because some consumers like MLBlocks passes arguments by keyword, and we need to pass an argument named y rather than one named X.

Parameters

can_skip_transform_none – behavior if during the transform stage, the input y is None (as would be the case during the predict stage of an MLPipeline). If false (the default), then we call the pipeline’s transform method on y. If true, we skip calling the transform method and instead return immediately (returning the value None).

fit(y, **fit_params)[source]

Fit the model.

Fit all the transformers one after the other and transform the data. Finally, fit the transformed data using the final estimator.

Parameters
  • X (iterable) – Training data. Must fulfill input requirements of first step of the pipeline.

  • y (iterable, default=None) – Training targets. Must fulfill label requirements for all steps of the pipeline.

  • **fit_params (dict of string -> object) – Parameters passed to the fit method of each step, where each parameter name is prefixed such that parameter p for step s has key s__p.

Returns

self – Pipeline with fitted steps.

Return type

object

fit_transform(y, **fit_params)[source]

Fit the model and transform with the final estimator.

Fits all the transformers one after the other and transform the data. Then uses fit_transform on transformed data with the final estimator.

Parameters
  • X (iterable) – Training data. Must fulfill input requirements of first step of the pipeline.

  • y (iterable, default=None) – Training targets. Must fulfill label requirements for all steps of the pipeline.

  • **fit_params (dict of string -> object) – Parameters passed to the fit method of each step, where each parameter name is prefixed such that parameter p for step s has key s__p.

Returns

Xt – Transformed samples.

Return type

ndarray of shape (n_samples, n_transformed_features)

steps: List[Any]
transform(y)[source]

Transform the data, and apply transform with the final estimator.

Call transform of each transformer in the pipeline. The transformed data are finally passed to the final estimator that calls transform method. Only valid if the final estimator implements transform.

This also works where final estimator is None in which case all prior transformations are applied.

Parameters

X (iterable) – Data to transform. Must fulfill input requirements of first step of the pipeline.

Returns

Xt – Transformed data.

Return type

ndarray of shape (n_samples, n_transformed_features)

ballet.encoder.make_encoder_pipeline(steps, **kwargs)[source]
ballet.encoder.make_robust_encoder(steps, **kwargs)[source]