ballet.pipeline module

class ballet.pipeline.EngineerFeaturesResult(X_df, features, pipeline, X, y_df, encoder, y)[source]

Bases: tuple

X: numpy.ndarray

Alias for field number 3

X_df: pandas.core.frame.DataFrame

Alias for field number 0

encoder: ballet.eng.base.BaseTransformer

Alias for field number 5

features: List[ballet.feature.Feature]

Alias for field number 1

pipeline: ballet.pipeline.FeatureEngineeringPipeline

Alias for field number 2

y: numpy.ndarray

Alias for field number 6

y_df: pandas.core.frame.DataFrame

Alias for field number 4

class ballet.pipeline.FeatureEngineeringPipeline(features)[source]

Bases: sklearn_pandas.dataframe_mapper.DataFrameMapper

Feature engineering pipeline

Parameters

features (Union[Feature, List[Feature]]) – feature or list of features

property ballet_features
Return type

List[Feature]

get_names(columns, transformer, x, alias=None)[source]

Return verbose names for the transformed columns.

This extends the behavior of DataFrameMapper to allow alias to rename all of the output columns, rather than just providing a common base. It also allows columns to be a callable that supports selection by callable of the data frame.

ballet.pipeline.make_engineer_features(pipeline, encoder, load_data)[source]
Return type

Callable[[DataFrame, DataFrame], EngineerFeaturesResult]