chooser.py module

USER_DEFINED_METADATA_KEY = 'user_defined_metadata'

Key used to store Improve AI metadata inside booster (<booster>.attr(USER_DEFINED_METADATA_KEY))

FEATURE_NAMES_METADATA_KEY = 'ai.improve.features'

Key used to store Improve AI booster feature names. During booster’s save procedure feature names are truncated from booster.

class XGBChooser

Bases: object

MODEL_NAME_REGEXP = '^[a-zA-Z0-9][\\w\\-.]{0,63}$'

Model name regexp used to verify all model names (both user provided and cached in boosters)

property model_name

Model name for this Improve AI model

Returns:

Model name of this Improve AI model

Return type:

str

property imposed_noise

Imposed noise value. Needed for SDK validation with synthetic models.

Returns:

Forced noise value

Return type:

float

property improveai_major_version_from_metadata: str

Stores the Improve AI model version

Returns:

string version of the Improve AI model or None

Return type:

str or None

property FEATURE_NAMES_METADATA_KEY

Key in model metadata storing feature names

Returns:

‘ai.improve.features’

Return type:

str

property STRING_TABLES_METADATA_KEY

Key in model metadata storing string tables

Returns:

‘ai.improve.string_tables’

Return type:

str

property MODEL_SEED_METADATA_KEY

Key in model metadata storing model seed

Returns:

‘ai.improve.seed’

Return type:

str

property MODEL_NAME_METADATA_KEY

Key in model metadata storing model name

Returns:

‘ai.improve.model’

Return type:

str

property CREATED_AT_METADATA_KEY

Key in model metadata storing model creation time

Returns:

‘ai.improve.created_at’

Return type:

str

property IMPROVE_AI_ALLOWED_MAJOR_VERSION

Latest supported major model version

Returns:

8

Return type:

int

property VERSION_METADATA_KEY

model metadata key storing model version

Returns:

‘ai.improve.version’

Return type:

str

property USER_DEFINED_METADATA_KEY

booster’s attribute name storing an entire user defined metadata dict

Returns:

‘user_defined_metadata’

Return type:

str

property REQUIRED_METADATA_KEYS

keys expected / required in model metadata

Returns:

list of required keys present in model metadata

Return type:

str

__init__()

Initialize chooser object

property model: Booster

xgboost’s booster used by this chooser

Returns:

xgboost’s booster used by this chooser

Return type:

Booster

property model_metadata: dict

Improve AI model metadata dict

Returns:

Improve AI model metadata dict

Return type:

dict

property feature_encoder: FeatureEncoder

FeatureEncoder of this chooser

Returns:

FeatureEncoder of this chooser

Return type:

FeatureEncoder

property feature_names: list

Feature names of this Improve AI model

Returns:

Feature names of this Improve AI model

Return type:

list

property current_noise

Currently used noise value. Needed for SDK validation with synthetic models.

Returns:

Currently used noise

Return type:

float

load_model(input_model_src, verbose=False)

Loads desired model from input path.

Parameters:
  • input_model_src (str) – URL / path to desired model

  • verbose (bool) – should I print debug messages

_get_noise()

Private noise getter. Noise can be set manually - this was provided for testing purposes. Please note that the ‘natural’ flow is for noise to be randomly sampled from 0-1 uniform distribution.

Returns:

noise used by chooser

Return type:

float

encode_candidates_to_matrix(candidates, context, noise=0.0)

Encodes list of candidates to 2D np.array for a given context with provided noise

Parameters:
  • candidates (list or tuple or np.ndarray) – list of JSON encodable candidates / items to encode

  • context (object) – JSON encodable object

  • noise (float) – noise to be used for sprinkling of encoded features

Returns:

2D numpy array with encoded candidates

Return type:

np.ndarray

score(candidates, context, **kwargs)

Calculates scores for all provided candidates in 2 steps:

  1. encodes candidates to np array

  2. predicts with booster on encoded features

Parameters:
  • candidates (list or tuple or np.ndarray) – list of candidates to scores

  • context (dict or None) – context dict needed for encoding

  • kwargs (dict) – kwargs

Returns:

1D numpy array with scores

Return type:

np.ndarray

calculate_predictions(features_matrix)

Calculates predictions on provided matrix with loaded model

Parameters:

features_matrix (np.ndarray) – array to be a source for DMatrix

Returns:

an array of double scores

Return type:

np.ndarray

encode_candidates_with_context(candidates, context)

Encodes provided candidates with a given context into numpy 2D matrix. Implemented as a XGBChooser helper method (will use Cython backend to speed things up if possible)

Parameters:
  • candidates (list or tuple or np.ndarray) – collection of input variants to be encoded

  • context (dict or None) – context to be encoded with variants

Returns:

2D array of encoded values

Return type:

np.ndarray

static get_model_src(model_src)

Based on provided model_src this method will return:

  • a FS string path for input FS path to unzipped booster

  • Path object for input Path object to unzipped booster

  • unzipped bytesarray for input FS path / Path object to gzipped booster

  • (unzipped) bytesarray for input URL (if URL leads to gzipped booster

it will be unzipped)

Output from this method can in be passed directly to Booster.load_model().

Parameters:

model_src (str or Path or bytes) – pth to model, url or bytes

Returns:

path or downloaded model

Return type:

str or Path or bytearray

_get_improveai_major_version(model_metadata)

Extracts Improve AI version from model metadata and return it if it is valid / allowed

Parameters:

model_metadata (dict) – a dictionary containing model metadata

Returns:

major Improve AI version extracted from loaded improve model

Return type:

str or None

_get_model_metadata()

Gets ‘model metadata’ from JSON string stored in ‘user defined metadata’ attribute of Improve AI booster

Returns:

dict with model metadata

Return type:

dict

_get_model_feature_names(model_metadata)

Gets model feature names from model metadata

Parameters:

model_metadata (dict) – a dict containing model metadata

Returns:

list of feature names

Return type:

list

_get_model_seed(model_metadata)

Gets model seed from model metadata

Parameters:

model_metadata (dict) – a dict containing model metadata

Returns:

model seed

Return type:

int

_get_model_name(model_metadata)

Gets model name from model metadata

Parameters:

model_metadata (dict) – a dict containing model metadata

Returns:

Improve AI model name

Return type:

str

_get_string_tables(model_metadata)

Gets string tables from model metadata

Parameters:

model_metadata (dict) – a dict containing model metadata

Returns:

dict of lists with string tables

Return type:

dict