The ML.ADVANCED_WEIGHTS function

This document describes the ML.ADVANCED_WEIGHTS function, which lets you see the underlying weights that a linear or binary logistic regression model uses during prediction, along with the associated p-values and standard errors for that weight. ML.ADVANCED_WEIGHTS is an extended version of ML.WEIGHTS for linear and binary logistic regression models.

Usage requirements

You can only use ML.ADVANCED_WEIGHTS on linear and binary logistic regression models that are trained with the following option settings:

It's common to require standard errors or p-values for either the regression coefficients or other estimated quantities for these penalized regression methods. In principle, such standard errors can be calculated—for example, using the bootstrap. In practice, this calculation isn't done for reasons that the authors of the R package explain as follows:

Standard errors are not very meaningful for strongly biased estimates such as arise from penalized estimation methods. Penalized estimation is a procedure that reduces the variance of estimators by introducing substantial bias. The bias of each estimator is therefore a major component of its mean squared error, whereas its variance may contribute only a small part. Unfortunately, in most applications of penalized regression it is impossible to obtain a sufficiently precise estimate of the bias. Any bootstrap-based calculations can only give an assessment of the variance of the estimates. Reliable estimates of the bias are only available if reliable unbiased estimates are available, which is typically not the case in situations in which penalized estimates are used.

Multiclass logistic regression models aren't supported.

Syntax

ML.ADVANCED_WEIGHTS(
  MODEL `project_id.dataset.model`,
  STRUCT(
    [standardize AS standardize]))

Arguments

ML.ADVANCED_WEIGHTS takes the following arguments:

Output

ML.ADVANCED_WEIGHTS returns the following columns:

If the TRANSFORM clause was used in the CREATE MODEL statement that created the model, ML.ADVANCED_WEIGHTS outputs the weights of the TRANSFORM output features. The weights are denormalized by default, with the option to get normalized weights, exactly like models that are created without TRANSFORM.

Permissions

You must have the bigquery.models.create andbigquery.models.getData Identity and Access Management (IAM) permissions in order to run ML.ADVANCED_WEIGHTS.

Limitations

The total cardinality of training features must be less than 1,000. This limitation is the result of the limitations of computing p-values and standard error when you set the CALCULATE_P_VALUES option to TRUE when training the model.

Examples

The following examples demonstrate ML.ADVANCED_WEIGHTS with and without standardization.

Without standardization

The following example retrieves weight information from mymodel in mydataset where the dataset is in your default project.

The query returns the weights associated with each one-hot encoded category for the input column input_col.

SELECT
  *
FROM
  ML.ADVANCED_WEIGHTS(MODEL `mydataset.mymodel`,
    STRUCT(FALSE AS standardize))

With standardization

The following example retrieves weight information from mymodel in mydataset. The dataset is in your default project.

The query retrieves standardized weights, which assume all features have a mean of 0 and a standard deviation of 1.0.

SELECT
  *
FROM
  ML.ADVANCED_WEIGHTS(MODEL `mydataset.mymodel`,
    STRUCT(TRUE AS standardize))

What's next