The ML.EXPLAIN_PREDICT function

This document describes the ML.EXPLAIN_PREDICT function, which lets you generate a predicted value and a set of feature attributions for each instance of the input data. Feature attributions indicate how much each feature in your model contributed to the final prediction for each given instance. ML.EXPLAIN_PREDICT is essentially an extended version of ML.PREDICT.

Syntax

ML.EXPLAIN_PREDICT(
  MODEL `project_id.dataset.model_name`,
  { TABLE `project_id.dataset.table` | (query_statement) },
  STRUCT(
  [number_of_output_tokens AS max_output_tokens]
  [, top_k AS top_k_features]
  [, threshold AS threshold]
  [, integrated_gradients_num_steps AS integrated_gradients_num_steps]
  [, approx_feature_contrib AS approx_feature_contrib])
)

Arguments

ML.EXPLAIN_PREDICT takes the following arguments:

Output

ML.EXPLAIN_PREDICT returns the following columns in addition to any passthrough columns:

Examples

The following examples assume that your model and input table are in your default project.

Explain a prediction generated by a linear regression model

The following example explains a prediction for a linear regression model by generating the top three attributions.

Assume a linear regression model stored in mydataset.mymodel was trained with the table mydataset.table with the following columns:

SELECT
  *
FROM
  ML.EXPLAIN_PREDICT(MODEL `mydataset.mymodel`,
    (
    SELECT
      label,
      column1,
      column2,
      column3,
      column4,
      column5
    FROM
      `mydataset.mytable`), STRUCT(3 AS top_k_features))

Explain a prediction generated by a boosted tree or a random forest binary classification model

The following example explains a prediction generated by a boosted tree or a random forest binary classification model. It generates the top three attributions with a custom threshold.

Assume a boosted tree or a random forest binary classification model stored in mydataset.mymodel is trained with the table mydataset.table with the following columns:

SELECT
  *
FROM
  ML.EXPLAIN_PREDICT(MODEL `mydataset.mymodel`,
    (
    SELECT
      label,
      column1,
      column2,
      column3,
      column4,
      column5
    FROM
      `mydataset.mytable`), STRUCT(3 AS top_k_features, 0.7 AS threshold))

Explain a prediction generated by a DNN classifier model

The following example explains a prediction generated by a DNN classifier model.

Assume a DNN classifier is stored in mydataset.mymodel and trained with the table mydataset.table with the following columns:

SELECT
  *
FROM
  ML.EXPLAIN_PREDICT(MODEL `mydataset.mymodel`,
    (
    SELECT
      label,
      column1,
      column2,
      column3,
      column4,
      column5
    FROM
      `mydataset.mytable`), STRUCT(3 AS top_k_features, 30 AS integrated_gradients_num_steps))

What's next