The ML.TFDV_VALIDATE function

This document describes the ML.TFDV_VALIDATE function, which you can use to compare the statistics for training and serving data, or two sets of serving data, in order to identify anomalous differences between the two data sets. Calling this function provides the same behavior as calling the TensorFlow validate_statistics API. You can use the data output by this function for model monitoring.

Syntax

ML.TFDV_VALIDATE(
  base_statistics,
  study_statistics
  [, detection_type]
  [, categorical_default_threshold]
  [, categorical_metric_type]
  [, numerical_default_threshold]
  [, numerical_metric_type]
  [, thresholds]
)

Arguments

ML.TFDV_VALIDATE takes the following arguments:

ML.TFDV_VALIDATE uses positional arguments, so if you specify an optional argument, you must also specify all arguments prior to that argument. For more information on argument types, see Named arguments.

Output

ML.TFDV_VALIDATE returns a TensorFlow Anomalies protocol buffer in JSON format.

Examples

The following example returns the skew between training and serving data and also sets custom anomaly detection thresholds for two of the feature columns:

DECLARE stats1 JSON;
DECLARE stats2 JSON;

SET stats1 = (SELECT * FROM ML.TFDV_DESCRIBE(TABLE `myproject.mydataset.training`));

SET stats2 = (SELECT * FROM ML.TFDV_DESCRIBE(TABLE `myproject.mydataset.serving`));

SELECT ML.TFDV_VALIDATE(
  stats1, stats2, 'SKEW', .3, 'L_INFTY', .3, 'JENSEN_SHANNON_DIVERGENCE', [('feature1', 0.2), ('feature2', 0.5)]
);

INSERT `myproject.mydataset.serve_stats`
  (t, dataset_feature_statistics_list)
SELECT CURRENT_TIMESTAMP() AS t, stats1;

The following example returns the drift between two sets of serving data:

SELECT ML.TFDV_VALIDATE(
  (SELECT dataset_feature_statistics_list FROM `myproject.mydataset.servingJan24`),
  (SELECT * FROM ML.TFDV_DESCRIBE(TABLE `myproject.mydataset.serving`)),
  'DRIFT'
);

Limitations

The ML.TFDV_VALIDATE function doesn't conduct schema validation.

ML.TFDV_VALIDATE handles type mismatch as follows: