The ML.CENTROIDS function

This document describes the ML.CENTROIDS function, which lets you return information about the centroids in a k-means model.

Syntax

ML.CENTROIDS(
  MODEL `project_id.dataset.model`
  STRUCT([, standardize AS standardize]))

Arguments

ML.CENTROIDS takes the following arguments:

Output

ML.CENTROIDS returns the following columns:

The output contains one row per feature per centroid.

Examples

The following examples show how to use ML.CENTROIDS with and without the standardize argument.

Without standardization

Numerical features

The following example retrieves centroid information from the model mydataset.my_kmeans_model in your default project. This model only contains numerical features.

SELECT
  *
FROM
  ML.CENTROIDS(MODEL `mydataset.my_kmeans_model`)

This query returns results like the following:

+-------------+-------------------+----------------------+---------------------+
| centroid_id | feature           | numerical_value      | categorical_value   |
+-------------+-------------------+----------------------+---------------------+
|           3 | x_coordinate      |            3095929.0 |                  [] |
|           3 | y_coordinate      | 1.0089726307692308E7 |                  [] |
|           2 | x_coordinate      |        3117072.65625 |                  [] |
|           2 | y_coordinate      | 1.0083220745833334E7 |                  [] |
|           1 | x_coordinate      |    3259947.096227731 |                  [] |
|           1 | y_coordinate      | 1.0105690227895036E7 |                  [] |
|           4 | x_coordinate      |   3109887.9056603773 |                  [] |
|           4 | y_coordinate      | 1.0057112358490566E7 |                  [] |
+-------------+-------------------+----------------------+---------------------+

Categorical features

The following example retrieves centroid information from the model mydataset.my_kmeans_model in your default project. This model contains categorical features.

SELECT
  *
FROM
  ML.CENTROIDS(MODEL `mydataset.my_kmeans_model`)
ORDER BY
  centroid_id;

This query returns results like the following:

+-------------+-------------------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| centroid_id | feature           |numerical_value| categorical_value                                                                                                                                                                                                                                              |
+-------------+-------------------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|           1 | department        |          NULL | [{"category":"Medieval Art","feature_value":"1.0"}]                                                                                                                                                                                                            |
|           1 | medium            |          NULL | [{"category":"Iron","feature_value":"0.21602160216021601"},{"category":"Glass, ceramic","feature_value":"0.3933393339333933"},{"category":"Copper alloy","feature_value":"0.39063906390639064"}]                                                               |
|           2 | medium            |          NULL | [{"category":"Wood, gesso, paint","feature_value":"0.15"},{"category":"Carnelian","feature_value":"0.2692307692307692"},{"category":"Papyrus, ink","feature_value":"0.2653846153846154"},{"category":"Steatite, glazed","feature_value":"0.3153846153846154"}] |
|           2 | department        |          NULL | [{"category":"Egyptian Art","feature_value":"1.0"}]                                                                                                                                                                                                            |
|           3 | medium            |          NULL | [{"category":"Faience","feature_value":"1.0"}]                                                                                                                                                                                                                 |
|           3 | department        |          NULL | [{"category":"Egyptian Art","feature_value":"1.0"}]                                                                                                                                                                                                            |
|           4 | medium            |          NULL | [{"category":"Steatite","feature_value":"1.0"}]                                                                                                                                                                                                                |
|           4 | department        |          NULL | [{"category":"Egyptian Art","feature_value":"1.0"}]                                                                                                                                                                                                            |
|           5 | medium            |          NULL | [{"category":"Red quartzite","feature_value":"0.20316027088036118"},{"category":"Bronze or copper alloy","feature_value":"0.3476297968397291"},{"category":"Gold","feature_value":"0.4492099322799097"}]                                                       |
|           5 | department        |          NULL | [{"category":"Egyptian Art","feature_value":"1.0"}]                                                                                                                                                                                                            |
+-------------+-------------------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Numerical and categorical features

The following are the results from the same query against a k-means model with both numerical and categorical features.

+-------------+--------------------+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| centroid_id |      feature       |  numerical_value  | categorical_value                                                                                                                                                                                                                                                                 |
+-------------+--------------------+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|           1 | start_station_name |              NULL | [{"category":"Toomey Rd @ South Lamar","value":"0.5714285714285714"},{"category":"State Capitol @ 14th & Colorado","value":"0.42857142857142855"}]                                                                                                                                |
|           1 | duration_minutes   | 9.142857142857142 | []                                                                                                                                                                                                                                                                                |
|           2 | duration_minutes   |               9.0 | []                                                                                                                                                                                                                                                                                |
|           2 | start_station_name |              NULL | [{"category":"Rainey @ River St","value":"0.14285714285714285"},{"category":"11th & San Jacinto","value":"0.42857142857142855"},{"category":"ACC - West & 12th Street","value":"0.14285714285714285"},{"category":"East 11th St. at Victory Grill","value":"0.2857142857142857"}] |
+-------------+--------------------+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

With standardization

The following example retrieves centroid information from the model mydataset.my_kmeans_model in your default project. The query in this example assumes that all features have a mean of 0 and a standard deviation of 1.

SELECT
  *
FROM
  ML.CENTROIDS(MODEL `mydataset.my_kmeans_model`,
    STRUCT(TRUE AS standardize))

What's next