finos / morphir-elm / Morphir.SDK.Aggregate

This module contains functions specifically designed to work with large data sets.

Aggregations


type alias Aggregation a key =
{ key : a -> key
, filter : a -> Basics.Bool
, operator : Operator a 
}

Type that represents an aggregation on a type a with a key of key. It encapsulates the following information:

groupBy : (a -> key) -> List a -> AssocList.Dict key (List a)

Group a list of items into a dictionary. Grouping is done using a function that returns a key for each item. The resulting dictionary will use those keys as the key of each entry in the dictionary and values will be lists of items for each key.

testDataSet =
    [ TestInput1 "k1_1" "k2_1" 1
    , TestInput1 "k1_1" "k2_1" 2
    , TestInput1 "k1_1" "k2_2" 3
    , TestInput1 "k1_1" "k2_2" 4
    , TestInput1 "k1_2" "k2_1" 5
    , TestInput1 "k1_2" "k2_1" 6
    , TestInput1 "k1_2" "k2_2" 7
    , TestInput1 "k1_2" "k2_2" 8
    ]

testDataSet
    |> groupBy .key1
        {- == Dict.fromList
                    [ ( "k1_1"
                      , [ TestInput1 "k1_1" "k2_1" 1
                        , TestInput1 "k1_1" "k2_1" 2
                        , TestInput1 "k1_1" "k2_2" 3
                        , TestInput1 "k1_1" "k2_2" 4
                        ]
                    , ( "k1_2",
                      , [ TestInput1 "k1_2" "k2_1" 5
                        , TestInput1 "k1_2" "k2_1" 6
                        , TestInput1 "k1_2" "k2_2" 7
                        , TestInput1 "k1_2" "k2_2" 8
                        ]
                    ]
        -}

aggregate : (key -> Aggregator a Morphir.SDK.Key.Key0 -> b) -> AssocList.Dict key (List a) -> List b

Aggregates a dictionary that contains lists of items as values into a list that contains exactly one item per key. The first argument is a function that takes a key and an aggregator and it should return a single item in the resulting list. The aggregator is a function that takes one of the aggregation functions in this module (count, sumOf, minimumOf, ...) and returns the aggregated value for the list of values in the input dictionary.

grouped =
    Dict.fromList
        [ ( "k1_1"
          , [ TestInput1 "k1_1" "k2_1" 1
            , TestInput1 "k1_1" "k2_1" 2
            , TestInput1 "k1_1" "k2_2" 3
            , TestInput1 "k1_1" "k2_2" 4
            ]
        , ( "k1_2",
          , [ TestInput1 "k1_2" "k2_1" 5
            , TestInput1 "k1_2" "k2_1" 6
            , TestInput1 "k1_2" "k2_2" 7
            , TestInput1 "k1_2" "k2_2" 8
            ]
        ]

grouped
    |> aggregate
        (\key inputs ->
            { key = key
            , count = inputs (count |> withFilter (\a -> a.value < 7))
            , sum = inputs (sumOf .value)
            , max = inputs (maximumOf .value)
            , min = inputs (minimumOf .value)
            }
        )
        {- ==
            [ { key = "k1_1", count = 4, sum = 10, max = 4, min = 1 }
            , { key = "k1_2", count = 2, sum = 26, max = 8, min = 5 }
            ]
        -}

This function is designed to be used in combination with groupBy.

testDataSet =
        [ TestInput1 "k1_1" "k2_1" 1
        , TestInput1 "k1_1" "k2_1" 2
        , TestInput1 "k1_1" "k2_2" 3
        , TestInput1 "k1_1" "k2_2" 4
        , TestInput1 "k1_2" "k2_1" 5
        , TestInput1 "k1_2" "k2_1" 6
        , TestInput1 "k1_2" "k2_2" 7
        , TestInput1 "k1_2" "k2_2" 8
        ]

    testDataSet
        |> groupBy .key1
        |> aggregate
            (\key inputs ->
                { key = key
                , count = inputs (count |> withFilter (\a -> a.value < 7))
                , sum = inputs (sumOf .value)
                , max = inputs (maximumOf .value)
                , min = inputs (minimumOf .value)
                }
            )
            { ==
                [ { key = "k1_1", count = 4, sum = 10, max = 4, min = 1 }
                , { key = "k1_2", count = 2, sum = 26, max = 8, min = 5 }
                ]
            }

aggregateMap : Aggregation a key1 -> (Basics.Float -> a -> b) -> List a -> List b

Map function that provides an aggregated value to the mapping function. The first argument is a tuple where the first element is a function that defines the aggregation key, the second element is predicate that allows you to filter out certain rows from the aggregation and the third argument is the aggregation operation to apply. Usage:

    testDataSet =
        [ TestInput1 "k1_1" "k2_1" 1
        , TestInput1 "k1_1" "k2_1" 2
        , TestInput1 "k1_1" "k2_2" 3
        , TestInput1 "k1_1" "k2_2" 4
        , TestInput1 "k1_2" "k2_1" 5
        , TestInput1 "k1_2" "k2_1" 6
        , TestInput1 "k1_2" "k2_2" 7
        , TestInput1 "k1_2" "k2_2" 8
        ]

    testDataSet
        |> aggregateMap
            (sumOf .value |> byKey .key1)
                (\\totalValue input ->
                    ( input, totalValue / input.value )
                )
        {- ==
            [ ( TestInput1 "k1_1" "k2_1" 1, 10 / 1 )
            , ( TestInput1 "k1_1" "k2_1" 2, 10 / 2 )
            , ( TestInput1 "k1_1" "k2_2" 3, 10 / 3 )
            , ( TestInput1 "k1_1" "k2_2" 4, 10 / 4 )
            , ( TestInput1 "k1_2" "k2_1" 5, 26 / 5 )
            , ( TestInput1 "k1_2" "k2_1" 6, 26 / 6 )
            , ( TestInput1 "k1_2" "k2_2" 7, 26 / 7 )
            , ( TestInput1 "k1_2" "k2_2" 8, 26 / 8 )
            ]
        -}

aggregateMap2 : Aggregation a key1 -> Aggregation a key2 -> (Basics.Float -> Basics.Float -> a -> b) -> List a -> List b

Map function that provides two aggregated values to the mapping function. The first argument is a tuple where the first element is a function that defines the aggregation key, the second element is predicate that allows you to filter out certain rows from the aggregation and the third argument is the aggregation operation to apply. Usage:

    testDataSet =
        [ TestInput1 "k1_1" "k2_1" 1
        , TestInput1 "k1_1" "k2_1" 2
        , TestInput1 "k1_1" "k2_2" 3
        , TestInput1 "k1_1" "k2_2" 4
        , TestInput1 "k1_2" "k2_1" 5
        , TestInput1 "k1_2" "k2_1" 6
        , TestInput1 "k1_2" "k2_2" 7
        , TestInput1 "k1_2" "k2_2" 8
        ]

    testDataSet
        |> aggregateMap2
            (sumOf .value |> byKey .key1)
            (maximumOf .value |> byKey .key2)
            (\totalValue maxValue input ->
                ( input, totalValue * maxValue / input.value )
            )
        {- ==
            [ ( TestInput1 "k1_1" "k2_1" 1, 10 * 6 / 1 )
            , ( TestInput1 "k1_1" "k2_1" 2, 10 * 6 / 2 )
            , ( TestInput1 "k1_1" "k2_2" 3, 10 * 8 / 3 )
            , ( TestInput1 "k1_1" "k2_2" 4, 10 * 8 / 4 )
            , ( TestInput1 "k1_2" "k2_1" 5, 26 * 6 / 5 )
            , ( TestInput1 "k1_2" "k2_1" 6, 26 * 6 / 6 )
            , ( TestInput1 "k1_2" "k2_2" 7, 26 * 8 / 7 )
            , ( TestInput1 "k1_2" "k2_2" 8, 26 * 8 / 8 )
            ]
        -}

aggregateMap3 : Aggregation a key1 -> Aggregation a key2 -> Aggregation a key3 -> (Basics.Float -> Basics.Float -> Basics.Float -> a -> b) -> List a -> List b

Map function that provides three aggregated values to the mapping function. The first argument is a tuple where the first element is a function that defines the aggregation key, the second element is predicate that allows you to filter out certain rows from the aggregation and the third argument is the aggregation operation to apply. Usage:

    testDataSet =
        [ TestInput1 "k1_1" "k2_1" 1
        , TestInput1 "k1_1" "k2_1" 2
        , TestInput1 "k1_1" "k2_2" 3
        , TestInput1 "k1_1" "k2_2" 4
        , TestInput1 "k1_2" "k2_1" 5
        , TestInput1 "k1_2" "k2_1" 6
        , TestInput1 "k1_2" "k2_2" 7
        , TestInput1 "k1_2" "k2_2" 8
        ]

    testDataSet
        |> aggregateMap3
            (sumOf .value |> byKey .key1)
            (maximumOf .value |> byKey .key2)
            (minimumOf .value |> byKey (key2 .key1 .key2))
            (\totalValue maxValue minValue input ->
                ( input, totalValue * maxValue / input.value + minValue )
            )
        {- ==
            [ ( TestInput1 "k1_1" "k2_1" 1, 10 * 6 / 1 + 1 )
            , ( TestInput1 "k1_1" "k2_1" 2, 10 * 6 / 2 + 1 )
            , ( TestInput1 "k1_1" "k2_2" 3, 10 * 8 / 3 + 3 )
            , ( TestInput1 "k1_1" "k2_2" 4, 10 * 8 / 4 + 3 )
            , ( TestInput1 "k1_2" "k2_1" 5, 26 * 6 / 5 + 5 )
            , ( TestInput1 "k1_2" "k2_1" 6, 26 * 6 / 6 + 5 )
            , ( TestInput1 "k1_2" "k2_2" 7, 26 * 8 / 7 + 7 )
            , ( TestInput1 "k1_2" "k2_2" 8, 26 * 8 / 8 + 7 )
            ]
        -}

aggregateMap4 : Aggregation a key1 -> Aggregation a key2 -> Aggregation a key3 -> Aggregation a key4 -> (Basics.Float -> Basics.Float -> Basics.Float -> Basics.Float -> a -> b) -> List a -> List b

Map function that provides three aggregated values to the mapping function. The first argument is a tuple where the first element is a function that defines the aggregation key, the second element is predicate that allows you to filter out certain rows from the aggregation and the third argument is the aggregation operation to apply. Usage:

    testDataSet =
        [ TestInput1 "k1_1" "k2_1" 1
        , TestInput1 "k1_1" "k2_1" 2
        , TestInput1 "k1_1" "k2_2" 3
        , TestInput1 "k1_1" "k2_2" 4
        , TestInput1 "k1_2" "k2_1" 5
        , TestInput1 "k1_2" "k2_1" 6
        , TestInput1 "k1_2" "k2_2" 7
        , TestInput1 "k1_2" "k2_2" 8
        ]

    testDataSet
        |> aggregateMap4
            (sumOf .value |> byKey .key1)
            (maximumOf .value |> byKey .key2)
            (minimumOf .value |> byKey (key2 .key1 .key2))
            (averageOf .value |> byKey (key2 .key1 .key2))
            (\totalValue maxValue minValue average input ->
                ( input, totalValue * maxValue / input.value + minValue + average )
            )
       {-  ==
            [ ( TestInput1 "k1_1" "k2_1" 1, 10 * 6 / 1 + 1 + 1.5 )
            , ( TestInput1 "k1_1" "k2_1" 2, 10 * 6 / 2 + 1 + 1.5 )
            , ( TestInput1 "k1_1" "k2_2" 3, 10 * 8 / 3 + 3 + 3.5 )
            , ( TestInput1 "k1_1" "k2_2" 4, 10 * 8 / 4 + 3 + 3.5 )
            , ( TestInput1 "k1_2" "k2_1" 5, 26 * 6 / 5 + 5 + 5.5 )
            , ( TestInput1 "k1_2" "k2_1" 6, 26 * 6 / 6 + 5 + 5.5 )
            , ( TestInput1 "k1_2" "k2_2" 7, 26 * 8 / 7 + 7 + 7.5 )
            , ( TestInput1 "k1_2" "k2_2" 8, 26 * 8 / 8 + 7 + 7.5 )
            ]
       -}

Operators

count : Aggregation a Morphir.SDK.Key.Key0

Count the number of rows in a group.

sumOf : (a -> Basics.Float) -> Aggregation a Morphir.SDK.Key.Key0

Apply a function to each row that returns a numeric value and return the sum of the values.

minimumOf : (a -> Basics.Float) -> Aggregation a Morphir.SDK.Key.Key0

Apply a function to each row that returns a numeric value and return the minimum of the values.

maximumOf : (a -> Basics.Float) -> Aggregation a Morphir.SDK.Key.Key0

Apply a function to each row that returns a numeric value and return the maximum of the values.

averageOf : (a -> Basics.Float) -> Aggregation a Morphir.SDK.Key.Key0

Apply a function to each row that returns a numeric value and return the average of the values.

weightedAverageOf : (a -> Basics.Float) -> (a -> Basics.Float) -> Aggregation a Morphir.SDK.Key.Key0

Apply two functions to each row that returns a numeric value and return the weighted of the values using the first function to get the weights.

byKey : (a -> key) -> Aggregation a Morphir.SDK.Key.Key0 -> Aggregation a key

Changes the key of an aggregation. Usage:

count
    |> byKey .key1
    == { key = .key1
       , filter = always True
       , operator = Count
       }

withFilter : (a -> Basics.Bool) -> Aggregation a key -> Aggregation a key

Adds a filter to an aggregation. Usage:

count
    |> withFilter (\a -> a.value < 0)
    == { key = key0
       , filter = \a -> a.value < 0
       , operator = Count
       }

Utilities

constructAggregationCall : Morphir.IR.Value.TypedValue -> Morphir.IR.Value.TypedValue -> Morphir.IR.Value.TypedValue -> Result ConstructAggregationError AggregationCall

constructAggregationCall transforms a Morphir.SDK.Aggregate groupBy and agggregate call into a single data structure


type AggregationCall
    = AggregationCall Morphir.IR.Name.Name (Maybe Morphir.IR.Name.Name) (List AggregateValue) Morphir.IR.Value.TypedValue

An AggregationCall represents a call to Morphir.SDK.Aggregate.aggregate

Its values are:


type AggregateValue
    = AggregateValue Morphir.IR.Name.Name (Maybe Morphir.IR.Value.TypedValue) Morphir.IR.FQName.FQName (Maybe Morphir.IR.Value.TypedValue)

An AggregateValue represents a single aggregation used within the overall Aggregation call

Its values are:


type ConstructAggregationError

A ConstructAggregationError represents the ways constructAggregationCall may fail

Its values are: