jxxcarlson / elm-stat / Stat

The goal of this module is to provide the most comonly used statistical functions .

Measures of Central Tendency

mean : List Basics.Float -> Maybe Basics.Float

Compute the mean of a list of floats.

> Stat.mean [1,2,4,5] == Just 3
> Stat.mean [] == Nothing

meanWithDefault : List Basics.Float -> Basics.Float -> Basics.Float

Compute the mean of a list of floats, but in case of an empty list return the default value that was provided.

> Stat.meanWithDefault [1,2,4,5] 0 == 3
> Stat.meanWithDefault [] 0 == 0

average : List Basics.Float -> Maybe Basics.Float

Same as mean

> Stat.average [1,2,4,5] == Just 3
> Stat.average [] == Nothing

geometricMean : List Basics.Float -> Maybe Basics.Float

Compute the geometric mean of a list of floats.

> Stat.geometricMean [1,2.7,5.9] == Just 2.51

harmonicMean : List Basics.Float -> Maybe Basics.Float

Compute the harmonic mean of a list of floats.

> Stat.harmonicMean [1,2,4,5] == Just 2.0512820512820515

weightedMean : List ( Basics.Float, Basics.Float ) -> Maybe Basics.Float

Compute the weighted mean of a list of tuples, where the first elemnt in the tuple is the weight and the second is the value

> Stat.weightedMean [(2,5),(8,10)] == Just 9
> Stat.weightedMean [(0,5),(0,10)] == Nothing -- the sum of the weights can not be 0

median : List Basics.Float -> Maybe Basics.Float

Compute the median of the list. The median is the value separating the higher half from the lower half of a data sample. If the sample has an odd number of values, the median is the value in the middle. If the sample has an even number of values, the median is the mean of the two middle values.

> Stat.median [1,6,10] == Just 6
> Stat.median [1,6,8,10] == Just 7

mode : List comparable -> Maybe ( comparable, Basics.Int )

Compute the mode of the data:

> data = [1, 5, 2, 2, 2, 2, 5, 3, 1]
> mode data
  Just (2,4) : Maybe ( number, Int )

> data = ["red", "green", "red", "blue", "blue", "red"]
> mode data
  Just ("red",3) : Maybe ( String, Int )

rootMeanSquare : List Basics.Float -> Maybe Basics.Float

Root mean square (RMS) is the square root of the sum of the squares of values in a list divided by the length of the list. Also known as quadratic mean.

Stat.rootMeanSquare [ 1, 10, 20 ] == Just 12.92

skewness : List Basics.Float -> Maybe Basics.Float

Skew or Skewness is a measure of the asymmetry of the probability distribution of a variable around its mean. There are several equations to calculate skewness. The one used in this function is Pearson’s moment coefficient of skewness.

> Stat.skewness [1,10.5,20] == Just 0
> Stat.skewness [1,2,3,10] == Just 1.01
> Stat.skewness [1,30,30,30] == Just -1.15

Measures of Dispersion

variance : List Basics.Float -> Maybe Basics.Float

In statistics, variance is the expectation of the squared deviation of a random variable from its mean.

> Stat.variance [1,2,3,4,5] == Just 2

standardDeviation : List Basics.Float -> Maybe Basics.Float

The standard deviation is the square root of variance. A low standard deviation indicates that the values tend to be close to the mean.

> Stat.standardDeviation [1,2,3,4,5] == Just 1.41
> Stat.standardDeviation [2,2,2] == Just 0

meanAbsoluteDeviation : List Basics.Float -> Maybe Basics.Float

The average absolute deviation, or mean absolute deviation, of a data set is the average of the absolute deviations from the mean.

> Stat.meanAbsoluteDeviation [1,2,5,4] == Just 1.5
> Stat.meanAbsoluteDeviation [1,2,4] == Just 1.11

medianAbsoluteDeviation : List Basics.Float -> Maybe Basics.Float

The median absolute deviation, of a data set is the average of the absolute deviations from the median.

> Stat.medianAbsoluteDeviation [ 1, 2, 4 ] == Just 1

zScore : Basics.Float -> Basics.Float -> Basics.Float -> Basics.Float

Calculate the Z-score or standard score of a given elements provided the mean and the standard deviation.

> Stat.zScore 1 3 1.58 == -1.26

zScores : List Basics.Float -> Maybe (List Basics.Float)

Calculate the Z-score or standard score of the provided list

> Stat.zScores [1,2,4,5] == Just [-1.26,-0.63,0.63,1.26]

Similarity

covariance : List ( Basics.Float, Basics.Float ) -> Maybe Basics.Float

Covariance is a measure of how two random variables vary together. When the greater values of one variable correspond to the greater values of the other variable, this is a positive covariance. Whereas when the greater values of one variable correspond to the lesser values of the other variable, this is negative covariance.

> Stat.covariance[(1,2),(4,8),(5,10)] == Just 5.77

correlation : List ( Basics.Float, Basics.Float ) -> Maybe Basics.Float

A correlation is a “normalized” covariance, its values are between -1.0 and 1.0

> Stat.correlation[(1,2),(4,8),(5,10)] == Just 1.00
> Stat.correlation[(10,0),(40,-30),(50,-32)] == Just -0.98

r2 : List ( Basics.Float, Basics.Float ) -> Maybe Basics.Float

R2 is the square of the correlation coefficient

> Stat.r2[(1,2),(4,8),(5,10)] == Just 1.00
> Stat.r2[(10,0),(40,-30),(50,-32)] == Just 0.97

Linear Regression

linearRegression : List ( Basics.Float, Basics.Float ) -> Maybe ( Basics.Float, Basics.Float )

Linear regression finds the line that best fits the given points. The method used here is the simple linear regression. The tuple returned is (alpha, beta) where y = alpha + beta * x

> Stat.linearRegression[(1,3),(4,9),(5,11)] == Just (1,2) -- 3 = 1 + 2 * 1

linearRegressionLine : ( Basics.Float, Basics.Float ) -> Basics.Float -> Basics.Float

Returns a function that looks like this: y = alpha + beta * x. This may come in handy when generating points on the regreesion line.

> Stat.linearRegression[(1,3),(4,9),(5,11)] == Just (1,2)
> f = Stat.linearRegressionLine (1,2) == <function> : Float -> Float
> f 5 == 11