BrianHicks / elm-csv / Csv.Decode

Decode values from CSV. This package tries to be as unsurprising as possible, imitating elm/json and [NoRedInk/elm-json-decode-pipeline][json-decode-pipeline] so that you can apply whatever you already know about JSON decoders to a different data format.

A Crash Course on Constructing Decoders

Say you have a CSV like this:

ID,Name,Species
1,Atlas,cat
2,Axel,puffin

You want to get some data out of it, so you're looking through these docs. Where do you begin?

The first thing you need to know is that decoders are designed to fit together to match whatever data shapes are in your CSV. So to decode the ID (an Int in the "ID" field), you'd combine int and field like this:

data : String
data =
    -- \u{000D} is the carriage return
    "ID,Name,Species\u{000D}\n1,Atlas,cat\u{000D}\n2,Axel,puffin"

decodeCsv FieldNamesFromFirstRow (field "ID" int) data
--> Ok [ 1, 2 ]

But this is probably not enough, so we'll need to combine a bunch of decoders together using into:

decodeCsv FieldNamesFromFirstRow
    (into
        (\id name species ->
            { id = id
            , name = name
            , species = species
            }
        )
        |> pipeline (field "ID" int)
        |> pipeline (field "Name" string)
        |> pipeline (field "Species" string)
    )
    data
--> Ok
-->     [ { id = 1, name = "Atlas", species = "cat" }
-->     , { id = 2, name = "Axel", species = "puffin" }
-->     ]

You can decode as many things as you want by giving into a function that takes more arguments.

Basic Decoders


type Decoder a

A way to specify what kind of thing you want to decode into. For example, if you have a Pet data type, you'd want a Decoder Pet.

string : Decoder String

Decode a string.

decodeCsv NoFieldNames string "a" --> Ok [ "a" ]

Unless you specify otherwise (e.g. with field) this will assume there is only one column in the CSV and try to decode that.

decodeCsv NoFieldNames string "a,b"
--> Err
-->     (DecodingErrors
-->         [ FieldDecodingError
-->             { row = 0
-->             , column = OnlyColumn
-->             , problem =  ExpectedOneColumn 2
-->             }
-->         ]
-->     )

int : Decoder Basics.Int

Decode an integer.

decodeCsv NoFieldNames int "1" --> Ok [ 1 ]

decodeCsv NoFieldNames int "volcano"
--> Err
-->     (DecodingErrors
-->         [ FieldDecodingError
-->           { row = 0
-->           , column = OnlyColumn
-->           , problem = ExpectedInt "volcano"
-->           }
-->         ]
-->     )

Unless you specify otherwise (e.g. with field) this will assume there is only one column in the CSV and try to decode that.

decodeCsv NoFieldNames int "1,2"
--> Err
-->     (DecodingErrors
-->         [ FieldDecodingError
-->           { row = 0
-->           , column = OnlyColumn
-->           , problem = ExpectedOneColumn 2
-->           }
-->         ]
-->     )

float : Decoder Basics.Float

Decode a floating-point number.

decodeCsv NoFieldNames float "3.14" --> Ok [ 3.14 ]

decodeCsv NoFieldNames float "mimesis"
--> Err
-->     (DecodingErrors
-->         [ FieldDecodingError
-->           { row = 0
-->           , column = OnlyColumn
-->           , problem = ExpectedFloat "mimesis"
-->           }
-->         ]
-->     )

Unless you specify otherwise (e.g. with field) this will assume there is only one column in the CSV and try to decode that.

decodeCsv NoFieldNames float "1.0,2.0"
--> Err
-->     (DecodingErrors
-->         [ FieldDecodingError
-->           { row = 0
-->           , column = OnlyColumn
-->           , problem = ExpectedOneColumn 2
-->           }
-->         ]
-->     )

blank : Decoder a -> Decoder (Maybe a)

Handle blank fields by turning them into Maybes. We consider a field to be blank if it's empty or consists solely of whitespace characters.

decodeCsv NoFieldNames (blank int) "\r\n1"
--> Ok [ Nothing, Just 1 ]

Finding Values

column : Basics.Int -> Decoder a -> Decoder a

Parse a value at a numbered column, starting from 0.

decodeCsv NoFieldNames (column 1 string) "a,b,c" --> Ok [ "b" ]

decodeCsv NoFieldNames (column 100 float) "3.14"
--> Err
-->     (DecodingErrors
-->         [ FieldDecodingError
-->           { row = 0
-->           , column = Column 100
-->           , problem = ColumnNotFound 100
-->           }
-->         ]
-->     )

field : String -> Decoder a -> Decoder a

Parse a value at a named column. There are a number of ways to provide these names, see FieldNames

decodeCsv
    FieldNamesFromFirstRow
    (field "Country" string)
    "Country\r\nArgentina"
--> Ok [ "Argentina" ]

optionalColumn : Basics.Int -> Decoder a -> Decoder (Maybe a)

Like column, parse a value at a numbered column. The parsing succeeds even if the column is missing.

decodeCsv
    NoFieldNames
    (optionalColumn 1 string)
    "Pie\r\nApple,Argentina"
--> Ok [ Nothing, Just "Argentina" ]

optionalField : String -> Decoder a -> Decoder (Maybe a)

Like field, parse a value at a named column. The parsing succeeds even if the column is missing.

decodeCsv
    FieldNamesFromFirstRow
    (optionalField "Country" string)
    "Country\r\nArgentina"
--> Ok [ Just "Argentina" ]


decodeCsv
    FieldNamesFromFirstRow
    (optionalField "Country" string)
    "Pie\r\nApple"
--> Ok [ Nothing ]

Running Decoders


type FieldNames
    = NoFieldNames
    | CustomFieldNames (List String)
    | FieldNamesFromFirstRow

Where do we get names for use with field?

decodeCsv : FieldNames -> Decoder a -> String -> Result Error (List a)

Convert a CSV string into some type you care about using the Decoders in this module!

decodeCustom : { fieldSeparator : Char } -> FieldNames -> Decoder a -> String -> Result Error (List a)

Convert something shaped roughly like a CSV. For example, to decode a TSV (tab-separated values) string:

decodeCustom {  fieldSeparator = '\t' }
    NoFieldNames
    (map2 Tuple.pair
        (column 0 int)
        (column 1 string)
    )
    "1\tBrian\n2\tAtlas"
    --> Ok [ ( 1, "Brian" ), ( 2, "Atlas" ) ]


type Error
    = ParsingError Csv.Parser.Problem
    | NoFieldNamesOnFirstRow
    | DecodingErrors (List DecodingError)

Sometimes we cannot decode every row in a CSV. This is how we tell you what went wrong. If you need to present this to someone, you can get a human-readable version with errorToString

Some more detail:


type DecodingError
    = FieldDecodingError ({ row : Basics.Int, column : Column, problem : Problem })
    | OneOfDecodingError Basics.Int (List DecodingError)
    | FieldNotProvided String
    | NoFieldNamesProvided

Errors when decoding can either be:

errorToString : Error -> String

Produce a human-readable version of an Error?!


type Column
    = Column Basics.Int
    | Field String (Maybe Basics.Int)
    | OnlyColumn

Where did the problem happen?


type Problem
    = ColumnNotFound Basics.Int
    | FieldNotFound String
    | ExpectedOneColumn Basics.Int
    | ExpectedInt String
    | ExpectedFloat String
    | Failure String

Things that can go wrong while decoding:

Transforming Values

map : (from -> to) -> Decoder from -> Decoder to

Transform a decoded value.

decodeCsv NoFieldNames (map (\i -> i * 2) int) "15"
--> Ok [ 30 ]

decodeCsv NoFieldNames (map String.reverse string) "slap"
--> Ok [ "pals" ]

map2 : (a -> b -> c) -> Decoder a -> Decoder b -> Decoder c

Combine two decoders to make something else.

decodeCsv NoFieldNames
    (map2 Tuple.pair
        (column 0 int)
        (column 1 string)
    )
    "1,Atlas"
    --> Ok [ (1, "Atlas") ]

map3 : (a -> b -> c -> d) -> Decoder a -> Decoder b -> Decoder c -> Decoder d

Like map2, but with three decoders. map4 and beyond don't exist in this package. Use into to decode records instead!

decodeCsv NoFieldNames
    (map3 (\r g b -> (r, g, b))
        (column 0 int)
        (column 1 int)
        (column 2 int)
    )
    "255,255,0"
    --> Ok [ (255, 255, 0) ]

into : (a -> b) -> Decoder (a -> b)

Combine an arbitrary amount of fields. You provide a function that takes as many arguments as you need, then send it values by providing decoders with pipeline.

type alias Pet =
    { id : Int
    , name : String
    , species : String
    , weight : Float
    }

petDecoder : Decoder Pet
petDecoder =
    into Pet
        |> pipeline (column 0 int)
        |> pipeline (column 1 string)
        |> pipeline (column 2 string)
        |> pipeline (column 3 float)

Now you can decode pets like this:

decodeCsv NoFieldNames petDecoder "1,Atlas,cat,14\r\n2,Axel,puffin,1.37"
--> Ok
-->     [ { id = 1, name = "Atlas", species = "cat", weight = 14 }
-->     , { id = 2, name = "Axel", species = "puffin", weight = 1.37 }
-->     ]

pipeline : Decoder a -> Decoder (a -> b) -> Decoder b

See into.

Fancy Decoding

oneOf : Decoder a -> List (Decoder a) -> Decoder a

Try several possible decoders in sequence, committing to the first one that passes.

decodeCsv NoFieldNames
    (oneOf
        (map Just int)
        [ succeed Nothing ]
    )
    "1"
--> Ok [ Just 1 ]

decodeCsv NoFieldNames
    (oneOf
        (map Just int)
        [ succeed Nothing ]
    )
    "a"
--> Ok [ Nothing ]

andThen : (a -> Decoder b) -> Decoder a -> Decoder b

Decode some value and then make a decoding decision based on the outcome. For example, if you wanted to reject negative numbers, you might do something like this:

positiveInt : Decoder Int
positiveInt =
    int
        |> andThen
            (\rawInt ->
                if rawInt < 0 then
                    Decode.fail "Only positive numbers allowed!"

                else
                    Decode.succeed rawInt
            )

You could then use it like this:

decodeCsv NoFieldNames positiveInt "1" -- Ok [ 1 ]

decodeCsv NoFieldNames positiveInt "-1"
-- Err { row = 0, problem = Failure "Only positive numbers allowed!" }

succeed : a -> Decoder a

Always succeed, no matter what. Mostly useful with andThen.

fail : String -> Decoder a

Always fail with the given message, no matter what. Mostly useful with andThen.

fromResult : Result String a -> Decoder a

Make creating custom decoders a little easier. If you already have a function that parses into something you care about, you can combine it with this.

For example, here's how you could parse a hexadecimal number with rtfeldman/elm-hex:

import Hex

hex : Decoder Int
hex =
    andThen
        (\value -> fromResult (Hex.fromString value))
        string

decodeCsv NoFieldNames hex "ff"
--> Ok [ 255 ]

fromMaybe : String -> Maybe a -> Decoder a

Like fromResult but you have to specify the error message since Nothing has no further information.

For example, you could implement something like int using this:

myInt : Decoder Int
myInt =
    andThen
        (\value ->
            fromMaybe "Expected an int"
                (String.toInt value)
        )
        string

decodeCsv NoFieldNames myInt "123"
--> Ok [ 123 ]

(That said, you probably want to use int instead... it has better error messages and is more tolerant of unusual situations!)

availableFields : Decoder (List String)

Returns all available field names. The behavior depends on your configuration: