Decode values from CSV. This package tries to be as
unsurprising as possible, imitating elm/json
and
[NoRedInk/elm-json-decode-pipeline
][json-decode-pipeline] so that you can
apply whatever you already know about JSON decoders to a different data format.
Say you have a CSV like this:
ID,Name,Species
1,Atlas,cat
2,Axel,puffin
You want to get some data out of it, so you're looking through these docs. Where do you begin?
The first thing you need to know is that decoders are designed to fit together
to match whatever data shapes are in your CSV. So to decode the ID (an Int
in
the "ID" field), you'd combine int
and field
like this:
data : String
data =
-- \u{000D} is the carriage return
"ID,Name,Species\u{000D}\n1,Atlas,cat\u{000D}\n2,Axel,puffin"
decodeCsv FieldNamesFromFirstRow (field "ID" int) data
--> Ok [ 1, 2 ]
But this is probably not enough, so we'll need to combine a bunch of decoders
together using into
:
decodeCsv FieldNamesFromFirstRow
(into
(\id name species ->
{ id = id
, name = name
, species = species
}
)
|> pipeline (field "ID" int)
|> pipeline (field "Name" string)
|> pipeline (field "Species" string)
)
data
--> Ok
--> [ { id = 1, name = "Atlas", species = "cat" }
--> , { id = 2, name = "Axel", species = "puffin" }
--> ]
You can decode as many things as you want by giving into
a function
that takes more arguments.
A way to specify what kind of thing you want to decode into. For example,
if you have a Pet
data type, you'd want a Decoder Pet
.
string : Decoder String
Decode a string.
decodeCsv NoFieldNames string "a" --> Ok [ "a" ]
Unless you specify otherwise (e.g. with field
) this will assume
there is only one column in the CSV and try to decode that.
decodeCsv NoFieldNames string "a,b"
--> Err
--> (DecodingErrors
--> [ FieldDecodingError
--> { row = 0
--> , column = OnlyColumn
--> , problem = ExpectedOneColumn 2
--> }
--> ]
--> )
int : Decoder Basics.Int
Decode an integer.
decodeCsv NoFieldNames int "1" --> Ok [ 1 ]
decodeCsv NoFieldNames int "volcano"
--> Err
--> (DecodingErrors
--> [ FieldDecodingError
--> { row = 0
--> , column = OnlyColumn
--> , problem = ExpectedInt "volcano"
--> }
--> ]
--> )
Unless you specify otherwise (e.g. with field
) this will assume
there is only one column in the CSV and try to decode that.
decodeCsv NoFieldNames int "1,2"
--> Err
--> (DecodingErrors
--> [ FieldDecodingError
--> { row = 0
--> , column = OnlyColumn
--> , problem = ExpectedOneColumn 2
--> }
--> ]
--> )
float : Decoder Basics.Float
Decode a floating-point number.
decodeCsv NoFieldNames float "3.14" --> Ok [ 3.14 ]
decodeCsv NoFieldNames float "mimesis"
--> Err
--> (DecodingErrors
--> [ FieldDecodingError
--> { row = 0
--> , column = OnlyColumn
--> , problem = ExpectedFloat "mimesis"
--> }
--> ]
--> )
Unless you specify otherwise (e.g. with field
) this will assume
there is only one column in the CSV and try to decode that.
decodeCsv NoFieldNames float "1.0,2.0"
--> Err
--> (DecodingErrors
--> [ FieldDecodingError
--> { row = 0
--> , column = OnlyColumn
--> , problem = ExpectedOneColumn 2
--> }
--> ]
--> )
blank : Decoder a -> Decoder (Maybe a)
Handle blank fields by turning them into Maybe
s. We consider a field
to be blank if it's empty or consists solely of whitespace characters.
decodeCsv NoFieldNames (blank int) "\r\n1"
--> Ok [ Nothing, Just 1 ]
column : Basics.Int -> Decoder a -> Decoder a
Parse a value at a numbered column, starting from 0.
decodeCsv NoFieldNames (column 1 string) "a,b,c" --> Ok [ "b" ]
decodeCsv NoFieldNames (column 100 float) "3.14"
--> Err
--> (DecodingErrors
--> [ FieldDecodingError
--> { row = 0
--> , column = Column 100
--> , problem = ColumnNotFound 100
--> }
--> ]
--> )
field : String -> Decoder a -> Decoder a
Parse a value at a named column. There are a number of ways to provide
these names, see FieldNames
decodeCsv
FieldNamesFromFirstRow
(field "Country" string)
"Country\r\nArgentina"
--> Ok [ "Argentina" ]
optionalColumn : Basics.Int -> Decoder a -> Decoder (Maybe a)
Like column
, parse a value at a numbered column. The parsing succeeds even if the column is missing.
decodeCsv
NoFieldNames
(optionalColumn 1 string)
"Pie\r\nApple,Argentina"
--> Ok [ Nothing, Just "Argentina" ]
optionalField : String -> Decoder a -> Decoder (Maybe a)
Like field
, parse a value at a named column. The parsing succeeds even if the column is missing.
decodeCsv
FieldNamesFromFirstRow
(optionalField "Country" string)
"Country\r\nArgentina"
--> Ok [ Just "Argentina" ]
decodeCsv
FieldNamesFromFirstRow
(optionalField "Country" string)
"Pie\r\nApple"
--> Ok [ Nothing ]
Where do we get names for use with field
?
NoFieldNames
: don't get field names at all. field
will
always fail.CustomFieldNames
: use the provided field names in order (so ["Id", "Name"]
will mean that "Id" is in column 0 and "Name" is in column 1.)FieldNamesFromFirstRow
: use the first row of the CSV as the source of
field names.decodeCsv : FieldNames -> Decoder a -> String -> Result Error (List a)
Convert a CSV string into some type you care about using the
Decoder
s in this module!
decodeCustom : { fieldSeparator : Char } -> FieldNames -> Decoder a -> String -> Result Error (List a)
Convert something shaped roughly like a CSV. For example, to decode a TSV (tab-separated values) string:
decodeCustom { fieldSeparator = '\t' }
NoFieldNames
(map2 Tuple.pair
(column 0 int)
(column 1 string)
)
"1\tBrian\n2\tAtlas"
--> Ok [ ( 1, "Brian" ), ( 2, "Atlas" ) ]
Sometimes we cannot decode every row in a CSV. This is how we tell
you what went wrong. If you need to present this to someone, you can get a
human-readable version with errorToString
Some more detail:
ParsingError
: there was a problem parsing the CSV into rows and
columns. All these errors have to do with quoting issues. Check that
any quoted fields are closed and that quotes are escaped.NoFieldNamesOnFirstRow
: we tried to get the field names from the first
row (using FieldNames
) but couldn't find any, probably
because the input was blank.DecodingErrors
: we couldn't decode a value using the specified
decoder. See DecodingError
for more details.Errors when decoding can either be:
FieldDecodingError
), in which case there
is a specific Problem
in a specific location.oneOf
where all branches failed (OneOfDecodingError
).FieldNotProvided
).availableFields
when NoFieldNames
was passed.errorToString : Error -> String
Produce a human-readable version of an Error
?!
Where did the problem happen?
Column
: at the given column numberField
: at the given named column (with optional column number if we were
able to look up what column we should have found.)OnlyColumn
: at the only column in the rowThings that can go wrong while decoding:
ColumnNotFound Int
and FieldNotFound String
: we looked for the
specified column, but couldn't find it. The argument specifies where we
tried to look.ExpectedOneColumn Int
: basic decoders like string
and
int
expect to find a single column per row. If there are multiple
columns, and you don't specify which to use with column
or field
, you'll get this error. The argument says how many
columns we found.ExpectedInt String
and ExpectedFloat String
: we failed to parse a
string into a number. The argument specifies the string we got.Failure
: we got a custom failure message from fail
.map : (from -> to) -> Decoder from -> Decoder to
Transform a decoded value.
decodeCsv NoFieldNames (map (\i -> i * 2) int) "15"
--> Ok [ 30 ]
decodeCsv NoFieldNames (map String.reverse string) "slap"
--> Ok [ "pals" ]
map2 : (a -> b -> c) -> Decoder a -> Decoder b -> Decoder c
Combine two decoders to make something else.
decodeCsv NoFieldNames
(map2 Tuple.pair
(column 0 int)
(column 1 string)
)
"1,Atlas"
--> Ok [ (1, "Atlas") ]
map3 : (a -> b -> c -> d) -> Decoder a -> Decoder b -> Decoder c -> Decoder d
Like map2
, but with three decoders. map4
and beyond don't
exist in this package. Use into
to decode records instead!
decodeCsv NoFieldNames
(map3 (\r g b -> (r, g, b))
(column 0 int)
(column 1 int)
(column 2 int)
)
"255,255,0"
--> Ok [ (255, 255, 0) ]
into : (a -> b) -> Decoder (a -> b)
Combine an arbitrary amount of fields. You provide a function that takes
as many arguments as you need, then send it values by providing decoders with
pipeline
.
type alias Pet =
{ id : Int
, name : String
, species : String
, weight : Float
}
petDecoder : Decoder Pet
petDecoder =
into Pet
|> pipeline (column 0 int)
|> pipeline (column 1 string)
|> pipeline (column 2 string)
|> pipeline (column 3 float)
Now you can decode pets like this:
decodeCsv NoFieldNames petDecoder "1,Atlas,cat,14\r\n2,Axel,puffin,1.37"
--> Ok
--> [ { id = 1, name = "Atlas", species = "cat", weight = 14 }
--> , { id = 2, name = "Axel", species = "puffin", weight = 1.37 }
--> ]
pipeline : Decoder a -> Decoder (a -> b) -> Decoder b
See into
.
oneOf : Decoder a -> List (Decoder a) -> Decoder a
Try several possible decoders in sequence, committing to the first one that passes.
decodeCsv NoFieldNames
(oneOf
(map Just int)
[ succeed Nothing ]
)
"1"
--> Ok [ Just 1 ]
decodeCsv NoFieldNames
(oneOf
(map Just int)
[ succeed Nothing ]
)
"a"
--> Ok [ Nothing ]
andThen : (a -> Decoder b) -> Decoder a -> Decoder b
Decode some value and then make a decoding decision based on the outcome. For example, if you wanted to reject negative numbers, you might do something like this:
positiveInt : Decoder Int
positiveInt =
int
|> andThen
(\rawInt ->
if rawInt < 0 then
Decode.fail "Only positive numbers allowed!"
else
Decode.succeed rawInt
)
You could then use it like this:
decodeCsv NoFieldNames positiveInt "1" -- Ok [ 1 ]
decodeCsv NoFieldNames positiveInt "-1"
-- Err { row = 0, problem = Failure "Only positive numbers allowed!" }
succeed : a -> Decoder a
Always succeed, no matter what. Mostly useful with andThen
.
fail : String -> Decoder a
Always fail with the given message, no matter what. Mostly useful with
andThen
.
fromResult : Result String a -> Decoder a
Make creating custom decoders a little easier. If you already have a function that parses into something you care about, you can combine it with this.
For example, here's how you could parse a hexadecimal number with
rtfeldman/elm-hex
:
import Hex
hex : Decoder Int
hex =
andThen
(\value -> fromResult (Hex.fromString value))
string
decodeCsv NoFieldNames hex "ff"
--> Ok [ 255 ]
fromMaybe : String -> Maybe a -> Decoder a
Like fromResult
but you have to specify the error
message since Nothing
has no further information.
For example, you could implement something like int
using this:
myInt : Decoder Int
myInt =
andThen
(\value ->
fromMaybe "Expected an int"
(String.toInt value)
)
string
decodeCsv NoFieldNames myInt "123"
--> Ok [ 123 ]
(That said, you probably want to use int
instead... it has better
error messages and is more tolerant of unusual situations!)
availableFields : Decoder (List String)
Returns all available field names. The behavior depends on your configuration:
NoFieldNames
: The decoder fails.CustomFieldNames
: Decodes to the provided list.FieldNamesFromFirstRow
: Returns the first row of the CSV.