dasch / parser / Parser

Easy to use text parsing.

Definitions


type Parser a

A Parser a is an instruction for how to take some input text and turn it into an a value.


type alias Error =
{ message : String
, position : Basics.Int
, context : Maybe String 
}

Describes an error during parsing, i.e. what caused a parser to fail, and at what position into the input text it failed.

Core

parse : String -> Parser a -> Result Error a

Parse an input string using a specific parser, returning a result containing either the parsed value or an error.

parse "xyz" (char 'x') -- Ok 'x'

parse "xyz" (char 'w') -- Err { message = "expected char w", position = 0 }

inContext : String -> Parser a -> Parser a

Sets the context of the parser. Useful for providing better error messages.

succeed : a -> Parser a

A parser that always succeeds with a specified value without reading any input.

parse "xyz" (succeed 42) -- Ok 42

fail : String -> Parser a

A parser that always fails with a specified error message without reading any input.

parse "xyz" (fail "nope") -- Err { message = "nope", position = 0 }

lazy : (() -> Parser a) -> Parser a

In order to support self-referential parsers, you need to introduce lazy evaluation.

type Tree = Leaf | Node Tree Tree

tree : Parser Tree
tree =
    oneOf [ leaf, node ]

leaf : Parser Tree
leaf =
    map (always Leaf) (char 'x')

node : Parser Tree
node =
    into Node
        |> ignore (char '@')
        |> grab (lazy (\_ -> tree))
        |> grab (lazy (\_ -> tree))

parse "x" tree -- Ok Leaf
parse "@x@xx" tree -- Ok (Node Leaf (Node Leaf Leaf))

Without lazy, this example would fail due to a circular reference.

Matching Specific Text

char : Char -> Parser Char

Matches a specific character.

parse "hello" (char 'h') -- Ok 'h'

string : String -> Parser String

Matches a specific string.

parse "hello world" (string "hello") -- Ok "hello"

Matching with Patterns

anyChar : Parser Char

Matches any character.

when : (Char -> Basics.Bool) -> Parser Char

Matches a character if some predicate holds.

parse "123" (when Char.isDigit) -- Ok '1'

except : Parser Char -> Parser Char

Matches a character if the specified parser fails.

parse "xyz" (except (char 'a')) -- Ok 'x'

parse "xyz" (except (char 'x')) -- Err { message = "expected to not match", ... }

end : Parser ()

Matches the end of the input.

char 'x'
    |> followedBy end
    |> parse "x" -- Ok ()

chomp : Basics.Int -> Parser String

A parser that simply reads a specific number of characters from the input.

parse "xyz" (chomp 2) -- Ok "xy"

Matching Multiple Different Patterns

oneOf : List (Parser a) -> Parser a

Matches one of a list of parsers.

parse "y" (oneOf [ char 'x', char 'y' ]) -- Ok 'y'

Matching Sequences

maybe : Parser a -> Parser (Maybe a)

Maybe match a value. If the parser succeeds with x, we'll succeed with Just x. If if fails, we'll succeed with Nothing.

parse "42" (maybe int) -- Just 42

parse "hello" (maybe int) -- Nothing

zeroOrMore : Parser a -> Parser (List a)

Matches zero or more successive occurrences of a value. Succeeds with an empty list if there are no occurrences.

parse "xxy" (zeroOrMore (char 'x')) -- Ok [ 'x', 'x' ]

parse "yyy" (zeroOrMore (char 'x')) -- Ok []

oneOrMore : Parser a -> Parser (List a)

Matches one or more successive occurrences of a value. Fails if there are no occurrences.

parse "xxy" (oneOrMore (char 'x')) -- Ok [ 'x', 'x' ]

parse "yyy" (oneOrMore (char 'x')) -- Err { message = "expected char `x`", position = 0 }

sequence : List (Parser a) -> Parser (List a)

Matches a sequence of parsers in turn, succeeding with a list of their values if they all succeed.

parse "helloworld" (sequence [ string "hello", string "world" ]) -- Ok [ "hello", "world" ]

repeat : Basics.Int -> Parser a -> Parser (List a)

Matches a specific number of occurrences of a parser, succeeding with a list of values.

parse "xxxx" (repeat 3 (char 'x')) -- Ok [ 'x', 'x', 'x' ]

until : Parser a -> Parser b -> Parser (List b)

Matches zero or more values until a "stop" parser matches.

char '['
    |> followedBy (until (char ']') anyChar)
    |> parse "[abc]" -- Ok [ 'a', 'b', 'c' ]

Chaining Parsers

andThen : (a -> Parser b) -> Parser a -> Parser b

Create a parser that depends on the previous parser's result.

For example, you can support two different versions of a format if there's a version number included:

spec : Parser Spec
spec =
    let
        specByVersion version =
            case version of
                1 ->
                    v1

                -- assume v1 is a Parser Spec
                2 ->
                    v2

                -- assume v2 is a Parser Spec
                x ->
                    fail ("unknown spec version " ++ String.fromInt x)
    in
    string "version="
        |> followedBy int
        |> andThen specByVersion

orElse : Parser a -> Parser a -> Parser a

Create a fallback for when a parser fails.

followedBy : Parser a -> Parser b -> Parser a

Create a parser that depends on a previous parser succeeding. Unlike andThen, this does not preserve the value of the first parser, so it's only useful when you want to discard that value.

atMention : Parser String
atMention =
    char '@'
        |> followedBy username

Pipelines

into : (a -> b) -> Parser (a -> b)

Start a parser pipeline that feeds values into a function.

Typically used to build up complex values.

type Operation = Binary Int Char Int

operation : Parser Operation
operation =
    into Operation
        |> grab int
        |> ignore blanks
        |> grab (oneOf [ char '+', char '-', char '*' ])
        |> ignore blanks
        |> grab int

parse "42 * 13" operation -- Binary 42 '*' 13

Here we feed three values into Operation while ignoring blank characters between the values.

grab : Parser a -> Parser (a -> b) -> Parser b

Grabs a value and feeds it into a function in a pipeline.

See into.

ignore : Parser a -> Parser b -> Parser b

Ignores a matched value, preserving the previous value in a pipeline.

See into.

Transforming Parsed Values

map : (a -> b) -> Parser a -> Parser b

Map the value of a parser.

map (\x -> x * x) int

map2 : (a -> b -> c) -> Parser a -> Parser b -> Parser c

Matches two parsers and combines the result.

map2 (\x y -> (x, y)) anyChar anyChar
    |> parse "xy" -- Ok ('x', 'y')

withError : String -> Parser a -> Parser a

Use the specified error message when the parser fails.

string "</div>"
    |> withError "expected closing tag"

stringWith : Parser (List Char) -> Parser String

Turns a parser that returns a list of characters into a parser that returns a String.

parse "xyz" (stringWith (sequence [ char 'x', anyChar, char 'z' ])) -- Ok "xyz"

matchedString : Parser a -> Parser String

Maps a parser to include all the matched input as a String.

matchedString (sequence [ word, string "@", word ])
    |> parse "hello@world!" -- Ok "hello@world"

High Level Parsers

separatedBy : Parser s -> Parser a -> Parser (List a)

Matches zero or more values separated by a specified parser.

separatedBy (char ',') int
    |> parse "42,13,99" -- Ok [ 42, 13, 99 ]