andre-dietrich / parser-combinators / Combine

This library provides facilities for parsing structured text data into concrete Elm values.

API Reference

Core Types


type Parser state res

The Parser type.

At their core, Parsers wrap functions from some state and an InputStream to a tuple representing the new state, the remaining InputStream and a ParseResult res.


type alias InputStream =
{ data : String
, input : String
, position : Basics.Int 
}

The input stream over which Parsers operate.


type alias ParseLocation =
{ source : String
, line : Basics.Int
, column : Basics.Int 
}

A record representing the current parse location in an InputStream.


type alias ParseContext state res =
( state
, InputStream
, ParseResult res 
)

A tuple representing the current parser state, the remaining input stream and the parse result. Don't worry about this type unless you're writing your own primitive parsers.


type alias ParseResult res =
Result (List String) res

Running a Parser results in one of two states:


type alias ParseError state =
( state
, InputStream
, List String 
)

A tuple representing a failed parse. It contains the state after running the parser, the remaining input stream and a list of error messages.


type alias ParseOk state res =
( state, InputStream, res )

A tuple representing a successful parse. It contains the state after running the parser, the remaining input stream and the result.

Running Parsers

parse : Parser () res -> String -> Result (ParseError ()) (ParseOk () res)

Parse a string. See runParser if your parser needs to manage some internal state.

import Combine.Num exposing (int)
import String

parseAnInteger : String -> Result String Int
parseAnInteger input =
  case parse int input of
    Ok (_, stream, result) ->
      Ok result

    Err (_, stream, errors) ->
      Err (String.join " or " errors)

parseAnInteger "123"
-- Ok 123

parseAnInteger "abc"
-- Err "expected an integer"

runParser : Parser state res -> state -> String -> Result (ParseError state) (ParseOk state res)

Parse a string while maintaining some internal state.

import Combine.Num exposing (int)
import String

type alias Output =
  { count : Int
  , integers : List Int
  }

statefulInt : Parse Int Int
statefulInt =
  -- Parse an int, then increment the state and return the parsed
  -- int.  It's important that we try to parse the int _first_
  -- since modifying the state will always succeed.
  int |> ignore (modifyState ((+) 1))

ints : Parse Int (List Int)
ints =
  sepBy (string " ") statefulInt

parseIntegers : String -> Result String Output
parseIntegers input =
  case runParser ints 0 input of
    Ok (state, stream, ints) ->
      Ok { count = state, integers = ints }

    Err (state, stream, errors) ->
      Err (String.join " or " errors)

parseIntegers ""
-- Ok { count = 0, integers = [] }

parseIntegers "1 2 3 45"
-- Ok { count = 4, integers = [1, 2, 3, 45] }

parseIntegers "1 a 2"
-- Ok { count = 1, integers = [1] }

Constructing Parsers

primitive : (state -> InputStream -> ParseContext state res) -> Parser state res

Construct a new primitive Parser.

If you find yourself reaching for this function often consider opening a Github issue with the library to have your custom Parsers included in the standard distribution.

app : Parser state res -> state -> InputStream -> ParseContext state res

Unwrap a parser so it can be applied to a state and an input stream. This function is useful if you want to construct your own parsers via primitive. If you're using this outside of the context of primitive then you might be doing something wrong so try asking for help on the mailing list.

Here's how you would implement a greedy version of manyTill using primitive and app:

manyTill : Parser s a -> Parser s x -> Parser s (List a)
manyTill p end =
    let
        accumulate acc state stream =
            case app end state stream of
                ( rstate, rstream, Ok _ ) ->
                    ( rstate, rstream, Ok (List.reverse acc) )

                _ ->
                    case app p state stream of
                        ( rstate, rstream, Ok res ) ->
                            accumulate (res :: acc) rstate rstream

                        ( estate, estream, Err ms ) ->
                            ( estate, estream, Err ms )
    in
    primitive <| accumulate []

lazy : (() -> Parser s a) -> Parser s a

Unfortunatelly this is not a real lazy function anymore, since this functionality is not accessable anymore by ordinary developers. Use this function only to avoid "bad-recursion" errors or use the following example snippet in your code to circumvent this problem:

recursion x =
    \() -> recursion x

Parsers

fail : String -> Parser s a

Fail without consuming any input.

parse (fail "some error") "hello"
-- Err ["some error"]

succeed : a -> Parser s a

Return a value without consuming any input.

parse (succeed 1) "a"
-- Ok 1

string : String -> Parser s String

Parse an exact string match.

parse (string "hello") "hello world"
-- Ok "hello"

parse (string "hello") "goodbye"
-- Err ["expected \"hello\""]

regex : String -> Parser s String

Parse a Regex match.

Regular expressions must match from the beginning of the input and their subgroups are ignored. A ^ is added implicitly to the beginning of every pattern unless one already exists.

parse (regex "a+") "aaaaab"
-- Ok "aaaaa"

regexSub : String -> Parser s ( String, List (Maybe String) )

Parse a Regex match.

Same as regex, but returns also submatches as the second parameter in the result tuple.

parse (regexSub "a+") "aaaaab"
-- Ok ("aaaaa", [])

regexWith : Basics.Bool -> Basics.Bool -> String -> Parser s String

Parse a Regex match.

Since, Regex now also has support for more parameters, this option was included into this package. Call regexWith with two additional parameters: caseInsensitive and multiline, which allow you to tweak your expression. The rest is as follows. Regular expressions must match from the beginning of the input and their subgroups are ignored. A ^ is added implicitly to the beginning of every pattern unless one already exists.

parse (regexWith True False "a+") "aaaAAaAab"
-- Ok "aaaAAaAa"

regexWithSub : Basics.Bool -> Basics.Bool -> String -> Parser s ( String, List (Maybe String) )

Parse a Regex match.

Similar to regexWith, but a tuple is returned, with a list of additional submatches. The rest is as follows. Regular expressions must match from the beginning of the input and their subgroups are ignored. A ^ is added implicitly to the beginning of every pattern unless one already exists.

parse (regexWithSub True False "a+") "aaaAAaAab"
-- Ok ("aaaAAaAa", [])

end : Parser s ()

Fail when the input is not empty.

parse end ""
-- Ok ()

parse end "a"
-- Err ["expected end of input"]

whitespace : Parser s String

Parse zero or more whitespace characters.

parse (whitespace |> keep (string "hello")) "hello"
-- Ok "hello"

parse (whitespace |> keep (string "hello")) "   hello"
-- Ok "hello"

whitespace1 : Parser s String

Parse one or more whitespace characters.

parse (whitespace1 |> keep (string "hello")) "hello"
 -- Err ["whitespace"]

parse (whitespace1 |> keep (string "hello")) "   hello"
 -- Ok "hello"

Combinators

Transforming Parsers

map : (a -> b) -> Parser s a -> Parser s b

Transform the result of a parser.

let
  parser =
    string "a"
      |> map String.toUpper
in
  parse parser "a"
  -- Ok "A"

onsuccess : a -> Parser s x -> Parser s a

Run a parser and return the value on the right on success.

parse (string "true" |> onsuccess True) "true"
-- Ok True

parse (string "true" |> onsuccess True) "false"
-- Err ["expected \"true\""]

mapError : (List String -> List String) -> Parser s a -> Parser s a

Transform the error of a parser.

let
  parser =
    string "a"
      |> mapError (always ["bad input"])
in
  parse parser b
  -- Err ["bad input"]

onerror : String -> Parser s a -> Parser s a

Variant of mapError that replaces the Parser's error with a List of a single string.

parse (string "a" |> onerror "gimme an 'a'") "b"
-- Err ["gimme an 'a'"]

Chaining Parsers

andThen : (a -> Parser s b) -> Parser s a -> Parser s b

Sequence two parsers, passing the result of the first parser to a function that returns the second parser. The value of the second parser is returned on success.

import Combine.Num exposing (int)

choosy : Parser s String
choosy =
  let
    createParser n =
      if n % 2 == 0 then
        string " is even"
      else
        string " is odd"
  in
    int
      |> andThen createParser

parse choosy "1 is odd"
-- Ok " is odd"

parse choosy "2 is even"
-- Ok " is even"

parse choosy "1 is even"
-- Err ["expected \" is odd\""]

andMap : Parser s a -> Parser s (a -> b) -> Parser s b

Sequence two parsers.

import Combine.Num exposing (int)

plus : Parser s String
plus = string "+"

sum : Parser s Int
sum =
  int
    |> map (+)
    |> andMap (plus |> keep int)

parse sum "1+2"
-- Ok 3

sequence : List (Parser s a) -> Parser s (List a)

Run a list of parsers in sequence, accumulating the results. The main use case for this parser is when you want to combine a list of parsers into a single, top-level, parser. For most use cases, you'll want to use one of the other combinators instead.

parse (sequence [string "a", string "b"]) "ab"
-- Ok ["a", "b"]

parse (sequence [string "a", string "b"]) "ac"
-- Err ["expected \"b\""]

Parser Combinators

lookAhead : Parser s a -> Parser s a

Apply a parser without consuming any input on success.

while : (Char -> Basics.Bool) -> Parser s String

Consume input while the predicate matches.

parse (while ((/=) ' ')) "test 123"
-- Ok "test"

or : Parser s a -> Parser s a -> Parser s a

Choose between two parsers.

parse (or (string "a") (string "b")) "a"
-- Ok "a"

parse (or (string "a") (string "b")) "b"
-- Ok "b"

parse (or (string "a") (string "b")) "c"
-- Err ["expected \"a\"", "expected \"b\""]

choice : List (Parser s a) -> Parser s a

Choose between a list of parsers.

parse (choice [string "a", string "b"]) "a"
-- Ok "a"

parse (choice [string "a", string "b"]) "b"
-- Ok "b"

optional : a -> Parser s a -> Parser s a

Return a default value when the given parser fails.

letterA : Parser s String
letterA = optional "a" (string "a")

parse letterA "a"
-- Ok "a"

parse letterA "b"
-- Ok "a"

maybe : Parser s a -> Parser s (Maybe a)

Wrap the return value into a Maybe. Returns Nothing on failure.

parse (maybe (string "a")) "a"
-- Ok (Just "a")

parse (maybe (string "a")) "b"
-- Ok Nothing

many : Parser s a -> Parser s (List a)

Apply a parser zero or more times and return a list of the results.

parse (many (string "a")) "aaab"
-- Ok ["a", "a", "a"]

parse (many (string "a")) "bbbb"
-- Ok []

parse (many (string "a")) ""
-- Ok []

many1 : Parser s a -> Parser s (List a)

Parse at least one result.

parse (many1 (string "a")) "a"
-- Ok ["a"]

parse (many1 (string "a")) ""
-- Err ["expected \"a\""]

manyTill : Parser s a -> Parser s end -> Parser s (List a)

Apply the first parser zero or more times until second parser succeeds. On success, the list of the first parser's results is returned.

string "<!--" |> keep (manyTill anyChar (string "-->"))

many1Till : Parser s a -> Parser s end -> Parser s (List a)

Apply the first parser one or more times until second parser succeeds. On success, the list of the first parser's results is returned.

string "<!--" |> keep (many1Till anyChar (string "-->"))

sepBy : Parser s x -> Parser s a -> Parser s (List a)

Parser zero or more occurences of one parser separated by another.

parse (sepBy (string ",") (string "a")) "b"
-- Ok []

parse (sepBy (string ",") (string "a")) "a,a,a"
-- Ok ["a", "a", "a"]

parse (sepBy (string ",") (string "a")) "a,a,b"
-- Ok ["a", "a"]

sepBy1 : Parser s x -> Parser s a -> Parser s (List a)

Parse one or more occurences of one parser separated by another.

sepEndBy : Parser s x -> Parser s a -> Parser s (List a)

Parse zero or more occurences of one parser separated and optionally ended by another.

parse (sepEndBy (string ",") (string "a")) "a,a,a,"
-- Ok ["a", "a", "a"]

sepEndBy1 : Parser s x -> Parser s a -> Parser s (List a)

Parse one or more occurences of one parser separated and optionally ended by another.

parse (sepEndBy1 (string ",") (string "a")) ""
-- Err ["expected \"a\""]

parse (sepEndBy1 (string ",") (string "a")) "a"
-- Ok ["a"]

parse (sepEndBy1 (string ",") (string "a")) "a,"
-- Ok ["a"]

skip : Parser s x -> Parser s ()

Apply a parser and skip its result.

skipMany : Parser s x -> Parser s ()

Apply a parser and skip its result many times.

skipMany1 : Parser s x -> Parser s ()

Apply a parser and skip its result at least once.

chainl : Parser s (a -> a -> a) -> Parser s a -> Parser s a

Parse one or more occurences of p separated by op, recursively apply all functions returned by op to the values returned by p. See the examples/Calc.elm file for an example.

chainr : Parser s (a -> a -> a) -> Parser s a -> Parser s a

Similar to chainl but functions of op are applied in right-associative order to the values of p. See the examples/Python.elm file for a usage example.

count : Basics.Int -> Parser s a -> Parser s (List a)

Parse n occurences of p.

between : Parser s l -> Parser s r -> Parser s a -> Parser s a

Parse something between two other parsers.

The parser

between (string "(") (string ")") (string "a")

is equivalent to the parser

string "(" |> keep (string "a") |> ignore (string ")")

parens : Parser s a -> Parser s a

Parse something between parentheses.

braces : Parser s a -> Parser s a

Parse something between braces {}.

brackets : Parser s a -> Parser s a

Parse something between square brackets [].

keep : Parser s a -> Parser s x -> Parser s a

Join two parsers, ignoring the result of the one on the right.

unsuffix : Parser s String
unsuffix =
  regex "[a-z]"
    |> keep (regex "[!?]")

parse unsuffix "a!"
-- Ok "a"

ignore : Parser s x -> Parser s a -> Parser s a

Join two parsers, ignoring the result of the one on the left.

unprefix : Parser s String
unprefix =
  string ">"
    |> ignore (while ((==) ' '))
    |> ignore (while ((/=) ' '))

parse unprefix "> a"
-- Ok "a"

State Combinators

withState : (s -> Parser s a) -> Parser s a

Get the parser's state and pipe it into a parser.

putState : s -> Parser s ()

Replace the parser's state.

modifyState : (s -> s) -> Parser s ()

Modify the parser's state.

withLocation : (ParseLocation -> Parser s a) -> Parser s a

Get the current position in the input stream and pipe it into a parser.

withLine : (Basics.Int -> Parser s a) -> Parser s a

Get the current line and pipe it into a parser.

withColumn : (Basics.Int -> Parser s a) -> Parser s a

Get the current column and pipe it into a parser.

withSourceLine : (String -> Parser s a) -> Parser s a

Get the current InputStream and pipe it into a parser, only for debugging purposes ...

currentLocation : InputStream -> ParseLocation

Get the current (line, column) in the input stream.

currentSourceLine : InputStream -> String

Get the current source line in the input stream.

currentLine : InputStream -> Basics.Int

Get the current line in the input stream.

currentColumn : InputStream -> Basics.Int

Get the current column in the input stream.

currentStream : InputStream -> String

Get the current string stream. That might be useful for applying memorization.

modifyInput : (String -> String) -> Parser s ()

Modify the parser's InputStream input (String).

modifyPosition : (Basics.Int -> Basics.Int) -> Parser s ()

Modify the parser's InputStream position (Int).