YuyaAizawa / peg / Peg.Parser

A parser combinator implementation for Parsing Expression Grammer (PEG).

Parse


type Parser a

A value decides wether the given input string should be accepted or not and converts it into Elm object when accepted.

parse : String -> Parser a -> Maybe a

Parse the given string and return the result. It returns Just value when the string is accepted, Nothing otherwise.

parse "abc" (match "abc") -- Just "abc"
parse "xyz" (match "abc") -- Nothing

Parsers

return : a -> Parser a

This Parser alway succeeds on parse and results in the given argument without consumption. Typically used with andThen.

fail : Parser a

This Parser always fails on parse. It is "mzero" on monad context. Typically used with andThen.

match : String -> Parser String

Generate the parser accepts the specified string. The parser returns the same string when accepts.

parse "abc" (match "abc") -- Just "abc"
parse "xyz" (match "abc") -- Nothing

char : (Char -> Basics.Bool) -> Parser Char

Generate the parser accepts characters satisfied with the specified predicator. The parser returns the specified character when accepted.

char Char.isUpper |> parse "A"  -- Just 'A'
char Char.isUpper |> parse "a"  -- Nothing

chars : (Char -> Basics.Bool) -> Parser String

Generate the parser accepts consecutive characters satisfied with the specified predicator. The parser returns the string when accepted.

char Char.isUpper |> parse "AAA"  -- Just "AAA"
char Char.isUpper |> parse "aaa"  -- Nothing

Basic Combinators

seq2 : Parser a -> Parser b -> (a -> b -> result) -> Parser result

Concatenate two specified parsers, in other words, generate new parser accepts the sequence. The result is also merged according to the 3rd parameter.

seq2 (match "con") (match "cat") (++) |> parse "concat" -- Just "concat"

choice : List (() -> Parser a) -> Parser a

Generate new parser from specified parsers. result parser accepts all inputs specified parsers accept. This combinator is ordered, in other words, if the first parser accepts input, the second parser is ignored. These parsers needs to be '() -> Parser' form. (It is useful for avoiding infinite reference.)

p =
    choice
    [ \() -> match "foo"
    , \() -> match "bar"
    ]
p |> parse "foo" -- Just "foo"
p |> parse "bar" -- Just "bar"

option : Parser a -> Parser (Maybe a)

Generate optional parser. It accepts whatever string and consumes only if specified parser accepts. The parse result is Maybe value.

option (match "foo") |> parse "" -- Just Nothing

zeroOrMore : Parser a -> Parser (List a)

Generate zero-or-more parser. It accept zero or more consecutive repetitions of string specified parser accepts. it always behaves greedily, consuming as much input as possible.

p = zeroOrMore (match "a")
p |> parse "aaa" -- Just ["a", "a", "a"]
p |> parse ""    -- Just []

oneOrMore : Parser a -> Parser (List a)

Generate one-or-more parser. It accept one or more consecutive repetitions of string specified parser accepts. it always behaves greedily, consuming as much input as possible.

p = oneOrMore (match "a")
p |> parse "aaa" -- Just ["a", "a", "a"]
p |> parse ""    -- Nothing

andPredicate : Parser a -> Parser ()

Generate and-predicate parser. The parse succeeds if the specified parser accepts the input and fails if the specified parser rejects, but in either case, never consumes any input.

word = chars (always True)
p = seq2 (andPredicate (match "A")) (\_ w -> w)
p |> parse "Apple"  -- Just "Apple"
p |> parse "Banana" -- Nothing

notPredicate : Parser a -> Parser ()

Generate not-predicate parser. The parse succeeds if the specified parser rejects the input and fails if the specified parser accepts, but in either case, never consumes any input.

nums = chars Char.isDigit
p = seq2 (notPredicate (match "0")) nums (\_ i -> i)
p |> parse "1234" -- Just "1234"
p |> parse "0123" -- Nothing

Transform

map : (a -> b) -> Parser a -> Parser b

Generate new parser return mapped result by specifying mapper.

andThen : (a -> Parser b) -> Parser a -> Parser b

Incorporate specified parser to the parser. It re-parse the input using specified parser with the original result when the original parser success parse, otherwise the parse failed. Using it can set more stringent conditions for parse success.

intParser =
  chars Char.isDigit
    |> andThen (\str ->
      case String.toInt str of
        Just i -> return i
        Nothing -> fail)

Position


type alias Position =
{ begin : Basics.Int, end : Basics.Int }

Parsed posision in sourse string.

withPosition : Parser a -> Parser ( a, Position )

Generate parser returns result with position. Begin value is inclusive, end value is exclusive.

parse "abc" (match "abc" |> withPosition)
  -- Just ( "abc", { begin = 0, end = 3 } )

Sequence Utils

seq3 : Parser a -> Parser b -> Parser c -> (a -> b -> c -> result) -> Parser result

seq4 : Parser a -> Parser b -> Parser c -> Parser d -> (a -> b -> c -> d -> result) -> Parser result

seq5 : Parser a -> Parser b -> Parser c -> Parser d -> Parser e -> (a -> b -> c -> d -> e -> result) -> Parser result

seq6 : Parser a -> Parser b -> Parser c -> Parser d -> Parser e -> Parser f -> (a -> b -> c -> d -> e -> f -> result) -> Parser result

intersperseSeq2 : Parser i -> Parser a -> Parser b -> (a -> b -> result) -> Parser result

Concatenate two parsers with a specified parser in between.

ws = match " " |> oneOrMore
varTy = choise [ \() -> match "int", \() -> match "char" ]
varName = chars Char.isAlpha
varDecl =
  intersperseSeq2
  ws varTy varName           -- matches like "int x" or "char  foo"
  (\ty name -> ( ty, name )) -- result like ("int", "x") or ("char", "foo")

intersperseSeq3 : Parser i -> Parser a -> Parser b -> Parser c -> (a -> b -> c -> result) -> Parser result

intersperseSeq4 : Parser i -> Parser a -> Parser b -> Parser c -> Parser d -> (a -> b -> c -> d -> result) -> Parser result

intersperseSeq5 : Parser i -> Parser a -> Parser b -> Parser c -> Parser d -> Parser e -> (a -> b -> c -> d -> e -> result) -> Parser result

intersperseSeq6 : Parser i -> Parser a -> Parser b -> Parser c -> Parser d -> Parser e -> Parser f -> (a -> b -> c -> d -> e -> f -> result) -> Parser result

join : Parser i -> Parser a -> Parser (List a)

Generate one-or-more parser with a specified parser in between.

chars Char.isAlpha
  |> join (match ", ")
  |> parse "foo, bar, baz" -- Just ["foo", "bar", "baz"]

infixl : Parser (a -> a -> a) -> Parser a -> Parser a

Generate parser parses left infix operator. The first argument is for operator, and the second is for term.

This is the four arithmetic operations sample.

let
  nat =
    chars Char.isDigit
      |> andThen (\num ->
        case String.toInt num of
          Just i -> return i
          Nothing -> fail)

  muldiv =
    nat
      |> infixl (
        choice
        [ \_ -> match "*" |> map (always (*))
        , \_ -> match "/" |> map (always (//))
        ])

  addsub =
    muldiv
      |> infixl (
        choice
        [ \_ -> match "+" |> map (always (+))
        , \_ -> match "-" |> map (always (-))
        ])
in
  addsub
    |> parse "1+2*4-5+6/2" -- Just 7

infixr : Parser (a -> a -> a) -> Parser a -> Parser a

Generate parser parses right infix operator. The first argument is for operator, and the second is for term.