pithub / elm-parser-bug-workaround / Parser.Workaround

Workarounds for a bug in Parser.

The Problem

The Elm Parser internally keeps track of the current position in two ways:

See the Positions chapter in the Parser documentation for more details.

Normally both kinds of position infos (row and column vs. offset) are in sync with each other. (For a given source string, you can calculate both row and column from the offset and vice versa.)

There's a bug in the Parser code though. The following parsers break this synchronicity: lineComment, multiComment, chompUntil, and chompUntilEndOr. They set...

Here's an example with chompUntil:

import Parser exposing ((|.), (|=), Parser)

testParser : Parser { row : Int, col : Int, offset : Int }
testParser =
    Parser.succeed (\row col offset -> { row = row, col = col, offset = offset })
        |. Parser.chompUntil "token"
        |= Parser.getRow
        |= Parser.getCol
        |= Parser.getOffset

Parser.run testParser "< token >"
--> Ok { row = 1, col = 8, offset = 2 }

The state after the test parser is run:

Workaround

As a workaround, this package offers xxxBefore and xxxAfter parsers which consistently position both row/column and offset either before or after the (closing) token.

Why are there two different workarounds for each buggy parser?

On the one hand, if you already have working parsers and don't use the row or column information, then you can replace them with the xxxBefore variants and they will continue to work. (There's one exception for the multiComment parser though, see below.)

On the other hand, most often the xxxAfter variants are easier to use, because you don't need to chomp the (closing) token yourself. So if you write new parsers, you'll likely want to use the xxxAfter variants. Plus, if the bug will be fixed, it can be assumed that the fixed parsers will work like the xxxAfter variants. (If you want to know why, you can look at the description of this pull request.)

Guidelines

If you are unsure whether to use the xxxBefore or the xxxAfter parsers or whether to use the workarounds at all, you could follow these guidelines:

Exception

There's one exception to the rules above: if the multiComment parser is used with Nestable comments, then it isn't affected from the bug. (For this mode it's implemented differently.)

Therefore this parser should be replaced with the multiCommentAfter workaround to keep the current behavior.

Parsers

lineCommentBefore : String -> Parser ()

Just like Parser.lineComment except it consistently stops before the linefeed character.

lineCommentAfter : String -> Parser ()

Just like Parser.lineComment except it consistently stops after the linefeed character.

multiCommentBefore : String -> String -> Parser.Nestable -> Parser ()

Just like Parser.multiComment except it consistently stops before the last closing string.

multiCommentAfter : String -> String -> Parser.Nestable -> Parser ()

Just like Parser.multiComment except it consistently stops after the last closing string.

chompUntilBefore : String -> Parser ()

Just like Parser.chompUntil except it consistently stops before the string.

chompUntilAfter : String -> Parser ()

Just like Parser.chompUntil except it consistently stops after the string.

chompUntilEndOrBefore : String -> Parser ()

Just like Parser.chompUntilEndOr except it consistently stops before the string (if it is found).

chompUntilEndOrAfter : String -> Parser ()

Just like Parser.chompUntilEndOr except it consistently stops after the string (if it is found).