Parse/decode broken JSON.
When reading these docs or the code in this module, it might be useful to refer to json.org (for the diagrams) or RFC 8259 (for the official word). The diagrams especially.
When successful, parsing returns a Json.Encode.Value
. Use with with
Json.Decode.decodeValue
to extract the information you need into your
application's data structures.
parse : String -> Result (List Parser.DeadEnd) Json.Encode.Value
Parse the given JSON string.
This assumes a spec-compliant JSON string; it will choke on "broken" JSON. This seems kind of weird for a package that's all about parsing broken JSON. However, we all have to start somewhere. Read the code, copy it, modify it, make it work for your use case.
Errors come straight from elm/parser and may not be super useful. Sorry. I may
switch to elm/parser's Parser.Advanced
to improve this at some point.
This isn't going to get you total flexibility, but using Config
and
co. will at least help you put together a consistent parser for JSON-like data.
By consistent, I mean that you override, say, the number parser and it will be
applied everywhere you might expect to see a number, be that at the top level,
or nested within any depths of objects or arrays.
parseWith : Config -> String -> Result (List Parser.DeadEnd) Json.Encode.Value
Parse the given JSON string with a custom configuration.
Configuration for the parser.
defaultConfig : Config
Default configuration for the parser.
Using this module, a strict parser for a JSON value is defined by:
Parser.oneOf
[ object defaultConfig
, array defaultConfig
, string
, number
, true
, false
, null
]
According to the specification, a JSON document is: optional
whitespace, a JSON value (that oneOf …
expression above), then more optional
whitespace. That's what the json
parser does. Hence parsing a
compliant JSON document is:
Parser.run (json defaultConfig) "…"
Those component parsers are also exposed, as are several other sub-parsers. Use
them as building blocks to compose a parser for broken JSON as you need. If you
need to parse non-compliant quoted strings, for example, you might start by
looking at stringLiteral
. It might even be best to copy just
the string
code from this module into your project, and use the
other parsers in this module – object
, array
, and so on
– to compose a new parser by creating a new Config
or deriving from
defaultConfig
.
json : Config -> Parser Json.Encode.Value
Parser for JSON.
This is a JSON value surrounded by optional whitespace.
object : Config -> Parser Json.Encode.Value
Parser for a JSON object.
key : Parser String
Parser for a JSON object key.
array : Config -> Parser Json.Encode.Value
Parser for a JSON array.
string : Parser Json.Encode.Value
Parser for a quoted JSON string.
string
and some of its helpers have been adapted from elm/parser's
DoubleQuoteString
example.
stringLiteral : Parser String -> Parser Char -> Parser String
Parser for a quoted JSON string literal.
This gives some flexibility over parsing unescaped string content and escape
sequences. The literal must still start and end with "
, but it's possible to
change the rules for content to allow, for example, new-lines or carriage
returns, or to process non-standard escape sequences.
One other difference from string
is that this yields the actual
String
rather than a re-encoded Value
. This is also used for object keys
which need to be captured as String
.
escape : Parser Char
Parser for an escape sequence.
This does not include the leading escape prefix, i.e. \\
.
unicodeHexCode : Parser String
Parser for a Unicode hexadecimal code.
E.g. "AbCd" or "1234" or "000D".
It will match exactly 4 hex digits, case-insensitive.
Goes well with hexChar
.
unescaped : Parser String
Parser for unescaped string contents.
The JSON specifications are specific about what characters are permissible in a quoted string. Perhaps most interestingly, horizontal tabs, new-lines, and carriage returns are not permitted; these must be escaped.
number : Parser Json.Encode.Value
Parser for a JSON number.
int : Parser String
Parser for the integer portion of a JSON number.
123.456e+78
^^^
frac : Parser ()
Parser for an optional fractional portion of a JSON number.
123.456e+78
^^^^
exp : Parser ()
Parser for an optional exponent portion of a JSON number.
123.456e+78
^^^^
digit : Parser ()
Parser for a single decimal digit.
digits : Parser ()
Parser for one or more decimal digits.
This chomps characters; it does not yield them. Wrap with getChompedString
to
obtain the matched string.
digitsMaybe : Parser ()
Parser for zero or more decimal digits.
zero : Parser ()
Parser for a single decimal zero digit, 0
.
oneNine : Parser ()
Parser for a single decimal digit between 1
and 9
inclusive.
true : Parser Json.Encode.Value
Parser for a JSON true
literal.
false : Parser Json.Encode.Value
Parser for a JSON false
literal.
null : Parser Json.Encode.Value
Parser for a JSON null
literal.
ws : Parser ()
Parser for JSON whitespace.
This is the whitespace that appears between significant elements of JSON, and before and after JSON documents, not whitespace within quoted strings.
hexChar : String -> Char
Convert a Unicode hexadecimal code to a Char
.
Useful with unicodeHexCode
.
Note that ECMA 404 does not put a limit on the character ranges, i.e. it is
permissible in JSON to specify a character for which Unicode does not have a
character assignment. This leans on the behaviour of Char.fromCode
to
determine what happens for codes not covered by Unicode.
yields : a -> Parser b -> Parser a
Parser that, on success, always returns a
For example:
token "true" |> yields (Encode.bool True)
When the token true
is matched, a boolean true value is yielded.