A charstring is a sequence of numbers that encodes the shape of a glyph with drawing and layout operators like moveto, lineto, and curveto.
Because both operators and their arguments are numbers, we have to differentiate the two.
The operators use the numbers 0..31 (as unsignedInt8) and arguments use all other values.
To be able to use 0..31 as arguments too, the arguments are shifted (specifics are in Charstring.Number
).
The arguments come first and are pushed onto a stack (or really a dequeue, we mostly use first in first out). When an operator is found, the arguments and the operator are bundled together.
A tricky thing is that while most operators only take these arguments, the mask operators can also chomp some bytes after the operator token. This means that we have to decode from left to right, one full operation at a time.
List Operation
The Charstring
is what defines the actual shape of a glyph. It is a list of drawing instructions (like moveto, lineto, and curveto).
The drawing operations. For the full details see the charstring 2 spec.
{ x : Basics.Int, y : Basics.Int }
A 2D point with integer coordinates
decode : { global : Subroutines, local : Maybe Subroutines } -> Bytes.Decode.Decoder Charstring
Decode a Charstring
given global and local subroutines.
Array Bytes
Subroutines are initially stored as an array of Bytes
objects. Global subroutines are a CFF table, local subroutines are part of the PRIVATE table.
At any point between operators in a charstring, a subroutine can be invoked. Subroutines are pieces of charstrings that occur often and are therfore abstracted to save space.
Subroutines can be either global (used by all fonts in a fontset) or local (used only in this particular font).
Decoding subroutines correctly is tricky because the decoding depends on the current State
, in particular
the arguments on the stack (State.argumentStack
).
The solution I've settled on is to store the subroutines as Bytes
, and when a subroutine is called, we evaluate the normal charstring decoder with the subroutine bytes.
The storage of the subroutines in this way is cheap, because a Bytes
slice really only stores an offset and a length. It doesn't copy the underlying Bytes
.