folkertdev / elm-cff / Charstring

A charstring is a sequence of numbers that encodes the shape of a glyph with drawing and layout operators like moveto, lineto, and curveto.

Because both operators and their arguments are numbers, we have to differentiate the two. The operators use the numbers 0..31 (as unsignedInt8) and arguments use all other values. To be able to use 0..31 as arguments too, the arguments are shifted (specifics are in Charstring.Number).

The arguments come first and are pushed onto a stack (or really a dequeue, we mostly use first in first out). When an operator is found, the arguments and the operator are bundled together.

A tricky thing is that while most operators only take these arguments, the mask operators can also chomp some bytes after the operator token. This means that we have to decode from left to right, one full operation at a time.


type alias Charstring =
List Operation

The Charstring is what defines the actual shape of a glyph. It is a list of drawing instructions (like moveto, lineto, and curveto).


type Operation
    = HintMask (List Basics.Int)
    | CounterMask (List Basics.Int)
    | HStem Basics.Int Basics.Int
    | VStem Basics.Int Basics.Int
    | Width Basics.Int
    | MoveTo Point
    | LineTo Point
    | CurveTo Point Point Point

The drawing operations. For the full details see the charstring 2 spec.


type alias Point =
{ x : Basics.Int, y : Basics.Int }

A 2D point with integer coordinates

decode : { global : Subroutines, local : Maybe Subroutines } -> Bytes.Decode.Decoder Charstring

Decode a Charstring given global and local subroutines.


type alias Subroutines =
Array Bytes

Subroutines are initially stored as an array of Bytes objects. Global subroutines are a CFF table, local subroutines are part of the PRIVATE table.

At any point between operators in a charstring, a subroutine can be invoked. Subroutines are pieces of charstrings that occur often and are therfore abstracted to save space.

Subroutines can be either global (used by all fonts in a fontset) or local (used only in this particular font). Decoding subroutines correctly is tricky because the decoding depends on the current State, in particular the arguments on the stack (State.argumentStack).

The solution I've settled on is to store the subroutines as Bytes, and when a subroutine is called, we evaluate the normal charstring decoder with the subroutine bytes. The storage of the subroutines in this way is cheap, because a Bytes slice really only stores an offset and a length. It doesn't copy the underlying Bytes.