matheus23 / elm-markdown-transforms / Markdown.Scaffolded

Rendering Markdown with Scaffolds, Reducers and Folds

(This is called recursion-schemes in other languages, but don't worry, you don't have to write recursive functions (this is the point of all of this ;) )!)

This is module provides a more complicated, but also more powerful and composable way of rendering markdown than the built-in elm-markdown Renderer.

If you feel a little overwhelmed with this module at first, I recommend taking a look at the What are reducers? section.

Main Datastructure


type Block children
    = Heading ({ level : Markdown.Block.HeadingLevel, rawText : String, children : List children })
    | Paragraph (List children)
    | BlockQuote (List children)
    | Text String
    | CodeSpan String
    | Strong (List children)
    | Emphasis (List children)
    | Strikethrough (List children)
    | Link ({ title : Maybe String, destination : String, children : List children })
    | Image ({ alt : String, src : String, title : Maybe String })
    | UnorderedList ({ items : List (Markdown.Block.ListItem children) })
    | OrderedList ({ startingIndex : Basics.Int, items : List (List children) })
    | CodeBlock ({ body : String, language : Maybe String })
    | HardLineBreak
    | ThematicBreak
    | Table (List children)
    | TableHeader (List children)
    | TableBody (List children)
    | TableRow (List children)
    | TableCell (Maybe Markdown.Block.Alignment) (List children)
    | TableHeaderCell (Maybe Markdown.Block.Alignment) (List children)

A datatype that enumerates all possible ways markdown could wrap some children.

Kind of like a 'Scaffold' around something that's already built, which will get torn down after building is finished.

This does not include Html tags.

If you look at the left hand sides of all of the functions in the elm-markdown Renderer, you'll notice a similarity to this custom type, except it's missing a type for 'html'.

Defining this data structure has some advantages in composing multiple Renderers.

It has a type parameter children, which is supposed to be filled with String, Html msg or similar. Take a look at some reducers for examples of this.

There are some neat tricks you can do with this data structure, for example, Block Never represents only non-nested blocks of markdown.

map : (a -> b) -> Block a -> Block b

Transform each child of a Block using the given function.

For example, we can transform the lists of words inside each block into concatenated Strings:

wordsToWordlist : Block (List String) -> Block String
wordsToWordlist block =
    map (\listOfWords -> String.join ", " listOfWords)
        block

Paragraph
    [ [ "This", "paragraph", "was", "full", "of", "individual", "words", "once." ]
    , [ "It", "also", "contained", "another", "paragraph" ]
    ]
    |> wordsToWordlist
--> Paragraph
-->     [ "This, paragraph, was, full, of, individual, words, once."
-->     , "It, also, contained, another, paragraph"
-->     ]

HardLineBreak |> wordsToWordlist
--> HardLineBreak

The ability to define this function is one of the reasons for our Block definition. If you try defining map for elm-markdown's Renderer you'll find out it doesn't work.

indexedMap : (List Basics.Int -> a -> b) -> Block a -> Block b

Block's children are mapped from 0 to n (if n+1 is the amount of children).

Most arguments to the mapping function are therefore [0], [1], ... etc.

All children will get unique List Int arguments.

In some cases like lists, there might be two levels of indices: [0,0], or [1,0].

In these cases, the first integer is the 'closest' index from the point of view of the child.

OrderedList
    { startingIndex = 0
    , items =
        [ [ (), () ]
        , [ (), (), () ]
        ]
    }
    |> indexedMap (\indices _ -> indices)
--> OrderedList
-->     { startingIndex = 0
-->     , items =
-->         [ [ [ 0, 0 ], [ 1, 0 ] ]
-->         , [ [ 0, 1 ], [ 1, 1 ], [ 2, 1 ] ]
-->         ]
-->     }

High-level Transformations

These functions are not as composable as transformation building blocks, but might suffice for your use case. Take a look at the other section if you find you need something better.

parameterized : (Block view -> environment -> view) -> Block (environment -> view) -> environment -> view

Use this function if you want to parameterize your view by an environment.

Another way of thinking about this use-case is: use this if you want to 'render to functions'.

Examples for what the environment type variable can be:

Usually, for the above usecases you would have to define a function of type

reduceTemplate :
    Block (TemplateInfo -> Html msg)
    -> (TemplateInfo -> Html msg)

for example, so that you can turn it back into a Renderer (Template Info -> Html msg) for elm-markdown.

If you were to define such a function, you would have to pass around the TemplateInfo parameter a lot. This function will take care of that for you.

Anti use-cases

In some cases using this function would be overkill. The alternative to this function is to simply parameterize your whole renderer (and not use this library):

renderMarkdown : List String -> Block (Html Msg) -> Html Msg
renderMarkdown censoredWords markdown =
    ...

renderer : List String -> Markdown.Renderer (Html Msg)
renderer censoredWords =
    toRenderer
        { renderHtml = ...
        , renderMarkdown = renderMarkdown censoredWords
        }

In this example you can see how we pass through the 'censored words'. It behaves kind of like some global context in which we create our renderer.

It is hard to convey the abstract notion of when to use parameterized and when not to. I'll give it a try: If you want to parse your markdown once and need to quickly render different versions of it (for example with different Models or different TemplateInfos), then use this. In other cases, if you probably only want to de-couple some variable out of your renderer that is pretty static in general (for example censored words), don't use this.

parameterized over multiple Parameters

If you want to parameterize your renderer over multiple variables, there are two options:

  1. Add a field to the environment type used in this function
  2. Take another parameter in curried form

Although both are possible, I highly recommend the first option, as it is by far easier to deal with only one call to parameterized, not with two calls that would be required for option 2.

Missing Functionality

If this function doesn't quite do what you want, just try to re-create what you need by using map directly. parameterized basically just documents a pattern that is really easy to re-create: Its implementation is just 1 line of code.

validating : (Block view -> Result error view) -> Block (Result error view) -> Result error view

This transform enables validating the content of your Block before rendering.

This function's most prominent usecases are linting markdown files, so for example:

But it might also be possible that your view type can't always be reduced from a Block view to a view, so you need to generate an error in these cases.

Missing Functionality

If this function doesn't quite do what you need to do, try using foldResults. The validating definition basically just documents a common pattern. Its implementation is just 1 line of code.

withDataSource : (Block view -> DataSource view) -> Block (DataSource view) -> DataSource view

This transform allows you to perform elm-pages' DataSource requests without having to think about how to thread these through your renderer.

Some applications that can be realized like this:

Missing Functionality

If this function doesn't quite do what you need to do, try using foldStaticHttpRequests. The wihtStaticHttpRequests definition basically just documents a common pattern. Its implementation is just 1 line of code.

Transformation Building Blocks

reduceHtml : List (Html.Attribute msg) -> Block (Html msg) -> Html msg

This will reduce a Block to Html similar to what the defaultHtmlRenderer in elm-markdown does. That is, it renders similar to what the CommonMark spec expects.

It also takes a list of attributes for convenience, so if you want to attach styles, id's, classes or events, you can use this.

However, the attributes parameter is ignored for Text nodes.

reduceWords : Block (List String) -> List String

Extracts all words from the blocks and inlines. Excludes any markup characters, if they had an effect on the markup.

The words are split according to the \s javascript regular expression (regex).

Inline code spans are split, but code blocks fragments are ignored (code spans are included).

If you need something more specific, I highly recommend rolling your own function for this.

This is useful if you need to e.g. create header slugs.

reducePretty : Block String -> String

Convert a block of markdown back to markdown text. (See the 'Formatting Markdown' test in the test suite.)

This just renders one particular style of markdown. Your use-case might need something completely different. I recommend taking a look at the source code and adapting it to your needs.

Note: This function doesn't support GFM tables. The function Markdown.PrettyTables.reducePrettyTable extends this function with table pretty-printing. Table pretty-printing is complicated, even when ignoring column sizes. The type Block String -> String is just "not powerful" enough to render a table to a string in such a way that it is syntactically valid again.

reduce : { accumulate : List a -> a, extract : Block a -> a } -> Block a -> a

Reduces a block down to anything that can be accumulated.

You provide two functions

For example, this can count the amount of headings in a markdown document:

reduce
    { accumulate = List.sum
    , extract =
        \block ->
            case block of
                Heading _ ->
                    1

                _ ->
                    0
    }

Or this extracts code blocks:

reduce
    { accumulate = List.concat
    , extract =
        \block ->
            case block of
                CodeBlock codeBlock ->
                    [ codeBlock ]

                _ ->
                    []
    }

The special thing about this function is how you don't have to worry about accumulating the other generated values recursively.

foldFunction : Block (environment -> view) -> environment -> Block view

Transform a block that contains functions into a function that produces blocks.

One really common use-case is having access to a Model inside your html renderers. In these cases you want your markdown to be 'rendered to a function'.

So let's say you've got a Markdown.Html.Renderer like so:

renderHtml :
    Markdown.Html.Renderer
        (List (Model -> Html Msg)
         -> (Model -> Html Msg)
        )

It has this type to be able to depend on the Model. Eventually you'll want to render to Model -> Html Msg.

So now you can define your Markdown.Renderer.Renderer like so:

renderer : Markdown.Renderer.Renderer (Model -> Html Msg)
renderer =
    toRenderer
        { renderHtml = renderHtml
        , renderMarkdown = renderMarkdown
        }

renderMarkdown :
    Block (Model -> Html Msg)
    -> (Model -> Html Msg)
renderMarkdown block model =
    foldFunction block
        -- ^ result : Model -> Block (Html Msg)
        model
        -- ^ result : Block (Html Msg)
        |> reduceHtml

-- ^ result : Html Msg

foldResults : Block (Result error view) -> Result error (Block view)

Thread results through your Blocks.

The input is a block that contains possibly failed views. The output becomes Err, if any of the input block's children had an error (then it's the first error). If all of the block's children were Ok, then the result is going to be Ok.

foldStaticHttpRequests : Block (DataSource view) -> DataSource (Block view)

Accumulate elm-page's DataSources over blocks.

Using this, it is possible to write reducers that produce views as a result of performing static http requests.

foldIndexed : Block (List Basics.Int -> view) -> List Basics.Int -> Block view

Fold your blocks with index information. This uses indexedMap under the hood.

This is quite advanced, but also very useful. If you're looking for a working example, please take a look at the test for this function.

What are 'reducers'?

In this context of the library, we're often working with functions of the type Block view -> view, where view might be something like Html Msg or String, etc. or, generally, functions of structure Block a -> b.

I refer to functions of that structure as 'reducers'. (This is somewhat different to the 'real' terminology, but I feel like they capture the nature of 'reducing once' very well.)

If you know List.foldr you already know an example for a reducer (the first argument)! The reducers in this module are no different, we just write them in different ways.

We can do the same thing we did for this library for lists:

type ListScaffold elem a
    = Empty
    | Cons elem a

reduceEmpty = 0

reduceCons a b = a + b

handler listElement =
    case listElement of
        Empty ->
            reduceEmpty

        Cons elem accumulated ->
            reduceCons elem accumulated

foldl : (ListScaffold a b -> b) -> List a -> b
foldl handle list =
    case list of
        [] -> handle Empty
        (x:xs) -> handle (Cons x xs)

foldl handler == List.foldl reduceCons reduceEmpty

The last line illustrates how different ways of writing these reducers relate: For List.foldl we simply provide the cases (empty or cons) as different arguments, for reducers in this library, we create a custom type case for empty and cons.

What are 'folds'?

Some functions have similar, but not quite the type that a reducers has. For example:

All of these examples have the structure Block (F a) -> F (Block a) for some F. You might have to squint your eyes at the last two of these examples. Especially the last one. Let me rewrite it with a type alias:

type alias Function a b =
    a -> b

foldFunction : Block (Function env a) -> Function env (Block a)

Combining Reducers

You can combine multiple 'reducers' into one. There's no function for doing this, but a pattern you might want to follow.

Let's say you want to accumulate both all the words in your markdown and the Html you want it to render to, then you can do this:

type alias Rendered =
    { html : Html Msg
    , words : List String
    }

reduceRendered : Block Rendered -> Rendered
reduceRendered block =
    { html = block |> map .html |> reduceHtml
    , words = block |> map .words |> reduceWords
    }

If you want to render to more things, just add another parameter to the record type and follow the pattern. It is even possible to let the rendered html to depend on the words inside itself (or maybe something else you're additionally reducing to).

Conversions

Did you already start to write a custom elm-markdown Renderer, but want to use this library? Don't worry. They're compatible. You can convert between them!

fromRenderer : Markdown.Renderer.Renderer view -> Block view -> view

There are two ways of thinking about this function:

  1. Render a Block using the given elm-markdown Renderer.
  2. Extract a function of type (Block view -> view) out of the elm-markdown Renderer. This is useful if you want to make use of the utilities present in this library.

toRenderer : { renderMarkdown : Block view -> view, renderHtml : Markdown.Html.Renderer (List view -> view) } -> Markdown.Renderer.Renderer view

Convert a function that works with Block to a Renderer for use with elm-markdown.

(The second parameter is a Markdown.Html.Renderer)

Utilities

I mean to aggregate utilites for transforming Blocks in this section.

bumpHeadings : Basics.Int -> Block view -> Block view

Bump all Heading elements by given positive amount of levels.

import Markdown.Block as Block

bumpHeadings 2
    (Heading
        { level = Block.H1
        , rawText = ""
        , children = []
        }
    )
--> Heading
-->     { level = Block.H3
-->     , rawText = ""
-->     , children = []
-->     }

bumpHeadings 1
    (Heading
        { level = Block.H6
        , rawText = ""
        , children = []
        }
    )
--> Heading
-->     { level = Block.H6
-->     , rawText = ""
-->     , children = []
-->     }

bumpHeadings -1
    (Heading
        { level = Block.H2
        , rawText = ""
        , children = []
        }
    )
--> Heading
-->     { level = Block.H2
-->     , rawText = ""
-->     , children = []
-->     }