Pandoc User’s GuideJohnMacFarlaneAugust 7, 2018Synopsispandoc [options]
[input-file]…
Description
Pandoc is a
Haskell library
for converting from one markup format to another, and a command-line
tool that uses this library.
Pandoc can convert between numerous markup and word processing
formats, including, but not limited to, various flavors of
Markdown,
HTML,
LaTeX and
Word
docx. For the full lists of input and output formats, see the
--from and --to
options below. Pandoc can
also produce
PDF output: see
creating a PDF, below.
Pandoc’s enhanced version of Markdown includes syntax for
tables,
definition lists,
metadata blocks,
footnotes,
citations,
math, and much more. See below under
Pandoc’s Markdown.
Pandoc has a modular design: it consists of a set of readers, which
parse text in a given format and produce a native representation of
the document (an abstract syntax tree or AST),
and a set of writers, which convert this native representation into
a target format. Thus, adding an input or output format requires
only adding a reader or writer. Users can also run custom
pandoc
filters to modify the intermediate AST.
Because pandoc’s intermediate representation of a document is less
expressive than many of the formats it converts between, one should
not expect perfect conversions between every format and every other.
Pandoc attempts to preserve the structural elements of a document,
but not formatting details such as margin size. And some document
elements, such as complex tables, may not fit into pandoc’s simple
document model. While conversions from pandoc’s Markdown to all
formats aspire to be perfect, conversions from formats more
expressive than pandoc’s Markdown can be expected to be lossy.
Using pandoc
If no input-files are specified, input is
read from stdin. Output goes to
stdout by default. For output to a file, use
the -o option:
pandoc -o output.html input.txt
By default, pandoc produces a document fragment. To produce a
standalone document (e.g. a valid HTML file including
<head> and
<body>), use the -s or
--standalone flag:
pandoc -s -o output.html input.txt
For more information on how standalone documents are produced, see
Templates below.
If multiple input files are given, pandoc will
concatenate them all (with blank lines between them) before
parsing. (Use --file-scope to parse files
individually.)
Specifying formats
The format of the input and output can be specified explicitly
using command-line options. The input format can be specified
using the -f/--from option, the output format
using the -t/--to option. Thus, to convert
hello.txt from Markdown to LaTeX, you could
type:
pandoc -f markdown -t latex hello.txt
To convert hello.html from HTML to Markdown:
pandoc -f html -t markdown hello.html
Supported input and output formats are listed below under
Options (see -f
for input formats and -t for output formats).
You can also use pandoc --list-input-formats
and pandoc --list-output-formats to print lists
of supported formats.
If the input or output format is not specified explicitly,
pandoc will attempt to guess it from the
extensions of the filenames. Thus, for example,
pandoc -o hello.tex hello.txt
will convert hello.txt from Markdown to LaTeX.
If no output file is specified (so that output goes to
stdout), or if the output file’s extension is
unknown, the output format will default to HTML. If no input file
is specified (so that input comes from
stdin), or if the input files’ extensions are
unknown, the input format will be assumed to be Markdown.
Character encoding
Pandoc uses the UTF-8 character encoding for both input and
output. If your local character encoding is not UTF-8, you should
pipe input and output through
iconv:
iconv -t utf-8 input.txt | pandoc | iconv -f utf-8
Note that in some output formats (such as HTML, LaTeX, ConTeXt,
RTF, OPML, DocBook, and Texinfo), information about the character
encoding is included in the document header, which will only be
included if you use the -s/--standalone option.
Creating a PDF
To produce a PDF, specify an output file with a
.pdf extension:
pandoc test.txt -o test.pdf
By default, pandoc will use LaTeX to create the PDF, which
requires that a LaTeX engine be installed (see
--pdf-engine below).
Alternatively, pandoc can use
ConTeXt,
pdfroff, or any of the following
HTML/CSS-to-PDF-engines, to create a PDF:
wkhtmltopdf,
weasyprint
or
prince.
To do this, specify an output file with a .pdf
extension, as before, but add the --pdf-engine
option or -t context,
-t html, or -t ms to the
command line (-t html defaults to
--pdf-engine=wkhtmltopdf).
PDF output can be controlled using
variables for LaTeX (if
LaTeX is used) and variables
for ConTeXt (if ConTeXt is used). When using an
HTML/CSS-to-PDF-engine, --css affects the
output. If wkhtmltopdf is used, then the
variables margin-left,
margin-right, margin-top,
margin-bottom, footer-html,
header-html and papersize
will affect the output.
To debug the PDF creation, it can be useful to look at the
intermediate representation: instead of
-o test.pdf, use for example
-s -o test.tex to output the generated LaTeX.
You can then test it with pdflatex test.tex.
When using LaTeX, the following packages need to be available
(they are included with all recent versions of
TeX Live):
amsfonts,
amsmath,
lm,
unicode-math,
ifxetex,
ifluatex,
listings
(if the --listings option is used),
fancyvrb,
longtable,
booktabs,
graphicx
and
grffile
(if the document contains images),
hyperref,
xcolor
(with colorlinks),
ulem,
geometry
(with the geometry variable set),
setspace
(with linestretch), and
babel
(with lang). The use of
xelatex or lualatex as the
LaTeX engine requires
fontspec.
xelatex uses
polyglossia
(with lang),
xecjk,
and
bidi
(with the dir variable set). If the
mathspec variable is set,
xelatex will use
mathspec
instead of
unicode-math.
The
upquote
and
microtype
packages are used if available, and
csquotes
will be used for typography if
\usepackage{csquotes} is present in the
template or included via
/H/--include-in-header. The
natbib,
biblatex,
bibtex,
and
biber
packages can optionally be used for
citation rendering.
Reading from the Web
Instead of an input file, an absolute URI may be given. In this
case pandoc will fetch the content using HTTP:
pandoc -f html -t markdown http://www.fsf.org
It is possible to supply a custom User-Agent string or other
header when requesting a document from a URL:
pandoc -f html -t markdown --request-header User-Agent:"Mozilla/5.0" \
http://www.fsf.org
OptionsGeneral options-fFORMAT,
-rFORMAT,
--from=FORMAT,
--read=FORMAT
Specify input format. FORMAT can be:
commonmark
(CommonMark
Markdown)
creole
(Creole
1.0)
docbook
(DocBook)
docx
(Word
docx)
epub
(EPUB)
fb2
(FictionBook2
e-book)
gfm
(GitHub-Flavored
Markdown), or the deprecated and less accurate
markdown_github; use
markdown_github
only if you need extensions not supported in
gfm.
haddock
(Haddock
markup)
html
(HTML)
jats
(JATS
XML)
json (JSON version of native AST)
latex
(LaTeX)
markdown
(Pandoc’s
Markdown)
markdown_mmd
(MultiMarkdown)
markdown_phpextra
(PHP
Markdown Extra)
markdown_strict (original unextended
Markdown)
mediawiki
(MediaWiki
markup)
muse
(Muse)
native (native Haskell)
odt
(ODT)
opml
(OPML)
org
(Emacs Org
mode)
rst
(reStructuredText)
t2t
(txt2tags)
textile
(Textile)
tikiwiki
(TikiWiki
markup)
twiki
(TWiki
markup)
vimwiki
(Vimwiki)
Extensions can be individually enabled or disabled by
appending +EXTENSION or
-EXTENSION to the format name. See
Extensions below, for a
list of extensions and their names. See
--list-input-formats and
--list-extensions, below.
-tFORMAT,
-wFORMAT,
--to=FORMAT,
--write=FORMAT
Specify output format. FORMAT can be:
asciidoc
(AsciiDoc)
beamer
(LaTeX
beamer slide show)
commonmark
(CommonMark
Markdown)
context
(ConTeXt)
docbook or
docbook4
(DocBook 4)
docbook5 (DocBook 5)
docx
(Word
docx)
dokuwiki
(DokuWiki
markup)
epub or epub3
(EPUB v3
book)
epub2 (EPUB v2)
fb2
(FictionBook2
e-book)
gfm
(GitHub-Flavored
Markdown), or the deprecated and less accurate
markdown_github; use
markdown_github
only if you need extensions not supported in
gfm.
haddock
(Haddock
markup)
html or html5
(HTML,
i.e.
HTML5/XHTML
polyglot
markup)
html4
(XHTML
1.0 Transitional)
icml
(InDesign
ICML)
jats
(JATS
XML)
json (JSON version of native AST)
latex
(LaTeX)
man
(groff
man)
markdown
(Pandoc’s
Markdown)
markdown_mmd
(MultiMarkdown)
markdown_phpextra
(PHP
Markdown Extra)
markdown_strict (original unextended
Markdown)
mediawiki
(MediaWiki
markup)
ms
(groff
ms)
muse
(Muse),
native (native Haskell),
odt
(OpenOffice
text document)
opml
(OPML)
opendocument
(OpenDocument)
org
(Emacs Org
mode)
plain (plain text),
pptx
(PowerPoint
slide show)
rst
(reStructuredText)
rtf
(Rich
Text Format)
texinfo
(GNU
Texinfo)
textile
(Textile)
slideous
(Slideous
HTML and JavaScript slide show)
slidy
(Slidy
HTML and JavaScript slide show)
dzslides
(DZSlides
HTML5 + JavaScript slide show),
revealjs
(reveal.js
HTML5 + JavaScript slide show)
s5
(S5
HTML and JavaScript slide show)
tei
(TEI
Simple)
zimwiki
(ZimWiki
markup)
the path of a custom lua writer, see
Custom writers
below
Note that odt, docx,
and epub output will not be directed to
stdout unless forced with
-o -.
Extensions can be individually enabled or disabled by
appending +EXTENSION or
-EXTENSION to the format name. See
Extensions below, for a
list of extensions and their names. See
--list-output-formats and
--list-extensions, below.
-oFILE,
--output=FILE
Write output to FILE instead of
stdout. If FILE is
-, output will go to
stdout, even if a non-textual format
(docx, odt,
epub2, epub3) is
specified.
--data-dir=DIRECTORY
Specify the user data directory to search for pandoc data
files. If this option is not specified, the default user
data directory will be used. This is, in UNIX:
$HOME/.pandoc
in Windows XP:
C:\Documents And Settings\USERNAME\Application Data\pandoc
and in Windows Vista or later:
C:\Users\USERNAME\AppData\Roaming\pandoc
You can find the default user data directory on your system
by looking at the output of
pandoc --version. A
reference.odt,
reference.docx,
epub.css, templates,
slidy, slideous, or
s5 directory placed in this directory
will override pandoc’s normal defaults.
--bash-completion
Generate a bash completion script. To enable bash completion
with pandoc, add this to your .bashrc:
eval "$(pandoc --bash-completion)"
--verbose
Give verbose debugging output. Currently this only has an
effect with PDF output.
--quiet
Suppress warning messages.
--fail-if-warnings
Exit with error status if there are any warnings.
--log=FILE
Write log messages in machine-readable JSON format to
FILE. All messages above DEBUG level
will be written, regardless of verbosity settings
(--verbose, --quiet).
--list-input-formats
List supported input formats, one per line.
--list-output-formats
List supported output formats, one per line.
--list-extensions[=FORMAT]
List supported extensions, one per line, preceded by a
+ or - indicating
whether it is enabled by default in
FORMAT. If FORMAT
is not specified, defaults for pandoc’s Markdown are given.
--list-highlight-languages
List supported languages for syntax highlighting, one per
line.
--list-highlight-styles
List supported styles for syntax highlighting, one per line.
See --highlight-style.
-v, --version
Print version.
-h, --help
Show usage message.
Reader options--base-header-level=NUMBER
Specify the base level for headers (defaults to 1).
--strip-empty-paragraphsDeprecated. Use the
+empty_paragraphs extension
instead. Ignore paragraphs with no content. This
option is useful for converting word processing documents
where users have used empty paragraphs to create
inter-paragraph space.
--indented-code-classes=CLASSES
Specify classes to use for indented code blocks–for example,
perl,numberLines or
haskell. Multiple classes may be
separated by spaces or commas.
--default-image-extension=EXTENSION
Specify a default extension to use when image paths/URLs
have no extension. This allows you to use the same source
for formats that require different kinds of images.
Currently this option only affects the Markdown and LaTeX
readers.
--file-scope
Parse each file individually before combining for multifile
documents. This will allow footnotes in different files with
the same identifiers to work as expected. If this option is
set, footnotes and links will not work across files. Reading
binary files (docx, odt, epub) implies
--file-scope.
-FPROGRAM,
--filter=PROGRAM
Specify an executable to be used as a filter transforming
the pandoc AST after the input is parsed and before the
output is written. The executable should read JSON from
stdin and write JSON to stdout. The JSON must be formatted
like pandoc’s own JSON input and output. The name of the
output format will be passed to the filter as the first
argument. Hence,
pandoc --filter ./caps.py -t latex
is equivalent to
pandoc -t json | ./caps.py latex | pandoc -f json -t latex
The latter form may be useful for debugging filters.
Filters may be written in any language.
Text.Pandoc.JSON exports
toJSONFilter to facilitate writing
filters in Haskell. Those who would prefer to write filters
in python can use the module
pandocfilters,
installable from PyPI. There are also pandoc filter
libraries in
PHP,
perl,
and
JavaScript/node.js.
In order of preference, pandoc will look for filters in
a specified full or relative path (executable or
non-executable)
$DATADIR/filters (executable or
non-executable) where $DATADIR is the
user data directory (see --data-dir,
above).
$PATH (executable only)
Filters and lua-filters are applied in the order specified
on the command line.
--lua-filter=SCRIPT
Transform the document in a similar fashion as JSON filters
(see --filter), but use pandoc’s build-in
lua filtering system. The given lua script is expected to
return a list of lua filters which will be applied in order.
Each lua filter must contain element-transforming functions
indexed by the name of the AST element on which the filter
function should be applied.
The pandoc lua module provides helper
functions for element creation. It is always loaded into the
script’s lua environment.
The following is an example lua script for macro-expansion:
function expand_hello_world(inline)
if inline.c == '{{helloworld}}' then
return pandoc.Emph{ pandoc.Str "Hello, World" }
else
return inline
end
end
return {{Str = expand_hello_world}}
In order of preference, pandoc will look for lua filters in
a specified full or relative path (executable or
non-executable)
$DATADIR/filters (executable or
non-executable) where $DATADIR is the
user data directory (see --data-dir,
above).
-MKEY[=VAL],
--metadata=KEY[:VAL]
Set the metadata field KEY to the value
VAL. A value specified on the command
line overrides a value specified in the document using
YAML metadata
blocks. Values will be parsed as YAML boolean or
string values. If no value is specified, the value will be
treated as Boolean true. Like --variable,
--metadata causes template variables to
be set. But unlike --variable,
--metadata affects the metadata of the
underlying document (which is accessible from filters and
may be printed in some output formats) and metadata values
will be escaped when inserted into the template.
-p, --preserve-tabs
Preserve tabs instead of converting them to spaces (the
default). Note that this will only affect tabs in literal
code spans and code blocks; tabs in regular text will be
treated as spaces.
--tab-stop=NUMBER
Specify the number of spaces per tab (default is 4).
--track-changes=accept|reject|all
Specifies what to do with insertions, deletions, and
comments produced by the MS Word Track
Changes feature. accept (the
default), inserts all insertions, and ignores all deletions.
reject inserts all deletions and ignores
insertions. Both accept and
reject ignore comments.
all puts in insertions, deletions, and
comments, wrapped in spans with
insertion, deletion,
comment-start, and
comment-end classes, respectively. The
author and time of change is included.
all is useful for scripting: only
accepting changes from a certain reviewer, say, or before a
certain date. If a paragraph is inserted or deleted,
track-changes=all produces a span with
the class
paragraph-insertion/paragraph-deletion
before the affected paragraph break. This option only
affects the docx reader.
--extract-media=DIR
Extract images and other media contained in or linked from
the source document to the path DIR,
creating it if necessary, and adjust the images references
in the document so they point to the extracted files. If the
source format is a binary container (docx, epub, or odt),
the media is extracted from the container and the original
filenames are used. Otherwise the media is read from the
file system or downloaded, and new filenames are constructed
based on SHA1 hashes of the contents.
--abbreviations=FILE
Specifies a custom abbreviations file, with abbreviations
one to a line. If this option is not specified, pandoc will
read the data file abbreviations from the
user data directory or fall back on a system default. To see
the system default, use
pandoc --print-default-data-file=abbreviations.
The only use pandoc makes of this list is in the Markdown
reader. Strings ending in a period that are found in this
list will be followed by a nonbreaking space, so that the
period will not produce sentence-ending space in formats
like LaTeX.
General writer options-s, --standalone
Produce output with an appropriate header and footer (e.g. a
standalone HTML, LaTeX, TEI, or RTF file, not a fragment).
This option is set automatically for pdf,
epub, epub3,
fb2, docx, and
odt output. For native
output, this option causes metadata to be included;
otherwise, metadata is suppressed.
--template=FILE|URL
Use the specified file as a custom template for the
generated document. Implies --standalone.
See Templates, below, for a
description of template syntax. If no extension is
specified, an extension corresponding to the writer will be
added, so that --template=special looks
for special.html for HTML output. If the
template is not found, pandoc will search for it in the
templates subdirectory of the user data
directory (see --data-dir). If this
option is not used, a default template appropriate for the
output format will be used (see
-D/--print-default-template).
-VKEY[=VAL],
--variable=KEY[:VAL]
Set the template variable KEY to the
value VAL when rendering the document
in standalone mode. This is generally only useful when the
--template option is used to specify a
custom template, since pandoc automatically sets the
variables used in the default templates. If no
VAL is specified, the key will be given
the value true.
-DFORMAT,
--print-default-template=FORMAT
Print the system default template for an output
FORMAT. (See -t for
a list of possible FORMATs.) Templates
in the user data directory are ignored.
--print-default-data-file=FILE
Print a system default data file. Files in the user data
directory are ignored.
--eol=crlf|lf|native
Manually specify line endings: crlf
(Windows), lf (macOS/Linux/UNIX), or
native (line endings appropriate to the
OS on which pandoc is being run). The default is
native.
--dpi=NUMBER
Specify the dpi (dots per inch) value for conversion from
pixels to inch/centimeters and vice versa. The default is
96dpi. Technically, the correct term would be ppi (pixels
per inch).
--wrap=auto|none|preserve
Determine how text is wrapped in the output (the source
code, not the rendered version). With
auto (the default), pandoc will attempt
to wrap lines to the column width specified by
--columns (default 72). With
none, pandoc will not wrap lines at all.
With preserve, pandoc will attempt to
preserve the wrapping from the source document (that is,
where there are nonsemantic newlines in the source, there
will be nonsemantic newlines in the output as well).
Automatic wrapping does not currently work in HTML output.
--columns=NUMBER
Specify length of lines in characters. This affects text
wrapping in the generated source code (see
--wrap). It also affects calculation of
column widths for plain text tables (see
Tables below).
--toc,
--table-of-contents
Include an automatically generated table of contents (or, in
the case of latex,
context, docx,
odt, opendocument,
rst, or ms, an
instruction to create one) in the output document. This
option has no effect unless
-s/--standalone is used, and it has no
effect on man,
docbook4, docbook5, or
jats output.
--toc-depth=NUMBER
Specify the number of section levels to include in the table
of contents. The default is 3 (which means that level 1, 2,
and 3 headers will be listed in the contents).
--strip-comments
Strip out HTML comments in the Markdown or Textile source,
rather than passing them on to Markdown, Textile or HTML
output as raw HTML. This does not apply to HTML comments
inside raw HTML blocks when the
markdown_in_html_blocks extension is not
set.
--no-highlight
Disables syntax highlighting for code blocks and inlines,
even when a language attribute is given.
--highlight-style=STYLE|FILE
Specifies the coloring style to be used in highlighted
source code. Options are pygments (the
default), kate,
monochrome,
breezeDark, espresso,
zenburn, haddock, and
tango. For more information on syntax
highlighting in pandoc, see
Syntax
highlighting, below. See also
--list-highlight-styles.
Instead of a STYLE name, a JSON file
with extension .theme may be supplied.
This will be parsed as a KDE syntax highlighting theme and
(if valid) used as the highlighting style.
To generate the JSON version of an existing style, use
--print-highlight-style.
--print-highlight-style=STYLE|FILE
Prints a JSON version of a highlighting style, which can be
modified, saved with a .theme extension,
and used with --highlight-style.
--syntax-definition=FILE
Instructs pandoc to load a KDE XML syntax definition file,
which will be used for syntax highlighting of appropriately
marked code blocks. This can be used to add support for new
languages or to use altered syntax definitions for existing
languages.
-HFILE,
--include-in-header=FILE
Include contents of FILE, verbatim, at
the end of the header. This can be used, for example, to
include special CSS or JavaScript in HTML documents. This
option can be used repeatedly to include multiple files in
the header. They will be included in the order specified.
Implies --standalone.
-BFILE,
--include-before-body=FILE
Include contents of FILE, verbatim, at
the beginning of the document body (e.g. after the
<body> tag in HTML, or the
\begin{document} command in LaTeX). This
can be used to include navigation bars or banners in HTML
documents. This option can be used repeatedly to include
multiple files. They will be included in the order
specified. Implies --standalone.
-AFILE,
--include-after-body=FILE
Include contents of FILE, verbatim, at
the end of the document body (before the
</body> tag in HTML, or the
\end{document} command in LaTeX). This
option can be used repeatedly to include multiple files.
They will be included in the order specified. Implies
--standalone.
--resource-path=SEARCHPATH
List of paths to search for images and other resources. The
paths should be separated by : on Linux,
UNIX, and macOS systems, and by ; on
Windows. If --resource-path is not
specified, the default resource path is the working
directory. Note that, if --resource-path
is specified, the working directory must be explicitly
listed or it will not be searched. For example:
--resource-path=.:test will search the
working directory and the test
subdirectory, in that order.
--request-header=NAME:VAL
Set the request header NAME to the
value VAL when making HTTP requests
(for example, when a URL is given on the command line, or
when resources used in a document must be downloaded). If
you’re behind a proxy, you also need to set the environment
variable http_proxy to
http://....
Options affecting specific writers--self-contained
Produce a standalone HTML file with no external
dependencies, using data: URIs to
incorporate the contents of linked scripts, stylesheets,
images, and videos. Implies --standalone.
The resulting file should be self-contained,
in the sense that it needs no external files and no net
access to be displayed properly by a browser. This option
works only with HTML output formats, including
html4, html5,
html+lhs, html5+lhs,
s5, slidy,
slideous, dzslides,
and revealjs. Scripts, images, and
stylesheets at absolute URLs will be downloaded; those at
relative URLs will be sought relative to the working
directory (if the first source file is local) or relative to
the base URL (if the first source file is remote). Elements
with the attribute
data-external="1" will be left
alone; the documents they link to will not be incorporated
in the document. Limitation: resources that are loaded
dynamically through JavaScript cannot be incorporated; as a
result, --self-contained does not work
with --mathjax, and some advanced
features (e.g. zoom or speaker notes) may not work in an
offline self-containedreveal.js slide show.
--html-q-tags
Use <q> tags for quotes in HTML.
--ascii
Use only ASCII characters in output. Currently supported for
XML and HTML formats (which use numerical entities instead
of UTF-8 when this option is selected) and for groff ms and
man (which use hexadecimal escapes).
--reference-links
Use reference-style links, rather than inline links, in
writing Markdown or reStructuredText. By default inline
links are used. The placement of link references is affected
by the --reference-location option.
--reference-location = block|section|document
Specify whether footnotes (and references, if
reference-links is set) are placed at the
end of the current (top-level) block, the current section,
or the document. The default is document.
Currently only affects the markdown writer.
--atx-headers
Use ATX-style headers in Markdown and AsciiDoc output. The
default is to use setext-style headers for levels 1-2, and
then ATX headers. (Note: for gfm output,
ATX headers are always used.)
--top-level-division=[default|section|chapter|part]
Treat top-level headers as the given division type in LaTeX,
ConTeXt, DocBook, and TEI output. The hierarchy order is
part, chapter, then section; all headers are shifted such
that the top-level header becomes the specified type. The
default behavior is to determine the best division type via
heuristics: unless other conditions apply,
section is chosen. When the LaTeX
document class is set to report,
book, or memoir
(unless the article option is specified),
chapter is implied as the setting for
this option. If beamer is the output
format, specifying either chapter or
part will cause top-level headers to
become \part{..}, while second-level
headers remain as their default type.
-N, --number-sections
Number section headings in LaTeX, ConTeXt, HTML, or EPUB
output. By default, sections are not numbered. Sections with
class unnumbered will never be numbered,
even if --number-sections is specified.
--number-offset=NUMBER[,NUMBER,…]
Offset for section headings in HTML output (ignored in other
output formats). The first number is added to the section
number for top-level headers, the second for second-level
headers, and so on. So, for example, if you want the first
top-level header in your document to be numbered
6, specify
--number-offset=5. If your document
starts with a level-2 header which you want to be numbered
1.5, specify
--number-offset=1,4. Offsets are 0 by
default. Implies --number-sections.
--listings
Use the
listings
package for LaTeX code blocks
-i, --incremental
Make list items in slide shows display incrementally (one by
one). The default is for lists to be displayed all at once.
--slide-level=NUMBER
Specifies that headers with the specified level create
slides (for beamer,
s5, slidy,
slideous, dzslides).
Headers above this level in the hierarchy are used to divide
the slide show into sections; headers below this level
create subheads within a slide. Note that content that is
not contained under slide-level headers will not appear in
the slide show. The default is to set the slide level based
on the contents of the document; see
Structuring the
slide show.
--section-divs
Wrap sections in <section> tags (or
<div> tags for
html4), and attach identifiers to the
enclosing <section> (or
<div>) rather than the header
itself. See Header
identifiers, below.
--email-obfuscation=none|javascript|references
Specify a method for obfuscating mailto:
links in HTML documents. none leaves
mailto: links as they are.
javascript obfuscates them using
JavaScript. references obfuscates them by
printing their letters as decimal or hexadecimal character
references. The default is none.
--id-prefix=STRING
Specify a prefix to be added to all identifiers and internal
links in HTML and DocBook output, and to footnote numbers in
Markdown and Haddock output. This is useful for preventing
duplicate identifiers when generating fragments to be
included in other pages.
-TSTRING,
--title-prefix=STRING
Specify STRING as a prefix at the
beginning of the title that appears in the HTML header (but
not in the title as it appears at the beginning of the HTML
body). Implies --standalone.
-cURL,
--css=URL
Link to a CSS style sheet. This option can be used
repeatedly to include multiple files. They will be included
in the order specified.
A stylesheet is required for generating EPUB. If none is
provided using this option (or the
stylesheet metadata field), pandoc will
look for a file epub.css in the user data
directory (see --data-dir). If it is not
found there, sensible defaults will be used.
--reference-doc=FILE
Use the specified file as a style reference in producing a
docx or ODT file.
Docx
For best results, the reference docx should be a
modified version of a docx file produced using pandoc.
The contents of the reference docx are ignored, but
its stylesheets and document properties (including
margins, page size, header, and footer) are used in
the new docx. If no reference docx is specified on the
command line, pandoc will look for a file
reference.docx in the user data
directory (see --data-dir). If this
is not found either, sensible defaults will be used.
To produce a custom reference.docx,
first get a copy of the default
reference.docx:
pandoc --print-default-data-file reference.docx > custom-reference.docx.
Then open custom-reference.docx in
Word, modify the styles as you wish, and save the
file. For best results, do not make changes to this
file other than modifying the styles used by pandoc:
[paragraph] Normal, Body Text, First Paragraph,
Compact, Title, Subtitle, Author, Date, Abstract,
Bibliography, Heading 1, Heading 2, Heading 3, Heading
4, Heading 5, Heading 6, Heading 7, Heading 8, Heading
9, Block Text, Footnote Text, Definition Term,
Definition, Caption, Table Caption, Image Caption,
Figure, Captioned Figure, TOC Heading; [character]
Default Paragraph Font, Body Text Char, Verbatim Char,
Footnote Reference, Hyperlink; [table] Table.
ODT
For best results, the reference ODT should be a
modified version of an ODT produced using pandoc. The
contents of the reference ODT are ignored, but its
stylesheets are used in the new ODT. If no reference
ODT is specified on the command line, pandoc will look
for a file reference.odt in the
user data directory (see
--data-dir). If this is not found
either, sensible defaults will be used.
To produce a custom reference.odt,
first get a copy of the default
reference.odt:
pandoc --print-default-data-file reference.odt > custom-reference.odt.
Then open custom-reference.odt in
LibreOffice, modify the styles as you wish, and save
the file.
PowerPoint
Any template included with a recent install of
Microsoft PowerPoint (either with
.pptx or .potx
extension) should work, as will most templates derived
from these.
The specific requirement is that the template should
contain the following four layouts as its first four
layouts:
Title Slide
Title and Content
Section Header
Two Content
All templates included with a recent version of MS
PowerPoint will fit these criteria. (You can click on
Layout under the
Home menu to check.)
You can also modify the default
reference.pptx: first run
pandoc --print-default-data-file reference.pptx > custom-reference.pptx,
and then modify
custom-reference.pptx in MS
PowerPoint (pandoc will use the first four layout
slides, as mentioned above).
--epub-cover-image=FILE
Use the specified image as the EPUB cover. It is recommended
that the image be less than 1000px in width and height. Note
that in a Markdown source document you can also specify
cover-image in a YAML metadata block (see
EPUB Metadata, below).
--epub-metadata=FILE
Look in the specified XML file for metadata for the EPUB.
The file should contain a series of
Dublin
Core elements. For example:
<dc:rights>Creative Commons</dc:rights>
<dc:language>es-AR</dc:language>
By default, pandoc will include the following metadata
elements: <dc:title> (from the
document title), <dc:creator> (from
the document authors), <dc:date>
(from the document date, which should be in
ISO
8601 format), <dc:language>
(from the lang variable, or, if is not
set, the locale), and
<dc:identifier id="BookId">
(a randomly generated UUID). Any of these may be overridden
by elements in the metadata file.
Note: if the source document is Markdown, a YAML metadata
block in the document can be used instead. See below under
EPUB Metadata.
--epub-embed-font=FILE
Embed the specified font in the EPUB. This option can be
repeated to embed multiple fonts. Wildcards can also be
used: for example, DejaVuSans-*.ttf.
However, if you use wildcards on the command line, be sure
to escape them or put the whole filename in single quotes,
to prevent them from being interpreted by the shell. To use
the embedded fonts, you will need to add declarations like
the following to your CSS (see --css):
@font-face {
font-family: DejaVuSans;
font-style: normal;
font-weight: normal;
src:url("DejaVuSans-Regular.ttf");
}
@font-face {
font-family: DejaVuSans;
font-style: normal;
font-weight: bold;
src:url("DejaVuSans-Bold.ttf");
}
@font-face {
font-family: DejaVuSans;
font-style: italic;
font-weight: normal;
src:url("DejaVuSans-Oblique.ttf");
}
@font-face {
font-family: DejaVuSans;
font-style: italic;
font-weight: bold;
src:url("DejaVuSans-BoldOblique.ttf");
}
body { font-family: "DejaVuSans"; }
--epub-chapter-level=NUMBER
Specify the header level at which to split the EPUB into
separate chapter files. The default is to
split into chapters at level 1 headers. This option only
affects the internal composition of the EPUB, not the way
chapters and sections are displayed to users. Some readers
may be slow if the chapter files are too large, so for large
documents with few level 1 headers, one might want to use a
chapter level of 2 or 3.
--epub-subdirectory=DIRNAME
Specify the subdirectory in the OCF container that is to
hold the EPUB-specific contents. The default is
EPUB. To put the EPUB contents in the top
level, use an empty string.
--pdf-engine=pdflatex|lualatex|xelatex|wkhtmltopdf|weasyprint|prince|context|pdfroff
Use the specified engine when producing PDF output. The
default is pdflatex. If the engine is not
in your PATH, the full path of the engine may be specified
here.
--pdf-engine-opt=STRING
Use the given string as a command-line argument to the
pdf-engine. If used multiple times, the
arguments are provided with spaces between them. Note that
no check for duplicate options is done.
Citation rendering--bibliography=FILE
Set the bibliography field in the
document’s metadata to FILE, overriding
any value set in the metadata, and process citations using
pandoc-citeproc. (This is equivalent to
--metadata bibliography=FILE --filter pandoc-citeproc.)
If --natbib or
--biblatex is also supplied,
pandoc-citeproc is not used, making this
equivalent to
--metadata bibliography=FILE. If you
supply this argument multiple times, each
FILE will be added to bibliography.
--csl=FILE
Set the csl field in the document’s
metadata to FILE, overriding any value
set in the metadata. (This is equivalent to
--metadata csl=FILE.) This option is only
relevant with pandoc-citeproc.
--citation-abbreviations=FILE
Set the citation-abbreviations field in
the document’s metadata to FILE,
overriding any value set in the metadata. (This is
equivalent to
--metadata citation-abbreviations=FILE.)
This option is only relevant with
pandoc-citeproc.
--natbib
Use
natbib
for citations in LaTeX output. This option is not for use
with the pandoc-citeproc filter or with
PDF output. It is intended for use in producing a LaTeX file
that can be processed with
bibtex.
--biblatex
Use
biblatex
for citations in LaTeX output. This option is not for use
with the pandoc-citeproc filter or with
PDF output. It is intended for use in producing a LaTeX file
that can be processed with
bibtex
or
biber.
Math rendering in HTML
The default is to render TeX math as far as possible using Unicode
characters. Formulas are put inside a span with
class="math", so that they may be
styled differently from the surrounding text if needed. However,
this gives acceptable results only for basic math, usually you
will want to use --mathjax or another of the
following options.
--mathjax[=URL]
Use
MathJax to
display embedded TeX math in HTML output. TeX math will be
put between \(...\) (for inline math) or
\[...\] (for display math) and wrapped in
<span> tags with class
math. Then the MathJax JavaScript will
render it. The URL should point to the
MathJax.js load script. If a
URL is not provided, a link to the
Cloudflare CDN will be inserted.
--mathml
Convert TeX math to
MathML (in
epub3, docbook4,
docbook5, jats,
html4 and html5). This
is the default in odt output. Note that
currently only Firefox and Safari (and select e-book
readers) natively support MathML.
--webtex[=URL]
Convert TeX formulas to <img> tags
that link to an external script that converts formulas to
images. The formula will be URL-encoded and concatenated
with the URL provided. For SVG images you can for example
use
--webtex https://latex.codecogs.com/svg.latex?.
If no URL is specified, the CodeCogs URL generating PNGs
will be used
(https://latex.codecogs.com/png.latex?).
Note: the --webtex option will affect
Markdown output as well as HTML, which is useful if you’re
targeting a version of Markdown without native math support.
--katex[=URL]
Use
KaTeX
to display embedded TeX math in HTML output. The
URL is the base URL for the KaTeX
library. If a URL is not provided, a
link to the KaTeX CDN will be inserted.
--katex-stylesheet=URL
The URL should point to the
katex.css stylesheet. If this option is
not specified, a link to the KaTeX CDN will be inserted.
Note that this option does not imply
--katex.
--gladtex
Enclose TeX math in <eq> tags in
HTML output. The resulting HTML can then be processed by
GladTeX
to produce images of the typeset formulas and an HTML file
with links to these images. So, the procedure is:
pandoc -s --gladtex input.md -o myfile.htex
gladtex -d myfile-images myfile.htex
# produces myfile.html and images in myfile-images
Options for wrapper scripts--dump-args
Print information about command-line arguments to
stdout, then exit. This option is
intended primarily for use in wrapper scripts. The first
line of output contains the name of the output file
specified with the -o option, or
- (for stdout) if no
output file was specified. The remaining lines contain the
command-line arguments, one per line, in the order they
appear. These do not include regular pandoc options and
their arguments, but do include any options appearing after
a -- separator at the end of the line.
--ignore-args
Ignore command-line arguments (for use in wrapper scripts).
Regular pandoc options are not ignored. Thus, for example,
pandoc --ignore-args -o foo.html -s foo.txt -- -e latin1
is equivalent to
pandoc -o foo.html -s
Templates
When the -s/--standalone option is used, pandoc
uses a template to add header and footer material that is needed for
a self-standing document. To see the default template that is used,
just type
pandoc -D *FORMAT*
where FORMAT is the name of the output format.
A custom template can be specified using the
--template option. You can also override the
system default templates for a given output format
FORMAT by putting a file
templates/default.*FORMAT* in the user data
directory (see --data-dir, above).
Exceptions:
For odt output, customize the
default.opendocument template.
For pdf output, customize the
default.latex template (or the
default.context template, if you use
-t context, or the
default.ms template, if you use
-t ms, or the default.html
template, if you use -t html).
docx has no template (however, you can use
--reference-doc to customize the output).
Templates contain variables, which allow for
the inclusion of arbitrary information at any point in the file.
They may be set at the command line using the
-V/--variable option. If a variable is not set,
pandoc will look for the key in the document’s metadata – which can
be set using either
YAML metadata
blocks or with the --metadata option.
Variables set by pandoc
Some variables are set automatically by pandoc. These vary
somewhat depending on the output format, but include the
following:
sourcefile, outputfile
source and destination filenames, as given on the command
line. sourcefile can also be a list if
input comes from multiple files, or empty if input is from
stdin. You can use the following snippet in your template to
distinguish them:
$if(sourcefile)$
$for(sourcefile)$
$sourcefile$
$endfor$
$else$
(stdin)
$endif$
Similarly, outputfile can be
- if output goes to the terminal.
title, author,
date
allow identification of basic aspects of the document.
Included in PDF metadata through LaTeX and ConTeXt. These
can be set through a
pandoc title
block, which allows for multiple authors, or through
a YAML metadata block:
---
author:
- Aristotle
- Peter Abelard
...
subtitle
document subtitle, included in HTML, EPUB, LaTeX, ConTeXt,
and Word docx; renders in LaTeX only when using a document
class that supports \subtitle, such as
beamer or the
KOMA-Script
series (scrartcl,
scrreprt,
scrbook).
To make subtitle work with other
LaTeX document classes, you can add the following to
header-includes:
\providecommand{\subtitle}[1]{%
\usepackage{titling}
\posttitle{%
\par\large#1\end{center}}
}
institute
author affiliations (in LaTeX and Beamer only). Can be a
list, when there are multiple authors.
abstract
document summary, included in LaTeX, ConTeXt, AsciiDoc, and
Word docx
keywords
list of keywords to be included in HTML, PDF, and AsciiDoc
metadata; may be repeated as for author,
above
header-includes
contents specified by
-H/--include-in-header (may have multiple
values)
toc
non-null value if
--toc/--table-of-contents was specified
toc-title
title of table of contents (works only with EPUB,
opendocument, odt, docx, pptx)
include-before
contents specified by
-B/--include-before-body (may have
multiple values)
include-after
contents specified by
-A/--include-after-body (may have
multiple values)
body
body of document
meta-json
JSON representation of all of the document’s metadata. Field
values are transformed to the selected output format.
Language variableslang
identifies the main language of the document, using a code
according to
BCP
47 (e.g. en or
en-GB). For some output formats, pandoc
will convert it to an appropriate format stored in the
additional variables babel-lang,
polyglossia-lang (LaTeX) and
context-lang (ConTeXt).
Native pandoc Spans and Divs with the lang attribute (value
in BCP 47) can be used to switch the language in that range.
In LaTeX output, babel-otherlangs and
polyglossia-otherlangs variables will be
generated automatically based on the lang
attributes of Spans and Divs in the document.
dir
the base direction of the document, either
rtl (right-to-left) or
ltr (left-to-right).
For bidirectional documents, native pandoc
spans and divs with
the dir attribute (value
rtl or ltr) can be
used to override the base direction in some output formats.
This may not always be necessary if the final renderer
(e.g. the browser, when generating HTML) supports the
Unicode
Bidirectional Algorithm.
When using LaTeX for bidirectional documents, only the
xelatex engine is fully supported (use
--pdf-engine=xelatex).
Variables for slides
Variables are available for
producing slide
shows with pandoc, including all
reveal.js
configuration options.
titlegraphic
title graphic for Beamer documents
logo
logo for Beamer documents
slidy-url
base URL for Slidy documents (defaults to
https://www.w3.org/Talks/Tools/Slidy2)
slideous-url
base URL for Slideous documents (defaults to
slideous)
s5-url
base URL for S5 documents (defaults to
s5/default)
revealjs-url
base URL for reveal.js documents (defaults to
reveal.js)
theme, colortheme,
fonttheme, innertheme,
outertheme
themes for LaTeX
beamer
documents
themeoptions
options for LaTeX beamer themes (a list).
navigation
controls navigation symbols in beamer
documents (default is empty for no
navigation symbols; other valid values are
frame, vertical, and
horizontal).
section-titles
enables on title pages for new sections in
beamer documents (default = true).
beamerarticle
when true, the beamerarticle package is
loaded (for producing an article from beamer slides).
aspectratio
aspect ratio of slides (for beamer only,
1610 for 16:10, 169
for 16:9, 149 for 14:9,
141 for 1.41:1, 54 for
5:4, 43 for 4:3 which is the default, and
32 for 3:2).
Variables for LaTeX
LaTeX variables are used when
creating a PDF.
papersize
paper size, e.g. letter,
a4fontsize
font size for body text (e.g. 10pt,
12pt)
documentclass
document class, e.g.
article,
report,
book,
memoirclassoption
option for document class, e.g. oneside;
may be repeated for multiple options
beameroption
In beamer, add extra beamer option with
\setbeameroption{}geometry
option for
geometry
package, e.g. margin=1in; may be repeated
for multiple options
margin-left,
margin-right,
margin-top,
margin-bottom
sets margins, if geometry is not used
(otherwise geometry overrides these)
linestretch
adjusts line spacing using the
setspace
package, e.g. 1.25,
1.5fontfamily
font package for use with pdflatex:
TeX
Live includes many options, documented in the
LaTeX
Font Catalogue. The default is
Latin
Modern.
fontfamilyoptions
options for package used as fontfamily:
e.g. osf,sc with
fontfamily set to
mathpazo
provides Palatino with old-style figures and true small
caps; may be repeated for multiple options
mainfont, sansfont,
monofont, mathfont,
CJKmainfont
font families for use with xelatex or
lualatex: take the name of any system
font, using the
fontspec
package. Note that if CJKmainfont is
used, the
xecjk
package must be available.
mainfontoptions,
sansfontoptions,
monofontoptions,
mathfontoptions,
CJKoptions
options to use with mainfont,
sansfont, monofont,
mathfont, CJKmainfont
in xelatex and
lualatex. Allow for any choices available
through
fontspec,
such as the OpenType features
Numbers=OldStyle,Numbers=Proportional.
May be repeated for multiple options.
fontenc
allows font encoding to be specified through
fontenc package (with
pdflatex); default is
T1 (see guide to
LaTeX font
encodings)
microtypeoptions
options to pass to the microtype package
colorlinks
add color to link text; automatically enabled if any of
linkcolor, citecolor,
urlcolor, or toccolor
are set
linkcolor, citecolor,
urlcolor, toccolor
color for internal links, citation links, external links,
and links in table of contents: uses options allowed by
xcolor,
including the dvipsnames,
svgnames, and x11names
lists
links-as-notes
causes links to be printed as footnotes
indent
uses document class settings for indentation (the default
LaTeX template otherwise removes indentation and adds space
between paragraphs)
subparagraph
disables default behavior of LaTeX template that redefines
(sub)paragraphs as sections, changing the appearance of
nested headings in some classes
thanks
specifies contents of acknowledgments footnote after
document title.
toc
include table of contents (can also be set using
--toc/--table-of-contents)
toc-depth
level of section to include in table of contents
secnumdepth
numbering depth for sections, if sections are numbered
lof, lot
include list of figures, list of tables
bibliography
bibliography to use for resolving references
biblio-style
bibliography style, when used with
--natbib and
--biblatex.
biblio-title
bibliography title, when used with
--natbib and
--biblatex.
biblatexoptions
list of options for biblatex.
natbiboptions
list of options for natbib.
pagestyle
An option for LaTeX’s \pagestyle{}. The
default article class supports plain
(default), empty, and
headings; headings puts section titles in the
header.
Variables for ConTeXtpapersize
paper size, e.g. letter,
A4, landscape (see
ConTeXt
Paper Setup); may be repeated for multiple options
layout
options for page margins and text arrangement (see
ConTeXt
Layout); may be repeated for multiple options
margin-left,
margin-right,
margin-top,
margin-bottom
sets margins, if layout is not used
(otherwise layout overrides these)
fontsize
font size for body text (e.g. 10pt,
12pt)
mainfont, sansfont,
monofont, mathfont
font families: take the name of any system font (see
ConTeXt
Font Switching)
linkcolor, contrastcolor
color for links outside and inside a page, e.g.
red, blue (see
ConTeXt
Color)
linkstyle
typeface style for links, e.g. normal,
bold, slanted,
boldslanted, type,
cap, smallindenting
controls indentation of paragraphs, e.g.
yes,small,next (see
ConTeXt
Indentation); may be repeated for multiple options
whitespace
spacing between paragraphs, e.g. none,
small (using
setupwhitespace)
interlinespace
adjusts line spacing, e.g. 4ex (using
setupinterlinespace);
may be repeated for multiple options
headertext, footertext
text to be placed in running header or footer (see
ConTeXt
Headers and Footers); may be repeated up to four
times for different placement
pagenumbering
page number style and location (using
setuppagenumbering);
may be repeated for multiple options
toc
include table of contents (can also be set using
--toc/--table-of-contents)
lof, lot
include list of figures, list of tables
pdfa
adds to the preamble the setup necessary to generate
PDF/A-1b:2005. To successfully generate PDF/A the required
ICC color profiles have to be available and the content and
all included files (such as images) have to be standard
conforming. The ICC profiles can be obtained from
ConTeXt
ICC Profiles. See also
ConTeXt
PDFA for more details.
Variables for man pagessection
section number in man pages
header
header in man pages
footer
footer in man pages
adjusting
adjusts text to left (l), right
(r), center (c), or
both (b) margins
hyphenate
if true (the default), hyphenation will
be used
Variables for mspointsize
point size (e.g. 10p)
lineheight
line height (e.g. 12p)
fontfamily
font family (e.g. T or
P)
indent
paragraph indent (e.g. 2m)
Using variables in templates
Variable names are sequences of alphanumerics,
-, and _, starting with a
letter. A variable name surrounded by $ signs
will be replaced by its value. For example, the string
$title$ in
<title>$title$</title>
will be replaced by the document title.
To write a literal $ in a template, use
$$.
Templates may contain conditionals. The syntax is as follows:
$if(variable)$
X
$else$
Y
$endif$
This will include X in the template if
variable has a truthy value; otherwise it will
include Y. Here a truthy value is any of the
following:
a string that is not entirely white space,
a non-empty array where the first value is truthy,
any number (including zero),
any object,
the boolean true (to specify the boolean
true value using YAML metadata or the
--metadata flag, use
true, True, or
TRUE; with the
--variable flag, simply omit a value for
the variable, e.g. --variable draft).
X and Y are placeholders for
any valid template text, and may include interpolated variables or
other conditionals. The $else$ section may be
omitted.
When variables can have multiple values (for example,
author in a multi-author document), you can use
the $for$ keyword:
$for(author)$
<meta name="author" content="$author$" />
$endfor$
You can optionally specify a separator to be used between
consecutive items:
$for(author)$$author$$sep$, $endfor$
A dot can be used to select a field of a variable that takes an
object as its value. So, for example:
$author.name$ ($author.affiliation$)
If you use custom templates, you may need to revise them as pandoc
changes. We recommend tracking the changes in the default
templates, and modifying your custom templates accordingly. An
easy way to do this is to fork the
pandoc-templates
repository and merge in changes after each pandoc release.
Templates may contain comments: anything on a line after
$-- will be treated as a comment and ignored.
Extensions
The behavior of some of the readers and writers can be adjusted by
enabling or disabling various extensions.
An extension can be enabled by adding +EXTENSION
to the format name and disabled by adding
-EXTENSION. For example,
--from markdown_strict+footnotes is strict
Markdown with footnotes enabled, while
--from markdown-footnotes-pipe_tables is pandoc’s
Markdown without footnotes or pipe tables.
The markdown reader and writer make by far the most use of
extensions. Extensions only used by them are therefore covered in
the section Pandoc’s
Markdown below (See
Markdown variants for
commonmark and gfm.) In the
following, extensions that also work for other formats are covered.
TypographyExtension: smart
Interpret straight quotes as curly quotes,
--- as em-dashes, -- as
en-dashes, and ... as ellipses. Nonbreaking
spaces are inserted after certain abbreviations, such as
Mr.
This extension can be enabled/disabled for the following
formats:
input formats
markdown,
commonmark, latex,
mediawiki, org,
rst, twiki
output formats
markdown, latex,
context, rst
enabled by default in
markdown, latex,
context (both input and output)
Note: If you are writing Markdown, then the
smart extension has the reverse effect: what
would have been curly quotes comes out straight.
In LaTeX, smart means to use the standard TeX
ligatures for quotation marks (`` and
'' for double quotes, `
and ' for single quotes) and dashes
(-- for en-dash and ---
for em-dash). If smart is disabled, then in
reading LaTeX pandoc will parse these characters literally. In
writing LaTeX, enabling smart tells pandoc to
use the ligatures when possible; if smart is
disabled pandoc will use unicode quotation mark and dash
characters.
Headers and sectionsExtension: auto_identifiers
A header without an explicitly specified identifier will be
automatically assigned a unique identifier based on the header
text.
This extension can be enabled/disabled for the following
formats:
input formats
markdown, latex,
rst, mediawiki,
textile
output formats
markdown, muse
enabled by default in
markdown, muse
The algorithm used to derive the identifier from the header text
is:
Remove all formatting, links, etc.
Remove all footnotes.
Remove all punctuation, except underscores, hyphens, and
periods.
Replace all spaces and newlines with hyphens.
Convert all alphabetic characters to lowercase.
Remove everything up to the first letter (identifiers may
not begin with a number or punctuation mark).
If nothing is left after this, use the identifier
section.
Thus, for example,
Header
Identifier
Header identifiers in HTMLheader-identifiers-in-html*Dogs*?--in *my* house?dogs--in-my-house[HTML], [S5], or [RTF]?html-s5-or-rtf3. Applicationsapplications33section
These rules should, in most cases, allow one to determine the
identifier from the header text. The exception is when several
headers have the same text; in this case, the first will get an
identifier as described above; the second will get the same
identifier with -1 appended; the third with
-2; and so on.
These identifiers are used to provide link targets in the table
of contents generated by the
--toc|--table-of-contents option. They also
make it easy to provide links from one section of a document to
another. A link to this section, for example, might look like
this:
See the section on
[header identifiers](#header-identifiers-in-html-latex-and-context).
Note, however, that this method of providing links to sections
works only in HTML, LaTeX, and ConTeXt formats.
If the --section-divs option is specified,
then each section will be wrapped in a
section (or a div, if
html4 was specified), and the identifier will
be attached to the enclosing <section>
(or <div>) tag rather than the header
itself. This allows entire sections to be manipulated using
JavaScript or treated differently in CSS.
Extension: ascii_identifiers
Causes the identifiers produced by
auto_identifiers to be pure ASCII. Accents
are stripped off of accented Latin letters, and non-Latin
letters are omitted.
Math Input
The extensions
tex_math_dollars,
tex_math_single_backslash,
and
tex_math_double_backslash
are described in the section about Pandoc’s Markdown.
However, they can also be used with HTML input. This is handy for
reading web pages formatted using MathJax, for example.
Raw HTML/TeX
The following extensions (especially how they affect Markdown
input/output) are also described in more detail in their
respective sections of Pandoc’s
Markdown.
Extension:
raw_html
When converting from HTML, parse elements to raw HTML which are
not representable in pandoc’s AST. By default, this is disabled
for HTML input.
Extension:
raw_tex
Allows raw LaTeX, TeX, and ConTeXt to be included in a document.
This extension can be enabled/disabled for the following formats
(in addition to markdown):
input formats
latex, org,
textile
output formats
textile, commonmarkExtension:
native_divs
This extension is enabled by default for HTML input. This means
that divs are parsed to pandoc native
elements. (Alternatively, you can parse them to raw HTML using
-f html-native_divs+raw_html.)
When converting HTML to Markdown, for example, you may want to
drop all divs and spans:
pandoc -f html-native_divs-native_spans -t markdown
Extension:
native_spans
Analogous to native_divs above.
Literate Haskell supportExtension: literate_haskell
Treat the document as literate Haskell source.
This extension can be enabled/disabled for the following
formats:
input formats
markdown, rst,
latex
output formats
markdown, rst,
latex, html
If you append +lhs (or
+literate_haskell) to one of the formats
above, pandoc will treat the document as literate Haskell
source. This means that
In Markdown input, bird track sections will
be parsed as Haskell code rather than block quotations. Text
between \begin{code} and
\end{code} will also be treated as
Haskell code. For ATX-style headers the character
= will be used instead of #.
In Markdown output, code blocks with classes
haskell and literate
will be rendered using bird tracks, and block quotations
will be indented one space, so they will not be treated as
Haskell code. In addition, headers will be rendered
setext-style (with underlines) rather than ATX-style (with
# characters). (This is because ghc treats
# characters in column 1 as introducing line
numbers.)
In restructured text input, bird track
sections will be parsed as Haskell code.
In restructured text output, code blocks with class
haskell will be rendered using bird
tracks.
In LaTeX input, text in code environments
will be parsed as Haskell code.
In LaTeX output, code blocks with class
haskell will be rendered inside
code environments.
In HTML output, code blocks with class
haskell will be rendered with class
literatehaskell and bird tracks.
Examples:
pandoc -f markdown+lhs -t html
reads literate Haskell source formatted with Markdown
conventions and writes ordinary HTML (without bird tracks).
pandoc -f markdown+lhs -t html+lhs
writes HTML with the Haskell code in bird tracks, so it can be
copied and pasted as literate Haskell source.
Note that GHC expects the bird tracks in the first column, so
indented literate code blocks (e.g. inside an itemized
environment) will not be picked up by the Haskell compiler.
Other extensionsExtension: empty_paragraphs
Allows empty paragraphs. By default empty paragraphs are
omitted.
This extension can be enabled/disabled for the following
formats:
input formats
docx, html
output formats
docx, odt,
opendocument, htmlExtension: styles
Read all docx styles as divs (for paragraph styles) and spans
(for character styles) regardless of whether pandoc understands
the meaning of these styles. This can be used with
docx custom styles.
Disabled by default.
input formats
docxExtension: amuse
In the muse input format, this enables
Text::Amuse extensions to Emacs Muse markup.
Extension: citations
Some aspects of Pandoc’s Markdown
citation syntax are also accepted in
org input.
Extension: ntb
In the context output format this enables the
use of
Natural
Tables (TABLE) instead of the default
Extreme
Tables (xtables). Natural tables allow more fine-grained
global customization but come at a performance penalty compared
to extreme tables.
Pandoc’s Markdown
Pandoc understands an extended and slightly revised version of John
Gruber’s
Markdown
syntax. This document explains the syntax, noting differences from
standard Markdown. Except where noted, these differences can be
suppressed by using the markdown_strict format
instead of markdown. Extensions can be enabled or
disabled to specify the behavior more granularly. They are described
in the following. See also
Extensions above, for extensions
that work also on other formats.
Philosophy
Markdown is designed to be easy to write, and, even more
importantly, easy to read:
A Markdown-formatted document should be publishable as-is, as
plain text, without looking like it’s been marked up with tags
or formatting instructions. –
John
Gruber