ARTICLE

Using Markup Languages with Hugo

19 min readDec 27, 2019

From Hugo in Action by Atishay Jain

Hugo supports a variety of markup languages for generating content. Markdown is the most popular amongst them and most widely used.

In this article we’ll create formatted posts using Markdown to fill up the content pages used in the website for Acme Corporation.

_________________________________________________________________

Take 37% off Hugo in Action by entering fccjain into the discount code box at checkout at manning.com.
_________________________________________________________________

Markup languages

Hugo natively supports Markdown and HTML for content markup. It also supports asciidoc, reStructured Text and pandoc via external helpers. External helpers are command line parsers which Hugo calls and Hugo can’t guarantee great performance with them. Markdown is the most popular format for writing text-based content. It’s the simplest of the content markup languages and it’s easy to read without any formatting by a document processor. Asciidoc is built for creating larger pieces of contents like books. It provides more features than markdown, though they come at a cost in flexibility. This article is written using asciidoc. reStructured Text is used for generating documentation projects and it’s a formally defined and stricter language providing easier parsing at the cost of simplicity. Pandoc is a tool for file format conversion that supports a superset of markdown which Hugo can convert to HTML using its command line.

Markdown trades simplicity and ease of use over features. It’s the perfect language for website content which doesn’t span beyond a couple of pages. Hugo’s markdown parser is fast and powerful. It’s recommended for users, who are new to Hugo, to begin with the markdown format unless they already know a different markup language.

Table 1. Content Markup Languages in Hugo

Note that Hugo constantly adds support for new formats. As of writing this article, there were active discussions to natively support AsciiDoc in Hugo.

Markdown

Markdown is an extremely lightweight format for writing easy to read and easy to write plain text documents which have some level of structure and formatting support. Created in 2004 by John Gruber and Aaron Swartz. Being able to be written as plain text in command line, git commit messages, plain text boxes and chat pods enables markdown to provide basic formatting anywhere text can be supplied. Markdown is the most popular format for writing README documents, used to detail the code documentation. Original Markdown has a limited set of features which has been extended by CommonMark, which was further extended as Github Flavored Markdown (GFM) which is amongst the most popular variant. Hugo supports most of GFM and adds supports more extensions via two libraries BlackFriday and MMark. Blackfriday is built as a general-purpose Markdown parser. It supports Github flavored markdown and adds supports for more extensions. Hugo parses files with extensions md and markdown with BlackFriday.

MMark extends BlackFriday for generating IETF Documentation. IETF or Internet Engineering Task Force that maintains the documentation for TCP, IP, HTTP and other protocols of the Internet. MMark supports cross-references, citations, callouts and indices for content. Hugo parses files with extension mmark with MMark.

Markdown Editors

Markdown is built as a language which can be read and written in a plain text editor without any special support for this format. Many users don’t have any special editor for Markdown. Plain text editors like Sublime Text and VS Code provide color coding to help identify special formatting in markdown. They also support live preview of markdown content in the output format. If you’re looking for a dedicated Markdown editor, there are tools like the Typora and iA Writer which provide a lot of capabilities for helping create good markdown documents. They support keyboard shortcuts and inline or live previews. Online tools like Dropbox Paper also support markdown.

Apart from these pandoc can convert many file formats including Microsoft Office, Open Office, Latex, MediaWiki to markdown.

Organizing your posts with block elements

Markdown is extremely easy to write. A plain blob of text converts to paragraphs. A single line break between text is ignored. Two, or more, line breaks create empty lines between text, which is required to change between paragraphs. This requirement isn’t arbitrary. Markdown is made for writing/reading in text boxes where automatic text wrapping may not be available. Therefore, authors are allowed to manually create line breaks without impacting outputs. To create a line break, add two spaces at the end of the line and then add a new line character (via enter key).

Markdown is a practical language. It’s built with the objective of human readability and understandability. As a naturally evolved rather than committee-built language, markdown has been patched whenever a major issue has emerged after a syntax is standardized, making it an easy to use language with some idiosyncrasies.

Listing 1. Writing paragraphs in markdown

I am a paragraph in markdown with line
 wrapping so fit in this width.
 I am a continuation of the first paragraph
 as there is no empty line before me.
  
 I am the second paragraph.
  
  
 I am the third one. Even though there are
 two line breaks before me, this does not
 create any newline characters. After me there
 are two spaces before the newline character.
 I have line break before me and even though
 I am not a new paragraph, I do start on a
 new line due to the manual line break via
 spaces before the newline character.

Two way can create top level headings. First is to underline the text with a set of equal signs for level one or with dashes for level two. This makes headings highlighted and readable. The second approach is to use hashes () before the text. Single sign creates level one and double creates level two. We can use this to reach until level six of headings in most parsers. To distinguish with hashtags, which have become popular to label issues, Github Flavored Markdown requires a space after the “has” sign to be considered a valid heading. Hashes between tags don’t create headings.

Listing 2. Headings in markdown

Top Level H1
 =============
 H2
 ---
  
 #Just a tag
 Also a # tag.
 # Alternate H1
 ## Alternate H2
 ### H3
 ###### H6

Figure 1. Elements in Markdown Part 1 — Block Elements (Code section 1)

Quotes are to be prefixed by > as the first element on the line. Lists can be added with either *, - or + at the start of the sentence. Sub-lists are added by having spaces before the bullet. Numbered/Ordered lists are created by number followed by a dot. Ordered and unordered lists can be mixed in markdown.

Horizontal straight lines represented by the <hr> tag can be created by using dashes or stars (minimum of three).

Listing 3. Block elements in markdown

* This is a list element
 + This is also a list element
 - This is also a list element
  - This is a sublist element
  + Also a sublist element
      + Sublist level 2
      1. Numbered sublist
      2. Next item
         3. Next indent level
  
 1. Numbered sublist
 2. Next item
      1. Next indent level
         * Sublist non numbered
 3. Back
  
 Horizontal Lines:
 ------------------------------------
 ***********************************
 ***
 ---
  
 > Block Quote

Figure 2. Elements in Markdown Part 2 — Lists (Code section 1)

Let’s use this to setup the privacy policy page for the Acme Corporation website. The content can be generated via a website like privacypolicies.com and converted to markdown via pandoc (https://pandoc.org/try/). Using the universal theme, the pages looks like this:

Figure 3. Privacy Policy for Acme Corporation formatted using block elements (Code section 1)

We can switch back the Acme Corporation website to the eclectictheme to verify the cross theme compatibility by updating the config.yaml. We also fill up the terms of using these features. We won’t switch to the universal theme for every section hereafter.

Figure 4. Terms of Use for Acme Corporation via theme Eclectic (Code section 2)

Formatting, Inline links, code and images

You must have noticed that links in the page were automatically formatted by Hugo and converted to HTML anchor tags which can be clicked and go to the target page. This is a feature provided by Hugo’s markdown parser and we don’t need to write anything special to enable this feature. Markdown provides basic formatting support. We can surround the text with _(underscore) or *(star) for italics, __(double underscore) for boldface. We can use three underscores for both bold and italics. ~~(double tildes) are used for strike through.

Inline links can be created using [Visible text](http://example.org/path/to/file). We can add a title via [visible text](http://link "Title") where Title is available via tooltip. We can share target links across text via footnote links by specifying a footnote location number in the link as [visible text][target 1] and then specifying a shared footnote at the bottom as [target 1]: https://path/to/target. Links can also specify locations directly like [target 1] auto-links to target the one we specified.

Inline code can be specified by surrounding it with back ticks `Inline Code` and placing it within free-flowing content. It gets formatted with a monospace font in the HTML code tag to allow spaces to flow through.

Markdown supports <img> tags to show inline images. Support for specifying block images, image dimensions and other details are non-existent, and it’s left to the theme to implement. Images can be inlined using the similar syntax to links and prefixing that with an !(Exclamation mark, commonly called a bang). We can use relative paths in the image tag as well.

Listing 4. Inline Formatting in Markdown

*Italics*
  
 _Italics_
  
 __Bold__
  
 ___Bold+Italics___
  
 this_is_not_emphasis
  
 ~~Strikethough~~
  
 Content with a -- (dash) and a --- (long dash).
  
 [link](http://link/path/to/target)
  
 [link](http://link/path/to/target "TITLE ON LINK")
  
 [Shared Links with footnotes][target 1]
  
 [Second shared link][target 1]
  
 [target 1]
  
 [target 1]: http://footnote.com
  
 Sample inline code `a++` can be specified here.
  
 ![Alt Text](/path/to/image "Optional Tooltip")

Figure 5. Elements in Markdown Part 4 — Inline Elements (Code section 3)

Using these features, we can properly label parts of the privacy policy for the Acme Corporation website and format it. The formatted privacy policy page looks much more professional and complete.

Figure 6. Updated privacy page (Code section 3)

One big advantage of having content alongside code in the entire JAM stack is the ability to have versioning. With Markdown, these formatting updated are extremely easy to understand and we get full support for forks, branches and pull requests for content. We can also have a proper software lifecycle for the website content including staging, branch views and a proper release cycle. We can write bots to manage content. Figure 7 shows the diff view of the privacy page after the updates for inline elements.

Figure 7. Diff view for the privacy page on Github. Each content change can be clearly viewed, reviewed ad managed as code.

HTML

Markdown is built to pass through HTML into content. Technically any valid HTML is also valid Markdown. You can place HTML tags within markdown to allow those to be rendered within content. You can also write unicode characters using the escape syntax which is used in HTML and XML documents. This provides access to the entire set of unicode characters including localization, emojis, symbols, etc.

Listing 5. Using HTML in markdown

HTML Escaped characters:
  
 Copyright: &copy;
  
 Registered: &reg;
  
 Trademark: &trade;
  
 Less Than: &lt;
  
 Greater Than: &gt;
  
 Ampresand: &gt;
  
 Smiley: &#x1F604;
  
 Embedded HTML: x<sup<>2</sup<>
  
  
 Floating image via HTML: <img< src="/image/logo.png"< style="float: right; padding: 0 0 0 10px"<> Follow up text after the image. This honors the floats and wraps around the image, automatically going into the next line.

Figure 8. Elements in Markdown Part 5: HTML Escaping and inline HTML (Code section 4)

Hugo supports disabling HTML tags (not unicode characters) via the skipHTML key in the configuration file config.yaml. Inline HTML with shortcodes have little value. Inline HTML can turn into a security risk if we don’t trust the creators of content as arbitrary JavaScript and CSS can be added. With little control of the embedded HTML of the theme creator, allowing embedded HTML can turn into a big problem when updating a theme. Users get freedom to be creative with HTML and the content adds layouts, alignment, color and other styling information which is difficult to clean up. It’s strongly advised to minimize the use of embedded HTML in markdown content. Formatting should be present in the theme exposed as shortcodes if needed.

When we created the config file for Acme Corporation’s website, we specified the copyright directive as Copyright ©. Here we used escaped HTML for the unicode based copyright symbol. © which also works. A lot of themes take markdown in the params area giving us the power to provide formatting for the content which is rendered. Hugo provides a simple utility method to convert markdown to HTML.

Tables, Task Lists, code block

Hugo supports the extensions to Markdown popularized by Github in the Github Flavored Markdown using the exact syntax. We can create tables, task lists and provide code blocks within Hugo pages.

Hugo supports tables styled like Github as well as a shorter form where we can ignore the edge |(pipe) character.

Task lists follow the Github style and are dislayed as disabled. No automatic enabling exists as click handling and updating content on click isn’t automatically possible without involving a server to edit the files.

Markdown supports blocks of code using three backticks (“`) popularly called code fences at the start and end of the code block. The start code fence can be followed by the name of the language to get syntax specific code highlighting. Hugo uses chroma, a syntax highlighter written in Go for this purpose but can also switch to pygments, the popular python-based syntax highlighter. Pygments is slower than chroma but provides more options to control syntax highlighting. Hugo provides the theme creator with a CSS files which they can include in the page for syntax highlighting. Hugo has code fences disabled by default. To enable code fences, all the following to your config.yaml:

pygmentsCodefences: true

Other syntax highlighting options can be supplied to chroma

pygmentsUseClasses: false               # Use CSS instead of inline styling
 pygmentsStyle: monokai                  # Inline style to use
 pygmentsCodefencesGuessSyntax: true     # Guess language if not provided
 pygmentsUseClassic: false               # Use pygments instead of chroma
 pygmentsOptions: "linenos=true"         # Parameters to pass to pygments

Listing 6. Tables Code Fences and Tasks lists from Github Flavored Markdown

Table:
  
    Name | Job
 --------|------
    Alex | Web Developer
     Bob | Sys Admin
    Gabby| Technical Writer
  
 Alternate Table:
  
 |  Name | Mantra |
 |  ---  |   ---  |
 | Alex  | There must be a better way. |
 | Bob   | Play it safe. |
 | Gabby | Try everything, but do what you like. |
  
 Acme Website task list
 - [x] Get the home page up
 - [x] Update Privacy Policy and Terms of Use
 - [ ] Add the about page
 - [ ] Start the blog
 - [ ] Enable contact us
  
 ```js
 var x= 10;
 x++;
 console.log(x);
 ```

Figure 9. Elements in Markdown Part 6 — Tables, Code blocks and task lists (Code section 4)

Fractions, emojis and other Hugo extensions

Hugo extends Markdown with added features which make our day-to-day use of markdown easier and more fun. In our config.yaml, we can set enableEmoji: true to use direct emojis in our source code using the syntax similar to slack, github, basecamp, trello, gitter, bitbucket with the exact same list. You can use the emoji cheat sheet from https://www.webfx.com/tools/emoji-cheat-sheet/ for a list of supported emojis.

Hugo automatically converts fractions from 1/2 to ½ and so on. It automatically converts headers to IDs to link directly to them.

It also supports HTML definition lists (another types of lists in HTML outside of ordered and unordered lists which are relatively less used). To declare a definition list, you can specify the term on one line followed by a :(colon) and a definition on the other.

Apart from this, Hugo provides support for custom shortcodes by which we can extend markdown by adding custom elements that render to HTML. Hugo also provides custom shortcodes, which are bundled within Hugo.

Listing 7. Emoticons, fractions and definition lists in Hugo

## Direct Emojis
 Smile please :smile:
  
 I :heart: Hugo
  
 Wink :wink:
  
  
 ## Fractions
  
 1/2
  
 100/999
  
 Not a Number/5
  
 A link to [Fractions](#fractions)
  
  
 ## Definition Lists
  
 Alex
 : Hippy Web Developer
 : Technofile
  
 Bob
 : Classic SysAdmin
 : Conservative
  
 Gabby
 : Cool Content Master
 : Cautious

Not all themes have support for all markdown features. If you plan to rely on a third-party theme, a good idea is to check for feature support for all the markdown features you plan to use. A good sample page which you can try is included as markdown.md in the code content with this article.

Figure 10. Elements in Markdown Part 7 — Emojis fractions and definition lists. (Code section 4)

Using all these content features, we can now pull up the about us page for the Acme Corporation. The about page has a lot of formatted content as it’s used for both marketing and investor relations.

Figure 11. About Us page for Acme Corporation using advanced markdown features (Code section 4)

Metadata Languages

Writing content is more than providing the raw data. A lot of contextual metadata associated with the content like the creation date is included, the tags, the URL, the author name, associated with the content which normally fills up the metadata table of the database. That information also needs a home. Hugo takes answers to those questions with the concept of the front matter. The Front Matter is a set of key value pairs that define the metadata for the content which is provided right before the content.

Hugo is smart with the metadata and provides a sane set of defaults. This is why we‘re able to get along providing little metadata but are still able to render. By default, Hugo gets the information from the filename, the git version control system and the OS attributes like the modified date. We need to deal in front matter only if we need to do something that Hugo can’t guess itself or we need to override to perform certain tasks.

Metadata before content

The concept of putting metadata before the content isn’t new. It has been there from the beginning of programming. In Pascal, strings are represented by a length followed by raw binary data of that length. Many binary file formats start with a signature which is the metadata associated with the file. For example, if you open a .pdf file in a text editor, it starts is %PDF, a .png starts with .PNG, a .gif with a GIF. Jekyll, the first popular static site builder introduced the concept of metadata in the front matter and Hugo picked up the concept.

Before going into the metadata, let’s discuss the ways to provide it. The content language and the metadata language are two different languages. Although content is provided as markdown, it’s an ill-suited choice for providing metadata. Metadata needs to be structured and needs to be easy to split into keys and values. Therefore, it needs a different language. Hugo supports three languages for providing metadata — YAML, TOML and JSON.

YAML Ain’t a Markup Language (YAML)

YAML is a language for structured data which has keys and values separated by a :(colon). The definition of YAML is meant to highlight the fact that the core use case of YAML is around structured data and not marking data, like we do in Markdown. YAML is sensitive to spaces. We use YAML for the cofig file for the Acme Corporation website. YAML supports plain key value pairs, lists, dictionaries (also calld maps and objects) as data structures.

# YAML comments are prefixed by a #(Hash). They can appear anywhere.
 ---                    # Three dashes wrap a YAML grouping block
 key: value             # YAML guesses the type.
 key2: 12.0             # Enter and same indentation declares new keys. This has a float value
 key3:                  # Lists can be added via - (dash)
   - a
   - b
   - c
 key4: [d, e, f]        # Lists can also be added via square brackets([])
 key5: 10               # YAML auto co-erces numerical types
 key6: "10"             # Wrap in quotes to declare strings.
 key7:
    "hello"             # Since hello is at different indentation, it is read as value
 key8:                  # null value
 key9: false            # Boolean false(coerced)
 # Multi-line string with newline characters can be declared with a |(pipe) character
 key10: |
    This is a multi line string
    where enter keys are valid
  
    This is another para. The string ends with two
    empty lines.
  
  
 key11: >
     This is a multi line string withere new lines
     will be merged back in.
  
     Again we need two blank lines to close it.
  
  
 key12:                  # Dictionary
    key13: value13
    key14:
      - List Item 1
      - List Item 2
    key15: 10
 key13: {key14: value14} # Alternate dictionary
 ---                     # Closing of a YAML section.

YAML sections are needed to separate data with metadata in content files but not required in pure yaml files like config.yaml. YAML also has the same tradeoffs as markdown — human readability over a strict specification. This comes with complexities in parsers and weird edge cases where it may be difficult to understand. YAML is the default metadata language in Jekyll (not default in Hugo) and extremely popular. We have chosen YAML in this article for the config file due to its popularity and easy of readability.

Tom’s Obvious Minimal Language (TOML)

TOML is the default metadata language in Hugo. Most of the community uses this language and most of the documentation is found in TOML. Unlike YAML, the objective of TOML is obviousness over readability. It’s human readable and doesn’t have a lot of edge cases. It has first class support for dates and fewer edge cases to worry about if writing a parser. Unfortunately, it isn’t as popular as YAML and may be intimidating to newcomers. To be successful with the Hugo community, it’s important to understand TOML. TOML uses the equal sign (=) instead of YAML’s colon (:). It’s sensitive to new line but not to indentation unlike YAML.

# TOML comments are prefixed by a #(Hash). They can appear anywhere.
 +++                        # Three pluses wrap a TOML grouping block
 key = "value"              # TOML requires strings to have quotes around them
 key2= 12.0                 # Enter declares new keys. This has a float value
 key3=  [                   # Lists can be added via [] (square brackets).
    "a",                    #    All list elements need to have the same type
    "b",                    #    All list elements are separated by commas(,)
    "c"
    ]
 key4= ["d", "e", "f"]      # All spacing and indentation is optional
 key5= 10                   # TOML recognizes floats, integers, booleans and dates
 key6= 2020-01-01T00:00:00Z # TOML understands dates natively
      key7= "hello"         # No newlines but indentation allowed
 key8 = ""                  # null value not present
 key9 = false               # Boolean false
 key10= """
    This is a multi line string
    where enter keys are valid.
  
    Multi line strings end by three quote(") symbols<
    """
 key11= '''
   String single quotes, both single line and multiline are represented
   as is and nothing, not event backslash(\) can escape text.
   '''
 [[key12]]                  # Dictionary
    key13= 'value13'        # The indentation is optional
    key14= [<
        "List \" Item 1",
        "List Item 2"
    ]
    key15 = 10
 +++                        # Closing of a TOML section.

TOML sections are not required in content files and not in pure .tomlfile. Netlify supports netlify.tomlconfiguration file for specifying the configuration of the website within code rather than a form.

JavaScript Object Notation (JSON)

JSON is a standard information exchange format extremely popular on the web. Most services expose their functionality via JSON based APIs. The objective of JSON is machine readability and efficient transmission over the network. Human readability is a bonus. Although JSON is meant to be more human readable than binary formats, JSON has strict language rules to allow writing a parser more easily which may come in the way of reading it. JSON is insensitive to spaces and new lines and relies on explicit markers for content.

// JSON does not support comments. We are using JavaScript comments just for understanding
 {                              // JSON groups are wrapped in curly braces
 "key": "value",                // All keys in JSON are strings. All strings have double quotes
 "key2": 12.0,                  // All keys are separated by commas(,) except the last one. New lines are not important.
 "key3":  [                     // Lists can be added via [] (square brackets).
    1,                          //    All list elements need not be of the same type
    "b",                        //    All list elements are separated by commas(,) except the last element
    "c"
    ],
 "key4": ["d", "e", "f"],       // All spacing and indentation is optional
 "key5": 10,                    // JSON has number types for both integers and floats
 "key6": "2020-01-01T00:00:00Z" // TOML understands dates natively
      "key7":
      "hello",                  // Insensitive to new lines and indentation
 "key8": null,                  // null value is supported
 "key9": false,                 // Boolean false
 "key10", "Multi line strings need \n (newline characters). No single quotes or special modes available",
 "key12": {                     // Dictionary
       "key13": "value13",      // The indentation is optional
       "key14": [
          "List \" Item 1",
          "List Item 2"
       ],
       "key15": 10              // No comma after the last item
    }                           // Closing of a the dictionary
 }                              // Closing of a the JSON section

Event pure JSON files require the curly braces to mark JSON objects. The objective of JSON is interoperability. This comes at a cost to readability. JSON has a lot of quotes, strict commas and brackets with no regard for newlines.

Metadata conversions

Hugo supports all metadata languages in parallel. Different documents can use different metadata languages and Hugo parses them properly.

Hugo can also convert content between metadata languages. We can use the Hugo command line hugo convert toTOML <content file> to convert the content to TOML. We can similarly convert to JSON/YAML by updating the command with toJSON and toYAML respectively.

This conversion is unnecessary as all of these languages are fully supported in Hugo for the front matter.

Figure 12. Gabby, the content editor discusses with Alex the move to Markdown for content markup.

The Front Matter

The front matter consists of all the metadata properties which are associated with the specific page. Like a website has a config.yaml, we can place page specific yaml content in the page itself. With the front matter we can override a bunch of properties. Some of the common ones are described in Table 3.

Table 3. Common Font Matter variables in Hugo. Examples are for the About Us page in Acme Corporation’s website

To update these in the in the about us page, we can add the following to the top of the page:

---
 title: About Us
 slug: about
 date: 2010-01-01T00:00:00Z
 description: World's leading manufacturer of digital shapes. We shape the world. You live in it.
 draft: false
 ---
 <page content>

Theme eclectic places the title on the page and therefore we don’t need to add it manually. Other than this, there seems to be no major changes. If you view the generated HTML, you should see the description being updated, and the document title should also show up in the tab bar.

Figure 13. Address bar update with the front matter (Code section 5)

Figure 14. Metadata updates with the front matter (Code section 5)

Some of the features like draft can only be enabled via the front matter.

The front matter and the content together form the updates which a content creator has to work on regular basis. With a working knowledge of these two areas, we can design as many pages of content we want in a website.

Summary

Hugo supports multiple languages for content including Asciidoc for long form content, reStructured Text for documentation and pandoc for extended markdown along with two flavors of Markdown — regular and MMArk (for IETF documentation).
Regular markdown has all the features which are needed for writing regular web pages and blog posts.
Markdown supports block elements like headings, various types of lists and sublists, along with inline images, links, basic formatting and code blocks.
Github Flavored Markdown is supported by Hugo, which adds task lists and tables to markdown.
Hugo’s rendering engine also supports features like emoticons, automatic fractions and definition lists.
Apart from data, a web page also needs metadata which can be supplied in Hugo using YAML, TOML and JSON.
Although Hugo has sensible defaults and we can build websites without writing a single metadata item, Hugo provides the ability to override most metadata items like the title, description, date, as well as theme-specific params.

That’s all for now. If you want to learn more about the book, check it out on our browser-based liveBook reader here and see this slide deck.

Check out more great free content from our titles on our Free Content Center.