HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Converting simple markup to HTML

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
convertingsimplehtmlmarkup

Problem

This is going to be rather long and generic, so apologies in advance.

I've been reading a lot about Haskell lately, but I've never really programmed anything with it beyond simple experiments in ghci. So, I wanted to finally try and do some coding exercise that was easy but non-trivial and ended up choosing the "Recluse" problem from a book called Eloquent JavaScript.

The aim is to make a program that takes a text document with simple, custom markup and formats it into HTML according to the following rules:

  • Paragraphs are separated by blank lines.



  • A paragraph that starts with a '%' symbol is a header. The more '%' symbols, the smaller the header.



  • Inside paragraphs, pieces of text can be emphasised by putting them between asterisks.



  • Footnotes are written between braces.



So for example, the text document

% Heading

%% Sub-heading

Text with emphasis.

Another {an example footnote} paragraph.

would be formatted as



Heading
Sub-heading

Text with emphasis.

Another1 paragraph.

  1. an example footnote





The main program should read the text document from stdin and output the HTML to stdout.

This seemed like a simple enough task at first (I'd estimate that using Python, a language I have a lot of experience with, I could've finished it in about 15-30 minutes), but as I got deeper into the implementation, I realized I have no clue how to do something like this in Haskell. The footnotes seemed particularily challenging, as you kind of have to accumulate them on the side while building the rest of the document, and I didn't have any idea how to express that in functional terms (at least not without dragging an extra footnotes argument in every single function call).

I hacked at the problem for a few evenings, re-reading Haskell tutorials and perusing the standard library reference, and finally ended up with a solution that works correctly, with some assumptions (for example, nested markup is not supported). However, my solution feels like it's

Solution

One of Haskell's advantages is that there's a good number of high-quality parser libraries for it. Using one for this case might be a bit overkill, but since this is a learning exercise anyway it might be a good chance to pick up a parser library (e.g. parsec) as well. This could certainly come in handy later. Also using such a library, you could support nested markup without any trouble.

About memory: Haskell strings, being (lazy) linked lists of characters, just aren't very memory efficient. For this reason it is often recommended to use bytestrings or Data.Text in favor of plain strings if you need memory-efficient string handling.


The footnotes seemed particularily challenging, as you kind of have to accumulate them on the side while building the rest of the document, and I didn't have any idea how to express that in functional terms (at least not without dragging an extra footnotes argument in every single function call).

As you already found out one solution to track state through the program without adding extra arguments is the State monad. However in this case I feel that it made the program more complicated than it needed to be.

I think an extra footnotes argument (and a second one to count them) would actually have been the simplest solution here.

As a general design note, I think a two-step approach would make the code more manageable and extensible: First parse the string into an internal representation (I would call it a tree, except that until you support nested markup, it won't actually be a tree), and then write a function which turns that representation into HTML.

This way you separate the code that does the parsing from the code that produces the HTML, which is good style. It also allows you to add another output format later without having to duplicate any parsing code. It should also be easier to support nested markup using this approach and it also makes it easier to replace your manual parsing code with a parsing library should you decide to do so.

Context

StackExchange Code Review Q#1176, answer score: 7

Revisions (0)

No revisions yet.