HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavascriptMinor

HTML to Markdown converter

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
convertermarkdownhtml

Problem

I've made a simple HTML→Markdown converter in Javascript and am looking for any feedback. For now, I've basically used Stack Exchange's /editing-help as a guide as to what to convert, but I might look at CommonMark's spec later on.

It uses DOMParser() and then goes through the child nodes to convert things.

My test HTML string right now is:

h1 

h2 

h3 
text outside everything

(and another element!) 

a link! 

    item 1
    item 2
    item 3

    item 1
    item 2
    item 3

BOLD TEXT and ITALICISED TEXT 

blockquote


and that conversion 'works':

# h1

## h2

### h3

text outside everything

## (and another element!)

!enter image description here

a link!

  • item 1
  • item 2
  • item 3



  1. item 1
  2. item 2
  3. item 3



BOLD TEXT and ITALICISED TEXT

> blockquote


Code



`var str = "h1
"
str += "h2
";
str += "h3
";
str += "text outside everything
";
str += "(and another element!)
"
str += "
";
str += "a link!
";
str += "
  • item 1
  • item 2
  • item 3


";
str += "
  • item 1
  • item 2
  • item 3


";
str += "BOLD TEXT and ITALICISED TEXT
";
str += "blockquote";

var doc = new DOMParser().parseFromString(str, 'text/html');
var childnodes = doc.body.childNodes;
var markdown = '';

var conversions = {
br: function(data) {
return '\n\n';
},
h1: function(data) {
return '# '
},
h2: function(data) {
return '## ';
},
h3: function(data) {
return '### ';
},
hr: function(data) {
return '---\n';
},
blockquote: function(data) {
return '> ';
},
img: function(data) {
var imgStr = "!alt text";
return imgStr;
},
a: function(data) {
return "" + data.html + " + ")";
},
ul: function(data) {
var lis = childnodes[data.i].childNodes;
var newmd = '';
var lislength = lis.length;
for (var x = 0; x -1 ? '' : html);
}
}

var length = childnodes.length;
fo

Solution

A few things to point out:

ul: function(data) {
    var lis = childnodes[data.i].childNodes;
    var newmd = '';
    var lislength = lis.length;
    for (var x = 0; x < lislength; x++) {
      newmd += "- " + lis[x].innerHTML + "\n";
    }
    return newmd;
  },


You can declare lislength in the for loop, rather than externally, if you like.

for (var x = 0, listlength = list.length; x < listlength; x++) {


Also, note that you should keep your use of quotes consistent ("" & '')

This is especially helpful when trying to avoid re-calling a function, or a method call that is somewhat slow.

img: function(data) {
    var imgStr = "![alt text](" + data.curEl.src + ")";
    return imgStr;
  },


You can just return imgStr without assigning it.

return "![alt text](" + data.curEl.src + ")";


strong: function(data) {
    return "**" + data.html + "**";
  },


` is synonymous with , so you may want to consider that, also.

On an abstraction note, you should be passing
var doc = new DOMParser().parseFromString(str, 'text/html'); to a class/function, at best, and leave the rest to be handled within the class/function, meaning childnodes and markdown` would be better off buried within the structure, rather than as external globals.

Code Snippets

ul: function(data) {
    var lis = childnodes[data.i].childNodes;
    var newmd = '';
    var lislength = lis.length;
    for (var x = 0; x < lislength; x++) {
      newmd += "- " + lis[x].innerHTML + "\n";
    }
    return newmd;
  },
for (var x = 0, listlength = list.length; x < listlength; x++) {
img: function(data) {
    var imgStr = "![alt text](" + data.curEl.src + ")";
    return imgStr;
  },
return "![alt text](" + data.curEl.src + ")";
strong: function(data) {
    return "**" + data.html + "**";
  },

Context

StackExchange Code Review Q#101791, answer score: 8

Revisions (0)

No revisions yet.