patternpythonMinor
Converting Pandoc Markdown images from captioned to inline
Viewed 0 times
convertingmarkdownpandocinlineimagesfromcaptioned
Problem
After writing a rather long document in Markdown and using pandoc to convert it to a PDF, I found, to my dismay, that many of the images were out of place, and that they all had their alternate text proudly displayed underneath them as captions. My document is rather instructional, so this rearrangement was harmful to its readability.
I eventually found a way to display the images as inline. I still wanted to write the document in standard Markdown, though, so I wrote a Python script to convert all the standalone images in a document to this inline form.
Example (
Running
It seems like a lot of mess (
I eventually found a way to display the images as inline. I still wanted to write the document in standard Markdown, though, so I wrote a Python script to convert all the standalone images in a document to this inline form.
pandoc_images.py:import sys
# Convert standalone images in standard Markdown
# to inline images in Pandoc's Markdown
# (see http://pandoc.org/README.html#images)
with open(sys.argv[1], 'r') as markdown:
lines = markdown.read().splitlines()
for index, line in enumerate(lines):
is_first_line = index == 0
preceding_blank = True if is_first_line else not lines[index - 1]
is_last_line = index == len(lines) - 1
following_blank = True if is_last_line else not lines[index + 1]
is_standalone = preceding_blank and following_blank
is_image = line.startswith('')
print(line + ('\\\n' if is_standalone and is_image else ''))Example (
text.md):This is some text.
!This is an image.
### This is a header.
Running
python3 pandoc_images.py text.md would produce:This is some text.
!This is an image.\
### This is a header.
It seems like a lot of mess (
enumerate, bounds checking, etc.) for such a simple job, though. Is there any way I can improve any of this code?Solution
How about a regular expression?
Note: I am using
def convert(s):
return re.sub(r"((?:\A|^ *\n)!\[.*\]\(.*\))\n(^ *\n|\Z)", r"\1\\\2", s, 0, re.M)
def test1():
print convert("""\n\nthis is a test\n""")
def test2():
print convert("""line 1\n\n\n\nanother test\n""")
def test3():
print convert("""line 1\n\n\n""")
def test4():
print convert("""line 1\n\n\nNot blank\n""")Note: I am using
^\s*\n to match a blank line - i.e. it can also contain spaces.Code Snippets
def convert(s):
return re.sub(r"((?:\A|^ *\n)!\[.*\]\(.*\))\n(^ *\n|\Z)", r"\1\\\2", s, 0, re.M)
def test1():
print convert("""\n\nthis is a test\n""")
def test2():
print convert("""line 1\n\n\n\nanother test\n""")
def test3():
print convert("""line 1\n\n\n""")
def test4():
print convert("""line 1\n\n\nNot blank\n""")Context
StackExchange Code Review Q#102335, answer score: 2
Revisions (0)
No revisions yet.